Cover image

Deep Thinking, Dynamic Acting: How DeepAgent Redefines General Reasoning

In the fast-evolving landscape of agentic AI, one critical limitation persists: most frameworks can think or act, but rarely both in a fluid, self-directed manner. They follow rigid ReAct-like loops—plan, call, observe—resembling a robot that obeys instructions without ever truly reflecting on its strategy. The recent paper “DeepAgent: A General Reasoning Agent with Scalable Toolsets” from Renmin University and Xiaohongshu proposes an ambitious leap beyond this boundary. It envisions an agent that thinks deeply, acts freely, and remembers wisely. ...

October 31, 2025 · 4 min · Zelina
Cover image

Seeing Green: When AI Learns to Detect Corporate Illusions

Seeing Green: When AI Learns to Detect Corporate Illusions Oil and gas companies have long mastered the art of framing—selectively showing the parts of reality they want us to see. A commercial fades in: wind turbines turning under a soft sunrise, a child running across a field, the logo of an oil major shimmering on the horizon. No lies are spoken, but meaning is shaped. The message? We care. The reality? Often less so. ...

October 31, 2025 · 4 min · Zelina
Cover image

Teaching Safety to Machines: How Inverse Constraint Learning Reimagines Control Barrier Functions

Autonomous systems—from self-driving cars to aerial drones—are bound by one inescapable demand: safety. But encoding safety directly into algorithms is harder than it sounds. We can write explicit constraints (“don’t crash,” “stay upright”), yet the boundary between safe and unsafe states often defies simple equations. The recent paper Learning Neural Control Barrier Functions from Expert Demonstrations using Inverse Constraint Learning (Yang & Sibai, 2025) offers a different path. It suggests that machines can learn what safety looks like—not from rigid formulas, but from watching experts. ...

October 31, 2025 · 4 min · Zelina
Cover image

The Benchmark Awakens: AstaBench and the New Standard for Agentic Science

The Benchmark Awakens: AstaBench and the New Standard for Agentic Science The latest release from the Allen Institute for AI, AstaBench, represents a turning point for how the AI research community evaluates large language model (LLM) agents. For years, benchmarks like MMLU or ARC have tested narrow reasoning and recall. But AstaBench brings something new—it treats the agent not as a static model, but as a scientific collaborator with memory, cost, and strategy. ...

October 31, 2025 · 4 min · Zelina
Cover image

The Rise of FreePhD: How Multiagent Systems are Reimagining the Scientific Method

The Rise of FreePhD: How Multiagent Systems are Reimagining the Scientific Method In today’s AI landscape, most “autonomous scientists” still behave like obedient lab assistants: they follow rigid checklists, produce results, and stop when the checklist ends. But science, as any human researcher knows, is not a checklist—it’s a messy, self-correcting process of hypotheses, failed attempts, and creative pivots. That is precisely the gap freephdlabor seeks to close. Developed by researchers at Yale and the University of Chicago, this open-source framework reimagines automated science as an ecosystem of co-scientist agents that reason, collaborate, and adapt—much like a real research group. Its tagline might as well be: build your own lab, minus the PhD. ...

October 25, 2025 · 4 min · Zelina
Cover image

When Numbers Meet Narratives: How LLMs Reframe Quant Investing

In the world of quantitative investing, the line between data and story has long been clear. Numbers ruled the models; narratives belonged to the analysts. But the recent paper “Exploring the Synergy of Quantitative Factors and Newsflow Representations from Large Language Models for Stock Return Prediction” from RAM Active Investments argues that this divide is no longer useful—or profitable. Beyond Factors: Why Text Matters Quantitative factors—valuation, momentum, profitability—are the pillars of systematic investing. They measure what can be counted. But markets move on what’s talked about, too. Corporate press releases, analyst notes, executive reshuffles—all carry signals that often precede price action. Historically, this qualitative layer was hard to quantify. Now, LLMs can translate the market’s chatter into vectors of meaning. ...

October 25, 2025 · 3 min · Zelina
Cover image

Beyond Utility: When LLM Agents Start Dreaming Their Own Tasks

When large language models started solving math problems and writing code, they were celebrated as powerful tools. But a recent paper from INSAIT and ETH Zurich—LLM Agents Beyond Utility: An Open‑Ended Perspective—suggests something deeper may be stirring beneath the surface. The authors don’t simply ask what these agents can do, but whether they can want to do anything at all. From Obedience to Autonomy Most current LLM agents, even sophisticated ones like ReAct or Reflexion, live inside tight task loops: you prompt them, they plan, act, observe, and return a result. Their agency ends with the answer. But this study challenges that boundary by giving the agent a chance to set its own goals, persist across runs, and store memories of past interactions. ...

October 23, 2025 · 4 min · Zelina
Cover image

Blueprints of Agency: Compositional Machines and the New Architecture of Intelligence

When the term agentic AI is used today, it often conjures images of individual, autonomous systems making plans, taking actions, and learning from feedback loops. But what if intelligence, like biology, doesn’t scale by perfecting one organism — but by building composable ecosystems of specialized agents that interact, synchronize, and co‑evolve? That’s the thesis behind Agentic Design of Compositional Machines — a sprawling, 75‑page manifesto that reframes AI architecture as a modular society of minds, not a monolithic brain. Drawing inspiration from software engineering, systems biology, and embodied cognition, the paper argues that the next generation of LLM‑based agents will need to evolve toward compositionality — where reasoning, perception, and action emerge not from larger models, but from better‑coordinated parts. ...

October 23, 2025 · 4 min · Zelina
Cover image

When the Lab Thinks Back: How LabOS Turns AI Into a True Co-Scientist

When we talk about AI in science, most imaginations stop at the screen — algorithms simulating molecules, predicting reactions, or summarizing literature. But in LabOS, AI finally steps off the screen and into the lab. It doesn’t just compute hypotheses; it helps perform them. The Missing Half of Scientific Intelligence For decades, computation and experimentation have formed two halves of discovery — theory and touch, model and pipette. AI has supercharged the former, giving us AlphaFold and generative chemistry, but the physical laboratory has remained stubbornly analog. Robotic automation can execute predefined tasks, yet it lacks situational awareness — it can’t see contamination, notice a wrong reagent, or adapt when a human makes an unscripted move. ...

October 23, 2025 · 4 min · Zelina
Cover image

When Lateral Beats Linear: How LToT Rethinks the Tree of Thought

When Lateral Beats Linear: How LToT Rethinks the Tree of Thought AI researchers are learning that throwing more compute at reasoning isn’t enough. The new Lateral Tree-of-Thoughts (LToT) framework shows that the key isn’t depth—but disciplined breadth. The problem with thinking deeper As models like GPT and Mixtral gain access to massive inference budgets, the default approach—expanding Tree-of-Thought (ToT) searches—starts to break down. With thousands of tokens or nodes to explore, two predictable pathologies emerge: ...

October 21, 2025 · 3 min · Zelina