Cover image

Think, Then Do: Why ReAct Turned LLMs into Real Agents

A chatbot answers. An agent checks. That distinction sounds small until a workflow fails at 2:17 p.m. because the model confidently invented a policy clause, skipped the database lookup, and then explained itself with the serene authority of a consultant who has already left the building. The 2022 paper ReAct: Synergizing Reasoning and Acting in Language Models matters because it made that failure mode harder to ignore.1 It did not simply ask language models to “think step by step.” Chain-of-thought prompting already did that. It did not simply attach a search box to a model. Retrieval-augmented systems were already moving in that direction. The paper’s real contribution was more architectural: it showed that a language model could alternate between reasoning, acting, observing, and revising its next move. ...

March 4, 2026 · 16 min · Zelina
Cover image

Consistency Is Not a Coincidence: When LLM Agents Disagree With Themselves

A support ticket arrives. The agent reads the same customer history, sees the same policy document, and has access to the same tools. On Monday, it searches for the refund rule, retrieves the correct clause, and gives a clean answer. On Tuesday, with the same input, it searches for a different phrase, retrieves a less relevant document, wanders through two extra steps, and ends with a confident answer that is only approximately useful. ...

February 14, 2026 · 16 min · Zelina
Cover image

Agentic Systems Need Architecture, Not Vibes

Agentic AI has a habit of sounding more engineered than it is. A demo connects an LLM to a search tool, adds a memory store, wraps the whole thing in a planner, and suddenly the slide deck says “autonomous agent.” The system may still forget what it just saw, retrieve the wrong context, misuse tools, loop on bad actions, or politely hallucinate its way into a support ticket. But the diagram has arrows, so morale remains high. ...

February 2, 2026 · 14 min · Zelina
Cover image

Beyond Utility: When LLM Agents Start Dreaming Their Own Tasks

A task list is usually where enterprise automation becomes reassuringly boring. Someone defines the work. The system executes it. A dashboard turns green, or, in more honest organisations, amber with an explanation. The point is not mystery. The point is control. The paper behind this article, LLM Agents Beyond Utility: An Open-Ended Perspective, asks what happens when that tidy arrangement is disturbed: what if the agent does not merely complete tasks, but proposes them? What if it can remember what it has done, inspect its environment, write notes to itself, and continue across runs?1 ...

October 23, 2025 · 15 min · Zelina
Cover image

Org Charts for Robots: What AgentArch Really Tells Us About Enterprise AI

Enterprise AI teams love an architecture diagram. Boxes, arrows, specialist agents, memory stores, tool registries, a tasteful orchestrator sitting at the top like a middle manager with JSON access. It looks reassuring. It looks intentional. It also looks suspiciously like the kind of thing that can fail in six different places while still producing a beautifully formatted answer. ...

September 20, 2025 · 16 min · Zelina
Cover image

ReAct Without the Chaos: AgentScope 1.0 Turns Tools into Strategy

TL;DR for operators AgentScope 1.0 is best read as a production-shaping framework for agentic applications, not as a victory lap over rival agent frameworks. Alibaba’s paper describes a developer-centric stack that rebuilds agents around four core abstractions — message, model, memory, and tool — then places a ReAct-style reasoning-and-action loop on top of them.1 ...

August 25, 2025 · 17 min · Zelina
Cover image

Who Sees What, Who Pays the Cost? Teaching Agents to See Through Others’ Eyes

TL;DR for operators The paper’s useful message is not “symbolic planners can teach LLM agents to reason socially.” That would be tidy, flattering, and mostly wrong. The useful message is narrower and more operational: planner-derived thought-action examples can scaffold some agent behaviour, especially local decision discipline, but they do not automatically create robust perspective-taking. In the tested Director–Matcher environment, agents do well when the task is basically “ignore what the other party cannot see.” They struggle when they must imagine what exists in another agent’s private view, or decide whether it is worth asking, moving, opening, or acting under uncertainty.1 ...

August 23, 2025 · 20 min · Zelina
Cover image

Plans Before Action: What XAgent Can Learn from Pre-Act's Cognitive Blueprint

TL;DR for operators Pre-Act is a useful reminder that enterprise agents do not fail only because they choose the wrong tool. They fail because they lose the plot. A customer asks for help, the agent gathers one fact, calls one API, sees an unexpected result, and then behaves as if the workflow has reset. Charming, in the same way a lift that forgets floors is charming. ...

May 18, 2025 · 18 min · Zelina