Cover image

From Retry to Recovery: Teaching AI Agents to Learn from Their Own Mistakes

A failed automation run usually tells you more than a successful one. A coding agent compiles the wrong program and receives a concrete error. A web-navigation agent clicks into the wrong product page and sees that the attributes do not match. A task agent tries an invalid action and the environment complains, patiently, like a machine that has seen too much. In each case, the system does not merely say “failed.” It gives clues. ...

March 18, 2026 · 17 min · Zelina
Cover image

The Slides That Explain Themselves: When AI Learns to Reverse Its Own Thinking

Slides are supposed to be obvious. That is their entire professional excuse for existing. A good presentation does not merely contain information; it makes the intended argument recoverable by someone who was not inside the author’s head. This is why a deck can look expensive and still fail. The gradients are polished, the icons are friendly, and the narrative has quietly wandered into a swamp wearing a consultant’s blazer. ...

March 18, 2026 · 16 min · Zelina
Cover image

Mind Over Machine: When AGI Starts Thinking in Needs

A factory line does not need a chatbot with feelings. It needs a control system that can tell the difference between a harmless deviation, a costly delay, and a situation that deserves to interrupt a human operator before the machine becomes expensive sculpture. That is the useful way to read Computational Concept of the Psyche by Anton Kolonin and Vladimir Krykov.1 The paper’s title sounds as if we are about to attach a synthetic soul to a machine, perhaps with a dashboard of emotions and a tasteful blue glow. Fortunately, the core argument is more operational than theatrical: an intelligent agent should not only predict the next state of the world; it should manage its own state of needs while acting under uncertainty, risk, and resource limits. ...

March 17, 2026 · 16 min · Zelina
Cover image

When Right Meets Wrong: Teaching LLMs by Letting Their Mistakes Talk

Training a reasoning model is often treated like running a classroom with a very impatient teacher: give the model a problem, let it produce several answers, mark each answer right or wrong, and push the policy toward the winners. That is already useful. It is also slightly wasteful. Because in a real classroom, the wrong answers are not just trash to be swept off the floor. They reveal what the student misunderstood. They show which shortcuts are tempting, which algebra step keeps breaking, and which false pattern looks suspiciously persuasive. A good teacher does not only praise the correct solution. A good teacher puts the correct and incorrect attempts side by side and asks: what exactly changed? ...

March 16, 2026 · 16 min · Zelina
Cover image

Too Smart to Share: When AI Agents Get Smarter, Systems Get Worse

Chargers are boring until everyone arrives at the same time. That is the useful way to enter this paper. Not through grand claims about artificial general intelligence, swarm intelligence, or the coming society of agents. Start with something embarrassingly practical: seven autonomous electric vehicles, two charging slots, and no reliable cloud coordinator telling everyone what to do. ...

March 14, 2026 · 19 min · Zelina
Cover image

Agents That Learn From Their Own Mistakes: The Rise of Retroactive AI

Mistakes are useful only when they are converted into something operational. That is the small, inconvenient detail often missing from agent hype. An LLM agent can fail at a web-shopping task, wander through a simulated room, push the wrong Sokoban box, or uncover the wrong MineSweeper cell. Fine. Failure happens. The useful question is not whether the agent failed. The useful question is whether the system can extract a reusable signal from that failure before the next attempt. ...

March 12, 2026 · 16 min · Zelina
Cover image

Mirror, Mirror on the Agent: Teaching LLMs to Judge Their Own Actions

The agent did exactly what it was taught. That was the problem. A familiar business agent failure does not look dramatic. It looks boring. The agent searches the database, clicks the wrong record, receives an error, retries the same action, receives the same error, retries again, and then politely informs the user that it has encountered “temporary difficulty.” Very professional. Completely useless. ...

March 12, 2026 · 16 min · Zelina
Cover image

The Long Conversation Problem: How MAPO Teaches AI to Care Over Time

Customer support has a familiar failure mode: the first answer sounds polished, the second answer sounds patient, the third answer sounds as if the system has quietly forgotten what problem it is solving. The user is still there. The emotional state has changed. The unresolved issue has shifted. The model, meanwhile, keeps producing individually acceptable replies, like a waiter bringing one beautifully plated dish at a time to the wrong table. ...

March 10, 2026 · 14 min · Zelina
Cover image

Teaching Reinforcement Learning to Think Before It Acts

Agents are easy to impress and hard to trust. Give a reinforcement learning agent a game, a reward signal, and enough time, and it may discover something brilliant. Or it may discover the dumbest possible way to look successful. In Seaquest, that can mean shooting enemies while ignoring oxygen. In Kangaroo, it can mean punching enemies in a corner instead of climbing toward the joey. Technically, points go up. Strategically, the agent has learned the machine-learning equivalent of optimizing a dashboard while the business burns quietly in the background. ...

March 9, 2026 · 14 min · Zelina
Cover image

When the Streets Flood, Let the AI Drive: Reinforcement Learning for Climate‑Resilient Cities

A flooded street is not only a drainage problem. It is a transport problem, a budget problem, an insurance problem, a public-trust problem, and, if the city waits long enough, a very expensive lesson in pretending that yesterday’s weather statistics are still a planning manual. Copenhagen is a useful place to begin because the paper’s case is not imaginary. In 2011, the city experienced a major cloudburst that flooded streets, disrupted roads and rail, and caused damage estimated at around 6 billion Danish kroner. The new research paper, Artificial Intelligence for Climate Adaptation: Using Reinforcement Learning for Climate Change-Resilient Transport, uses Copenhagen’s inner city as the testbed for a larger question: how should a city decide where, when, and how much to invest in flood adaptation between 2024 and 2100?1 ...

March 9, 2026 · 16 min · Zelina