Cover image

From Spreadsheets to Swarms: How Agentic AI Rewrites the Retail Supply Chain

Supermarkets look simple from the aisle. Milk is cold. Apples are stacked. Shampoo is there because, apparently, civilization requires thirty-seven variants of “moisture repair.” Behind that calm retail surface is a coordination machine that never really sleeps: demand planners, inventory teams, procurement staff, suppliers, warehouse coordinators, truck schedules, exception reports, and the occasional emergency because one popular SKU suddenly became everyone’s personality for the week. ...

April 8, 2026 · 18 min · Zelina
Cover image

Protocol Over Prompts: Why ANX Rewrites the Rules of AI Agent Interaction

Forms are boring until an AI agent has to fill one. Then the boring form becomes a surprisingly expensive machine. The agent reads the page, interprets the fields, finds the dropdowns, waits for the browser, loads dynamic options, decides what to click, serializes actions, and tries not to leak whatever the user typed into the wrong place. This is not intelligence in the glamorous sense. It is office work wearing a robotic costume. ...

April 7, 2026 · 18 min · Zelina
Cover image

World-Building for Agents: When Synthetic Environments Become Real Advantage

A customer-support agent can sound impressive in a demo and still collapse the first time it has to change an address, cancel a duplicate order, rebook a flight, and explain what happened afterward. That collapse usually does not come from weak prose. The model can write the apology beautifully. The problem is that the world behind the apology has state. Orders exist or do not exist. Inventory changes. Refunds create records. A bad tool call can mutate the wrong row. A follow-up answer must reflect what the agent actually did, not what it vaguely intended to do. ...

February 11, 2026 · 16 min · Zelina
Cover image

When Coders Prove Theorems: Agents, Lean, and the Quiet Death of the Specialist Prover

A coder does not trust a program because it sounds plausible. A coder runs it, reads the error message, changes the implementation, tests again, searches the library, asks a colleague, splits the problem, and keeps going until the machine stops complaining. That mundane loop is the interesting part of Numina-Lean-Agent: An Open and General Agentic Reasoning System for Formal Mathematics.1 The headline result is easy to market: with Claude Opus 4.5 as the base model, Numina-Lean-Agent solves all 12 Putnam 2025 problems in Lean, matching the reported perfect score of AxiomProver. Nice. The trophy cabinet sparkles. ...

January 21, 2026 · 20 min · Zelina
Cover image

When the Chain Watches the Brain: Governing Agentic AI Before It Acts

Approval is boring. That is why most automation diagrams hide it. A user request arrives, a sensor emits a signal, an AI agent reasons through the situation, a tool call fires, and something in the real world changes. A stock level is replenished. A traffic light is adjusted. A healthcare alert is escalated. In the clean version of the diagram, the agent looks wonderfully autonomous. In the operational version, someone eventually asks the unpleasant question: who allowed this thing to act? ...

December 28, 2025 · 19 min · Zelina
Cover image

Traffic, but Make It Agentic: When Simulators Learn to Think

Traffic. A planner wants to test whether a new signal policy will reduce congestion near a hospital. A logistics operator wants to know whether a revised delivery schedule will overload a district during the evening peak. A city team wants to compare two neighborhoods, two time windows, and two control strategies before anyone touches asphalt, paint, or public patience. ...

December 25, 2025 · 18 min · Zelina
Cover image

Shaking the Stack: Teaching Seismology to Talk Back

Simulation software has a talent for hiding intelligence inside inconvenience. A mature physics code may contain decades of numerical insight, community testing, and domain expertise. Then it asks the user to prove loyalty by editing parameter files, remembering command sequences, managing mesh directories, choosing execution binaries, checking output folders, and pretending that none of this is a productivity tax. This is not because scientists enjoy suffering. Mostly. It is because high-performance scientific software often grows around capability first and usability later. ...

December 17, 2025 · 17 min · Zelina
Cover image

Agents on the Assembly Line: How Production-Grade AI Workflows Actually Get Built

Assembly lines are not exciting because every worker improvises. They are useful because each station does a narrow job, hands the result forward, and leaves as little room as possible for charming chaos. That is also the quiet lesson in A Practical Guide for Designing, Developing, and Deploying Production-Grade Agentic AI Workflows.1 The paper looks, at first glance, like another guide to agents, tools, MCP servers, multi-model reasoning, and cloud-native deployment. The tempting summary would be: “Here are nine best practices for building agentic AI.” ...

December 10, 2025 · 16 min · Zelina
Cover image

Stacking the Odds: Why Blocksworld Still Breaks Your Fancy LLM Agent

A robot arm, a few colored blocks, and a table. That is the setup. No messy warehouse, no sensor dust, no tired operator, no forklift reversing into the wrong aisle. Just blocks. And still, the fancy LLM agent stumbles. That is the useful discomfort in Benchmark for Planning and Control with Large Language Model Agents: Blocksworld with Model Context Protocol.1 The paper does not show a robot revolution. It shows something more valuable for anyone trying to deploy LLM agents in industrial workflows: even in a symbolic world where the rules are explicit, the actions are discrete, the state can be queried, and the tool interface is standardized, reliability degrades as soon as the task stops being politely simple. ...

December 4, 2025 · 17 min · Zelina
Cover image

The Agent Olympics: How Toolathlon Tests the Limits of AI Workflows

Office work is not one task. It is a chain of small obligations pretending to be one task. “Check the homework submissions, download the attached Python files, run them, grade the students in Canvas, and use the latest submission if someone sent more than one.” That sounds like a normal administrative request. It is also a compact torture device for an AI agent. The agent must read email, handle attachments, inspect local files, run code, interpret results, map students to course records, update Canvas, and not confidently grade the wrong person. Easy, apparently, as long as nothing has to actually work. ...

November 4, 2025 · 17 min · Zelina