Cover image

The Path of Least Assurance: Why AI Reliability Lives Between the Steps

TL;DR for operators AI reliability is increasingly a process problem, not an answer-checking problem. Three recent arXiv papers make that point from very different angles. MoCo-EA shows that adversarial examples are not merely isolated malicious pixels lurking in the shrubbery; they can lie along continuous, optimisable paths.1 ConceptAgent shows that erasing a concept from a diffusion model may disrupt the early text-to-image link while leaving later trajectory dynamics available for concept re-entry.2 BlueFin shows that LLM agents doing finance spreadsheet work fail in ways that only appear when you inspect formulas, recalculation behaviour, workbook mutations, tool choices, and whether the output helps a human analyst do useful work.3 ...

June 17, 2026 · 18 min · Zelina
Cover image

Source Code, Not Source Dump: Why Multimodal AI Needs Evidence Routing

Video is easy to collect and expensive to understand. That is the awkward little truth behind many enterprise “AI video intelligence” projects. A warehouse camera records everything. A body camera records everything. A meeting room system records everything. A field-service headset records everything. Then someone asks a very human question: who handled the device after lunch, what did they say, and was the machine hot when they touched it? ...

June 12, 2026 · 15 min · Zelina
Cover image

Hands-On Intelligence: Why Immersive AI Needs Both Eyes and Fingers

Immersive AI has a convenient myth: put a stronger multimodal model inside a headset, let it see what the user sees, and the future of work politely appears. Very cinematic. Slightly incomplete. The real problem is less glamorous and more operational. Extended-reality work is not just a visual scene. It is a long-running loop of perception, memory, reasoning, instruction, correction, confirmation, and physical effort. The model must understand what is happening over time. The human must still steer the system without becoming a tired thumb attached to a battery pack. ...

June 9, 2026 · 15 min · Zelina
Cover image

Pixels to Purchase Orders: A Business Map for Choosing Vision-Language Models

Pixels to Purchase Orders: A Business Map for Choosing Vision-Language Models Receipts are a good way to ruin an AI demo. A clean product photo is polite. A scanned receipt is not. It has shadows, folds, strange fonts, tiny numbers, merchant abbreviations, table-like structure, and one suspiciously important total amount hiding near the bottom. Ask a generic multimodal assistant what it sees, and it may produce an answer that sounds fluent enough to make everyone in the meeting relax. That is usually the dangerous part. ...

June 8, 2026 · 19 min · Zelina
Cover image

Talk Is Cheap, Until It Trains ASR

Talk Is Cheap, Until It Trains ASR Call centers are very good at producing audio. They are much worse at producing clean, labeled, domain-matched, multi-speaker training data. That distinction matters. A business may have thousands of hours of customer calls, branch conversations, medical consultations, field-service recordings, or internal support audio. But most of it is noisy, consent-constrained, poorly transcribed, unevenly distributed across accents and topics, and inconveniently full of humans doing human things: interrupting, pausing, talking over each other, drifting off-topic, and using domain-specific shorthand as if the ASR model had attended the onboarding session. ...

June 7, 2026 · 17 min · Zelina
Cover image

The Tower of Babble Gets a Router

Opening — Why this matters now Enterprise AI has a language problem. Not a charming one, like mispronouncing a French menu item with confidence. A structural one. Most companies do not operate in one clean English-speaking universe. Customer support conversations arrive in English, Tagalog, Spanish, Arabic, Thai, Vietnamese, Hindi, Indonesian, Turkish, and whatever dialectal mixture the internet felt like producing that morning. Compliance teams need summaries that preserve local meaning. E-commerce platforms need product search that understands regional idioms. Banks need customer explanations that do not flatten culture into machine-translated oatmeal. ...

May 1, 2026 · 16 min · Zelina
Cover image

Spatial-Gym and the Illusion of Thinking: Why AI Can’t Walk Before It Runs

Agents are supposed to act. That is the promise hiding behind most enterprise AI demos: the model will not merely answer a question, but inspect a system, choose the next step, correct itself, and reach a useful outcome. The interface changes from chat box to workflow loop, and suddenly everyone starts using the word “agent” with the confidence of a person who has never watched a model get lost in a four-by-four grid. ...

April 13, 2026 · 18 min · Zelina
Cover image

The Ask Gap: Why AI Agents Fail Not Because They Can’t Think — But Because They Don’t Know When to Stop

A ticket lands in the queue. It looks ordinary: update a parser, answer a business question, patch a workflow, produce a SQL query. The agent opens the files, explores the schema, writes code, runs a few checks, and submits something plausible. The output is polished. The reasoning trace is confident. The dashboard marks the task as completed. ...

April 13, 2026 · 16 min · Zelina
Cover image

Protocol Over Prompts: Why ANX Rewrites the Rules of AI Agent Interaction

Forms are boring until an AI agent has to fill one. Then the boring form becomes a surprisingly expensive machine. The agent reads the page, interprets the fields, finds the dropdowns, waits for the browser, loads dynamic options, decides what to click, serializes actions, and tries not to leak whatever the user typed into the wrong place. This is not intelligence in the glamorous sense. It is office work wearing a robotic costume. ...

April 7, 2026 · 18 min · Zelina
Cover image

Pre-Decision Intelligence: When AI Decides Before It Thinks

Audit logs are comforting things. They tell managers that a system took an action, they tell engineers which step fired, and they tell compliance teams that someone, somewhere, has a line of text to point at when the incident review begins. Now imagine an AI agent inside a business workflow. It has a customer request, a list of available tools, and a visible reasoning trace. The trace says it carefully considered whether to call an API, ask for missing information, or answer directly. It sounds deliberate. It sounds inspectable. It sounds like governance. ...

April 2, 2026 · 16 min · Zelina