Cover image

Pretty Text, Ugly Logic: When Image Models Learn to Write but Not to Reason

A slide looks finished. The headline is sharp, the equations are aligned, the answer box is confident, and the design has the mild corporate glow of something that has already been approved by three people who did not read it. That is exactly the problem. For years, text-to-image models failed in a wonderfully obvious way: they could not spell. A poster would say “Qaurterly Reveneu,” the mockup button would contain mystical glyphs, and everyone understood the output was decorative, not operational. Recent models have changed that. They can now place readable text inside images, produce document-like pages, and generate slide-like visual artifacts. The failure mode has become less funny and more expensive: the text may be readable, but the reasoning may be wrong. ...

June 7, 2026 · 15 min · Zelina
Cover image

Scaffold and Ladder: Why AI Agents Need Meta-Reasoning, Not Longer Monologues

Workflow is where AI agents usually stop looking magical. Ask one to summarize a short memo, and it behaves like a competent intern with suspiciously fast typing. Ask it to investigate a compliance question across policies, contract clauses, ticket histories, and messy attachments, and the illusion starts to wobble. The agent searches once, reads too much at once, jumps to a plausible answer, and then politely explains the wrong conclusion with the confidence of a junior consultant who has discovered formatting. ...

June 1, 2026 · 18 min · Zelina
Cover image

Score and Disorder: Why LLM Reasoning Needs More Than Accuracy

A model review often begins with a spreadsheet. One column says accuracy. Another says cost. A third says latency. Someone asks whether the model is “good enough.” Someone else points at the benchmark score. A decision is made. Procurement smiles. Compliance does not, but compliance rarely smiles anyway. The problem is not that accuracy is useless. The problem is that accuracy is too small a container for the thing businesses actually want from reasoning systems. A final answer can be correct while the route to that answer is unstable, unnecessarily expensive, locally contradictory, or impossible to reproduce under a harmless rewording of the question. That is not a philosophical inconvenience. It is an operational failure mode waiting politely inside a dashboard. ...

June 1, 2026 · 16 min · Zelina
Cover image

If Logic Were Enough: Why LLMs Still Miss the Point of Conditionals

A promise is rarely just a logical operator. “If you mow the lawn, I’ll give you 50 dollars” does not sound like a philosophical exercise in truth tables. It sounds like a deal. Most people hear it as: no mowing, no money. By contrast, “If you’re hungry, there’s pizza in the oven” does not mean the pizza appears only under the metaphysical condition of your hunger. It means the pizza is there, and your hunger merely explains why I am telling you. ...

May 29, 2026 · 16 min · Zelina
Cover image

RL Needs a Menu, Not a Miracle

RL Needs a Menu, Not a Miracle Menus are underrated. When a language model knows only one way to solve a problem, reinforcement learning can mostly reward or punish that route. It can make the model more confident, more selective, and sometimes more verbose. But it has little room to choose among genuinely different ways of reaching the answer. ...

May 25, 2026 · 14 min · Zelina
Cover image

Think Less, Align Better: The New Economics of AI Reasoning

Opening — Why this matters now Enterprise AI is entering its mildly awkward teenage phase: everyone wants intelligence, nobody wants the invoice. For the last two years, much of the AI conversation has revolved around more: more context, more reasoning tokens, more chain-of-thought, more human feedback, more evaluators, more synthetic data, more agents, more dashboards to explain why the agents broke the dashboards. The operating assumption was simple enough: if the model thinks more, explains more, or trains on more feedback, it should perform better. ...

May 9, 2026 · 19 min · Zelina
Cover image

Credit Where It’s Due: The New Reasoning Stack for Agentic AI

Opening — Why this matters now The current agentic AI conversation has a very convenient myth: if an AI agent fails, give it a better model, a longer context window, more tool calls, and perhaps a heroic prompt containing the phrase “think step by step” in several places. Then wait for magic. Preferably billable magic. ...

May 7, 2026 · 16 min · Zelina
Cover image

Crystal Clear? Why AI Needs to Show Its Work

Answers are cheap. In a business setting, this is slightly annoying. A model reads a chart, extracts a number, answers a compliance question, classifies a product defect, or explains a visual inspection result. The answer lands in the dashboard. It looks clean. It may even be correct. Then someone asks the only question that matters: how did it get there? ...

March 16, 2026 · 16 min · Zelina
Cover image

The Context Ceiling: When Long Context Stops Thinking

Documents are the easiest way to fool an AI system into looking serious. A procurement team uploads the full contract archive. A compliance team adds policy manuals, audit notes, and emails. A financial analyst stuffs transcripts, filings, and market commentary into one heroic prompt. The interface accepts it. The model answers fluently. Everyone relaxes. ...

March 2, 2026 · 12 min · Zelina
Cover image

Gamma Rays and Toolboxes: Why Superintelligence May Be a Systems Engineering Problem

Toolboxes are not glamorous. Nobody gives a keynote about the screwdriver. Nobody writes breathless think-pieces about the socket wrench. But when a complicated system fails, the difference between “genius” and “expensive confusion” is often whether the operator had the right tool, used it at the right moment, and trusted it to do the part humans should not pretend to do mentally. ...

February 25, 2026 · 14 min · Zelina