Cover image

Org-Charted Territory: Why AI Agents Need Middle Management

Opening — Why this matters now The AI industry has spent the last two years trying to turn large language models into workers. The result is a small circus of agents: coding agents, browser agents, research agents, support agents, spreadsheet agents, and agents that appear to exist mainly to summon other agents. Naturally, the next problem is not intelligence. It is management. ...

April 28, 2026 · 16 min · Zelina
Cover image

Search Me If You Can: Why AI Agent Discovery Needs Receipts

Opening — Why this matters now The AI agent market is beginning to look like an overconfident airport duty-free shop: everything claims to be premium, every label promises capability, and somehow the thing you need is still hard to find. That matters because the next phase of business automation will not be built from one general chatbot sitting politely in a browser tab. It will involve agent ecosystems: finance agents, customer-support agents, coding agents, compliance agents, research agents, scheduling agents, procurement agents, and a thousand microscopic “I can do that” assistants wrapped in glossy product pages. ...

April 28, 2026 · 13 min · Zelina
Cover image

Model Citizens: Why Agentic AI Needs Laws, Not Just Loops

Opening — Why this matters now The current agentic AI conversation has a charmingly reckless habit: attach a large language model to tools, add a planner, sprinkle in memory, and call the result an autonomous system. This is not entirely wrong. It is merely incomplete in the way a paper airplane is technically aviation. ...

April 27, 2026 · 13 min · Zelina
Cover image

Drift Happens: Stress-Testing AI Policies Before Sensors Lie

Opening — Why this matters now Most AI deployment failures do not arrive wearing a villain costume. They arrive as a camera calibration shift, a slightly worse classifier, a sensor that ages badly, a document parser that misses one field more often than expected, or a retrieval layer that suddenly sees the wrong context with impressive confidence. The policy may still be “the same.” The world it observes is not. ...

April 26, 2026 · 13 min · Zelina
Cover image

Synthetic Data, Real Receipts: Why LLM Pipelines Need an Auditor

Opening — Why this matters now Synthetic data has become one of AI’s favorite escape routes. Real data is expensive, legally awkward, slow to collect, unevenly labeled, and sometimes simply unavailable. LLMs offer a tempting alternative: generate the missing examples, fill the long tail, create evaluation suites, simulate edge cases, and keep the training pipeline moving. Convenient. Elegant. Also mildly dangerous, which is usually where the interesting part begins. ...

April 25, 2026 · 12 min · Zelina
Cover image

Clawing Back the Benchmark: When AI Agents Start Testing Themselves

Opening — Why this matters now AI agents are graduating from toy demos to operational labor: triaging tickets, coordinating calendars, filing reports, reconciling data, and occasionally inventing new ways to misuse a CRM. Yet the industry still evaluates many of these systems with static, hand-built benchmarks assembled like museum exhibits. That model is expensive, slow, and increasingly obsolete. Once a benchmark is published, it starts aging immediately. Models train on adjacent data, developers optimize toward the leaderboard, and reality moves elsewhere. ...

April 23, 2026 · 4 min · Zelina
Cover image

Sirens in the Weights: Why AI Safety May Be Hiding Inside the Model

Opening — Why this matters now Every AI vendor claims to care about safety. Many even prove it by adding another model on top of the first model to police the first model. It is an elegant industry ritual: solve model complexity with more model complexity. But a newly uploaded paper, LLM Safety From Within: Detecting Harmful Content with Internal Representations, offers a more inconvenient thesis: perhaps the model already knows when content is dangerous — we simply have not been listening carefully enough. fileciteturn0file0 ...

April 23, 2026 · 4 min · Zelina
Cover image

When RL Needs a Tour Guide: OGER and the Business of Smarter Exploration

Opening — Why this matters now The current arms race in AI reasoning has an awkward secret: many models are not truly thinking better so much as repeating better. Reinforcement learning has improved chain-of-thought performance dramatically, but often by polishing existing habits rather than discovering new ones. Efficient? Yes. Inspiring? Not especially. The paper OGER: A Robust Offline-Guided Exploration Reward for Hybrid Reinforcement Learning proposes a cleaner answer: teach models from strong examples, then reward them for going beyond those examples intelligently. Not chaos. Not blind randomness. Structured exploration. A rare commodity. ...

April 23, 2026 · 4 min · Zelina
Cover image

CQ or Consequences: What This LLM Benchmark Reveals About AI Requirements Work

Opening — Why this matters now Everyone wants AI to automate the expensive, slow, deeply human parts of work. Requirements gathering is high on that list. It is also where many software and data projects quietly fail. A recent paper, Characterising LLM-Generated Competency Questions, examines whether large language models can reliably generate competency questions (CQs) — the structured questions used in ontology engineering to define what a knowledge system must know, answer, or reason about. In simpler terms: if you are building a knowledge graph, compliance engine, recommendation system, or enterprise AI layer, CQs help translate vague business intent into testable requirements. fileciteturn0file0 ...

April 22, 2026 · 5 min · Zelina
Cover image

CQ, AI & The Question of Questions

Opening — Why this matters now Everyone wants AI systems that are explainable, reliable, and aligned to business needs. Few want to do the tedious work required to get there. That work often begins with asking the right questions. In knowledge engineering, those questions are called Competency Questions (CQs): natural-language prompts that define what an ontology or knowledge model must be able to answer. Think: Which assets are on loan?, Who created this artifact?, What metadata is missing? ...

April 22, 2026 · 4 min · Zelina