Cover image

Mind Reading Machines: When AI Knows Something Is Wrong (But Not What)

Mind Reading Machines: When AI Knows Something Is Wrong (But Not What) Alarm systems are useful even when they cannot write the incident report. A smoke detector does not need to identify the brand of burning toaster. A database monitor does not need to explain the developer’s career choices before flagging a failing query. The first job is simpler: notice that something is off. ...

March 6, 2026 · 15 min · Zelina
Cover image

The Model That Knows It Knows: When Introspection Hides in the Logits

Audit. That is the word enterprises prefer when they want something to sound measurable, serious, and safely boring. You audit model outputs. You audit prompts. You audit logs. You audit whether the assistant said the forbidden thing, leaked the private thing, or hallucinated the regulatory thing. The problem is that models are not only output machines. They are also representation machines. Between the input and the final answer, they build intermediate signals, suppress some of them, amplify others, and then hand management a neat little sentence pretending the whole internal mess never happened. ...

February 24, 2026 · 14 min · Zelina