Cover image

When Alignment Is Not Enough: Reading Between the Lines of Modern LLM Safety

Opening — Why this matters now In the past two years, alignment has quietly shifted from an academic concern to a commercial liability. The paper you uploaded (arXiv:2601.16589) sits squarely in this transition period: post-RLHF optimism, pre-regulatory realism. It asks a deceptively simple question—do current alignment techniques actually constrain model behavior in the ways we think they do?—and then proceeds to make that question uncomfortable. ...

January 26, 2026 · 3 min · Zelina
Cover image

Judging the Judges: When AI Evaluation Becomes a Fingerprint

Opening — Why this matters now LLM-as-judge has quietly become infrastructure. It ranks models, filters outputs, trains reward models, and increasingly decides what ships. The industry treats these judges as interchangeable instruments—different thermometers measuring the same temperature. This paper suggests that assumption is not just wrong, but dangerously so. Across thousands of evaluations, LLM judges show near-zero agreement with each other, yet striking consistency with themselves. They are not noisy sensors of a shared truth. They are stable, opinionated evaluators—each enforcing its own private theory of quality. ...

January 10, 2026 · 4 min · Zelina
Cover image

Thinking Without Understanding: When AI Learns to Reason Anyway

Opening — Why this matters now For years, debates about large language models (LLMs) have circled the same tired question: Do they really understand what they’re saying? The answer—still no—has been treated as a conversation stopper. But recent “reasoning models” have made that question increasingly irrelevant. A new generation of AI systems can now reason through problems step by step, critique their own intermediate outputs, and iteratively refine solutions. They do this without grounding, common sense, or symbolic understanding—yet they still solve tasks previously reserved for humans. That contradiction is not a bug in our theory of AI. It is a flaw in our theory of reasoning. ...

January 6, 2026 · 4 min · Zelina
Cover image

When Models Start Remembering: The Quiet Rise of Adaptive AI

Opening — Why this matters now For years, we have treated AI models like polished machines: train once, deploy, monitor, repeat. That worldview is now visibly cracking. The paper you just uploaded lands squarely on this fault line, arguing—quietly but convincingly—that modern AI systems are no longer well-described as static functions. They are processes. And processes remember. ...

January 4, 2026 · 3 min · Zelina
Cover image

Ethics Isn’t a Footnote: Teaching NLP Responsibility the Hard Way

Opening — Why this matters now Ethics in AI is having a moment. Codes of conduct, bias statements, safety benchmarks, model cards—our industry has never been more concerned with responsibility. And yet, most AI education still treats ethics like an appendix: theoretically important, practically optional. This paper makes an uncomfortable point: you cannot teach ethical NLP by lecturing about it. Responsibility is not absorbed through slides. It has to be practiced. ...

January 2, 2026 · 4 min · Zelina
Cover image

When Models Look Back: Memory, Leakage, and the Quiet Failure Modes of LLM Training

Opening — Why this matters now Large language models are getting better at many things—reasoning, coding, multi‑modal perception. But one capability remains quietly uncomfortable: remembering things they were never meant to remember. The paper underlying this article dissects memorization not as a moral failure or an anecdotal embarrassment, but as a structural property of modern LLM training. The uncomfortable conclusion is simple: memorization is not an edge case. It is a predictable outcome of how we scale data, objectives, and optimization. ...

December 30, 2025 · 3 min · Zelina
Cover image

Reading the Room? Apparently Not: When LLMs Miss Intent

Opening — Why this matters now Large Language Models are increasingly deployed in places where misunderstanding intent is not a harmless inconvenience, but a real risk. Mental‑health support, crisis hotlines, education, customer service, even compliance tooling—these systems are now expected to “understand” users well enough to respond safely. The uncomfortable reality: they don’t. The paper behind this article demonstrates something the AI safety community has been reluctant to confront head‑on: modern LLMs are remarkably good at sounding empathetic while being structurally incapable of grasping what users are actually trying to do. Worse, recent “reasoning‑enabled” models often amplify this failure instead of correcting it. fileciteturn0file0 ...

December 25, 2025 · 4 min · Zelina
Cover image

Benchmarks on Quicksand: Why Static Scores Fail Living Models

Opening — Why this matters now If you feel that every new model release breaks yesterday’s leaderboard, congratulations: you’ve discovered the central contradiction of modern AI evaluation. Benchmarks were designed for stability. Models are not. The paper you just uploaded dissects this mismatch with academic precision—and a slightly uncomfortable conclusion: static benchmarks are no longer fit for purpose. ...

December 15, 2025 · 3 min · Zelina
Cover image

Paper Tigers or Compliance Cops? What AIReg‑Bench Really Says About LLMs and the EU AI Act

The gist AIReg‑Bench proposes the first benchmark for a deceptively practical task: can an LLM read technical documentation and judge how likely an AI system complies with specific EU AI Act articles? The dataset avoids buzzword theater: 120 synthetic but expert‑vetted excerpts portraying high‑risk systems, each labeled by three legal experts on a 1–5 compliance scale (plus plausibility). Frontier models are then asked to score the same excerpts. The headline: best models reach human‑like agreement on ordinal compliance judgments—under some conditions. That’s both promising and dangerous. ...

October 9, 2025 · 5 min · Zelina
Cover image

Options = Power: Turning Empowerment into a KPI for AI Agents

If your agents can reach more valuable futures with fewer steps, they’re stronger—whether you measured that task or not. Today’s paper offers a clean way to turn that intuition into a number: empowerment—an information‑theoretic score of how much an agent’s current action shapes its future states. The authors introduce EELMA, a scalable estimator that works purely from multi‑turn text traces. No bespoke benchmark design. No reward hacking. Just trajectories. This is the kind of metric we’ve wanted at Cognaptus: goal‑agnostic, scalable, and diagnostic. Below, I translate EELMA into an operator’s playbook: what it is, why it matters for business automation, how to wire it into your stack, and where it can mislead you if unmanaged. ...

October 3, 2025 · 5 min · Zelina