Cover image

GAVEL: When AI Safety Grows a Rulebook

Opening — Why this matters now AI safety is drifting toward an uncomfortable paradox. The more capable large language models become, the less transparent their internal decision-making appears — and the more brittle our existing safeguards feel. Text-based moderation catches what models say, not what they are doing. Activation-based safety promised to fix this, but in practice it has inherited many of the same flaws: coarse labels, opaque triggers, and painful retraining cycles. ...

February 2, 2026 · 4 min · Zelina
Cover image

Grading the Doctor: How Health-SCORE Scales Judgment in Medical AI

Opening — Why this matters now Healthcare LLMs have a credibility problem. Not because they cannot answer medical questions—many now ace exam-style benchmarks—but because real medicine is not a multiple-choice test. It is open-ended, contextual, uncertain, and unforgiving. In that setting, how a model reasons, hedges, and escalates matters as much as what it says. ...

February 2, 2026 · 4 min · Zelina
Cover image

When Models Start Remembering Too Much

Opening — Why this matters now Large language models are no longer judged solely by what they can generate, but by what they remember. As models scale and datasets balloon, a quiet tension has emerged: memorization boosts fluency and benchmark scores, yet it also raises concerns around data leakage, reproducibility, and governance. The paper examined here steps directly into that tension, asking not whether memorization exists — that debate is settled — but where, how, and why it concentrates. ...

February 2, 2026 · 3 min · Zelina
Cover image

When Empathy Needs a Map: Benchmarking Tool‑Augmented Emotional Support

Opening — Why this matters now Emotional support from AI has quietly moved from novelty to expectation. People vent to chatbots after work, during grief, and in moments of burnout—not to solve equations, but to feel understood. Yet something subtle keeps breaking trust. The responses sound caring, but they are often wrong in small, revealing ways: the time is off, the location is imagined, the suggestion doesn’t fit reality. Empathy without grounding turns into polite hallucination. ...

February 1, 2026 · 4 min · Zelina
Cover image

Metric Time Without the Clock: Making ASP Scale Again

Opening — Why this matters now Temporal reasoning has always been the Achilles’ heel of symbolic AI. The moment time becomes quantitative—minutes, deadlines, durations—logic programs tend to balloon, grounders panic, and scalability quietly exits the room. This paper lands squarely in that discomfort zone and does something refreshingly unglamorous: it makes time boring again. And boring, in this case, is good for business. ...

January 31, 2026 · 3 min · Zelina
Cover image

When LLMs Invent Languages: Efficiency, Secrecy, and the Limits of Natural Speech

Opening — Why this matters now Large language models are supposed to speak our language. Yet as they become more capable, something uncomfortable emerges: when pushed to cooperate efficiently, models often abandon natural language altogether. This paper shows that modern vision–language models (VLMs) can spontaneously invent task-specific communication protocols—compressed, opaque, and sometimes deliberately unreadable to outsiders—without any fine-tuning. Just prompts. ...

January 31, 2026 · 3 min · Zelina
Cover image

CAR-bench: When Agents Don’t Know What They Don’t Know

Opening — Why this matters now LLM agents are no longer toys. They book flights, write emails, control vehicles, and increasingly operate in environments where getting it mostly right is not good enough. In real-world deployments, the failure mode that matters most is not ignorance—it is false confidence. Agents act when they should hesitate, fabricate when they should refuse, and choose when they should ask. ...

January 30, 2026 · 4 min · Zelina
Cover image

Safety by Design, Rewritten: When Data Defines the Boundary

Opening — Why this matters now Safety-critical AI has a credibility problem. Not because it fails spectacularly—though that happens—but because we often cannot say where it is allowed to succeed. Regulators demand clear operational boundaries. Engineers deliver increasingly capable models. Somewhere in between, the Operational Design Domain (ODD) is supposed to translate reality into something certifiable. ...

January 30, 2026 · 5 min · Zelina
Cover image

The Patient Is Not a Moving Document: Why Clinical AI Needs World Models

Opening — Why this matters now Clinical AI has quietly hit a ceiling. Over the past five years, large language models trained on electronic health records (EHRs) have delivered impressive gains: better coding, stronger risk prediction, and even near‑physician exam performance. But beneath those wins lies an uncomfortable truth. Most clinical foundation models still treat patients as documents—static records to be summarized—rather than systems evolving over time. ...

January 30, 2026 · 4 min · Zelina
Cover image

When Rewards Learn to Think: Teaching Agents *How* They’re Wrong

Opening — Why this matters now Agentic AI is having a credibility problem. Not because agents can’t browse, code, or call tools—but because we still train them like they’re taking a final exam with no partial credit. Most agentic reinforcement learning (RL) systems reward outcomes, not process. Either the agent finishes the task correctly, or it doesn’t. For short problems, that’s tolerable. For long-horizon, tool-heavy reasoning tasks, it’s catastrophic. A single late-stage mistake erases an otherwise competent trajectory. ...

January 30, 2026 · 4 min · Zelina