Cover image

When Models Know They’re Wrong: Catching Jailbreaks Mid-Sentence

Guardrails usually fail quietly. A user sends a malicious prompt. The model begins answering. The safety policy that looked firm in the demo environment starts behaving like office wallpaper: present, decorative, and not especially involved. By the time a post-hoc filter reads the final answer, the model has already produced the thing it should not have produced. The system may block the response from the user, but the real lesson is less flattering: the model crossed the line before the defense noticed. ...

January 16, 2026 · 3 min · Zelina
Cover image

Thinking Without Understanding: When AI Learns to Reason Anyway

A meeting room is not a philosophy seminar, which is fortunate, because most companies would not survive one. A manager asks an AI system to analyze a contract, debug a workflow, compare vendors, or draft a risk memo. The system pauses, breaks the task into steps, checks an assumption, rejects one path, and returns a structured answer. Someone in the room says: “But it does not really understand.” ...

January 6, 2026 · 17 min · Zelina
Cover image

Rules of Attraction: How LLMs Learn to Judge Better Than We Do

Rubrics are supposed to make judgment boring. That is their charm. A good rubric tells a teacher why one essay deserves a 5 instead of a 3, tells a compliance reviewer why one response is acceptable and another is risky, and tells an internal QA team why a generated summary is useful rather than merely confident. In business, boring judgment is valuable. It scales. It can be audited. It survives employee turnover. It does not wake up one morning and decide that “clarity” now means “vibes with a semicolon.” ...

December 2, 2025 · 15 min · Zelina
Cover image

When Agents Get Bored: Three Baselines Your Autonomy Stack Already Has

Idle time is not empty time. Anyone who has managed a human team already knows this. Leave a capable person with no clear assignment and they may tidy the backlog, invent a side project, interrogate the process, or spend the afternoon constructing a philosophy of why the calendar is oppressive. Large language model agents, apparently, have their own version of this behaviour. Less caffeine, more JSON, same managerial problem. ...

October 2, 2025 · 19 min · Zelina
Cover image

OmniAvatar’s Metrics & Training: Under the Hood of Next-Gen Avatars

TL;DR for operators OmniAvatar is best read as a shift from “make the mouth move” to “make the person perform.” The paper introduces an audio-driven avatar video generation system that takes a reference image, an audio clip, and a text prompt, then generates facial and semi-body video with synchronised speech, adaptive body motion, and prompt-controlled scene elements.1 ...

June 24, 2025 · 16 min · Zelina