Cover image

Mind the Loss Gap

TL;DR for operators AI systems do not only fail because they are too small, too dumb, or insufficiently blessed by the gods of scale. They often fail because the formal objective supervises one slice of behavior and quietly leaves another slice unmanaged. Three recent papers make that point from different domains. MA-SBI shows how side-channel context can correct simulation-based inference when the simulator is misspecified.1 A paper on non-adversarial LLM robustness shows that semantically neutral prompt changes can systematically shift internal module outputs, and that targeted debiasing can recover robustness without full retraining.2 FiberTune shows that robot policy fine-tuning can preserve action-equivalent visual residuals that ordinary action loss is happy to compress into oblivion.3 ...

June 25, 2026 · 14 min · Zelina
Cover image

Spurious Minds: How Embedding Regularization Could Fix Bias at Its Roots

A hiring classifier works beautifully on average. A content moderation model passes global accuracy tests. A medical image model looks reassuringly competent across the validation set. Then someone asks the annoying question every serious deployment eventually faces: which group does it fail on? That is where average accuracy starts behaving like a corporate dashboard after a long lunch: technically present, emotionally comforting, and not especially interested in the unpleasant details. ...

November 8, 2025 · 16 min · Zelina