AI Governance

When Models Listen but Stop Thinking: Teaching Audio Models to Reason Like They Read

Opening — Why this matters now Audio-first interfaces are everywhere. Voice assistants, call-center bots, in-car copilots, and accessibility tools all rely on large audio-language models (LALMs) that promise to hear and think at the same time. Yet in practice, something awkward happens: the same model that reasons fluently when reading text suddenly becomes hesitant, shallow, or just wrong when listening to speech. ...

When SGD Remembers: The Hidden Memory Inside Training Dynamics

Opening — Why this matters now Modern deep learning quietly assumes a comforting fiction: that training is memoryless. Given the current parameters (and maybe the optimizer buffers), tomorrow’s update shouldn’t care about yesterday’s data order, augmentation choice, or micro-step path. This assumption underwrites theory, stabilizes intuition, and keeps whiteboards clean. Reality, however, has been less cooperative. Practitioners know that order matters, momentum carries ghosts of past gradients, and small curriculum tweaks can echo far longer than expected. Yet until now, there has been no clean, operational way to measure whether training truly forgets—or merely pretends to. ...

When Trains Meet Snowstorms: Turning Weather Chaos into Predictable Rail Operations

Opening — Why this matters now Railway delays are one of those problems everyone experiences and almost no one truly understands. Passengers blame weather. Operators blame operations. Data scientists blame missing variables. Everyone is partially correct. What has quietly shifted in recent years is not the weather itself, but our ability to observe it alongside operations—continuously, spatially, and at scale. As rail systems push toward AI‑assisted scheduling, predictive maintenance, and real‑time disruption management, delay prediction without weather is no longer just incomplete—it is structurally misleading. ...

Training Models to Explain Themselves: Counterfactuals as a First-Class Objective

Opening — Why this matters now As AI systems increasingly decide who gets a loan, a job interview, or access to public services, explanations have stopped being a philosophical luxury. They are now a regulatory, ethical, and operational requirement. Counterfactual explanations—“If your income were $5,000 higher, the loan would have been approved”—have emerged as one of the most intuitive tools for algorithmic recourse. ...

Triage by Token: When Context Clues Quietly Override Clinical Judgment

Opening — Why this matters now Large language models are quietly moving from clerical assistance to clinical suggestion. In emergency departments (EDs), where seconds matter and triage decisions shape outcomes, LLM-based decision support tools are increasingly tempting: fast, consistent, and seemingly neutral. Yet neutrality in language does not guarantee neutrality in judgment. This paper interrogates a subtle but consequential failure mode: latent bias introduced through proxy variables. Not overt racism. Not explicit socioeconomic labeling. Instead, ordinary contextual cues—how a patient arrives, where they live, how often they visit the ED—nudging model outputs in clinically unjustified ways. ...

Affective Inertia: Teaching LLM Agents to Remember Who They Are

Opening — Why this matters now LLM agents are getting longer memories, better tools, and more elaborate planning stacks—yet they still suffer from a strangely human flaw: emotional whiplash. An agent that sounds empathetic at turn 5 can become oddly cold at turn 7, then conciliatory again by turn 9. For applications that rely on trust, continuity, or persuasion—mental health tools, tutors, social robots—this instability is not a cosmetic issue. It’s a structural one. ...

Prompt Wars: When Pedagogy Beats Cleverness

Opening — Why this matters now Educational AI has entered its prompt era. Models are powerful, APIs are cheap, and everyone—from edtech startups to university labs—is tweaking prompts like seasoning soup. The problem? Most of this tweaking is still artisanal. Intuition-heavy. Barely documented. And almost never evaluated with the same rigor we expect from the learning science it claims to support. ...

Seeing Is Misleading: When Climate Images Need Receipts

Opening — Why this matters now Climate misinformation has matured. It no longer argues; it shows. A melting glacier with the wrong caption. A wildfire image from another decade. A meme that looks scientific enough to feel authoritative. In an era where images travel faster than footnotes, public understanding of climate science is increasingly shaped by visuals that lie by omission, context shift, or outright fabrication. ...

Auditing the Illusion of Forgetting: When Unlearning Isn’t Enough

Opening — Why this matters now “Right to be forgotten” has quietly become one of the most dangerous phrases in AI governance. On paper, it sounds clean: remove a user’s data, comply with regulation, move on. In practice, modern large language models (LLMs) have turned forgetting into a performance art. Models stop saying what they were trained on—but continue remembering it internally. ...

DISARM, but Make It Agentic: When Frameworks Start Doing the Work

Opening — Why this matters now Foreign Information Manipulation and Interference (FIMI) has quietly evolved from a niche security concern into a persistent, high‑tempo operational problem. Social media platforms now host influence campaigns that are faster, cheaper, and increasingly AI‑augmented. Meanwhile, defenders are expected to produce timely, explainable, and interoperable assessments—often across national and institutional boundaries. ...