Healthcare AI

SAFE Enough to Think: Federated Learning Comes for Your Brain

Hospitals do not usually wake up excited to pool brain data. Neither do device vendors, rehabilitation centers, or anyone with a lawyer who has read a privacy regulation without falling asleep halfway through. EEG data is useful precisely because it is personal. That is also why centralizing it is awkward. This is the practical tension behind SAFE, short for Secure and Accurate Federated Learning, a proposed framework for EEG-based brain-computer interfaces, or BCIs.1 The paper is not interesting because it says “federated learning protects privacy.” That line has already been printed on enough PowerPoint slides to qualify as industrial wallpaper. The interesting part is that the authors treat federated learning as only one piece of the problem. ...

Regrets, Graphs, and the Price of Privacy: Federated Causal Discovery Grows Up

A hospital changes its treatment protocol. Another keeps the old one. A third removes an approval step that had quietly influenced several downstream decisions. Their datasets now disagree. The usual federated-learning instinct is to treat that disagreement as a problem: smooth it, average it, or design an aggregation rule robust enough to survive it. In causal discovery, however, some disagreements contain precisely the information the global model lacks. Removing a local dependency can expose a previously hidden causal pattern. A policy difference that looks like statistical inconvenience may function as an accidental experiment. ...

When 100% Sensitivity Isn’t Safety: How LLMs Fail in Real Clinical Work

Clinic. That is where the comforting AI story starts to wobble. In a benchmark, a clinical model receives a clean question, enough context, and a scoring rule that usually rewards the right answer. In a clinic, the same model sees an elderly patient with multiple conditions, incomplete records, medication changes from years ago, possible specialist involvement, ambiguous prescribing history, and a problem that may not require action at all. The model is not merely being asked, “Can you spot a risk?” It is being asked, “Do you understand whether this risk is real, current, important, and safely actionable?” ...

Think Before You Beam: When AI Learns to Plan Like a Physicist

Beam planning sounds like the sort of work automation should have solved years ago. There is a target. There are organs at risk. There are dose constraints. There is an optimizer. Surely the machine should find the best plan while humans do something more dignified than nudging parameters inside a treatment planning system for the seventeenth time. ...

When Bigger Isn’t Smarter: Stress‑Testing LLMs in the ICU

A hospital does not buy “intelligence.” It buys a workflow. That distinction sounds obvious until an AI vendor arrives with a model that has billions of parameters, a clinical pretraining story, and the gentle implication that smaller models are now museum pieces. In the ICU, however, the useful question is not whether the model can talk like a doctor. It is whether it can detect tomorrow’s clinical deterioration from messy notes better than simpler systems that cost less, run faster, and attract fewer infrastructure headaches. ...

Black Boxes, White Coats: AI Epidemiology and the Art of Governing Without Understanding

A hospital does not need a perfect theory of neural network internals before it can notice that one clinical AI keeps recommending the wrong kind of follow-up. A bank does not need to decode every transformer layer before it can see that a credit assistant behaves oddly around post-bankruptcy applicants. A regulator does not need metaphysics. It needs repeatable measurements. ...

When Tools Think Before Tokens: What TxAgent Teaches Us About Safe Agentic AI

When Tools Think Before Tokens: What TxAgent Teaches Us About Safe Agentic AI Tools are supposed to make AI safer. That is the sales pitch, anyway. Give the model access to curated biomedical databases, let it call APIs instead of hallucinating from memory, and clinical reasoning suddenly becomes more grounded. Less improvisation, more evidence. Less theatrical confidence, more traceable work. ...

Lost in Translation: When Multilingual LLMs Miss the Medical Plot

Accuracy is a seductive number. It is tidy, executive-friendly, and easy to put in a slide deck. A model gets 82% accuracy, someone says “good enough,” and suddenly a clinical workflow is being “transformed.” Healthcare, as usual, has a way of punishing this kind of optimism. Not loudly at first. Quietly. Through false negatives, silent majority-class prediction, and a dashboard that looks reassuring until someone asks the rude question: what exactly did the model miss? ...

Fog of Neuro: Why Speech May Become the Next MRI

Fog of Neuro: Why Speech May Become the Next MRI Speech is a strange medical instrument. It does not look like one. It does not come with a scanner room, a radiology report, or a patient lying very still while a machine complains loudly. It comes out in ordinary life: a story, a pause, a word search, a sentence that loses its thread halfway through. For many neurological conditions, especially rare metabolic and neurodegenerative diseases, that ordinary speech may contain something today’s clinic often misses: the patient’s real cognitive state between appointments. ...

Scan, Plan, Report: When Agentic AI Starts Thinking Like a Radiologist

Scan, Plan, Report: When Agentic AI Starts Thinking Like a Radiologist Report writing is the visible part of radiology. It is also the part easiest for AI vendors to misunderstand. A radiology report looks like text, so the naive automation pitch is obvious: give the CT scan to a vision-language model, ask for a report, and let the model type faster than a human. Congratulations, we have reinvented autocomplete with more liability. ...