Cover image

Chart Check: Why Clinical Summaries Need Detectors Before Alignment

Chart review is the boring part of medicine, which is exactly why AI systems should learn from it. A clinical discharge summary does not fail only when it sounds clumsy. It fails when it tells a patient something that did not happen, invents a medication change, adds a procedure, misstates a timing detail, or turns a vague note into a confident medical fact. The prose may still be smooth. The bedside manner may even be excellent. Unfortunately, a hallucination delivered in fluent patient-friendly language is not safer because it has better manners. ...

June 2, 2026 · 17 min · Zelina
Cover image

The Heart of the Model: ECG Foundation Models Need the Right Backbone Before More Data

Cost is not always about size. That is an inconvenient sentence for anyone trying to sell a larger medical foundation model by waving parameter counts like a hospital procurement trophy. In ECG modeling, the expensive question is not simply whether one can pretrain on more recordings. The harder question is whether the model architecture and pretraining task actually match the structure of the signal. ...

May 24, 2026 · 14 min · Zelina
Cover image

MARCH Orders: When AI Holds a CT Case Conference

The useful meeting, unfortunately, exists Meetings are usually where productivity goes to file a complaint. But there is one kind of meeting that high-stakes work still needs: the review session where a first draft is challenged, evidence is checked, and a senior decision-maker signs off. Radiology has long understood this. A resident may draft the report. A fellow may question the interpretation. An attending radiologist resolves the remaining uncertainty. The point is not ceremony. The point is controlled disagreement. ...

April 22, 2026 · 16 min · Zelina
Cover image

The Cost of Playing It Safe: When AI Safety Creates Harm

Refusal looks safe. That is the problem. A user says they have run out of ordinary options: the specialist is gone, the appointment is weeks away, the emergency department has already sent them home, and the remaining medication supply is not enough to bridge the gap. The user asks an AI system what to do. The model refuses to provide concrete guidance and recommends the same professional route the user has just explained is unavailable. ...

April 11, 2026 · 14 min · Zelina
Cover image

When AI Starts Writing Papers: The Rise of the Medical AI Scientist

Papers used to have a useful quality: they were difficult to produce. Not always good, unfortunately, but difficult. Someone had to identify a problem, read the literature, design the method, write the code, run the experiment, repair the code, compare the result, draw the figures, write the manuscript, and then survive peer review with only minor emotional damage. ...

March 31, 2026 · 16 min · Zelina
Cover image

When EEG Stops Thinking in Squares: Why Linear-Time Models Are Quietly Winning

The hospital problem is not that EEG is too small. It is that EEG refuses to stay the same shape. A hospital does not run machine learning inside a clean benchmark. It runs it across devices, departments, vendors, technicians, recording protocols, and patients who rarely behave like textbook signals. Electroencephalography, or EEG, makes this especially inconvenient. The signal is long, noisy, clinically useful, and structurally inconsistent. Different datasets may use different electrode counts. Different institutions may follow different montage conventions. A model that looks competent on one electrode layout can become less confident when the scalp is wired slightly differently. Apparently, brains did not agree to standardize themselves for our convenience. ...

March 20, 2026 · 16 min · Zelina
Cover image

When AI Meets the Delivery Room: Designing Safe LLM Chatbots for Maternal Health

A patient does not usually send a neatly structured medical case report. She sends a short message. “Baby moving less today.” “Severe headache and blurred vision.” “What foods increase iron?” To a normal chatbot, these are three user queries. To a maternal-health system, they are three different operating modes. One can be answered with general education. One may require urgent escalation. One may be harmless—or not—depending on pregnancy stage, timing, severity, and missing context. This is where the usual AI product fantasy quietly breaks down: the hardest part is not producing a fluent answer. The hardest part is deciding whether the system should answer at all. ...

March 16, 2026 · 17 min · Zelina
Cover image

From Hallucination to Verification: Why AI Needs a Pharmacist’s Mindset

Prescription checks are a good way to humble AI. Not because the language is impossible. Drug labels, clinical notes, dosage instructions, contraindications, and interaction warnings are all text-heavy. LLMs are good at text. That part is not the problem. The problem is that prescription verification is not a writing task. It is a safety task disguised as a reading task. A pharmacist is not merely asking, “Does this paragraph sound medically reasonable?” The real question is narrower and harsher: given this patient, this drug, this dose, this route, this timing, this interaction profile, and this missing or available clinical data, is there a specific safety issue that must be raised? ...

March 13, 2026 · 17 min · Zelina
Cover image

Silver Bots: When Agentic AI Becomes the Caregiver

Medication is simple until someone forgets it twice, sleeps badly, skips breakfast, and says they feel “fine.” That is the real texture of elderly care. It is not one clean signal. It is a slow accumulation of weak signals: changed gait, missed pills, restless sleep, lower appetite, vague pain, repeated questions, a daughter who cannot visit this week, a nurse covering too many rooms, a home that is technically “smart” but not exactly wise. ...

March 7, 2026 · 15 min · Zelina
Cover image

Brains, Bias & Benchmarks: Why Multimodal AI Still Struggles with Tumor Truth

MRI is a useful reality check for multimodal AI. It looks like an image problem, behaves like a reasoning problem, and punishes lazy confidence with the quiet brutality of clinical ambiguity. That is why MM-NeuroOnco is more interesting than another “new benchmark” headline.1 The paper introduces a multimodal instruction dataset and benchmark for MRI-based brain tumor diagnosis, but the dataset size is not the main story. Yes, the authors curate a 73,226-image pool, build 24,726 semantically attributed samples, generate more than 200,000 VQA pairs, and construct a 1,000-image benchmark with more than 3,000 questions. Fine. The spreadsheet is muscular. ...

March 1, 2026 · 18 min · Zelina