When Bigger Isn’t Smarter: Stress‑Testing LLMs in the ICU
Opening — Why this matters now Healthcare AI has entered its foundation model phase. LLMs trained on trillions of tokens are being casually proposed for everything from triage to prognosis, often with an implicit assumption: bigger models must understand patients better. This paper quietly punctures that assumption. By benchmarking LLMs against smaller, task‑focused language models (SLMs) on shock prediction in ICUs, the authors confront a question most vendors avoid: Do LLMs actually predict future clinical deterioration better—or do they merely sound more convincing? ...