Cover image

Hard Problems Pay Better: Why Difficulty-Aware DPO Fixes Multimodal Hallucinations

Opening — Why this matters now Multimodal large language models (MLLMs) are getting better at seeing—but not necessarily at knowing. Despite steady architectural progress, hallucinations remain stubbornly common: models confidently describe objects that do not exist, infer relationships never shown, and fabricate visual details with unsettling fluency. The industry response has been predictable: more preference data, more alignment, more optimization. ...

January 5, 2026 · 4 min · Zelina