Cover image

RxnBench: Reading Chemistry Like a Human (Turns Out That’s Hard)

Opening — Why this matters now Multimodal Large Language Models (MLLMs) have become impressively fluent readers of the world. They can caption images, parse charts, and answer questions about documents that would once have required a human analyst and a strong coffee. Naturally, chemistry was next. But chemistry does not speak in sentences. It speaks in arrows, wedges, dashed bonds, cryptic tables, and reaction schemes buried three pages away from their explanations. If we want autonomous “AI chemists,” the real test is not trivia or SMILES strings — it is whether models can read actual chemical papers. ...

December 31, 2025 · 4 min · Zelina