Cover image

NeuralFOMO: When LLMs Care About Being Second

Opening — Why this matters now LLMs no longer live alone. They rank against each other on leaderboards, bid for tasks inside agent frameworks, negotiate in shared environments, and increasingly compete—sometimes quietly, sometimes explicitly. Once models are placed side-by-side, performance stops being purely absolute. Relative standing suddenly matters. This paper asks an uncomfortable question: do LLMs care about losing—even when losing costs them nothing tangible? ...

December 16, 2025 · 4 min · Zelina
Cover image

When LLMs Stop Talking and Start Choosing Algorithms

Opening — Why this matters now Large Language Models are increasingly invited into optimization workflows. They write solvers, generate heuristics, and occasionally bluff their way through mathematical reasoning. But a more uncomfortable question has remained largely unanswered: do LLMs actually understand optimization problems—or are they just eloquent impostors? This paper tackles that question head‑on. Instead of judging LLMs by what they say, it examines what they encode. And the results are quietly provocative. ...

December 16, 2025 · 4 min · Zelina
Cover image

When Medical AI Stops Guessing and Starts Asking

Opening — Why this matters now Medical AI has become very good at answering questions. Unfortunately, medicine rarely works that way. Pathology, oncology, and clinical decision-making are not single-query problems. They are investigative processes: observe, hypothesize, cross-check, revise, and only then conclude. Yet most medical AI benchmarks still reward models for producing one-shot answers — neat, confident, and often misleading. This mismatch is no longer academic. As multimodal models edge closer to clinical workflows, the cost of shallow reasoning becomes operational, regulatory, and ethical. ...

December 16, 2025 · 4 min · Zelina
Cover image

When Precedent Gets Nuanced: Why Legal AI Needs Dimensions, Not Just Factors

Opening — Why this matters now Legal AI has a habit of oversimplifying judgment. In the race to automate legal reasoning, we have learned how to encode rules, then factors, and eventually hierarchies of factors. But something stubborn keeps leaking through the abstractions: strength. Not whether a reason exists — but how strongly it exists. ...

December 16, 2025 · 4 min · Zelina
Cover image

When Reasoning Needs Receipts: Graphs Over Guesswork in Medical AI

Opening — Why this matters now Medical AI has a credibility problem. Not because large language models (LLMs) can’t answer medical questions—they increasingly can—but because they often arrive at correct answers for the wrong reasons. In medicine, that distinction is not academic. A shortcut that accidentally lands on the right diagnosis today can quietly institutionalize dangerous habits tomorrow. ...

December 16, 2025 · 3 min · Zelina
Cover image

When Rewards Learn Back: Evolution, but With Gradients

Opening — Why this matters now Reinforcement learning has always had an uncomfortable secret: most of the intelligence is smuggled in through the reward function. We talk about agents learning from experience, but in practice, someone—usually a tired engineer—decides what “good behavior” numerically means. As tasks grow longer-horizon, more compositional, and more brittle to specification errors, this arrangement stops scaling. ...

December 16, 2025 · 4 min · Zelina
Cover image

When Small Models Learn From Their Mistakes: Arithmetic Reasoning Without Fine-Tuning

Opening — Why this matters now Regulated industries love spreadsheets and hate surprises. Finance, healthcare, and insurance all depend on tabular data—and all have strict constraints on where that data is allowed to go. Shipping sensitive tables to an API-hosted LLM is often a non‑starter. Yet small, on‑prem language models have a reputation problem: they speak fluently but stumble over arithmetic. ...

December 16, 2025 · 3 min · Zelina
Cover image

Benchmarks on Quicksand: Why Static Scores Fail Living Models

Opening — Why this matters now If you feel that every new model release breaks yesterday’s leaderboard, congratulations: you’ve discovered the central contradiction of modern AI evaluation. Benchmarks were designed for stability. Models are not. The paper you just uploaded dissects this mismatch with academic precision—and a slightly uncomfortable conclusion: static benchmarks are no longer fit for purpose. ...

December 15, 2025 · 3 min · Zelina
Cover image

Green Is the New Gray: When ESG Claims Meet Evidence

Opening — Why this matters now Everyone suddenly cares about sustainability. Corporations issue glossy ESG reports, regulators publish directives, and investors nod approvingly at any sentence containing net-zero. The problem, of course, is that words are cheap. Greenwashing—claims that sound environmentally responsible while being misleading, partial, or outright false—has quietly become one of the most corrosive forms of corporate misinformation. Not because it is dramatic, but because it is plausible. And plausibility is exactly where today’s large language models tend to fail. ...

December 15, 2025 · 4 min · Zelina
Cover image

Kill the Correlation, Save the Grid: Why Energy Forecasting Needs Causality

Opening — Why this matters now Energy forecasting is no longer a polite academic exercise. Grid operators are balancing volatile renewables, industrial consumers are optimizing costs under razor‑thin margins, and regulators are quietly realizing that accuracy without robustness is a liability. Yet most energy demand models still do what machine learning does best—and worst: optimize correlations and hope tomorrow looks like yesterday. This paper argues that hope is not a strategy. ...

December 15, 2025 · 4 min · Zelina