Cover image

The Sink That Remembers: Solving LLM Memorization Without Forgetting Everything Else

TL;DR for operators Deletion is simple in a database. It is not simple in a neural network that has already used the deleted record to improve its internal machinery. That is the unpleasant little invoice this paper presents. Gaurav R. Ghosal, Pratyush Maini, and Aditi Raghunathan study why repeated natural text is hard to remove from language models after training, then propose MemSinks, a training-time mechanism designed to make memorization easier to isolate later.1 The important shift is not “better pruning.” It is architectural accounting. Instead of hoping that memorized text happens to live in a few removable neurons, MemSinks gives repeated sequences a controlled place to accumulate memorization during training. ...

July 15, 2025 · 19 min · Zelina
Cover image

Thinking Inside the Gameboard: Evaluating LLM Reasoning Step-by-Step

TL;DR for operators Most AI evaluations still ask the wrongly narrow question: did the model get the answer right? That is useful, but it is not enough when the model is expected to act as an agent, revise plans, obey constraints, and recover from failure without turning the workflow into a procedural bonfire. ...

June 20, 2025 · 16 min · Zelina
Cover image

What Happens in Backtests… Misleads in Live Trades

TL;DR for operators A beautiful backtest can still be a lie. Not because the model is malicious, obviously; spreadsheets have not yet formed a union. The problem is simpler and more expensive: a model can fit past data while misrepresenting the thing you actually care about. Charles Rathkopf’s paper on hallucination and reliability in scientific generative AI gives operators a useful way to think about this problem.1 It argues that hallucination should not be defined mainly as deviation from training data. In science, and in business domains that behave like science, the real question is whether an output misrepresents the target phenomenon: a protein, a weather system, a molecule, a patient, a market, a factory, a supply chain. ...

April 15, 2025 · 17 min · Zelina