Cover image

Auditing the Illusion of Forgetting: When Unlearning Isn’t Enough

Deletion requests sound simple until the model answers politely. A user asks for data to be removed. A publisher demands that copyrighted passages stop being reproduced. A compliance team wants evidence that a fine-tuned model no longer carries traces of a forbidden dataset. The model is run through an unlearning method, the surface tests improve, the dashboard turns less red, and everyone enjoys the brief spiritual comfort of a green checkmark. ...

January 22, 2026 · 17 min · Zelina
Cover image

When Models Start to Forget: The Hidden Cost of Training LLMs Too Well

Duplicates are supposed to be boring. In data engineering, duplicate records are usually treated as a hygiene problem: remove them, clean the pipeline, reduce noise, move on. In language-model training, repetition is less innocent. Repeated text can help a model learn an underrepresented domain. It can also teach the model to reproduce specific sequences too well. Somewhere between “useful exposure” and “verbatim recall,” a model stops learning only the pattern and starts carrying around the document. ...

January 3, 2026 · 16 min · Zelina