Gradient Customs: AlphaToken Checks Which Tokens Are Allowed to Train
Fine-tuning looks deceptively democratic. Every response token gets its little vote in the gradient. The commas, the boilerplate, the obvious connective tissue, the wrong kind of certainty, the genuinely task-bearing step in the middle of the answer: all are invited to update the model. A charmingly egalitarian arrangement. Also a rather efficient way to teach a model to forget things it used to know. ...