Efficient Reasoning

TL;DR for operators SynAdapt is not a paper about making models “think secretly” because mystery sells better on conference posters. It is a paper about inference budgeting: when a model should spend tokens explaining its reasoning, and when it can compress that reasoning into latent vectors and move on. The method trains a large language model to use synthetic continuous chain-of-thought—CCoT—as a dense internal reasoning representation instead of generating long natural-language reasoning traces. For easier problems, the model answers using this latent representation directly. For harder problems, a difficulty classifier detects that silent reasoning is likely insufficient and routes the question back to discrete chain-of-thought, with a prompt that keeps the re-thinking concise.1 ...