From Building Blocks to Breakthroughs: Why RL Finally Teaches Models to Think
Training an AI model is often sold like a kitchen renovation: add more data, add reinforcement learning, install the shiny reasoning countertop, and suddenly the whole thing looks expensive enough to be intelligent. This paper is useful because it ruins that brochure. The authors of Atomic Skills are the Prerequisite: When Reinforcement Learning Synthesizes Compositional Reasoning, and When It Only Amplifies ask a deceptively simple question: does reinforcement learning create new reasoning ability, or does it only increase the probability of behaviors the model could already produce?1 Their answer is not the clean slogan either camp wants. RL can synthesize new compositional reasoning, but only when the model has already learned the right underlying atomic skills. Without that foundation, RL mostly polishes whatever behavior already exists. Sometimes that is reasoning. Sometimes it is just a better-trained shortcut wearing a lab coat. ...