From Building Blocks to Breakthroughs: Why RL Finally Teaches Models to Think
Opening — Why this matters now Large Language Models keep telling us they can “reason”—yet break spectacularly the moment a question requires combining two simple facts that sit in different parts of their memory. The industry’s response has been predictable: train bigger models, gather more data, sprinkle some RL on top, and pray. This new paper—From Atomic to Composite: Reinforcement Learning Enables Generalization in Complementary Reasoning【filecite:turn0file0】—politely shatters that illusion. It suggests something delightfully inconvenient: models don’t generalize because they’re big; they generalize because their training curriculum actually makes sense. And most current curricula do not. ...