Opening — Why this matters now
Reasoning models are having a moment. Latent-space architectures promise to outgrow chain-of-thought without leaking tokens or ballooning costs. Benchmarks seem to agree. Some of these systems crack puzzles that leave large language models flat at zero.
And yet, something feels off.
This paper dissects a flagship example—the Hierarchical Reasoning Model (HRM)—and finds that its strongest results rest on a fragile foundation. The model often succeeds not by steadily reasoning, but by stumbling into the right answer and staying there. When it stumbles into the wrong one, it can stay there too.
That distinction matters. Especially if we plan to deploy “reasoning” systems in settings where retries, luck, or majority vote are not acceptable safeguards.
Background — Context and prior art
HRM belongs to a growing family of recursive latent reasoning models. Instead of emitting intermediate reasoning tokens, the model repeatedly updates a hidden state. Each recursion is meant to refine the solution, much like a human revisiting a problem step by step.
Training relies on two key ideas:
- Deep supervision: every recursive step is evaluated against the same final label.
- One-step gradient approximation: gradients do not propagate through the entire recursion, making long reasoning chains computationally feasible.
This setup quietly assumes a property borrowed from implicit models: once the model finds the correct solution, the hidden state should stabilize. Formally, the solution should be a fixed point of the recursion.
The paper’s core question is simple: does HRM actually behave this way?
Analysis — What the paper does
The authors run a mechanistic analysis of HRM on Sudoku-Extreme, a dataset where standard LLMs fail completely.
Three findings stand out.
1. Failure on extremely simple puzzles
HRM can solve some of the hardest Sudokus available. But give it a nearly completed grid—sometimes with just one missing cell—and it may fail outright.
The reason is unsettling. Even after reaching the correct answer early, the model continues updating its latent state and drifts away. The fixed point assumption breaks. Stability was never truly learned; it only emerged late in training for puzzles similar to those seen before.
The fix is mundane but revealing: data mixing. By augmenting training data with partially solved puzzles, the model learns to remain stable once it is right.
2. Grokking dynamics instead of gradual refinement
On average, more recursive steps reduce loss. Per sample, the story is different.
Individual puzzles show long plateaus where the error barely changes, followed by a sudden collapse to zero—or not at all. The recursion does not look like incremental reasoning. It looks like repeated attempts to land in a good region of latent space.
The authors describe this as grokking along segments: nothing happens, until suddenly everything does.
3. Multiple fixed points, only one of them correct
By projecting latent trajectories into a lower-dimensional space, the paper identifies spurious fixed points.
These are internally consistent but wrong solutions. Once the model enters their neighborhood, it tends to stay there. Escaping requires a jump that temporarily increases error—something the dynamics rarely encourage.
In effect, HRM often commits to the first fixed point it finds. Correct or not.
Findings — Results with visualization
The paper’s interventions follow directly from this diagnosis. If the model is guessing fixed points, performance should scale with the quality and diversity of guesses.
| Technique | Exact Accuracy |
|---|---|
| Baseline HRM | 54.5% |
| + Data mixing | 59.9% |
| + Input relabeling | 73.2% |
| + Model bootstrapping | 64.7% |
| All combined (Augmented HRM) | 96.9% |
Three levers matter:
- Data augmentation improves the quality of guesses by enforcing stability.
- Input perturbation (e.g., relabeling Sudoku symbols) creates multiple semantic views of the same problem.
- Model bootstrapping exploits variation across nearby training checkpoints.
Together, they turn a brittle reasoner into a highly reliable one—without changing the core architecture.
Implications — What this means beyond Sudoku
The uncomfortable takeaway is that recursive depth does not guarantee reasoning in the human sense. HRM’s recursion mostly amplifies search, not deliberation.
That insight generalizes.
- For agentic systems, retries and perturbations may matter more than longer thought chains.
- For evaluation, single-pass accuracy hides instability and false convergence.
- For governance and assurance, fixed-point behavior deserves explicit testing, especially in safety-critical tasks.
Perhaps most importantly, the paper offers a vocabulary—fixed points, attractors, grokking plateaus—that travels well beyond Sudoku.
Conclusion — Guessing, but with intent
HRM does not reason the way we like to imagine. It guesses latent solutions and sticks to the first one that feels stable.
Once you accept that, the path forward becomes clearer. Don’t just scale models or depth. Scale attempts. Encourage diversity. Enforce stability. Treat reasoning as navigation through a rugged landscape, not a straight line.
That may be less romantic than “thinking machines.” It is also far more useful.
Cognaptus: Automate the Present, Incubate the Future.