Opening — Why this matters now
For decades, heuristic design has been a quiet tax on optimization. Every serious deployment of A* or tree search comes with a familiar cost: domain experts handcraft rules, tune parameters, and babysit edge cases. The process is expensive, slow, and brittle. Large Language Models promised automation—but until recently, mostly delivered clever greedy tricks for toy problems.
This paper argues something sharper: the bottleneck was never model intelligence, but context. When LLMs are given the algorithm itself, not just the problem description, they stop guessing and start collaborating.
Background — From handcrafted heuristics to LLM evolution
Classic A* search lives or dies by its heuristic function. Accuracy improves pruning; admissibility preserves optimality. Historically, this meant Manhattan distance, pattern databases, or domain-specific counting rules—each requiring human insight and long iteration cycles.
Recent work like FunSearch and Evolution of Heuristics (EoH) reframed the problem: let LLMs evolve heuristics through an evolutionary loop. Candidate heuristics are generated, evaluated, mutated, and retained. This works well for constructive problems (TSP, bin packing), but struggles when feasibility is state-dependent and early decisions lock the search space.
In short: greedy heuristics scale poorly when constraints matter. A* does better—but only if its heuristic understands how A* actually works.
Analysis — Algorithmic Contextual Evolution of Heuristics (A-CEoH)
The core contribution of the paper is deceptively simple: embed the A algorithm code directly into the prompt*.
Instead of telling the LLM what the problem is, A-CEoH shows it how its output will be used:
- The A* driver loop
- How
f(n) = g(n) + h(n)is computed - How nodes are expanded, queued, and terminated
- Where the heuristic influences pruning and runtime
This algorithmic context is problem-agnostic. Swap the domain-specific functions (neighbors, goal test), keep the same A* skeleton, and the prompt still works.
The authors compare four setups:
| Framework | Problem Context | Algorithm Context |
|---|---|---|
| EoH | ✗ | ✗ |
| P-CEoH | ✓ | ✗ |
| A-CEoH | ✗ | ✓ |
| PA-CEoH | ✓ | ✓ |
The bet: LLMs reason better when they see execution semantics, not just natural-language descriptions.
Findings — When smaller models beat bigger ones
Two testbeds anchor the results:
- Unit-Load Pre-Marshalling Problem (UPMP) — a niche warehouse optimization task with little prior heuristic literature
- Sliding Puzzle Problem (SPP) — a classic benchmark where strong human-designed heuristics already exist
Key empirical outcomes:
- Algorithmic context consistently improves heuristic quality over baseline EoH.
- Small, coding-oriented models outperform larger general models when given algorithmic prompts.
- LLM-generated heuristics beat handcrafted ones on both speed and solvability for hard instances.
A striking example: Qwen2.5-Coder-32B, with A-CEoH, outperforms GPT-4o on both domains—despite being orders of magnitude smaller.
Performance snapshot
| Domain | Best Framework | Notable Result |
|---|---|---|
| UPMP | PA-CEoH | Solves instances human heuristics fail to solve |
| SPP (20×20) | A-CEoH | Solves puzzles where pattern databases are infeasible |
Token usage increases modestly, but output becomes more focused—less verbosity, more executable signal.
Implications — Heuristic design as infrastructure, not craft
This paper quietly changes the ROI calculus of optimization engineering:
- Heuristics become cheap to regenerate when instance distributions shift
- Algorithm-aware prompting beats brute-force scaling of model size
- Local, deployable models become viable for serious operations research
More broadly, A-CEoH suggests a general pattern: LLMs perform best not when treated as oracles, but as junior engineers reading your codebase.
This reframes prompt engineering away from clever wording toward execution-grounded interfaces.
Conclusion — The algorithm is the prompt
The lesson is blunt: if you want LLMs to design algorithms, stop hiding the algorithms from them.
By embedding A* directly into the prompt, the authors show that heuristic discovery can be automated, accelerated, and even improved beyond human baselines—without larger models or exotic training.
Algorithmic context is not a prompt trick. It is the missing abstraction layer between symbolic algorithms and statistical models.
Cognaptus: Automate the Present, Incubate the Future.