Packing a Punch: How Model-Based AI Outperformed Decades of Sphere-Packing Theory

Expensive experiments have a nasty habit: they punish enthusiasm.

In many AI success stories, the hidden luxury is cheap feedback. Generate a million candidates, test them quickly, keep the survivors, call it discovery. This is the comfortable world of many coding benchmarks, puzzle solvers, and evolutionary search systems. It is not the world of high-precision semidefinite programming, where evaluating one candidate can take days and where a bad guess does not merely waste a GPU minute; it quietly burns a serious slice of the research budget.

That is why the new sphere-packing paper is interesting. Not because it proves the optimal way to stack spheres in every dimension. It does not. Not because it throws a large language model at geometry and waits for mathematical revelation to fall from the ceiling. It does not do that either, mercifully.

The paper, Model-Based and Sample-Efficient AI-Assisted Math Discovery in Sphere Packing, does something more operationally useful: it turns the construction of sphere-packing upper-bound certificates into a structured game, then uses model-based search to decide which very expensive mathematical objects are worth evaluating.¹ The headline is that the method reports new best-known upper bounds in twelve dimensions: $n=4,5,6,7$ and $n=9,\ldots,16$. The deeper lesson is about how AI can help when evaluation is scarce, constraints are rigid, and brute force is not a strategy but a very expensive personality flaw.

The mechanism matters more than the record table

Sphere packing asks how densely equal spheres can be arranged in $n$-dimensional Euclidean space without overlap. The question sounds harmless, almost decorative. It is not. Exact optima are known only in a few dimensions, including $n=2,3,8,$ and $24$. The famous eight-dimensional and twenty-four-dimensional results are not just numerical achievements; they are landmarks in modern mathematics.

For most dimensions, researchers do not know the exact optimum. They work with bounds. Lower bounds come from constructing packings. Upper bounds prove that no packing can exceed a certain density. When the gap between lower and upper bounds narrows, the mathematical fog thins.

The paper focuses on upper bounds generated through the three-point method, a semidefinite programming approach that improves on older two-point linear-programming bounds by incorporating constraints among triples of sphere centers. In simplified form, the method says: choose geometric parameters, choose admissible auxiliary functions, build an SDP, solve it, and the SDP objective gives a valid upper bound.

The catch is that the SDP is not unique. Different choices of parameters and functions create different SDP instances, and different instances can produce tighter or weaker upper bounds.

So the real problem becomes:

$$ \text{Find the SDP formulation that gives the smallest valid upper bound.} $$

That is a search problem, but not the kind that modern AI usually enjoys. The search space is mixed: continuous geometric parameters on one side, discrete structured polynomial constructions on the other. The evaluation is expensive: each candidate SDP can take days. The constraints are delicate: arbitrary functions do not qualify, because the certificate must remain mathematically valid.

This is where the paper’s mechanism begins.

The SDP Game: make valid moves before asking AI to optimise

The authors define an “SDP Game,” a single-player sequential decision process. The player is the AI system. The environment is the SDP solver. The reward is the quality of the upper bound returned by solving the candidate SDP.

The game has two levels.

First, the agent chooses continuous geometric parameters $(r,R)$. These parameters define the admissible function class $F(r,R)$ and therefore shape the search space available in the next stage.

Second, conditioned on $(r,R)$, the agent constructs a pair of admissible functions $(f_1,f_2)$ that define the SDP certificate.

This second stage is where the paper makes an important design move. The agent does not directly search over arbitrary functions. That would be mathematically chaotic and computationally useless. Instead, the authors encode valid function construction through a discrete vocabulary of polynomial building blocks and grammar tokens. The agent assembles polynomial “sentences” from this vocabulary. Because the tokens and grammar are designed around the three-point method’s admissibility rules, the search space is constrained to mathematically meaningful candidates.

That sounds technical because it is. But the business translation is simple: the system does not ask AI to be magically creative in an unconstrained universe. It first builds a safe design language, then lets AI search inside that language.

That distinction matters. Many failed “AI for expert work” products make the opposite mistake. They give a model a vague objective, an enormous action space, and a final evaluator that catches mistakes too late. Here, validity is pushed into the representation. The agent can explore, but it explores through legal moves.

Bayesian optimisation chooses where to look; MCTS chooses what to build

Once the search problem is turned into a game, the paper uses two model-based tools.

Bayesian optimisation handles the continuous parameters $(r,R)$. This is a sensible choice because BO is designed for expensive black-box optimisation. It builds a probabilistic surrogate model of the objective, estimates both expected performance and uncertainty, and uses an acquisition rule to decide the next candidate to evaluate. In plain language: it tries not to waste expensive solver calls on regions that are probably bad unless uncertainty makes them worth investigating.

Monte Carlo Tree Search handles the discrete polynomial construction. Each node in the tree represents a partially constructed sequence of polynomial tokens. Edges represent valid next moves. The search alternates among selection, expansion, simulation, and backpropagation, gradually concentrating effort on polynomial families that appear promising while still probing alternatives.

The full loop is:

Stage	What the system decides	Tool used	Why it matters
Continuous search	Which $(r,R)$ region to test	Bayesian optimisation	Reduces waste when solver calls are scarce
Valid construction	Which polynomial certificate structure to assemble	MCTS	Searches a large discrete grammar without losing admissibility
Evaluation	How good the candidate SDP is	High-precision SDP solver	Produces the certified upper bound
Feedback	What the next search should exploit or explore	Surrogate update	Turns each expensive evaluation into future guidance

This is the paper’s real contribution. The new bounds are the evidence. The mechanism is the transferable idea.

The reported bounds improve twelve dimensions, but the size of the improvement needs context

The results table reports improved upper bounds over both classical LP bounds and prior three-point constructions for twelve dimensions: $4$ through $7$, and $9$ through $16$. Dimension $8$ is excluded from this main table because the optimal packing density is already known.

The numerical improvements are small in absolute terms. That is normal in this field. Upper-bound progress in a mature mathematical area often looks tiny to outsiders and significant to specialists. A few final digits can matter when decades of work have already squeezed the obvious slack out of the problem.

Dimension $n$	Prior three-point bound	AI-based bound	New polynomial terms
4	0.63610733	0.63610277	83.3%
5	0.51264513	0.5122645	85.1%
6	0.41030328	0.4102905	84.7%
7	0.32114711	0.32111475	85.1%
9	0.19112042	0.19099063	83.7%
10	0.14341009	0.14340271	83.3%
11	0.10672529	0.10672201	80.8%
12	0.07971177	0.0797066	80.0%
13	0.06016444	0.06012103	84.4%
14	0.04506122	0.04501031	82.5%
15	0.03375644	0.03372585	81.6%
16	0.02499441	0.02492121	82.9%

The “new polynomial terms” column is more than decoration. It tells us that the agent is not merely reusing the old human-designed certificate structure and nudging a parameter. Across the reported dimensions, roughly 80–85% of the monomial terms in the agent’s construction are absent from prior state-of-the-art formulations.

But the paper also reports that the agent recovers core monomials identified by human experts. That is the useful combination: novelty without total amnesia. The system expands the construction space while retaining important known components. In business language, this is not “replace the expert.” It is “search the adjacent design space without throwing away the expert’s validated primitives.” Less cinematic, more useful.

The structure analysis is not a side note; it explains why search worked

The paper does not stop at the result table. It analyses what kinds of polynomial structures the agent discovered. This section functions as diagnostic evidence, not just mathematical ornament.

The main observation is that frequently selected new polynomials tend to be low-degree. In the SDP formulation, low-degree polynomials correspond to variables in larger SDP blocks, giving the optimisation more freedom to reshape the feasible region. The agent repeatedly returns to these components because they are high-leverage parts of the certificate.

This matters because it changes the interpretation of the result. A shallow reading would say: “AI found many new polynomial terms.” The better reading is: “The search process learned where the SDP was most sensitive.”

That is a different claim. The first sounds like brute combinatorial novelty. The second sounds like useful optimisation intelligence.

Higher-degree polynomials still appear, but less frequently. The paper interprets them as fine-grained refinements rather than the main source of leverage. This division is important:

Discovered structure	Likely role in the SDP certificate	Interpretation
Low-degree recurring polynomials	High-impact freedom in larger SDP blocks	Main leverage for tightening bounds
Higher-degree less frequent polynomials	Local refinement of constraints	Secondary tuning around promising structures
Recovered expert monomials	Known useful certificate components	Validation that search is not drifting randomly
New monomial combinations	Previously unexplored function-space regions	Evidence of genuine search expansion

This is the kind of evidence that makes the paper more interesting than a record announcement. The authors are not merely reporting that the agent reached better numbers. They are showing that the agent repeatedly exploited structurally meaningful parts of the SDP landscape.

For AI deployment, that distinction is gold. A system that gives a better answer once is a demo. A system that reveals where the leverage sits is a workflow.

Releasing $r=1$ turns a narrow tradition into a wider search landscape

The paper also examines the continuous geometric parameters. In prior mathematical approaches, researchers often fixed $r=1$ and varied $R$, effectively searching along a one-dimensional slice. The authors let the agent optimise jointly over $(r,R)$.

Figure 5 in the paper is best read as a sensitivity and exploration diagnostic. It shows the optimiser first sampling broadly across the two-dimensional parameter space, then concentrating near regions associated with tighter bounds in dimensions 14 and 16. The important point is not merely that BO found better numbers. It found high-performing regions outside the classical one-dimensional search regime.

That is a recurring pattern in expert optimisation problems. Human practice often narrows the search space for good reasons: tractability, convention, interpretability, prior success. Over time, the narrowed space becomes the space. The paper’s mechanism makes it possible to relax one of those inherited restrictions while still preserving validity through the SDP construction.

The practical lesson is not “AI ignores experts.” Please, no. The lesson is more precise: AI can help reopen parts of the design space that experts previously compressed for tractability, provided the reopened space is still governed by valid rules.

The eight-dimensional case is validation, not the main trophy

Dimension $8$ is special because the optimal packing density is already known through Viazovska’s solution. The paper uses $n=8$ as a case study rather than part of the main record table.

This case has a different evidential purpose. It is not claiming a new optimum. Instead, it asks whether the model-based search can move toward known deep structure without being given the special modular-form machinery behind the proof.

The reported result is that the discovered functions increasingly resemble the required sign and Fourier-transform behavior associated with the known solution. Numerically, the paper reports:

Method	Bound in $n=8$
LP bound	0.253670
Three-point method	0.2536699179
SDP Games	0.2536695134
Known optimum	0.2536695079

This is close to the optimum, though not equal to it. The authors also note that one condition is not fully met: the discovered function’s zero is close to $\sqrt{2}$ but not exactly there, and the normalisation behavior has not fully converged. That makes the case study an exploratory validation, not a proof of rediscovering Viazovska’s method.

Used properly, this is still valuable. It shows that the search dynamics can move toward a known optimal structure from the geometry of the optimisation problem itself. Used improperly, it becomes the usual nonsense headline: “AI rediscovers Fields Medal mathematics.” It does not. The paper is better than that headline, which is fortunate, because the headline is exhausting.

What this paper directly shows

The paper directly supports four claims.

First, SDP construction for sphere-packing upper bounds can be formulated as a hierarchical search game over continuous parameters and discrete polynomial certificates.

Second, a model-based combination of Bayesian optimisation and MCTS can search this space under severe evaluation constraints.

Third, the resulting system reports new best-known upper bounds in twelve dimensions, improving on prior LP and three-point bounds.

Fourth, the discovered constructions are structurally interpretable: they contain many new monomials, recover important known components, and repeatedly exploit low-degree polynomial terms that appear to have high leverage in the SDP.

That is the direct evidence. Everything beyond that is interpretation.

What Cognaptus infers for business use

The business relevance is not that companies should suddenly care about sphere packing unless they work in coding theory, materials, crystallography, geometric optimisation, or closely related domains. Most firms do not wake up needing a better upper bound in dimension 14. Their calendars are already full of smaller tragedies.

The relevance is the workflow pattern.

Many business and engineering problems share the paper’s uncomfortable structure:

Candidate evaluation is expensive.
Validity constraints are strict.
The search space mixes continuous parameters and discrete design choices.
Historical expert practice has narrowed the search space.
Brute-force experimentation is infeasible.
The final evaluator is reliable but slow.

Examples include materials design, logistics network design with regulatory constraints, pricing policy search under simulation, portfolio construction with risk constraints, manufacturing process tuning, energy-grid optimisation, and solver-heavy operations research.

The paper suggests a disciplined architecture for such settings:

Paper mechanism	Business analogue	Operational value
Encode admissible polynomial vocabulary	Encode valid design actions, policy rules, or engineering constraints	Prevent invalid candidates before evaluation
Use BO for continuous parameters	Tune expensive numerical or operational parameters	Reduce wasted simulations
Use MCTS for structured discrete choices	Explore process designs, network structures, or rule combinations	Search combinatorial options without pure brute force
Feed solver results back into surrogates	Learn from each costly experiment	Improve sample efficiency
Analyse discovered structures	Identify reusable motifs	Convert optimisation output into expert insight

The most important idea is validity-first search. Before optimising, define a design language where every generated candidate is legal or at least close enough to be repairable. Then allocate expensive evaluation to candidates chosen by uncertainty-aware search, not by random enthusiasm.

This is different from using an LLM as a clever suggestion box. It is closer to building an expert-aware search machine.

Boundaries: where the lesson does not automatically transfer

The paper’s method is powerful because the target problem has a formal structure. There is a clear evaluator: solve the SDP and obtain an upper bound. There are admissibility rules. There is a meaningful way to encode candidate functions as grammatical token sequences. There is enough mathematical structure for surrogate-guided search to be more than decorative curve-fitting.

Not every business problem has those properties.

If the evaluation metric is vague, delayed, politically manipulated, or impossible to compare across candidates, the loop weakens. If the constraint system is not encoded cleanly, the agent may generate attractive nonsense. If the solver or simulator is unstable, the surrogate learns from noisy shadows. If the search space cannot be decomposed into continuous and discrete structure, the BO-plus-MCTS split may be the wrong tool.

There is also a scaling boundary inside the paper itself. The authors acknowledge that large SDP solving remains the bottleneck. Surrogate models improve sample efficiency, but they do not eliminate expensive evaluation. Higher-dimensional settings, richer higher-point bounds, and more complex certificate classes may require faster solvers, better warm starts, differentiable approximations, or hybrid symbolic guidance.

So the business conclusion should be sober: model-based AI can be extremely useful when expensive evaluation is paired with strong structure. It is not a universal replacement for experiments, solvers, experts, or good problem formulation. How rude of reality to remain involved.

The real punch: AI as a search discipline, not a sample factory

This paper is easy to misread because the record table is tempting. Twelve improved dimensions. Eighty-plus percent new monomial terms. A near-optimal eight-dimensional case study. All good headlines.

But the better story is mechanical.

The authors take a mature mathematical method, identify where human search is bottlenecked, encode valid moves, split the search into continuous and discrete levels, apply sample-efficient model-based optimisation, and then interpret the discovered structures. That is a serious template for AI-assisted discovery in domains where every evaluation hurts.

The contribution is not that AI “outperformed decades of sphere-packing theory” in the cartoon sense. It did not replace the theory; it stood on it. The three-point method, admissibility conditions, SDP solvers, polynomial grammar, and prior constructions are all part of the machinery. The AI contribution is to navigate that machinery more systematically than manual search can.

That is precisely why the work matters. The future of AI in difficult scientific and operational settings will not be won by systems that merely generate more candidates. It will be won by systems that know which candidates deserve to be generated at all.

And in expensive optimisation, that is the difference between intelligence and a bonfire with a dashboard.

Cognaptus: Automate the Present, Incubate the Future.

Rasul Tutunov, Alexandre Maraval, Antoine Grosnit, Xihan Li, Jun Wang, and Haitham Bou-Ammar, “Model-Based and Sample-Efficient AI-Assisted Math Discovery in Sphere Packing,” arXiv:2512.04829, 2025, https://arxiv.org/abs/2512.04829. ↩︎

Packing a Punch: How Model-Based AI Outperformed Decades of Sphere-Packing Theory#

The mechanism matters more than the record table#

The SDP Game: make valid moves before asking AI to optimise#

Bayesian optimisation chooses where to look; MCTS chooses what to build#

The reported bounds improve twelve dimensions, but the size of the improvement needs context#

The structure analysis is not a side note; it explains why search worked#

Releasing $r=1$ turns a narrow tradition into a wider search landscape#

The eight-dimensional case is validation, not the main trophy#

What this paper directly shows#

What Cognaptus infers for business use#

Boundaries: where the lesson does not automatically transfer#

The real punch: AI as a search discipline, not a sample factory#