Play by Automata: How Regular Games Rewrites the Rules of General Game Playing

A game engine is usually where rules go to become software. Someone writes the rules, someone else encodes the rules, and an AI agent then spends its expensive little life asking the engine what moves are legal, what happens next, and whether it has already lost. Very glamorous. Very repetitive.

General Game Playing tries to remove the hand-built engine from that loop. Instead of building a custom simulator for chess, backgammon, Amazons, Reversi, or some procedural oddity invented on a tired Wednesday afternoon, a game is described in a formal language and a generic system turns that description into something agents can use.

That promise has always had a catch. Formal languages that are elegant enough to describe many games tend to be slow. Languages that are fast tend to be narrow, stuffed with assumptions about boards, pieces, or move formats. The paper behind Regular Games, or RG, is an attempt to stop choosing between generality and speed quite so miserably.¹

Its core move is architectural. RG treats game rules as a finite automaton: a graph of nodes and labelled edges, where edge actions test conditions, assign variables, reveal move tags, or perform reachability checks. Human-friendly languages sit above it. A compiler and optimiser sit below it. Agents interact with the generated C++ forward model, not with a squishy pile of bespoke rule logic.

That matters because the paper is not really saying, “Here is a better way to encode board games.” It is saying something more useful: if you want many rule-governed environments to feed agents, benchmarks, procedural generators, or simulators, the boring middle layer is where the leverage lives.

Naturally, the boring middle layer is called an automaton. Academia does enjoy making its gifts work for attention.

The key mechanism is not a game language; it is a rule compiler stack

Regular Games is easiest to misunderstand if we start from games. So let us start from infrastructure.

The system has three conceptual layers:

Layer	What it is	Why it matters
High-level authoring	HRG and specialised generators such as LineGames	Humans and procedural systems can describe games without writing raw automata
Low-level intermediate representation	RG, an NFA-style automaton language	Rules become compact, analyzable, transformable, and language-agnostic
Execution layer	Optimised C++ forward model	Agents get fast legal-move generation, move application, terminal checks, and score queries

The paper’s important design choice is the separation between pleasant description and efficient execution. Ludii, one of the major comparison points, builds many game concepts directly into a rich language. Regular Boardgames, another comparison point, stays minimal and fast, but is mainly aimed at deterministic perfect-information board games. RG tries to borrow the useful parts of both: high-level convenience above, compact compiled representation below.

The low-level RG language itself is deliberately small. A game description contains finite types, variables, constants, and automaton edges. The edge actions are few: empty actions, comparisons, assignments, reachability checks, and tags. Shorthand actions such as “assign any value from this type” and variable tags exist for convenience, but can be expanded away. Pragmas provide implementation hints without changing the game tree.

This is why the “assembler” analogy works. RG is not meant to be the language every designer loves writing directly. It is meant to be the representation other languages compile into. HRG gives more familiar declarative and structural constructs. LineGames, implemented as a Python library, provides a specialised generator for Alquerque-like games on line boards and is used to define 23 existing games. RBG descriptions can be translated into RG through an automaton construction. GDL translation is also present, although the paper treats it as experimental.

That stack is the paper’s real business-facing idea. In agent infrastructure, most cost does not come from inventing the abstract notion of an action. It comes from maintaining many incompatible environments, each with its own semantics, edge cases, performance profile, and tooling. A common intermediate representation gives you one place to optimise, validate, visualise, debug, and compile. Obvious in hindsight. Still oddly rare in practice, because everybody prefers inventing one more framework with a heroic README.

RG’s universality claim is precise, and not as comforting as it sounds

The paper proves that RG can represent every finite turn-based game with imperfect information and rational randomness. The authors formulate this by reducing finite extensive-form games into RG. Imperfect information is handled by controlling visibility of tags, so each player sees only the relevant obfuscated move information. Randomness is represented through the special random player, with rational probability distributions achieved by duplicating choices.

This is a strong expressiveness result. It puts RG in the same broad theoretical class as GDL-II and Ludii for finite turn-based games with imperfect information and randomness. It also broadens the design space beyond deterministic board games.

But universality is not a free lunch. It is not even a discounted lunch. The paper also proves that deciding whether the initial state has a legal move is EXPSPACE-complete in the general case. The same complexity applies to verifying the conditions that make a game description proper. If the maximum type length is fixed, those problems fall to PSPACE-complete.

That distinction is important. RG can express a broad class of games because its states can be extremely succinct. Succinct representation is powerful precisely because it can hide enormous state spaces behind compact descriptions. Business readers should translate this carefully: “universal” means “the representation is capable,” not “every encoded environment will be cheap to reason about.”

A useful replacement belief is this:

Reader belief	Correction	Practical meaning
A universal game language should make all games easy to run	RG can encode the class, but legal-move reasoning is still hard in the worst case	Engineering performance depends on the encoding, optimisations, and game structure
A low-level automaton is too primitive for real authoring	RG is the target layer; HRG and specialised generators are the authoring layer	The stack separates designer productivity from runtime efficiency
Benchmarks prove a universal speed advantage	The benchmarks show strong performance on implemented games under a specific setup	The result is evidence for the architecture, not a magic guarantee

The misconception matters because it changes how to evaluate the paper. The correct question is not, “Has RG made game reasoning easy?” It has not, and cannot, unless complexity theory takes an unexpected sabbatical. The better question is, “Does RG create a representation where many useful games become easier to optimise, compile, and benchmark?” On the paper’s evidence, yes.

The automaton gives optimisers something to chew on

Once rules are represented as a graph, optimisation becomes a compiler problem. That is the paper’s second contribution, and probably the most commercially legible one.

The authors build a transformation pipeline that simplifies automata while preserving correctness. Some transformations are familiar from programming-language compilers: constant propagation, assignment inlining, pruning unused variables, and removing unreachable nodes. Others are specific to the RG setting: skipping redundant tags, simplifying reachability checks, joining equivalent automaton paths, and removing artificial tags.

Crucially, the optimiser is not a loose bag of tricks. Transformations are run in a fixed-point loop because one simplification can enable another. Data-flow analysis tracks information about game state at automaton nodes. Validators run after transformations to check properties such as type correctness, reachability plausibility, and map correctness.

The appendix figures are best read as implementation evidence, not a second thesis. Figure 47 shows how individual optimisations affect metrics such as node counts, edge counts, skip edges, reachability subautomata, state size, and branching. Figure 48 shows interaction effects: enabling one transformation can improve or degrade the usefulness of others. The point is less “this one optimisation is king” and more “the compiler pipeline behaves like a pipeline.” One pass opens the door for another; one cleanup changes what the next pass can see.

The headline reductions are material. For games translated from RBG, optimisation cuts more than 72% of nodes and 66% of edges, and can reduce state size by 21%. For games written in HRG, optimisations reduce nodes by 47% and edges by 41%. These are not abstract aesthetics. In a forward model, smaller automata and leaner state representations affect legal-move generation, move application, caching, and memory behaviour.

The paper also reports translation and optimisation times. For most HRG games, the complete suite runs quickly enough for near-immediate feedback, often well under 100 ms. But there are exceptions: HRG backgammon reaches 4,233 ms with all optimisations, chess 1,344 ms, and chess king-capture 1,009 ms. Several RBG translations also become expensive; Arimaa variants and some other complex descriptions hit the 60-second timeout under full optimisation.

That boundary is useful. The system is not frictionless. It is compiler infrastructure, and compilers sometimes take time when the input is gnarly. The encouraging part is that the paper exposes this trade-off rather than hiding it behind a cheerful demo.

The compiler turns game rules into agent-facing machinery

The generated C++ forward model is where the stack becomes operational. The paper describes functions for checking terminal states, obtaining scores, identifying the current player, generating legal moves, applying moves, and handling reachability.

Legal-move generation traverses the automaton, applies legal actions, and collects unique tag sequences corresponding to moves. Move application searches for a move walk matching a given move and applies the transitions. Game states track both variable assignments and the current automaton node.

The compiler also exploits pragmas. For example, disjointness hints let generated code stop checking mutually exclusive branches once one has succeeded. Tag-index pragmas influence the move container representation. Repeat and uniqueness information improve caching. Iterator pragmas replace broad loops followed by filtering checks with precomputed iteration ranges. Arithmetic-like operations can be recovered from symbol encodings where pragmas mark a finite domain as integer-like.

This is the part business readers should not skip. Many agent systems fail not because the learning algorithm is too weak, but because the environment interface is slow, inconsistent, or hard to instrument. RG’s generated forward model provides a uniform interface across games. That means agents can evaluate legal moves, simulate trajectories, and run experiments without writing a new simulator every time.

In reinforcement learning terms, the simulator is the tax collector. Every rollout pays it. A faster forward model does not merely make one experiment nicer; it changes how many experiments are affordable.

The benchmark evidence supports the architecture, with important asymmetries

The main efficiency experiment compares flat Monte Carlo playouts per second across games available in other GGP systems. Results are averaged over three one-minute runs on an AMD Ryzen 9 3950X machine with 64GB RAM, Ubuntu 24.04.3 LTS, g++ 14.2.0, and GraalVM 25.0.1+8.1 for Ludii.

The most important result is simple: all games implemented in HRG produce faster RG forward models than the corresponding RBG and Ludii versions in the reported benchmark set. The paper also says RG is typically 10–20 times faster than Ludii.

Some examples make the scale clearer:

Game	RG from HRG	Native RBG	Ludii	Interpretation
Chess	1,572	995	113	RG is faster than both, but chess remains relatively heavy
Connect Four	1,297,176	914,514	55,858	A large gain in an already fast game
Pentago split variant	172,626	61,878	3,933	RG benefits strongly from avoiding awkward workarounds
Reversi	28,445	19,838	1,497	RG improves over both baselines in the HRG implementation
Yavalath	415,251	352,910	93,642	Smaller but still meaningful improvement over RBG

The comparison with automatically translated RBG is more uneven. In some cases, RG translations are comparable to or faster than native RBG. In others, they are slower. Reversi translated from RBG reaches 2,249 playouts per second versus 19,838 for native RBG and 28,445 for the HRG implementation. Hex shows a similar pattern: HRG RG reaches 23,535, native RBG 22,441, but RBG-to-RG only 2,447. Pentago is especially dramatic: the HRG version is fast, while the direct RBG translation is poor.

That asymmetry is not a failure; it reveals the mechanism. RG is strongest when the game is encoded in a way that lets the automaton and optimiser see the right structure. Translation from another language can preserve semantics while importing awkward representation choices. Semantics transfer. Performance does not always travel business class.

A concise evidence map is useful here:

Paper component	Likely purpose	What it supports	What it does not prove
Universality theorem	Main theoretical evidence	RG can represent finite turn-based games with imperfect information and rational randomness	Encoded games will be easy to solve or simulate
Complexity theorem	Boundary condition	Worst-case legal-move and proper-description checks are very hard	Typical implemented games hit worst-case behaviour
Playout benchmark table	Comparison with prior systems	HRG-generated RG forward models are faster on the reported games	Universal superiority across all possible encodings
Optimisation figures and reductions	Implementation detail and optimisation analysis	The compiler pipeline materially shrinks automata and state representations	Every individual optimisation is always beneficial
Translation-time appendix	Engineering practicality and sensitivity test	Many descriptions optimise quickly; complex cases can be costly	Full optimisation is always instant

The benchmarks are therefore strong but not magical. They show that the chosen architecture can generate very fast forward models for a meaningful set of games. They do not show that any arbitrary formal environment poured into RG will become fast simply because it now has automata on it. Automata are useful. They are not fairy dust. A small but important distinction.

The business value is a reusable simulation substrate, not “better board games”

The direct domain is General Game Playing. The practical interpretation is broader but should be stated carefully.

What the paper directly shows is that RG can represent a broad finite-game class, compile descriptions into C++ forward models, optimise automata aggressively, support tooling around description and transformation, and outperform RBG and Ludii on the implemented HRG benchmark set.

What Cognaptus infers is that the same pattern is relevant to agent infrastructure wherever rule-governed environments need to be generated, tested, simulated, and reused. That includes game AI tooling, reinforcement-learning benchmarks, synthetic environment generation, policy-testing sandboxes, negotiation games, security exercises, workflow simulators, and other discrete decision systems where state transitions can be formally encoded.

The business pathway looks like this:

Unify the representation. Multiple rule languages or domain-specific generators target one intermediate representation.
Optimise once. Improvements to the compiler pipeline benefit every upstream language.
Expose one runtime interface. Agents use consistent calls for legal actions, state transitions, terminal checks, and scoring.
Benchmark across variation. Procedural generators can produce many environments without requiring many bespoke engines.
Shorten iteration loops. IDE diagnostics, visualisation, transformation snapshots, and fast compilation reduce the cost of adding or modifying environments.

This is not limited to entertainment. Games are useful because they are compact laboratories for agency: actions, hidden information, chance, objectives, terminal conditions, and strategic interaction. Many business workflows are less fun but structurally similar. Procurement negotiations, compliance processes, inventory allocation, fraud-response playbooks, and support escalation rules all involve finite-ish states, permitted actions, hidden information, and outcomes. The question is whether they can be cleanly formalised. Sometimes yes. Often painfully. Occasionally not without making everyone regret the meeting.

The paper gives no enterprise case study, and it does not claim one. The practical lesson is architectural, not market-validated: if your organisation is building many agent testbeds, the intermediate representation may become more valuable than any single environment.

Where the result applies, and where it does not

RG applies cleanly to finite, turn-based games with discrete state and action structure. It handles imperfect information and rational randomness. It can model simultaneous moves through imperfect information. It is designed for formal descriptions where game state, legal transitions, and outcomes are explicit.

It does not directly solve continuous control, real-time interaction, open-ended natural language environments, or domains where rules are ambiguous, contested, or learned from messy evidence. It also does not remove the cost of encoding a domain. A business process that lives in exceptions, undocumented conventions, and “ask Margaret, she knows” is not automatically ready for an automaton. Margaret may be the actual legacy system.

The GDL translation is experimental. More high-level languages are future work, including areas such as fairy chess, card games, and dice games. Efficiency can also improve through additional automaton analysis and techniques such as bit-boarding. These future directions matter because RG’s central bet is ecosystem leverage: the more source languages and generators can target the same low-level representation, the more valuable the compiler layer becomes.

There is also a modelling boundary around probabilities. RG’s universality theorem covers rational randomness. That is appropriate for finite games but should not be mistaken for a general probabilistic programming system.

Finally, performance is encoding-sensitive. The HRG results are the cleanest evidence for the intended stack. The RBG translation results show that compatibility is useful but not enough. Translating old structure into a new representation can preserve old inefficiencies, which is rude but traditional.

The quiet lesson: agent systems need compilers, not just clever agents

The paper’s deeper contribution is not that automata are new. They are extremely not new. The contribution is showing how an automaton-based low-level language can sit inside a broader General Game Playing ecosystem with authoring tools, translators, validators, optimisers, and generated forward models.

That is the part worth carrying into business and AI infrastructure work. As agent systems become more common, teams will need controlled environments where agents can be evaluated, stress-tested, and improved. Those environments will multiply. Without a common representation, every new benchmark or simulator becomes another little kingdom with its own laws, plumbing, and bugs.

Regular Games proposes a more disciplined arrangement: let designers and generators work at higher levels; lower the result into a compact formal representation; optimise and validate that representation; compile it into fast runtime code; give agents a uniform interface.

That is not the flashiest story in AI. It is better than flashy. It is the sort of infrastructure story that quietly decides whether experimentation scales or drowns in glue code.

And yes, the rules are now automata. Apparently even games eventually become compiler problems.

Cognaptus: Automate the Present, Incubate the Future.

Radosław Miernik, Marek Szykuła, Jakub Kowalski, Jakub Cieśluk, Łukasz Galas, and Wojciech Pawlik, “Regular Games – an Automata-Based General Game Playing Language,” arXiv:2511.10593, 2025, https://arxiv.org/abs/2511.10593. ↩︎

The key mechanism is not a game language; it is a rule compiler stack#

RG’s universality claim is precise, and not as comforting as it sounds#

The automaton gives optimisers something to chew on#

The compiler turns game rules into agent-facing machinery#

The benchmark evidence supports the architecture, with important asymmetries#

The business value is a reusable simulation substrate, not “better board games”#

Where the result applies, and where it does not#

The quiet lesson: agent systems need compilers, not just clever agents#