Guardrails Over Gigabytes: Making LLM Coding Agents Behave

Opening — Why this matters now

AI coding agents are everywhere—and still, maddeningly unreliable. They pass unit tests they shouldn’t. They hallucinate imports. They invent APIs with confidence that would be admirable if it weren’t so destructive. The industry response has been predictable: bigger models, longer prompts, more retries.

This paper proposes something less glamorous and far more effective: stop asking stochastic models to behave like deterministic software engineers.

Instead, treat them like what they are—unpredictable generators—and wrap them in the same kind of control frameworks software engineering has used for decades to manage unreliable components. The result is not a new model, but a new architecture. And the empirical gains are not subtle.

Background — Context and prior art

Software engineering has long accepted that some components are inherently unreliable. Configuration management systems like CFEngine, CI/CD pipelines, and Test-Driven Development do not assume correctness; they assume eventual convergence through validation.

Modern LLM-based agents, however, blur a critical boundary. Techniques such as Chain-of-Thought or ReAct embed decision-making inside the same probabilistic process that generates text. When reasoning and generation share the same stochastic substrate, failure modes compound rather than correct.

The theoretical lens here is refreshingly classical. Drawing from:

Promise Theory: unreliable actors make promises; consumers verify.
Agent–Environment separation (Sutton & Barto): only what you can modify belongs to the agent.
Bounded rationality: satisficing beats optimizing when costs matter.

Under this view, the LLM is not the agent. It is part of the environment.

Analysis — What the paper actually does

1. Move the control boundary

The core move is architectural: relocate decision-making outside the LLM. The agent is a deterministic controller. The LLM is a stochastic oracle.

This single distinction unlocks everything else.

2. Dual-State Architecture

The system state is split cleanly:

State	Role	Properties
Workflow State	Control	Finite, deterministic, auditable
Environment State	Generation	Stochastic, append-only, opaque

The workflow state tracks truth assignments to validation guards—nothing more. The environment state stores artifacts, history, and error traces without polluting control logic.

3. Atomic Action Pairs

Every meaningful step is an indivisible transaction:

Generate an artifact with the LLM
Verify it immediately with a deterministic guard

If verification fails, the workflow state does not advance. Only the context is refined.

This eliminates an entire class of failure where invalid outputs silently corrupt downstream steps.

4. Guards as sensors, not filters

Guards do more than block bad outputs. They sense reality and project probabilistic generation into a binary, observable control state.

Syntax checks, unit tests, architectural rules, even human review—all are modeled uniformly as guard functions.

The planner never sees probabilities. It sees pass or fail.

Findings — Results with visualization

Across three diagnostic coding tasks (LRU Cache, Template Engine, Password Validator), the framework was tested on 13 models ranging from 1.3B to 15B parameters.

Reliability gains

Task	Max Reliability Gain
Password Validator	+66 percentage points
Template Engine	+42 percentage points
LRU Cache	+50 percentage points

Crucially, these gains were achieved at 1.2–2.1× compute cost, dramatically cheaper than naïve best-of-N sampling.

The qualification insight

Not all models benefit. The framework amplifies capability—it does not create it.

Models with effectively zero probability of following instructions (ϵ ≈ 0) remain unusable. But once a minimal capability threshold is crossed, architectural constraints dominate parameter count.

A 6.7B model with guards can outperform a 15B model without them.

Multi-step workflows (TDD)

In a test-driven development pipeline—tests first, implementation second—the same pattern holds. Reliability scales with model size, but failure modes are now explainable.

When things break, they break because specifications are wrong, not because the agent “got confused.” That distinction matters.

Implications — What this means in practice

Smaller models, local control

This framework makes sub-15B models viable for serious software engineering tasks. That has direct implications for:

On-prem deployments
IP-sensitive codebases
Regulated environments

Safety becomes systemic

Instead of trying to train safety into model weights, safety emerges from workflow structure. The LLM remains creative—and dangerous—inside a sandbox of deterministic constraints.

Credit assignment finally makes sense

Immediate verification collapses the reward horizon. Every failure is attributed to the last generation attempt. This turns retries from waste into labeled training data.

In other words: the architecture quietly solves a reinforcement learning problem most agent frameworks ignore.

Conclusion — A quieter kind of progress

This paper does not introduce a new loss function or a clever prompt trick. It does something rarer: it formalizes what good engineers already know.

Unreliable components should not be trusted. They should be constrained.

By treating LLMs as stochastic environments rather than decision-makers, and by enforcing atomic generate–verify loops with explicit guards, we get systems that are both imaginative and dependable.

No hype. No mysticism. Just architecture doing its job.

Cognaptus: Automate the Present, Incubate the Future.

Opening — Why this matters now#

Background — Context and prior art#

Analysis — What the paper actually does#

1. Move the control boundary#

2. Dual-State Architecture#

3. Atomic Action Pairs#

4. Guards as sensors, not filters#

Findings — Results with visualization#

Reliability gains#

The qualification insight#

Multi-step workflows (TDD)#

Implications — What this means in practice#

Smaller models, local control#

Safety becomes systemic#

Credit assignment finally makes sense#

Conclusion — A quieter kind of progress#