Mind the Gap: Why Agency Isn’t Intelligence (Yet)

Opening — Why this matters now

We have built systems that write code, trade assets, drive robots, and negotiate with humans. They act. They learn. They optimize.

And yet, when the environment shifts—even slightly—they drift.

The dominant narrative says: scale more data, more parameters, more compute. But the paper A Mathematical Theory of Agency and Intelligence fileciteturn0file0 suggests something more uncomfortable: reliability is not primarily a training problem. It is an architectural one.

At the center of this argument is a deceptively simple quantity: bi-predictability (P) — the fraction of total information in an interaction that is genuinely shared between observation, action, and outcome.

In other words: how much of what the system “knows” is actually coupling it to reality?

Background — From Feedback to Information Bounds

Classical cybernetics already warned us. Ashby’s Law of Requisite Variety, Wiener’s feedback loops—reliable regulation requires continuous information coupling.

But modern AI reliability tools tend to monitor fragments:

Benchmark performance
Reward trends
Input drift
Confidence estimates

They rarely measure the full observation–action–outcome loop.

This paper reframes the loop in strict information-theoretic terms.

For passive systems (no action channel), predictive coherence is defined as:

$$ P = \frac{MI(S; S’)}{H(S) + H(S’)} $$

For agentic systems, with action $A$:

$$ P = \frac{MI(S, A; S’)}{H(S) + H(A) + H(S’)} $$

Where:

$S$ = internal state / observation
$A$ = action
$S’$ = next state / outcome

Crucially, this is not “how much information flows.” It is how efficiently the interaction uses its informational budget.

And here’s the part executives should pause at:

Regime	Upper Bound on P
Quantum systems	1
Classical systems	0.5
Agentic classical systems	< 0.5 (in practice)

Freedom has an information cost.

Analysis — Agency vs. Intelligence

The paper draws a sharp line:

Agency requires:

Choice: $H(A|S) > 0$
Effect: $MI(A; S’ | S) > 0$
Predictive asymmetry: $\Delta H \neq 0$

Agency means the system can intervene and those interventions matter.

But intelligence requires more.

Intelligence requires:

Increasing coupling ($MI(S,A;S’)$ grows through learning)
Monitoring $P$ over time
Adapting the structure of ${S}, {A}, {S’}$ when coupling degrades

This is the architectural leap.

Today’s systems optimize reward. They do not monitor their own coupling integrity.

They can win the game while losing grip on the environment.

Findings — Physics, RL, and LLMs Under the Same Lens

1. Physical Baseline: Double Pendulum

In a deterministic system without agency:

$P \approx 0.48$ (near classical ceiling 0.5)
$\Delta H \approx 0$

Chaos did not reduce coupling symmetry.

This establishes the calibration point: predictability loss is not the same as randomness.

2. Reinforcement Learning Agents

HalfCheetah (SAC/PPO):

System	P	ΔH	Interpretation
Double Pendulum	0.48	≈ 0	Symmetric physics
HalfCheetah	0.33	-0.56	Asymmetric agency

Introducing action reduces coherence and breaks symmetry.

More interestingly, when perturbations were introduced:

Detection Method	Perturbation Detection Rate	Median Latency
Reward-based	44%	184 windows
P/ΔH (IDT)	89%	42 windows

Reward lags. Coupling degradation appears first.

3. Large Language Models

Multi-turn dialogue experiments showed:

$P$ strongly correlates with structural coherence (85% of cases)
$P$ detects contradictions, topic shifts, and non-sequiturs with 100% sensitivity
Detection occurs immediately at injection points

And critically:

LLMs satisfy agency (choice, effect, asymmetry). They satisfy learning (next-token training). They do not compute $P$ internally. They cannot restructure their interface.

By this framework, they are agentic. Not intelligent.

The Architectural Proposal — Information Digital Twin (IDT)

The authors propose a Coupled Agency Architecture.

An auxiliary module—Information Digital Twin (IDT)—monitors:

$P$
$H_f$ (forward uncertainty)
$H_b$ (backward uncertainty)
$\Delta H$

In real time.

Instead of retraining, the system performs reflexive modulation:

Dampening actions
Gating inputs
Adjusting bandwidth

This mirrors thalamocortical regulation in biological brains: monitor statistics, not semantics.

Separation of concerns:

Layer	Function
Agent	Optimize task objective
IDT	Monitor coupling integrity
Controller	Modulate interface when drift occurs

Performance and structural stability become distinct variables.

That distinction may be foundational.

Implications — For AI Builders and Operators

1. Reliability is an architectural problem

Scaling alone will not create intelligence. You need a self-monitoring coupling layer.

2. Reward ≠ Grip

A system can maintain reward while losing bidirectional constraint. This is operationally dangerous.

3. Attribution matters

If $P$ drops, is the world opaque (high $H_f$)? Or is the agent illegible (high $H_b$)?

Without this decomposition, adaptation is blind.

4. Governance angle

A first-person coupling metric offers:

Model-agnostic oversight
Drift detection without semantic judges
A quantitative boundary between agency and intelligence

For regulators, this is more tractable than defining “alignment.”

Conclusion — The Cost of Freedom

The paper demonstrates a provable constraint:

Classical systems: $P \leq 0.5$
Agency reduces attainable coherence
Intelligence is the management of that reduction

Agency introduces freedom. Freedom reduces raw predictability. Intelligence is not eliminating that trade-off. It is regulating it.

Current AI systems optimize objectives. They do not monitor their own informational grip.

Until they do, they remain powerful agents. Not intelligent systems.

And that distinction is no longer philosophical. It is measurable.

Cognaptus: Automate the Present, Incubate the Future.

Opening — Why this matters now#

Background — From Feedback to Information Bounds#

Analysis — Agency vs. Intelligence#

Agency requires:#

Intelligence requires:#

Findings — Physics, RL, and LLMs Under the Same Lens#

1. Physical Baseline: Double Pendulum#

2. Reinforcement Learning Agents#

3. Large Language Models#

The Architectural Proposal — Information Digital Twin (IDT)#

Implications — For AI Builders and Operators#

1. Reliability is an architectural problem#

2. Reward ≠ Grip#

3. Attribution matters#

4. Governance angle#

Conclusion — The Cost of Freedom#