Opening — Why this matters now
The AI industry has a habit of mistaking scale for structure.
Bigger models, longer context windows, more tokens, more modalities. And yet, when these systems leave benchmark leaderboards and enter the real world, something curious happens: the bottlenecks are not raw capability — they are bandwidth, cost, interpretability, latency, and control.
The paper “Artificial Agency Program: Curiosity, compression, and communication in agents” (Csaky, 2026) reframes the problem entirely fileciteturn0file0. Instead of asking how to build smarter models, it asks how to build embedded, resource-bounded agents that allocate limited budgets across observation, action, memory, and deliberation.
In other words: not intelligence in isolation — but agency under constraint.
For business leaders and system designers, this shift is more than philosophical. It is operational.
Background — From Prediction Machines to Embedded Agents
Modern AI systems are largely trained as prediction engines: next-token models, large-scale world models, or reward-optimized policies. The dominant paradigm assumes near-infinite training data and compute during development, followed by inference at scale.
But biological intelligence did not emerge in a cloud cluster.
The Artificial Agency Program (AAP) argues that intelligence is shaped by:
- Limited sensing bandwidth
- Limited actuation authority
- Finite memory
- Energy constraints
- Partial observability
- Constant need to act under uncertainty
Critically, the paper distinguishes capability from constraint proximity. Two systems may perform equally well on a task, but one may operate under human-like constraints while another operates in a superhuman regime of memory and recall. That difference profoundly affects interpretability, collaboration quality, and failure modes.
For enterprise AI, this is not academic. Systems that are “alien” in constraint structure are harder to supervise and align.
Analysis — The Core Architecture of Artificial Agency
The AAP framework formalizes an embedded agent interacting with a partially observed environment. The key move is to unify four elements into a single cost-sensitive objective:
- Curiosity as learning progress
- Predictive compression
- Empowerment and control
- Explicit resource budgets
1. Curiosity as Learning Progress (Not Novelty)
The intrinsic reward is defined as improvement in predictive compression, not raw surprise.
An agent is rewarded when its predictive loss decreases over a horizon:
$$ r_t = L_{pred}(\theta_{t-1}) - L_{pred}(\theta_t) $$
This avoids two common traps:
- Pure randomness seeking (novel but useless patterns)
- Exploiting trivial predictability
The agent is incentivized to operate in the “learning-progress zone” — patterns that are learnable but not yet mastered.
From a business standpoint, this resembles R&D allocation under bounded budgets: invest where marginal improvement is still possible.
2. A Unified Cost Objective
The full objective includes explicit penalties:
- Observation cost ($C_O$)
- Action cost ($C_E$)
- Compute/deliberation cost ($C_C$)
- Memory maintenance cost ($C_M$)
The optimization target becomes:
$$ J = \mathbb{E}\left[ \sum_t \gamma^t (r_t - \lambda_O C_O - \lambda_E C_E - \lambda_C C_C - \lambda_M C_M) \right] $$
This is where AAP diverges sharply from current frontier practice.
Today’s systems typically optimize performance first and treat cost as an afterthought. AAP embeds cost directly into the intelligence definition.
For organizations, this reframes AI evaluation from:
“How accurate is the model?”
to
“How efficiently does the system allocate observation, action, and thinking?”
3. Empowerment and Plasticity
AAP connects prediction to empowerment — the information-theoretic capacity from actions to future observations.
Empowerment roughly measures how much influence an agent has over its environment.
But the paper also introduces a complementary notion: plasticity, the degree to which observations influence actions.
Together they define a bidirectional coupling:
| Dimension | Interpretation | Risk if Imbalanced |
|---|---|---|
| Empowerment | Control over environment | Overconfident exploitation |
| Plasticity | Reactivity to environment | Costly over-adaptation |
| Predictive Compression | Internal model quality | Detached prediction without control |
Business implication: High reactivity (plasticity) without predictive structure leads to operational churn. High control without predictive grounding leads to brittle overreach.
AAP predicts pragmatic alignment between prediction and control — but only within specific constraint regimes.
4. Unification — Interface Quality as a Measurable Variable
Perhaps the most operationally interesting concept is unification.
Unification measures how much sensing and acting bottlenecks distort the coupling between agent and environment.
The paper defines a task-relative unification score combining:
- Observation losslessness
- Action authority
- Communication bottlenecks
Unification increases as interfaces become less lossy.
In enterprise AI terms, this translates to:
- Better data pipelines
- Lower latency integration
- Clearer human-AI interfaces
The hypothesis (H2) predicts that agents will allocate resources to improve interface quality — but only when long-horizon learning benefits justify the cost.
This is precisely how infrastructure investment works in mature firms.
Findings — The Hypothesis Matrix
AAP is not merely conceptual. It defines falsifiable hypotheses.
Below is a simplified synthesis of the core claims:
| Hypothesis | Positive Signal | Failure Case |
|---|---|---|
| H1: Prediction–Control Alignment | Improving learning progress improves control | Prediction improves without usable control |
| H2: Boundary Pressure | Agent invests in interface upgrades until marginal cost dominates | Agent over- or under-invests systematically |
| H3: Constraint Pressure | Stronger costs push toward better predictive organization | Constraints collapse performance without structure |
| H4: Adaptive Compute | Dynamic compute allocation beats fixed schedules | Static schedules perform equally well |
| H5: Self-Communication | Private tokens improve long-horizon reasoning | Degenerate verbosity or no gain over latent state |
The practical takeaway: compute allocation strategy matters as much as model size.
Language as a Bottleneck, Not a Privilege
AAP treats language as a selective, lossy channel — not a sacred reasoning medium.
It proposes a modality-agnostic token taxonomy:
- Input tokens (vision, text, audio, etc.)
- Private tokens (internal deliberation)
- Output tokens (public communication)
This reframes chain-of-thought debates.
The question is not:
Should the model think in words?
But:
When is emitting a token worth the cost compared to silent latent computation?
This has immediate enterprise relevance:
- Token costs affect inference economics
- Deliberation latency affects UX
- Verbose traces affect compliance logging
Language becomes an economic decision variable, not a design default.
Implications — What This Means for Business and Governance
1. AI Evaluation Must Become Frontier-Based
AAP suggests measuring energy–performance frontiers.
Each system should be evaluated not only on performance, but on its distance to the Pareto frontier of cost vs. capability.
For AI procurement and governance, this means:
- Reporting compute-normalized performance
- Including pretraining energy in system cost accounting
- Tracking adaptive compute efficiency
This aligns directly with emerging regulatory pressures on energy transparency.
2. Infrastructure Is Part of Intelligence
If unification is measurable, then:
- API latency
- Data freshness
- Sensor resolution
- Human-AI interface clarity
are not peripheral engineering concerns — they are agency multipliers.
In effect, AI capability is partially a function of system architecture quality.
3. Curiosity as Capital Allocation
Curiosity-as-learning-progress provides a template for AI-driven exploration systems.
For firms building autonomous R&D agents, trading agents, or adaptive logistics systems, the principle becomes:
Allocate exploration budget where marginal compression improvement is highest under cost constraints.
This reframes exploration from randomness to disciplined learning progress.
4. Governance Through Constraints
Perhaps the most subtle contribution: constraints shape behavior.
Instead of attempting to impose alignment purely through reward shaping, AAP suggests that:
- Energy cost
- Memory maintenance cost
- Interface bottlenecks
- Viability constraints
naturally push agents toward predictive efficiency and selective control.
Designing constraint structures may become a more stable governance lever than reward engineering alone.
Conclusion — Engineering the Budgeted Mind
The Artificial Agency Program is not a benchmark proposal.
It is a research agenda for transforming AI from prediction engines into economically rational agents embedded in real systems.
It asks a question the industry has largely postponed:
Not “How smart is the model?” But “How does it spend its limited attention, energy, and communication?”
In a world where inference cost, latency, carbon impact, and regulatory scrutiny are rising, this may be the more durable axis of progress.
Intelligence without constraint is spectacle. Agency under constraint is infrastructure.
Cognaptus: Automate the Present, Incubate the Future.