Curiosity Under Constraint: Engineering Agency, Not Just Intelligence

Opening — Why this matters now

The AI industry has a habit of mistaking scale for structure.

Bigger models, longer context windows, more tokens, more modalities. And yet, when these systems leave benchmark leaderboards and enter the real world, something curious happens: the bottlenecks are not raw capability — they are bandwidth, cost, interpretability, latency, and control.

The paper “Artificial Agency Program: Curiosity, compression, and communication in agents” (Csaky, 2026) reframes the problem entirely fileciteturn0file0. Instead of asking how to build smarter models, it asks how to build embedded, resource-bounded agents that allocate limited budgets across observation, action, memory, and deliberation.

In other words: not intelligence in isolation — but agency under constraint.

For business leaders and system designers, this shift is more than philosophical. It is operational.

Background — From Prediction Machines to Embedded Agents

Modern AI systems are largely trained as prediction engines: next-token models, large-scale world models, or reward-optimized policies. The dominant paradigm assumes near-infinite training data and compute during development, followed by inference at scale.

But biological intelligence did not emerge in a cloud cluster.

The Artificial Agency Program (AAP) argues that intelligence is shaped by:

Limited sensing bandwidth
Limited actuation authority
Finite memory
Energy constraints
Partial observability
Constant need to act under uncertainty

Critically, the paper distinguishes capability from constraint proximity. Two systems may perform equally well on a task, but one may operate under human-like constraints while another operates in a superhuman regime of memory and recall. That difference profoundly affects interpretability, collaboration quality, and failure modes.

For enterprise AI, this is not academic. Systems that are “alien” in constraint structure are harder to supervise and align.

Analysis — The Core Architecture of Artificial Agency

The AAP framework formalizes an embedded agent interacting with a partially observed environment. The key move is to unify four elements into a single cost-sensitive objective:

Curiosity as learning progress
Predictive compression
Empowerment and control
Explicit resource budgets

1. Curiosity as Learning Progress (Not Novelty)

The intrinsic reward is defined as improvement in predictive compression, not raw surprise.

An agent is rewarded when its predictive loss decreases over a horizon:

$$ r_t = L_{pred}(\theta_{t-1}) - L_{pred}(\theta_t) $$

This avoids two common traps:

Pure randomness seeking (novel but useless patterns)
Exploiting trivial predictability

The agent is incentivized to operate in the “learning-progress zone” — patterns that are learnable but not yet mastered.

From a business standpoint, this resembles R&D allocation under bounded budgets: invest where marginal improvement is still possible.

2. A Unified Cost Objective

The full objective includes explicit penalties:

Observation cost ($C_O$)
Action cost ($C_E$)
Compute/deliberation cost ($C_C$)
Memory maintenance cost ($C_M$)

The optimization target becomes:

$$ J = \mathbb{E}\left[ \sum_t \gamma^t (r_t - \lambda_O C_O - \lambda_E C_E - \lambda_C C_C - \lambda_M C_M) \right] $$

This is where AAP diverges sharply from current frontier practice.

Today’s systems typically optimize performance first and treat cost as an afterthought. AAP embeds cost directly into the intelligence definition.

For organizations, this reframes AI evaluation from:

“How accurate is the model?”

“How efficiently does the system allocate observation, action, and thinking?”

3. Empowerment and Plasticity

AAP connects prediction to empowerment — the information-theoretic capacity from actions to future observations.

Empowerment roughly measures how much influence an agent has over its environment.

But the paper also introduces a complementary notion: plasticity, the degree to which observations influence actions.

Together they define a bidirectional coupling:

Dimension	Interpretation	Risk if Imbalanced
Empowerment	Control over environment	Overconfident exploitation
Plasticity	Reactivity to environment	Costly over-adaptation
Predictive Compression	Internal model quality	Detached prediction without control

Business implication: High reactivity (plasticity) without predictive structure leads to operational churn. High control without predictive grounding leads to brittle overreach.

AAP predicts pragmatic alignment between prediction and control — but only within specific constraint regimes.

4. Unification — Interface Quality as a Measurable Variable

Perhaps the most operationally interesting concept is unification.

Unification measures how much sensing and acting bottlenecks distort the coupling between agent and environment.

The paper defines a task-relative unification score combining:

Observation losslessness
Action authority
Communication bottlenecks

Unification increases as interfaces become less lossy.

In enterprise AI terms, this translates to:

Better data pipelines
Lower latency integration
Clearer human-AI interfaces

The hypothesis (H2) predicts that agents will allocate resources to improve interface quality — but only when long-horizon learning benefits justify the cost.

This is precisely how infrastructure investment works in mature firms.

Findings — The Hypothesis Matrix

AAP is not merely conceptual. It defines falsifiable hypotheses.

Below is a simplified synthesis of the core claims:

Hypothesis	Positive Signal	Failure Case
H1: Prediction–Control Alignment	Improving learning progress improves control	Prediction improves without usable control
H2: Boundary Pressure	Agent invests in interface upgrades until marginal cost dominates	Agent over- or under-invests systematically
H3: Constraint Pressure	Stronger costs push toward better predictive organization	Constraints collapse performance without structure
H4: Adaptive Compute	Dynamic compute allocation beats fixed schedules	Static schedules perform equally well
H5: Self-Communication	Private tokens improve long-horizon reasoning	Degenerate verbosity or no gain over latent state

The practical takeaway: compute allocation strategy matters as much as model size.

Language as a Bottleneck, Not a Privilege

AAP treats language as a selective, lossy channel — not a sacred reasoning medium.

It proposes a modality-agnostic token taxonomy:

Input tokens (vision, text, audio, etc.)
Private tokens (internal deliberation)
Output tokens (public communication)

This reframes chain-of-thought debates.

The question is not:

Should the model think in words?

But:

When is emitting a token worth the cost compared to silent latent computation?

This has immediate enterprise relevance:

Token costs affect inference economics
Deliberation latency affects UX
Verbose traces affect compliance logging

Language becomes an economic decision variable, not a design default.

Implications — What This Means for Business and Governance

1. AI Evaluation Must Become Frontier-Based

AAP suggests measuring energy–performance frontiers.

Each system should be evaluated not only on performance, but on its distance to the Pareto frontier of cost vs. capability.

For AI procurement and governance, this means:

Reporting compute-normalized performance
Including pretraining energy in system cost accounting
Tracking adaptive compute efficiency

This aligns directly with emerging regulatory pressures on energy transparency.

2. Infrastructure Is Part of Intelligence

If unification is measurable, then:

API latency
Data freshness
Sensor resolution
Human-AI interface clarity

are not peripheral engineering concerns — they are agency multipliers.

In effect, AI capability is partially a function of system architecture quality.

3. Curiosity as Capital Allocation

Curiosity-as-learning-progress provides a template for AI-driven exploration systems.

For firms building autonomous R&D agents, trading agents, or adaptive logistics systems, the principle becomes:

Allocate exploration budget where marginal compression improvement is highest under cost constraints.

This reframes exploration from randomness to disciplined learning progress.

4. Governance Through Constraints

Perhaps the most subtle contribution: constraints shape behavior.

Instead of attempting to impose alignment purely through reward shaping, AAP suggests that:

Energy cost
Memory maintenance cost
Interface bottlenecks
Viability constraints

naturally push agents toward predictive efficiency and selective control.

Designing constraint structures may become a more stable governance lever than reward engineering alone.

Conclusion — Engineering the Budgeted Mind

The Artificial Agency Program is not a benchmark proposal.

It is a research agenda for transforming AI from prediction engines into economically rational agents embedded in real systems.

It asks a question the industry has largely postponed:

Not “How smart is the model?” But “How does it spend its limited attention, energy, and communication?”

In a world where inference cost, latency, carbon impact, and regulatory scrutiny are rising, this may be the more durable axis of progress.

Intelligence without constraint is spectacle. Agency under constraint is infrastructure.

Cognaptus: Automate the Present, Incubate the Future.

Opening — Why this matters now#

Background — From Prediction Machines to Embedded Agents#

Analysis — The Core Architecture of Artificial Agency#

1. Curiosity as Learning Progress (Not Novelty)#

2. A Unified Cost Objective#

3. Empowerment and Plasticity#

4. Unification — Interface Quality as a Measurable Variable#

Findings — The Hypothesis Matrix#

Language as a Bottleneck, Not a Privilege#

Implications — What This Means for Business and Governance#

1. AI Evaluation Must Become Frontier-Based#

2. Infrastructure Is Part of Intelligence#

3. Curiosity as Capital Allocation#

4. Governance Through Constraints#

Conclusion — Engineering the Budgeted Mind#