Opening — Why this matters now
Agentic AI systems are everywhere—self-refining copilots, multi-step reasoning chains, autonomous research bots quietly talking to themselves. Yet beneath the productivity demos lurks an unanswered question: what actually happens when an LLM talks to itself repeatedly? Does meaning stabilize, or does it slowly dissolve into semantic noise?
The paper “Dynamics of Agentic Loops in Large Language Models” offers an unusually rigorous answer. Instead of hand-waving about “drift” or “stability,” it treats agentic loops as discrete dynamical systems and analyzes them geometrically in embedding space. The result is less sci‑fi mysticism, more applied mathematics—and that’s a compliment fileciteturn0file0.
Background — From prompts to trajectories
Most agent frameworks implicitly assume that iterating an LLM is benign: refine, critique, rewrite, repeat. But iteration is a transformation, and transformations have dynamics.
The paper introduces a clean separation:
- Artifact space: where text lives and LLMs generate strings.
- Semantic embedding space: where those strings are mapped into vectors and measured.
Once outputs are embedded, an agentic loop becomes a trajectory—a path traced through semantic space as the model repeatedly transforms its own outputs. This allows concepts like convergence, dispersion, clusters, and attractors to be defined precisely, rather than metaphorically.
If this sounds like borrowing tools from physics and dynamical systems theory—yes, that’s exactly the point.
Analysis — Fixing the measurement problem first
Before measuring trajectories, the author tackles an inconvenient truth: cosine similarity is biased.
Modern sentence embeddings are anisotropic. Everything looks similar to everything else. Raw cosine similarity systematically overestimates semantic closeness, making it nearly useless for detecting subtle drift.
The paper’s solution is refreshingly practical:
- Use human-judged semantic similarity (STS benchmark) as ground truth.
- Apply isotonic regression to recalibrate cosine similarity.
This calibrated similarity:
- Eliminates systematic bias (mean bias error → 0)
- Achieves near-perfect calibration (ECE ≈ 0)
- Preserves local stability (~98%) despite being piecewise-constant
In short: the ruler is fixed before measuring the motion. A detail many agent papers conveniently skip fileciteturn0file0.
Findings — Two prompts, two universes
With measurement solved, the paper runs a deceptively simple experiment: two single-prompt agentic loops, 50 iterations each, same model, same starting sentence.
Contractive loop (rewrite, preserve meaning)
Prompt: “Rewrite the sentence to sound slightly more natural while preserving meaning exactly.”
Observed dynamics:
- High local similarity (>0.85)
- Decreasing dispersion
- Clear cluster formation
- Eventual convergence to a single semantic attractor
This loop behaves like a contraction mapping. Iteration smooths, compresses, and stabilizes meaning.
Exploratory loop (summarize, then negate)
Prompt: “Summarize the text, then negate its main idea.”
Observed dynamics:
- Large stepwise semantic jumps
- Low similarity between iterations
- No stable clusters
- Unbounded drift
This loop never settles. Meaning is repeatedly destroyed and rebuilt. The trajectory wanders freely through semantic space.
Side-by-side summary
| Property | Contractive Loop | Exploratory Loop |
|---|---|---|
| Mean local similarity | High (>0.85) | Low (<0.5) |
| Global drift | Bounded | Unbounded |
| Clusters | Multiple → One | None |
| Regime | Convergent | Divergent |
Same model. Same temperature. Different prompt. Radically different physics.
Implications — Prompting is dynamical control
The uncomfortable takeaway is this: prompt design implicitly sets the dynamical regime of an agent.
- Want reliability, refinement, and consistency? You need contractive prompts.
- Want novelty, exploration, or creative destruction? You are inducing divergence, whether you realize it or not.
This reframes several practical concerns:
- Runaway agents are not mysterious—they are poorly constrained dynamical systems.
- Self-refinement works precisely because it creates semantic attractors.
- Creative agents need both expansion and contraction phases, or they either stagnate or collapse.
The paper sketches a natural next step: composite loops that alternate exploration and consolidation. In other words, creativity as controlled oscillation, not chaos.
Conclusion — From vibes to vectors
This work does something rare in agent research: it replaces intuition with instrumentation.
By treating agentic loops as trajectories in calibrated semantic space, it shows that stability, drift, and creativity are not mystical properties of LLMs—they are geometric consequences of how we prompt and iterate them.
If you are building agentic systems, the message is clear: stop asking whether an agent is “smart.” Start asking what kind of dynamical system you’ve accidentally created.
Cognaptus: Automate the Present, Incubate the Future.