Opening — Why this matters now
Agentic AI is rapidly escaping the sandbox.
From copilots to autonomous workflows, we are now deploying systems that don’t just predict — they act. The problem? These systems are increasingly embedded in real-world environments where timing, safety, and consistency are not optional.
And yet, the underlying models — particularly large language models — are inherently non-deterministic. Same input, different output. Slight latency shifts, different behaviors. In a chatbot, this is charming. In a car, it’s fatal.
The paper fileciteturn0file0 tackles this uncomfortable truth head-on: how do we make agentic AI systems behave predictably when their core components are fundamentally unpredictable?
Their answer is not to “fix” the AI — but to redesign the system around it.
Background — Context and prior art
Cyber-Physical Systems (CPS) — think autonomous vehicles, industrial robots, smart infrastructure — rely heavily on determinism.
Determinism, in this context, is simple but unforgiving:
Given the same inputs, the system must produce the same outputs.
Why? Because determinism enables:
| Capability | Why It Matters |
|---|---|
| Repeatability | You can test and validate safety-critical behavior |
| Debuggability | Failures can be traced and reproduced |
| Composability | Systems can be reliably integrated |
| Certification | Regulators require predictable behavior |
Now introduce three sources of chaos:
- Human behavior (inconsistent, emotional)
- Physical environment (dynamic, stochastic)
- LLM-based agents (probabilistic, latency-variable)
You don’t get a system. You get a negotiation.
Previous approaches tried to improve parts of the system:
- Better models (accuracy)
- Fine-tuning (alignment)
- Formal verification (bounded guarantees)
But they largely ignored a structural issue:
Even a perfect model cannot guarantee system-level determinism if the execution architecture is non-deterministic.
Analysis — What the paper actually does
The authors propose a subtle but powerful shift:
Treat nondeterminism as input, not error.
1. System Formalization
They define the system behavior as:
$$ y(t) = F(x_i, i_h(t), i_c(t), i_a(t)) $$
Where:
- $x_i$: initial system state
- $i_h(t)$: human input
- $i_c(t)$: environment/car input
- $i_a(t)$: agent (LLM) input
The key idea is almost philosophical:
If you treat all variability as explicit inputs, the system itself can remain deterministic.
This reframes the problem from eliminating randomness to containing it.
2. Reactor Model of Computation (MoC)
Instead of loosely coupled services, the system is built using a reactor model, implemented via the Lingua Franca (LF) framework.
Core properties:
| Feature | Business Translation |
|---|---|
| Deterministic scheduling | No race conditions between components |
| Port-based communication | Clear data contracts between modules |
| Logical time | Controlled timing instead of real-time chaos |
| Hierarchical composition | Systems remain explainable and auditable |
Think of it as replacing an improvisational jazz band with a tightly conducted orchestra.
3. The Agentic Driving Coach (Case Study)
The system is decomposed into four reactors:
| Component | Role |
|---|---|
| Driver | Human behavior model |
| Car | Physical dynamics |
| Environment | External conditions |
| Coach | AI agent (LLM + planner) |
The Coach is where things get interesting:
- LLM generates:
CONTROL_SIGNAL | Instruction - Planner enforces modes:
Monitoring → Warning → Actuate
This creates a controlled decision pipeline:
| Mode | Trigger | Action |
|---|---|---|
| Monitoring | Normal behavior | No intervention |
| Warning | Deviation | Suggest correction |
| Actuate | Safety breach | Override control |
This is not “AI autonomy.”
It’s AI under supervision with escalation protocols.
4. Containing LLM Uncertainty
The paper introduces three practical mechanisms:
a. Structured Prompting
Instead of free-form responses:
TOKEN | Message
With hard rules like:
- If distance ≤ 25m and speed too high → ACTUATE
- Else if deviation → WARNING
This reduces ambiguity and forces bounded outputs.
b. Deadline Enforcement
Each LLM call has a strict time budget:
| Model | Worst-case latency |
|---|---|
| 1B | 186 ms |
| 8B | 250 ms |
| 70B | 613 ms |
If the model is late: → Fallback logic triggers immediately
This is critical:
A correct answer delivered late is equivalent to a wrong answer.
c. Logical Delays
Human reaction time (~500ms) and system delays are explicitly modeled.
This avoids the common fallacy of “instant AI decisions” in real-world systems.
Findings — Results with structure
The experiments (see figures on page 5 fileciteturn0file0) reveal a non-obvious trade-off:
Model Size vs System Safety
| Model | Latency | Instruction Quality | Outcome |
|---|---|---|---|
| 1B | Low | Poor | Unsafe (fails to stop) |
| 8B | Medium | Good | Acceptable |
| 70B | High | Best | Safest behavior |
Two insights emerge:
- Smaller models are faster but dangerously inaccurate
- Larger models are safer but introduce timing risk
Which leads to a design paradox:
You cannot optimize for both intelligence and responsiveness without architectural intervention.
Determinism Achieved (With a Catch)
The system produces identical outputs when:
- Inputs are identical
- Timing is controlled
- LLM outputs are bounded
But note the fine print:
| Source of Variability | How It’s Handled |
|---|---|
| LLM randomness | Temperature = 0 |
| Latency variation | Deadlines + fallback |
| Human behavior | Modeled as input stream |
This is not pure determinism.
It’s engineered determinism — a constrained sandbox where chaos is allowed, but only within guardrails.
Implications — What this means for business
This paper quietly challenges how most companies are deploying AI today.
1. Prompt Engineering is Not Enough
Most teams focus on improving outputs.
This work shows:
The real risk lies in when and how outputs are delivered.
System architecture > model quality.
2. Agentic Systems Need Operating Systems
What Lingua Franca represents is essentially:
An OS for agent coordination
Expect a shift from:
- “LLM as a tool” → “LLM as a component in a deterministic pipeline”
3. Safety = Latency × Accuracy
Traditional AI metrics ignore timing.
This paper implies a more realistic objective:
| Metric | Interpretation |
|---|---|
| Accuracy | Is the decision correct? |
| Latency | Is it delivered in time? |
| Determinism | Is it repeatable? |
All three must hold simultaneously.
4. The Rise of Hybrid Control Systems
The architecture blends:
- Rule-based systems (fallback)
- Probabilistic models (LLMs)
- Deterministic orchestration (reactors)
This hybrid approach is likely to dominate safety-critical AI deployments.
Pure AI systems won’t pass regulatory scrutiny.
Conclusion — Control is the new intelligence
The industry has been obsessed with making AI smarter.
This paper asks a more uncomfortable question:
What if intelligence is not the bottleneck — control is?
By reframing nondeterminism as an input and enforcing deterministic orchestration around it, the authors demonstrate a path forward for deploying agentic AI in real-world systems without gambling on unpredictability.
It’s less glamorous than scaling parameters.
But it’s what makes AI deployable.
And in the end, deployability beats brilliance.
Cognaptus: Automate the Present, Incubate the Future.