If your AI agent is putting a metal fork in the microwave, would you rather stop it after the sparks fly—or before?
That’s the question Pro2Guard was designed to answer.
In a world where Large Language Model (LLM) agents are increasingly deployed in safety-critical domains—from household robots to autonomous vehicles—most existing safety frameworks still behave like overly cautious chaperones: reacting only when danger is about to occur, or worse, when it already has. This reactive posture, embodied in rule-based systems like AgentSpec, is too little, too late in many real-world scenarios.
Enter Pro2Guard, a proactive runtime enforcement framework that watches your LLM agent like a probabilistic time-traveler. Instead of simply asking “Is this action unsafe?”, it asks “What’s the probability this action leads to something unsafe in the next few steps?”
🧠 How It Works: Safety by Reachability
At its core, Pro2Guard learns from past agent behavior and uses that to simulate possible futures. Here’s how:
- Trace Collection: It gathers execution logs of the agent in action—via simulations or real-world deployments.
- State Abstraction: These logs are converted into symbolic states via safety-relevant predicates (e.g.,
is_inside(fork, microwave)
). - Model Learning: From these symbolic trajectories, it builds a Discrete-Time Markov Chain (DTMC) that captures the likelihood of one abstract state transitioning into another.
- Runtime Monitoring: During live execution, the current agent state is mapped to the DTMC. Pro2Guard estimates the probability that the agent will eventually reach an unsafe state. If that probability exceeds a threshold (say, 5%), it intervenes—before the agent commits a dangerous act.
This is a fundamental shift from reactive safety rules to probabilistic foresight.
🍴 Motivating Example
Consider a household robot tasked with “heat the fork inside the microwave.”
Step | Abstract State | Description |
---|---|---|
1 | s0 |
Fork is found, microwave is closed |
2 | s2 |
Fork placed inside microwave |
3 | s3 |
Microwave turned on (unsafe state) |
From s0
, the chance of reaching s3
is low (~4%). But from s2
, that probability jumps to 34%. Pro2Guard catches this spike and intervenes—suggesting reflection, user prompt, or outright halting—before the agent moves to s3
.
📈 What It Achieves: Numbers Tell the Story
Setting | Metric | No Guard | AgentSpec | Pro2Guard (stop @ 10%) | Pro2Guard (reflect) |
---|---|---|---|---|---|
Embodied Agents | Unsafe Tasks (%) | 40.63 | 19.79 | 2.60 | 14.07 |
Task Completion (%) | 59.38 | 59.38 | 10.42 | 47.74 | |
Autonomous Vehicles | Violation Prediction | — | — | 100% (≤ 0.3𝜃) | — |
Time Anticipated | — | — | up to 38.66 sec early | — |
Interpretation:
- Pro2Guard offers early prediction of unsafe behavior and tunable intervention modes.
- You can prioritize safety (stop mode) or task success (reflect mode) based on your risk appetite.
🧮 Why It Matters: Three Strategic Wins
- Token Efficiency: Pro2Guard reduces unnecessary LLM calls by 12.05% compared to AgentSpec.
- Explainability: It doesn’t just say “stop”—it explains why, giving the probability of future failure.
- Lower Engineering Overhead: Safety rules are derived directly from predicate definitions and PCTL logic. No hand-coding of symbolic rules required.
🔧 Extending to New Domains
To apply Pro2Guard elsewhere—say finance bots, warehouse robots, or browser agents—you only need to define:
encode()
: how to map environment observations to symbolic bits.can_reach()
: what transitions are semantically valid.unsafe predicates
: what states are considered dangerous.
That’s it. The system handles probabilistic modeling, verification, and intervention.
🕳️ Limitations
- No STL Support Yet: Pro2Guard doesn’t support time-bounded logic like “stay safe for 3 seconds”. This is a limitation for real-time systems like AVs.
- One DTMC per Task: Each scenario requires its own model. Future work includes unifying them into a global Markov Decision Process (MDP).
🧭 Cognaptus Commentary
Pro2Guard is more than a safety tool—it’s a risk-informed decision layer for LLM agents. It aligns perfectly with the kind of agentic orchestration Cognaptus envisions: models that reason in probabilities, explain their decisions, and gracefully balance safety with utility.
Instead of designing agents that simply “don’t break the rules,” we should design agents that forecast failure and act accordingly. Pro2Guard gives us a way to do just that.
Cognaptus: Automate the Present, Incubate the Future.