A bank does not usually fail because its compliance policy forgot to exist. It fails because the policy lived in one place, the software lived somewhere else, and the audit trail arrived after the damage had already developed a charming personality.
That gap becomes harder to excuse when AI agents move from answering questions to initiating payments, recommending clinical escalation, coordinating mission plans, or calling APIs inside enterprise workflows. A chatbot can be corrected after the fact. An agent that acts on behalf of a firm needs rules before it acts, evidence while it acts, and review after it acts. The old governance ritual of “write a policy, publish a PDF, hope engineering read it” starts to look less like oversight and more like theatre with better stationery.
The paper Policy Cards: Machine-Readable Runtime Governance for Autonomous AI Agents by Juraj Mavračić proposes a concrete artifact for that problem: the Policy Card.1 Its core idea is simple but operationally important. A Policy Card is not another document describing what an AI system is, how it was trained, or what risks were observed during evaluation. It is a deployment-layer specification that tells an agent what it may do, what it must not do, when it must escalate, what evidence it must emit, and how those obligations map to recognised assurance frameworks.
That distinction matters. Model Cards, Data Cards, and System Cards helped AI governance become more transparent. Policy Cards try to make it executable. The change is not cosmetic. It moves governance from “someone should check this later” to “the deployment pipeline and runtime system can check this now.” A small improvement, unless one enjoys finding out about forbidden behaviour during litigation.
The paper is about runtime constraints, not better documentation
The obvious misreading is to treat Policy Cards as a new member of the documentation family: Model Card, Data Card, System Card, Policy Card, perhaps next a Vibes Card for the truly desperate. That would undersell the paper.
The author positions Policy Cards as a normative artifact. They are meant to specify operational policy for a particular deployed system in a particular context and jurisdiction. A Policy Card includes scope, applicable policies, controls, obligations, monitoring requirements, KPI thresholds, change-management rules, assurance mappings, and references to related artifacts. The schema is machine-readable, based on JSON Schema 2020-12, and designed for validation, version control, CI/CD gating, runtime enforcement, and audit replay.
The key word is normative. A Model Card might tell a reviewer that a model performs poorly in some setting. A Policy Card can say that an agent is not allowed to perform an action unless certain conditions hold, that some actions must be denied outright, and that other actions require escalation. The paper’s examples use effects such as allow, deny, and require_escalation, which makes the card closer to an operational control surface than a transparency memo.
The practical claim is not that a JSON file magically makes an AI system compliant. Obviously not. Magic remains disappointingly unavailable in enterprise architecture. The claim is that compliance requirements can be expressed in a structure that validators, policy gateways, middleware, logging systems, and auditors can actually consume. Policy Cards become useful only when connected to that surrounding machinery.
The mechanism is the article’s real centre of gravity.
Declare first: policy becomes a versioned deployment dependency
The first stage is declaration. Before an agent is deployed, the Policy Card is registered, validated, and bound to the system context. The paper describes a schema with ten primary sections: metadata, scope, applicable policies, controls, obligations, monitoring, KPI thresholds, change management, assurance mapping, and references.
This is not merely tidy filing. Each section narrows a different source of governance ambiguity.
| Policy Card section | What it makes explicit | Operational consequence |
|---|---|---|
meta |
Owner, version, creation and review timing | Someone is accountable for the policy artifact |
scope |
Application, stakeholders, jurisdiction, boundaries | The card cannot quietly pretend to apply everywhere |
controls |
Action rules with conditions and effects | Runtime systems can evaluate allowed, denied, and escalated actions |
obligations |
Required behaviours such as notice, consent, or review | Compliance becomes a required action, not a suggestion |
monitoring |
Events, fields, detectors, retention, cadence | Evidence is planned before the audit |
kpis_thresholds |
Metrics and critical auto-fail conditions | Governance has measurable red lines |
change_management |
Approval, versioning, rollback | Policy drift becomes harder to hide |
assurance_mapping |
Links to frameworks such as NIST AI RMF, ISO/IEC 42001, and EU AI Act clauses | Auditors can trace rules to governance expectations |
The validator is important because ordinary JSON validity is not enough. A syntactically valid policy can still be semantically useless. The paper therefore adds linting rules: required fields must exist, identifiers and timestamps must follow patterns, valid_to cannot precede valid_from, retention periods must meet review needs, monitoring evidence fields must correspond to obligations or controls, and each card must include at least one critical auto-fail KPI and assurance mapping tokens.
This is where the proposal becomes interesting for business practice. Many organisations already have the ingredients of governance: policies, risk registers, model documentation, approval workflows, incident reports, and audit evidence. The problem is that these ingredients often sit in separate systems, interpreted by separate teams, and reconciled manually after deployment. Policy Cards propose a single, versioned interface between those teams.
The business interpretation is straightforward: if a deployment cannot identify the policy version, critical thresholds, approved exceptions, evidence requirements, and framework mappings that govern it, the system is not merely “under-documented.” It is operationally ungoverned. That sounds harsh. It is also useful.
Do next: the agent acts under a live constraint surface
Once declared, the Policy Card moves into the runtime phase. The deployed agent or its surrounding middleware interprets the controls and obligations sections. Proposed actions are checked against conditions. Some actions proceed. Some are denied. Some require escalation. All relevant outcomes emit evidence as specified by the monitoring section.
The paper’s retail banking example makes this tangible. A payments agent may initiate low-risk payments when conditions such as passed KYC, acceptable risk score, sufficient device trust, no sanctions hit, and amount limits are satisfied. Higher-risk situations trigger escalation: risk score at or above 0.70, first-time payee with a sufficiently large amount, low device trust, geo mismatch, or certain beneficiary-screening scores. Red lines such as failed KYC, sanctions hit, mule-account flag, or tamper-check failure require denial.
This is not presented as a field trial showing fraud reduction. It is an implementation exemplar showing that the same schema can encode concrete financial controls and evidence fields. The evidence fields matter: risk score, device trust level, beneficiary-screening score, escalation ID, geo mismatch, KYC status, and related detector outputs. Without these fields, the rule is just governance poetry. With them, the decision can be replayed.
The clinical triage example follows the same pattern but changes the domain semantics. The assistant is explicitly non-diagnostic. It can propose routine triage only when vitals are present, risk is below threshold, confidence is sufficiently high, and no red flag is detected. It must escalate red flags, low confidence, missing vitals, or specific high-risk combinations such as an older patient with chest pain. It must deny autonomous diagnosis and prescription. The card also defines a time-bound exception for remote clinical sites, requiring two-person approval and callback evidence.
The defence example is shorter but useful because it tests whether the schema can express safety-critical constraints: no autonomous kinetic actions, escalation for target designation, blue-force deconfliction, and immutable override audits. The paper’s point is not that defence autonomy has been solved. That would be a magnificent way to fail a seriousness test. The point is that the artifact can represent domain-specific rules without changing its core structure.
Across examples, the strongest contribution is not the individual thresholds. Those are illustrative. The stronger contribution is the pattern: every operational permission is tied to a condition, every exception is bounded, every high-risk event has an escalation path, and every meaningful action has an evidence trail.
Audit continuously: evidence becomes replayable, not archaeological
The third stage is audit. The Policy Card supports two modes: automated CI-based audits after deployment iterations, and post-market or continuous audits by internal or external assurance teams. In both cases, evidence emitted during execution is compared against the declared controls, obligations, KPIs, and assurance mappings.
This is the point where Policy Cards become more than “policy as code.” They are closer to “policy as replayable evidence.” The card does not only say what the agent should have done. It specifies what evidence must exist to prove what happened.
That difference is commercially significant. In a regulated workflow, an audit that depends on reconstructing decision logic from logs, Slack messages, ticket comments, and someone named Daniel who left six months ago is not an audit process. It is archaeology. Policy Cards aim to make the audit trail a designed property of the system.
The paper’s banking exemplar defines a violation-rate KPI over a 30-day window and an escalation-SLA KPI. It also defines critical auto-fail events such as transfer without KYC, transfer to a sanctioned party, and override without two-person approval. The clinical example defines median handover latency and escalation precision, with critical auto-fail events such as autonomous diagnosis, autonomous prescription, and triage without vitals.
These numbers should be read carefully. They are not reported outcomes from deployed systems. They are policy thresholds encoded in exemplars. Their purpose is to show how quantitative governance targets can be represented and checked. The difference matters. A reader looking for experimental evidence will not find a benchmark table proving operational superiority. The paper is a framework and artifact proposal, supported by schema design, validation logic, crosswalks, and domain exemplars.
That is not a weakness if interpreted correctly. The evidence is architectural, not empirical. The paper demonstrates a way to express runtime governance in machine-readable form. It does not yet demonstrate that regulators will accept that form, that enterprises will maintain it correctly, or that every runtime enforcement backend will behave safely under adversarial conditions.
Stress testing is a governance rehearsal, not the headline result
The paper also discusses stress testing and simulated assurance. This part is easy to over-read. Stress testing is not presented as a completed experimental campaign with measured performance gains. It is a proposed use of the Policy Card mechanism.
Because a Policy Card defines allowed actions, escalation conditions, evidence requirements, and critical failure thresholds, a test harness can use the same card as the live system. Developers can inject simulated edge cases: fraudulent transaction bursts, missing clinical data, latency spikes, red-flag symptoms, or inconsistent detector outputs. The test harness can then check whether the enforcement engine applies the correct allow, deny, or require_escalation rule and whether the required evidence is captured.
Its likely purpose is robustness testing of the governance layer. It supports the claim that Policy Cards can unify declaration, runtime behaviour, and audit. It does not prove that an AI system is safe in the general sense. A card can check whether a declared rule was followed. It cannot guarantee that the declared rule was wise, complete, or immune to gaming. One can encode nonsense with excellent schema discipline. Enterprise software has been proving this for decades.
Still, the stress-testing idea is powerful because it gives governance teams a testable artifact. Instead of asking whether a policy has been “communicated,” they can ask whether the agent fails correctly when confronted with known red lines. That is a healthier question.
The standards crosswalk makes compliance traceable, not automatically accepted
Policy Cards include assurance mappings to frameworks such as NIST AI RMF, ISO/IEC 42001, and the EU AI Act’s technical documentation and post-market monitoring requirements. The paper uses canonical tokens to connect card sections to governance functions or clauses: metadata and scope to governance context, controls to operational management, monitoring to measurement, review cadences to post-market monitoring, and change management to lifecycle control.
This is useful, but it should not be inflated. A crosswalk is not regulatory approval. It is a map between technical artifacts and compliance expectations. The paper’s claim is that Policy Cards can serve as an interoperable assurance layer, helping organisations show how their runtime controls correspond to recognised frameworks.
For business readers, the value is in reducing translation loss. Compliance teams speak in obligations, standards, clauses, and evidence. Engineering teams speak in schemas, validators, logs, gateways, and deployment gates. Policy Cards provide a shared object both sides can inspect. That does not remove legal judgement. It gives legal judgement something structured to attach to.
The operational advantage is especially visible in multi-jurisdictional settings. A global agent may need different constraints across regions, business units, data sensitivity levels, or product modes. Policy Cards could make those differences explicit through scoped policies, versioned exceptions, and jurisdiction-specific mappings. The uncertainty is whether organisations will govern the cards themselves with enough discipline. A badly maintained Policy Card becomes just another configuration file with an inflated job title.
Forward-looking features should be treated as extensions, not current proof
The paper’s later sections look ahead to agent-readable policies, multi-agent governance, cryptographic assurance, and ethical policy composition. These are important, but they sit at a different evidence level from the schema and exemplars.
Agent-readable governance means an agent can parse its own Policy Card, check authorisation, assess obligations, and validate context before acting. In multi-agent systems, each agent could carry a card and exchange compliance states or attestations with others. The paper describes this as a distributed assurance mesh.
The cryptographic extensions go further. Key-scoped encryption could restrict who can read sensitive policy content. Zero-knowledge proofs could allow an agent to prove adherence to selected conditions without revealing underlying data. This is particularly relevant for cross-border, regulated, or classified environments where evidence sharing is constrained.
Ethical policy composition is another extension. The paper suggests that fairness thresholds, transparency notices, consent duties, or non-manipulation rules could be encoded as controls or obligations, with priority ordering used to resolve conflicts between ethical constraints and optimisation goals.
These ideas are directionally coherent with the Policy Card framework, but they are not yet mature deployment evidence. They are exploratory extensions. The practical near-term value is not “agents will prove moral compliance cryptographically while coordinating in a decentralised mesh.” That sentence should come with a complimentary espresso and a legal waiver. The near-term value is simpler: put the rules, thresholds, exceptions, evidence fields, and framework mappings into a validated artifact that the deployment process can actually use.
What this means for enterprises deploying agents
The paper’s business relevance is strongest in settings where autonomous agents touch regulated actions, safety-critical workflows, or high-liability decisions. Payments, clinical triage, defence planning, insurance claims, procurement approvals, customer remediation, and financial advisory workflows all share the same structural problem: the agent’s permitted behaviour depends on context.
A generic model policy is not enough. The same underlying model may be safe in one role and unacceptable in another. An agent that can summarise payment rules is not the same as an agent that can initiate a payment. An assistant that can explain symptoms is not the same as one that can triage cases. A mission-planning tool that suggests options is not the same as one that can authorise action. The Policy Card makes that role distinction explicit.
A practical adoption pathway would look like this:
| Step | What the business does | What becomes measurable |
|---|---|---|
| Identify governed actions | List the agent actions that create legal, financial, safety, or reputational exposure | Which actions need allow, deny, or escalation rules |
| Encode runtime controls | Translate policies into scoped ABAC-style rules and obligations | Whether decisions match declared conditions |
| Bind evidence fields | Define logs, detector outputs, IDs, retention, and review cadence | Whether audit evidence exists and is complete |
| Add deployment gates | Run schema validation and lint checks in CI/CD | Whether governance defects block release |
| Connect runtime middleware | Enforce cards through gateways, orchestrators, or API controls | Whether live actions follow the card |
| Replay audits | Compare logs against thresholds and mappings | Whether compliance claims survive evidence review |
The ROI is not that Policy Cards make governance cheap. The ROI is that they make governance less ambiguous, less manual, and less dependent on heroic reconstruction after incidents. They can reduce audit preparation cost, shorten compliance-engineering handoffs, and make deployment approval more repeatable. They may also help firms identify where their current “AI governance framework” is mostly a slide deck wearing a blazer.
The harder organisational change is ownership. A Policy Card sits between legal, compliance, risk, product, security, and engineering. If everyone owns it, nobody owns it. Firms would need clear responsibility for authoring, approving, versioning, testing, and retiring cards. The paper’s schema can support that discipline; it cannot create it by force.
The boundary: a promising control plane, not a finished governance regime
The paper is strongest as an operational architecture for deployment-layer governance. It defines the artifact, proposes a schema, describes validation and linting, maps the schema to assurance frameworks, and demonstrates the pattern through domain exemplars. It also explains how the artifact could support Declare–Do–Audit workflows, CI/CD gating, runtime enforcement, stress testing, and continuous audit.
The boundary is equally clear. The work does not provide large-scale production evidence. It does not show regulator acceptance. It does not prove that Policy Cards are sufficient for all safety-critical deployments. It does not settle the formal semantics of every policy language backend. It does not eliminate the need for human judgement in choosing thresholds, writing rules, approving exceptions, or resolving conflicts among laws, ethics, and business objectives.
Those limitations do not make the proposal weak. They define what kind of contribution it is. Policy Cards are not a new AI model, not a benchmark, and not a magic compliance certificate. They are a proposed runtime governance interface: structured enough for machines, legible enough for humans, and explicit enough for auditors.
That is a useful category. The agent economy does not need more declarations that AI should be responsible. It needs mechanisms that make responsibility inspectable before the agent has already done the thing.
Governance finally gets an API-shaped object
The deeper value of Policy Cards is that they give governance an object that behaves like modern software infrastructure. It can be versioned. It can be validated. It can be diffed. It can block deployment. It can travel with the agent. It can tell the runtime what to enforce. It can tell the auditor what evidence to expect. It can tell compliance which external framework each control is meant to satisfy.
That does not make governance automatic. It makes governance operational.
If Model Cards helped the field ask, “What is this model, and what should we know about it?”, Policy Cards ask a sharper question: “Under this deployment, what exactly is this agent allowed to do, and how will we prove it obeyed?”
For autonomous agents, that question is no longer optional. Once software starts acting on behalf of institutions, policy can no longer remain a PDF in the next room. It has to go live.
Cognaptus: Automate the Present, Incubate the Future.
-
Juraj Mavračić, “Policy Cards: Machine-Readable Runtime Governance for Autonomous AI Agents,” arXiv:2510.24383, 19 October 2025, https://arxiv.org/abs/2510.24383. ↩︎