Game of Prompts: How Game Theory and Agentic LLMs Are Rewriting Cybersecurity

TL;DR for operators

A suspicious domain appears in a DNS log. A conventional classifier either recognises it, misses it, or assigns a confidence score that someone in the SOC must interpret while pretending the queue is under control. The paper’s more interesting proposal is not “let an LLM summarise the alert”. That would be the enterprise equivalent of putting a helpful intern on a fire alarm.

Quanyan Zhu’s chapter argues for something deeper: cybersecurity should be modeled as an adversarial game whose players now include LLM-powered agents, human analysts, attackers, users, tools, memories, prompts, and workflows.¹ Game theory supplies the strategic structure: attacker, defender, incentives, information asymmetry, deception, commitment, belief updating. Agentic LLMs supply the operational bridge: they can query tools, interpret language, reason from partial evidence, simulate perspectives, and coordinate across multi-agent workflows.

The practical implication is that prompts become more than instructions. In this framing, prompts are strategic controls. A red-team agent’s prompt shapes how it explores vulnerabilities. A blue-team agent’s prompt shapes how it weighs evidence, escalates uncertainty, or chooses containment. A coordinator agent’s prompt shapes how specialist agents disagree, reconcile, and update their shared view. The “game” has moved partly into the reasoning layer.

This is not an empirical paper showing that agentic LLM SOC systems beat existing tools by X percent. There is no production benchmark, no ablation table, no controlled deployment, and no measured ROI. Its value is architectural: it gives security leaders and AI builders a vocabulary for designing agentic cyber systems that are strategic, auditable, modular, and less embarrassingly naïve about adversaries.

The operator takeaway is simple: do not buy or build “AI security agents” as if they are smarter alert parsers. Treat them as participants in a strategic workflow. Define what they know, what they can do, what they are allowed to infer, how they resolve disagreement, how they remember prior rounds, and where humans remain the final authority. Otherwise, congratulations: you have automated confusion.

The alert is not the problem; the game around the alert is

Start with the paper’s own kind of example. A DNS query points to a suspicious domain, something like abcxyz.ru. A standard model checks whether the domain resembles known malicious infrastructure. A rules engine checks blocklists. A human analyst checks threat feeds, endpoint telemetry, and whether the affected machine belongs to finance, engineering, or that one executive who clicks everything with Olympic confidence.

The alert is technically small. The surrounding game is not.

The attacker may be probing. The defender may have partial visibility. The domain may be new, compromised, benign, or part of a multi-stage campaign. The user may have triggered it through negligence, business necessity, or credential compromise. The system’s response also teaches the attacker something. Block too quickly and the attacker learns the boundary of your detection. Watch too passively and the attacker gets time. Escalate everything and the SOC becomes a human denial-of-service attack against itself.

This is why the paper begins from cybersecurity as an interaction among agents: defenders, attackers, and users. That sounds obvious, which is usually where bad security architecture begins. The non-obvious part is that those agents have different information, different incentives, different capabilities, and different time horizons. APT campaigns are not one-shot events; they are repeated, adaptive interactions where each side learns from the other.

Classical cybersecurity tooling often compresses this into classification: malicious or benign, high risk or low risk, block or allow. Game theory keeps the strategic structure visible. Who moves first? What does each player observe? What does each player believe the other knows? What signal is being sent by a honeypot, a patch, a delay, a quarantine, or a deliberately noisy system response?

The paper’s first contribution is to position game theory and agentic LLMs as complementary layers rather than rival buzzwords. Game theory explains the adversarial structure. LLM agents operationalise parts of the reasoning and coordination. In a sane architecture, the LLM is not asked to “be cyber smart” in the abstract. It is placed inside a game: a bounded role, with information, tools, objectives, uncertainty, and constraints.

That framing matters because adversaries also adapt. A defender who uses an LLM merely to summarise SIEM events has bought a clerical improvement. A defender who uses game-theoretic structure to decide how agents gather evidence, simulate attacker responses, deploy deception, and coordinate escalation may be changing the defensive posture itself. Modestly. Carefully. Without pretending the word “agent” comes with a warranty.

Game theory gives the SOC a map of adversarial structure

The paper reviews several game-theoretic frameworks: static games, dynamic games, incomplete-information games, signaling games, Bayesian games, Stackelberg games, network interdiction games, deception games, and socio-economic security games. That list can look like a syllabus trying to win a taxonomic staring contest, so the business reader should compress it into four questions.

First, is the interaction simultaneous or sequential? If both sides act without seeing the other’s move, Nash-style reasoning is useful. If the defender commits first and the attacker responds, Stackelberg reasoning becomes more natural. In cybersecurity, commitment matters: a defender may choose sensor placement, deception posture, patching priority, or access segmentation before the attacker chooses the path of least resistance.

Second, is information complete or asymmetric? It usually is asymmetric. Attackers may know a zero-day, defenders may know internal topology, users may know their own intent, and nobody knows quite why the procurement portal still runs that antique dependency. Incomplete-information games model private types, beliefs, and belief updates. This is especially relevant for insider threats, deception, phishing, and persistent campaigns.

Third, is the game one-shot or repeated? Real attacks unfold over time. APT behaviour, lateral movement, privilege escalation, credential reuse, and stealthy persistence are dynamic processes. Dynamic games and Markov-style formulations are useful because state matters. Yesterday’s observation changes today’s belief and tomorrow’s best response.

Fourth, are humans part of the system? The answer is yes, unfortunately and fortunately. Human users introduce bounded rationality, cognitive bias, fatigue, incentives, and patterned mistakes. Socio-economic security games matter because many breaches are not purely technical failures. They are failures of incentives, trust, compliance, attention, and design.

A useful operator translation looks like this:

Game-theoretic idea	Cybersecurity interpretation	Operational design question
Nash equilibrium	No player benefits by changing strategy alone	Are our agents stable when multiple teams or tools act independently?
Stackelberg game	One player commits, another responds	What should defenders reveal, hide, or pre-commit to before attackers adapt?
Signaling game	One side sends signals under asymmetric information	Are alerts, honeypots, banners, delays, and responses changing attacker beliefs?
Bayesian updating	Beliefs change as evidence arrives	How should agents update suspicion after weak, conflicting, or delayed signals?
Dynamic game	Interaction unfolds across time	Are we designing for campaigns, not isolated alerts?
Deception game	Defender manipulates attacker perception	Where can decoys, honeypots, and moving-target defenses waste attacker effort?
Socio-economic game	Human incentives and bounded rationality shape outcomes	Are policies aligned with how people actually behave, or with the HR slideshow version of humanity?

The paper is not claiming game theory magically solves cybersecurity. Classical game theory itself has limits. Many equilibrium concepts assume rational actors, stable payoffs, common knowledge, and tractable strategy spaces. Modern cyber conflict cheerfully violates all of these. Attackers improvise. Users misread warnings. Systems change. Defenders operate under time pressure. Payoffs are often inferred after the damage is done.

This is where the paper makes its pivot: LLM agents may help relax some of those classical assumptions.

LLMs change the modeling layer, not just the tooling layer

The common misconception is that LLMs are just automation glue: summarize logs, draft tickets, query tools, write scripts, generate playbooks. Useful, yes. But not the paper’s sharper claim.

The paper argues that LLMs can change the way strategic agents are modeled. Classical game theory often represents decision-making as explicit utility maximisation: an agent chooses the action that maximises expected payoff. This is analytically elegant and psychologically suspicious. In real cyber environments, actors reason from incomplete context, role-specific knowledge, heuristics, language, incentives, fatigue, and sometimes pure organisational theatre.

LLMs offer a different modelling primitive. Instead of assuming that every agent solves a clean optimisation problem, an LLM-based agent can generate an action distribution from a prompt, context, memory, role, and internal representation. That does not make the agent “rational”. It makes its reasoning process explicit enough to shape, test, constrain, and compare.

This is the paper’s intellectual hinge: strategic behaviour can move from action space into reasoning space.

In a classical Nash game, agents choose actions. In the paper’s LLM-Nash framing, agents choose prompts or reasoning configurations that induce action distributions. The prompt is not merely a user interface. It is a cognitive strategy. A defender prompt might instruct an agent to prioritise worst-case impact, update beliefs from weak signals, or avoid containment unless business-critical services are threatened. An attacker prompt might instruct a simulated red-team agent to exploit defender bias, mimic benign behaviour, or minimise detectability.

The resulting equilibrium is not just “which action is stable?” It becomes “which reasoning setup is stable, given the other agent’s reasoning setup?”

A simplified way to state the shift is:

$$ \text{Classical game: choose action } a_i \in A_i $$

$$ \text{LLM-agent game: choose prompt } p_i \in P_i \rightarrow \text{LLM-induced policy } \pi_i(a_i \mid p_i, I_i) $$

Here, $I_i$ represents the agent’s information. The important move is the arrow. The prompt does not directly equal the action. It induces a policy through the model. That indirection is both the opportunity and the headache.

The opportunity is controllability. If prompts shape behaviour, then prompt design, retrieval, memory, fine-tuning, and agent role definition become security controls. The headache is instability. Similar-looking prompts can produce materially different behaviour. A tiny wording change may alter whether an agent escalates, quarantines, ignores, or invents a plausible-sounding explanation. Excellent. The machine is fluent and moody.

The paper explicitly discusses prompt-space stability: similar prompts should ideally induce similar output distributions. For production systems, that principle translates into regression testing, prompt versioning, adversarial prompt evaluation, role-specific benchmarks, and careful change management. Prompt engineering in this setting is not copywriting. It is policy engineering with a language-shaped wrench.

The suspicious domain becomes a multi-agent workflow

Return to the DNS alert. Under the paper’s agentic framing, a mature SOC workflow would not ask one model to decide everything. It would decompose the work.

One agent normalises telemetry. Another checks domain reputation. Another correlates endpoint behaviour. Another retrieves past incidents. Another evaluates whether the observed sequence fits phishing, command-and-control, supply-chain compromise, or benign automation. Another generates a containment recommendation. A human analyst reviews the decision boundary when business impact or uncertainty is high.

That is the paper’s third contribution: it maps multi-agent LLM workflow structures to cybersecurity tasks. The workflow taxonomy includes chain, star, parallel, feedback, heterogeneous, hybrid, and Gestalt-style game descriptions.

The figures in the paper are not empirical evidence. They are architectural diagrams. Their purpose is implementation clarification: showing how tasks move through agents, where coordination happens, and where feedback loops can be placed. Treating them as performance proof would be a category error, though a surprisingly common one in AI slide decks.

Workflow pattern	What it does	Cybersecurity fit	Main failure mode
Chain	Passes output step by step through specialised agents	Forensics, post-incident reporting, structured triage	Early error propagation
Star	Uses a central coordinator with specialist peripheral agents	Alert triage, risk scoring, compliance checks	Coordinator bottleneck or bad synthesis
Parallel	Runs independent analyses across data streams or zones	Distributed threat hunting, cloud/endpoints comparison	Inconsistent assumptions across agents
Feedback	Loops monitoring, analysis, planning, and execution	Active defense, deception, red-blue simulation	Oscillation, runaway loops, unstable decisions
Heterogeneous	Combines LLM agents with human analysts	Incident response and high-risk escalation	Ambiguous authority between humans and agents
Hybrid	Blends sequential, parallel, star, and feedback structures	Real SOC workflows	Complex governance and audit burden
Gestalt game	Models layered games inside larger games	Cyber-physical systems, trust networks, APT defense	Difficult formal verification and coordination

This table is more useful than memorising the taxonomy because it forces a design choice. Chain workflows are interpretable but brittle. Star workflows are coordinated but depend heavily on the central agent. Parallel workflows scale but need synchronisation. Feedback workflows adapt but can become unstable. Hybrid workflows are realistic, which is another way of saying messy.

The paper’s deeper point is that workflow architecture is itself part of the game. A poorly designed multi-agent SOC can amplify hallucinations, propagate false assumptions, or create artificial consensus. A better one can use redundancy, disagreement, voting, debate, reconciliation, provenance tracking, and human review to make errors visible before they become actions.

Multi-agent design is therefore not a decorative layer around LLMs. It is the control surface.

Deception is where game theory earns its keep

Cyber deception is one of the clearest places where the paper’s game-theoretic framing becomes operationally useful. Deception is not simply “deploy honeypots”. It is the deliberate shaping of adversary beliefs.

A honeypot, decoy asset, moving-target defense, or obfuscated response changes what the attacker thinks the system is. In signaling-game terms, the defender sends signals under asymmetric information. The attacker observes those signals, updates beliefs, and chooses an action. The defender’s goal is not only to detect but to steer the attacker toward less damaging behaviour: waste time, reveal capability, touch monitored infrastructure, or choose a less valuable path.

Agentic LLMs add a second layer. They can help generate, interpret, and adapt language-based or context-rich signals. Classical signaling games often reduce messages to simple symbols. Cyber operations often involve natural language, logs, banners, phishing content, negotiation, support tickets, policy text, and ambiguous human communication. LLMs operate natively in that medium.

This does not mean defenders should unleash autonomous deception agents into production networks with poetic licence. It means deception design can become more adaptive. A cyber defense agent might retrieve previous intrusion patterns, policy updates, and incident history before recommending whether to expose a decoy, delay a response, or escalate quietly. A red-team agent might retrieve known vulnerabilities and failed attempts to tailor its next probe. Both sides can use context; both sides can learn.

That symmetry is uncomfortable, which is why it is useful. The same mechanisms that improve defense can improve attack. A paper that ignores this would be selling comfort. This one at least keeps the adversarial structure in view.

Prompt engineering becomes strategic control, not artisanal wording

The paper’s sections on prompt engineering, RAG, and fine-tuning are easy to misread as a standard LLM operations checklist. They are not. In the game-theoretic framing, each mechanism changes the strategic agent.

Prompt engineering changes the immediate reasoning scaffold. It can encode role, objective, uncertainty posture, belief hierarchy, escalation logic, and interpretation style. In a Stackelberg-style setting, a sender’s prompt shapes the message it generates, while a receiver’s prompt shapes how that message is interpreted and acted upon. The prompt is therefore part of the strategic move.

RAG changes what the agent can know at inference time. Instead of relying only on static model parameters or a bloated prompt, an agent retrieves relevant logs, threat indicators, previous incidents, policy constraints, or adversary precedents. For repeated games, this matters because memory supports temporal coherence. The agent can condition current decisions on what happened in earlier rounds.

Fine-tuning changes longer-term behavioural priors. The paper frames red-teaming and blue-teaming as strategic fine-tuning problems: attacker models can be tuned to simulate threat behaviours, while defender models can be tuned through preference alignment to counter adversarial behaviour. The paper also notes that multi-agent preference settings may require solution concepts beyond simple ranking, including coarse correlated equilibria, evolutionarily stable strategies, or Pareto frontiers.

A business reader should translate this into governance categories:

Control mechanism	Strategic function	Enterprise governance question
Prompt design	Sets immediate reasoning posture	Who approves prompts that affect containment, escalation, or deception?
RAG	Supplies situational memory and external evidence	Which sources are trusted, current, logged, and access-controlled?
Fine-tuning	Embeds durable behavioural tendencies	What data shaped the agent’s threat assumptions and risk tolerance?
Multi-agent debate	Surfaces disagreement and weak evidence	When does disagreement block action versus trigger escalation?
Human-in-the-loop review	Preserves authority under high uncertainty	Which decisions are never delegated fully to agents?
Provenance logging	Makes reasoning auditable	Can we reconstruct why an agent recommended an action?

This is where many “AI SOC” implementations will quietly fail. They will treat prompt updates as harmless configuration edits, RAG sources as convenience plumbing, and agent coordination as product UX. In adversarial settings, those are governance surfaces. If an attacker can influence retrieved context, steer prompts, poison memory, or exploit agent disagreement, the defender has not built intelligence. The defender has built a bureaucracy with APIs.

What the paper shows, what it proposes, and what it does not prove

The paper is best read as a conceptual architecture chapter. It synthesises existing game-theoretic cyber models, identifies limitations of classical assumptions, proposes LLM-augmented Nash and Stackelberg formulations, and organises multi-agent workflow patterns for cyber defense.

It does not run a SOC benchmark. It does not compare detection accuracy across models. It does not show that LLM-Nash prompting improves incident response time. It does not provide ablations proving that RAG beats fine-tuning or that star workflows outperform chain workflows. The figures are workflow schematics. The Rock-Paper-Scissors example is illustrative, designed to clarify prompt-space equilibrium, not to establish cybersecurity performance.

That distinction is not a weakness if read properly. It is only a weakness if someone tries to turn it into a vendor claim by Friday.

Element in the paper	Likely purpose	What it supports	What it does not prove
Review of static, dynamic, Bayesian, signaling, and Stackelberg games	Background and conceptual foundation	Cybersecurity can be modeled as strategic interaction under uncertainty	That a specific game model fits a specific enterprise environment
Cyber deception, network security, and socio-economic security examples	Application mapping	Game theory has practical cybersecurity domains	That LLM agents improve those domains empirically
LLM-Nash formulation	New conceptual model	Prompts can be strategic variables that induce policies	That prompt equilibria are stable in production models
Prompt-space Rock-Paper-Scissors example	Toy illustration	Reasoning-level equilibrium may differ from classical action-level equilibrium	That this behaviour generalises to real cyber operations
LLM-Stackelberg formulation	New conceptual model	Language-based signaling can be modeled as leader-follower reasoning	That LLM deception or defense is safe to automate
RAG and fine-tuning sections	Mechanism discussion	Memory and adaptation alter strategic behaviour	That one adaptation method is superior
Multi-agent workflow figures	Architecture clarification	Different coordination structures suit different tasks	That these workflows improve accuracy, latency, or cost
Gestalt game framing	System-of-systems abstraction	Local games can interact inside larger workflow games	That meta-equilibria are operationally computable in ordinary SOCs

The business value of the paper is therefore not “deploy this and reduce breach risk by 37%”. The business value is better diagnosis of what kind of AI security system you are actually building.

Are you building a classifier with a chat interface? A tool-using analyst agent? A red-team simulator? A deception planner? A multi-agent incident response workflow? A human-AI decision system with escalation boundaries? Each has different risks, controls, and evaluation requirements.

The paper gives language for that distinction. In a market where everything is currently an agent, a copilot, or a platform—terms that now mean “software, but wearing a cape”—that language is not trivial.

The architecture implication: design the game before buying the agents

For enterprise teams, the most practical use of this paper is as an architecture checklist.

Before deploying agentic cyber systems, define the players. Not just “attacker” and “defender”, but human analysts, endpoint agents, SIEM systems, threat intelligence feeds, identity providers, cloud controls, users, auditors, and adversarial input channels. Then define what each player can observe, what actions each can take, and what incentives or objectives shape those actions.

Next, define the information structure. Which agents see raw logs? Which see summaries? Which can query external APIs? Which can retrieve past incidents? Which can see user identity or business context? Which are deliberately blinded to reduce leakage or bias?

Then define the strategic workflow. If the task is post-breach analysis, a chain may be acceptable. If the task is alert triage across independent evidence sources, a star or parallel workflow may be better. If the task is active defense, feedback loops become necessary but dangerous. If humans must approve containment, the workflow is heterogeneous by design, not as an afterthought.

Finally, define equilibrium in operational terms. Not mathematical purity. Practical stability. Do the agents converge on consistent recommendations under small prompt changes? Do they disagree productively? Do they over-escalate? Do they under-escalate? Can they be tricked by adversarial language? Can they explain which evidence changed the decision? Can a human override them cleanly?

A useful enterprise implementation path might look like this:

Start with decision support, not autonomy. Use agents to gather evidence, generate hypotheses, and draft recommendations. Keep containment authority bounded.
Instrument provenance from day one. Log retrieved sources, prompts, tool calls, intermediate conclusions, disagreements, and final recommendations.
Use multi-agent disagreement deliberately. Do not force consensus too early. Disagreement is often the useful part.
Treat prompts as controlled assets. Version them, test them, review them, and restrict who can modify them.
Separate red-team and blue-team agents. Their objectives, data access, and evaluation criteria should not blur into one cheerful assistant.
Evaluate workflows, not only models. A weaker model in a better workflow may outperform a stronger model acting alone.
Define human authority clearly. “Human in the loop” is not a governance model. It is a phrase people use when they have not designed one.

This is where Cognaptus’ inference goes beyond the paper. The paper supplies the conceptual machinery. The business implication is that SOC automation should be procured and evaluated as a strategic workflow system, not as a model capability demo. The uncertain part is how much performance gain follows in real deployments. That requires empirical testing the paper does not provide.

The limits are real, and they are not the boring kind

The important limitation is not merely “LLMs can hallucinate”. Everyone knows this. Some people even remember it during procurement.

The sharper limitation is that agentic LLM systems introduce new strategic attack surfaces. If prompts are control surfaces, prompts can be attacked. If retrieval supplies memory, retrieval can be poisoned. If agents coordinate through language, that language can carry manipulation, ambiguity, or hidden instructions. If a coordinator synthesises specialist outputs, the coordinator becomes a high-value failure point. If feedback loops adapt, they can also drift.

The paper discusses robustness, resilience, coordination, controllability, observability, and stability for multi-agent workflows. Those are not academic decorations. They are the engineering terms that decide whether an agentic SOC is inspectable or merely theatrical.

There is also a measurement gap. The paper provides frameworks and formulations, but operational teams still need benchmarks: incident response time, false positive rate, false negative rate, escalation quality, analyst workload, containment precision, adversarial robustness, prompt sensitivity, retrieval reliability, and auditability. Without those, “agentic cyber defense” remains an impressive phrase parked dangerously close to the budget.

One more boundary deserves attention. LLMs may simulate bounded rationality, role-specific belief, and language-based reasoning, but simulation is not validation. A model that produces plausible attacker reasoning is not necessarily modelling real attacker behaviour. A model that generates a convincing SOC recommendation is not necessarily calibrated to organisational risk. Fluency is not epistemology. It just dresses better.

The real shift is from tools to strategic software

The paper’s most useful message is that cybersecurity AI should not be understood as a pile of tools with a language model on top. It should be understood as strategic software: systems that reason under adversarial uncertainty, coordinate specialised agents, update beliefs, shape signals, and act within governed boundaries.

That is a more demanding standard. It requires security architects to think like game designers, AI engineers to think like control theorists, and executives to stop asking whether “the AI can handle alerts” as if the SOC were an email inbox with worse consequences.

The case-first lesson is straightforward. A suspicious domain is never just a suspicious domain. It is a move in a larger game: what the attacker knows, what the defender sees, what the user did, what the system reveals, what the agents infer, and what the next round will look like.

Game theory gives that game a structure. Agentic LLMs may give it an operational interface. Multi-agent workflows give it organisational form. The hard part is making the whole arrangement stable, auditable, and useful under pressure.

That is not as catchy as “AI will revolutionise cybersecurity”. It is also less likely to end in a postmortem.

Cognaptus: Automate the Present, Incubate the Future.

Quanyan Zhu, “Game Theory Meets LLM and Agentic AI: Reimagining Cybersecurity for the Age of Intelligent Threats,” arXiv:2507.10621, 2025. https://arxiv.org/abs/2507.10621 ↩︎

TL;DR for operators#

The alert is not the problem; the game around the alert is#

Game theory gives the SOC a map of adversarial structure#

LLMs change the modeling layer, not just the tooling layer#

The suspicious domain becomes a multi-agent workflow#

Deception is where game theory earns its keep#

Prompt engineering becomes strategic control, not artisanal wording#

What the paper shows, what it proposes, and what it does not prove#

The architecture implication: design the game before buying the agents#

The limits are real, and they are not the boring kind#

The real shift is from tools to strategic software#