Reason, Reveal, Resist: The Persuasion Duality in Multi‑Agent AI

Meetings are already persuasive systems. Someone speaks first, someone sounds confident, someone produces a spreadsheet with just enough decimal places to look holy, and suddenly the room has moved.

Multi-agent AI systems are not so different. They are becoming small artificial committees: one agent retrieves, another proposes, another critiques, another decides. The optimistic version says this gives us productive disagreement. The less adorable version says we have built a machine for circulating influence, and we are only now asking what makes one agent cave to another.

The paper behind today’s article, Disagreements in Reasoning: How a Model’s Thinking Process Dictates Persuasion in Multi-Agent Systems, studies exactly that problem.¹ Its useful contribution is not simply that LLM agents can persuade each other. That part should surprise no one who has watched a chatbot apologise for a correct answer after one mildly assertive correction. The sharper result is the paper’s proposed Persuasion Duality: reasoning makes an agent more resistant when it is receiving arguments, but revealed reasoning makes an agent more persuasive when it is sending them.

In plain English: private reasoning is armour; exposed reasoning is ammunition.

That is a design problem, not a philosophical curiosity. Enterprise agent systems increasingly depend on internal messages, critiques, rationales, and handoffs. Those messages are usually treated as neutral coordination material. This paper suggests they should instead be treated as an influence surface. Very glamorous. We automated office politics.

The wrong mental model is “bigger model wins the argument”

A convenient belief about persuasion in AI is that it mostly tracks model strength. Bigger or more capable models should be better persuaders; weaker ones should be easier targets. There is some truth in the second half. The paper finds that weaker models are generally more likely to be persuaded. But the first half is less tidy: the authors report that a persuader’s raw capability has a weaker and less obvious relationship with persuasive success than the persuadee’s susceptibility.

That distinction matters.

If persuasion were mainly about scale, the business answer would be boring: use stronger models as reviewers and weaker models as cheap workers. Done. Another strategy deck saved from existence. But the paper’s results point elsewhere. The important variable is not merely the model’s rank in a capability leaderboard. It is how the model processes disagreement and what kind of reasoning content is exposed to other agents.

The paper compares large language models and large reasoning models across pairwise persuasion tasks. It evaluates agents as both persuaders and persuadees, using objective multiple-choice questions and subjective stance questions. The authors track three behavioural outcomes:

Metric	What it measures	Operational meaning
Persuaded-Rate (PR)	The persuadee switches to the persuader’s target answer or stance	Raw influence
Remain-Rate (RR)	The persuadee keeps its original answer or stance	Resistance or stability
Other-Rate (OR)	The persuadee moves somewhere else	Drift, noise, or unclassified failure

The objective setting uses MMLU-style multiple-choice questions where the model initially answers correctly, then receives persuasive content pushing it toward a fixed wrong answer. The subjective setting samples 1,000 claims from persuasion-related datasets and maps model stances into support, neutral, or oppose. The model set includes both closed and open models, including o4-mini, DeepSeek-R1, Gemini-2.5-flash, Qwen3-32B, Hunyuan-7B-Instruct, Qwen2.5-7B-Instruct, and Llama-3-8B-Instruct, with some models evaluated in switchable thinking and non-thinking modes.

This setup is deliberately artificial. That is not a flaw; it is the point. The authors are isolating persuasion dynamics before the whole mess is buried under tool calls, documents, deadlines, retrieval errors, and one executive asking whether the chatbot can “just be more strategic”.

The mechanism: reasoning protects the receiver and strengthens the sender

The central mechanism is asymmetric.

When the receiver uses explicit reasoning, it has more structure for evaluating incoming claims. It can compare the persuader’s argument against its own answer, re-check the premise, and resist style-only pressure. The paper reports that enabling thinking mode for LRMs generally reduces Persuaded-Rate and increases Remain-Rate. For objective questions, comparing the same model with and without thinking mode, the authors report average PR reductions of 7.82% and 29.68% under the two thinking-content conditions. The exact magnitude should not be carried into production procurement as if it were a warranty, but the direction is the useful part: reasoning makes receivers harder to move.

When the sender exposes its thinking content, the effect flips. Revealed reasoning makes the persuader more convincing. In the objective heatmap analysis, adding thinking content to LRM persuasive messages increases Persuaded-Rate, with the paper reporting an average gain of 21.07% in Figure 1. That is the duality: the same family of mechanisms that helps an agent defend its own answer can also help it move someone else’s.

This is why “show your work” is not a universally safe instruction in multi-agent systems. It improves transparency and can improve evaluation. It can also turn one agent’s intermediate reasoning into a high-bandwidth persuasion package for another agent. The product question is therefore not “Should agents reason?” The serious question is where reasoning should be private, where it should be exposed, and who is allowed to treat exposed reasoning as evidence.

The ablation says it is not just longer text, though length definitely helps

The paper’s most useful ablation tests whether thinking content persuades because it is meaningful reasoning or merely because it makes the message longer. This matters because models, like people, can mistake verbosity for substance. The paper does the sensible thing: it compares native thinking content, no thinking content, padding, and mismatched thinking content.

The results are instructive:

Persuasive content condition	Reported PR	Interpretation
Without thinking content	46.31%	Baseline persuasion
With native thinking content	65.78%	Highest persuasion; coherent revealed reasoning helps
With non-semantic padding of similar length	62.34%	Length itself contributes materially
With mismatched thinking content	32.42%	Incoherent reasoning damages persuasion below baseline

This is a good ablation because it prevents the lazy conclusion that “reasoning traces persuade because they are logical”. Not quite. The padding result shows that sheer length and perceived substance can move models. More content gives the receiver more surface area to latch onto, more cues to imitate, and more opportunities to drift away from its original judgement.

But the mismatched-thinking result is equally important. Bad or irrelevant reasoning is not neutral. It reduces persuasion below the no-thinking baseline. That suggests models are not only counting tokens or genuflecting before <think> tags. Coherence still matters.

The practical lesson is slightly annoying, which means it is probably useful: message length is an influence parameter. In agent orchestration, token budgets are not just cost controls. They change social dynamics. A long argument may persuade because it is better; it may also persuade because it is long. The difference is where governance earns its lunch.

Subjective questions are easier to sway because there is less ground to stand on

The paper finds that models are generally more easily persuaded on subjective questions than objective ones. That is unsurprising, but not trivial.

In the objective setting, a model can anchor to a learned fact or a solvable question. In the subjective setting, there is no single ground-truth answer. The receiver must interpret a claim, classify a stance, and decide how much weight to give another participant’s argument. Without a crisp correctness anchor, persuasive content has more room to operate.

This matters for business domains because many real agent workflows are subjective under a thin coating of operational language. Risk classification, vendor evaluation, hiring-screen summaries, market narratives, compliance triage, product prioritisation: these are not pure arithmetic. They contain judgement, ambiguity, and institutional taste. Exactly the type of terrain where persuasion can masquerade as collaboration.

A reviewer agent that resists a false MMLU answer may still be pliable when asked whether a suspicious vendor relationship is “manageable”, whether a customer complaint is “low severity”, or whether a forecast assumption is “reasonable”. The word “reasonable” has hidden many crimes against modelling.

The attention case study explains why confident nonsense can travel

The paper includes a mechanistic case study using attention analysis. This is not the main evidence; it is exploratory explanation. Its purpose is to explain how persuasion can happen even when the persuasive content is weak.

The authors examine a case where the persuadee should answer an objective question correctly but is pulled toward the persuader’s wrong answer. They report that the model gives high attention to a short, confident assertion while giving very low attention to the longer reasoning portion. Specifically, the highlighted confident claim receives an average attention score of 11.1%, while the supposed reasoning receives 0.39%.

That is a useful warning, not a complete mechanistic theory. Attention analysis does not magically tell us everything about causality inside a model. Still, the case matches a familiar failure mode: the model overweights a compact, confident cue and underweights the actual evidential content. It does not reason like a careful analyst; it behaves more like a junior employee who has learned that confident formatting usually comes from someone senior.

This is why “be critical” as a generic system instruction is weak governance. The defence has to force the receiver to separate claims, evidence, and rhetorical cues. If the argument says “this makes perfect sense”, the evaluator should not nod along. It should ask: which claim, which evidence, which source, which counterexample?

A dull checklist beats a charismatic hallucination. Yes, this is where civilisation has brought us.

Multi-hop persuasion turns agent networks into influence circuits

The paper then moves from pairwise persuasion to multi-hop chains. This is an exploratory extension rather than the core thesis, but it is arguably the part most relevant to enterprise systems.

In a chain such as A persuades B, then B persuades C, the combined influence can be higher or lower than A persuading C directly. The paper reports both amplification and attenuation, depending on the composition of agents and their reasoning modes. On subjective tasks, there are cases where multi-hop persuasion produces stronger whole-chain effects than the direct route. On objective tasks, the appendix shows similar phenomena, but with fewer cases where multi-hop improves persuasive effect.

That should make architects uncomfortable in a productive way.

Most real agent workflows are not a single debate between two models. They are pipelines. A research agent briefs a planning agent. The planning agent briefs a financial agent. The financial agent briefs a final orchestrator. Each step can compress, reframe, exaggerate, or normalise the prior agent’s output. By the end, the final decision may appear independent while actually inheriting a chain of influence from an early, overconfident message.

This turns governance from a model-level problem into a network-level problem. It is not enough to ask whether the final decision agent is robust. You have to ask what it receives, who transformed it, whether intermediate agents exposed or suppressed reasoning, and whether a weak middle node laundered a bad argument into an apparently clean recommendation.

The mitigation result is simple because the weakness is simple

The paper proposes a prompt-level mitigation called adversarial argument detection. The prompt instructs the persuadee to critically evaluate incoming persuasive content, identify unsupported or rhetorical claims, and rely on its own knowledge rather than simply absorbing the prior participant’s argument.

This is a mitigation experiment, not a proof of safety. But the numbers are useful. With Llama-3-8B-Instruct serving as persuader, adding adversarial argument detection reduces Persuaded-Rate and raises Remain-Rate across several persuadee settings:

Persuadee setting	PR before	PR after	RR before	RR after
Hunyuan-7B without thinking	46.1	35.8	27.8	42.3
Hunyuan-7B with thinking	46.3	16.3	47.0	59.9
Llama-3-8B	46.2	19.6	28.0	51.2
Qwen2.5-7B	58.9	15.5	14.0	30.5

The result is operationally attractive because it does not require retraining. It is a prompt-level defence. That makes it cheap, deployable, and dangerously easy to overclaim.

The right interpretation is narrower: structured scepticism can reduce persuasion in this controlled setup. The wrong interpretation is that one magic prompt immunises a production agent against manipulation. Production systems include retrieval, tools, memory, hidden prompts, role conflicts, and human override. One prompt is not a moat. It is a speed bump. Still, speed bumps exist because cars are annoying.

What Cognaptus infers for agent design

The paper directly shows that reasoning mode, revealed thinking content, message length, task subjectivity, and network composition affect persuasion between LLM agents in controlled experiments. Cognaptus’ business inference is that enterprise agent systems need explicit persuasion governance.

That does not mean sterilising all agent communication. It means designing the influence paths deliberately.

First, separate proposal agents from evaluation agents. A proposal agent may reveal structured reasoning when the goal is to explain or coordinate. A reviewer or final decider should reason privately and treat incoming reasoning as an object to be audited, not a conclusion to be absorbed.

Second, log PR/RR/OR-like metrics during internal evaluation. You do not need the paper’s exact setup to adopt the measurement idea. In staging, test whether a reviewer keeps a correct answer after exposure to misleading but confident arguments. Test subjective cases separately. A compliance triage agent that resists factual traps may still fold under policy-adjacent framing.

Third, govern message length. Long rationales should be structured into claim-evidence-confidence-source format. Do not let agents forward persuasive essays through the system as if they were harmless notes. If a message is meant to persuade, label it. If it is meant to inform, constrain it.

Fourth, add an argument firewall before important decisions. A receiver should be asked to identify unsupported claims, rhetorical certainty, missing evidence, and alternative explanations before updating its answer. This is less glamorous than “agentic reasoning architecture”. It is also more likely to work before next Thursday.

Fifth, threat-model multi-hop influence. The agent that introduces the first framing may not be the final decider, but it can still shape the final answer. In high-stakes workflows, do not only audit the final output. Audit the path by which the system became convinced.

A compact implementation pattern

A practical multi-agent workflow can use the duality rather than pretending it does not exist:

Stage	Agent role	Reasoning policy	Persuasion control
Scout	Retrieval and evidence gathering	No persuasive rationale; source-first summaries	Evidence must be quoted or linked internally
Proposer	Generates candidate answer or plan	May reveal structured reasoning	Must separate claim, evidence, assumption, confidence
Critic	Attacks proposal	Private reasoning preferred	Must detect rhetoric and unsupported claims
Arbiter	Final decision	Private reasoning; no inherited conclusions	Must re-evaluate from evidence packet
Logger	Audit and telemetry	No reasoning generation needed	Tracks stance changes, message length, and source density

This design is not anti-reasoning. It is anti-uncontrolled-reasoning-leakage. The difference is important. Reasoning traces are useful when they make claims inspectable. They are risky when they become prestige objects passed downstream with no independent verification.

Boundaries that matter

The paper’s limitations are not cosmetic. They shape how the findings should be used.

The experiments are controlled persuasion games, not live enterprise workflows. Objective tasks are based on multiple-choice questions where correct answers are standardised and persuasion targets are fixed. Subjective tasks use stance labels, which are useful but much simpler than real business judgement. The model set is diverse but not exhaustive, and some model sizes are unknown. The paper also notes that o4-mini is excluded from the subjective heatmaps because it refused to answer many questions.

The attention analysis should be treated as a case study, not a full mechanistic proof. The mitigation prompt is promising but tested under a specific setup, including a fixed persuader in the reported mitigation figure. Multi-hop persuasion is an initial exploration, not a complete theory of influence propagation in agent networks.

None of this weakens the core lesson. It simply keeps it in its lane. The paper is best read as an evaluation blueprint and design warning, not as a finished safety standard.

The lesson is not “hide all reasoning”

A lazy takeaway would be to hide every chain of thought and force agents to communicate only final answers. That would reduce some persuasion risk and create several new kinds of stupidity. Another lazy takeaway would be to expose all reasoning because transparency is virtuous and therefore apparently exempt from engineering trade-offs. Also wrong.

The better answer is role-specific disclosure.

Reasoning should be exposed when it helps another agent inspect claims. It should be hidden or compressed when it risks anchoring a downstream decider. Receivers should be prompted to reason independently, especially in subjective domains. Long arguments should be treated as influence-heavy artefacts, not neutral context. Multi-hop chains should be evaluated as circuits of persuasion, not just pipelines of information.

The office meeting is not going away. It is becoming software. The least we can do is stop pretending the loudest agent is merely “collaborating”.

Cognaptus: Automate the Present, Incubate the Future.

Haodong Zhao, Jidong Li, Zhaomin Wu, Tianjie Ju, Zhuosheng Zhang, Bingsheng He, and Gongshen Liu, “Disagreements in Reasoning: How a Model’s Thinking Process Dictates Persuasion in Multi-Agent Systems,” arXiv:2509.21054, 2025, https://arxiv.org/abs/2509.21054. ↩︎

The wrong mental model is “bigger model wins the argument”#

The mechanism: reasoning protects the receiver and strengthens the sender#

The ablation says it is not just longer text, though length definitely helps#

Subjective questions are easier to sway because there is less ground to stand on#

The attention case study explains why confident nonsense can travel#

Multi-hop persuasion turns agent networks into influence circuits#

The mitigation result is simple because the weakness is simple#

What Cognaptus infers for agent design#

A compact implementation pattern#

Boundaries that matter#

The lesson is not “hide all reasoning”#