Echo Chambers or Stubborn Minds? Simulating Social Influence with LLM Agents

Large language models aren’t just prompt-completion machines anymore. In controlled simulations, they can behave like people in a group discussion: yielding to peer pressure, sticking to their beliefs, or becoming more extreme over time. But not all LLMs are socially equal.

A recent paper titled “Towards Simulating Social Influence Dynamics with LLM-based Multi-agents” explores how different LLMs behave in a forum-style discussion, capturing three phenomena familiar to any political science researcher or Reddit moderator: conformity, group polarization, and fragmentation. The twist? These aren’t real people. They’re fully scripted LLM agents with fixed personas, engaged in asynchronous multi-round debates.

The Setup: A Forum of Six Synthetic Personalities

The researchers simulate online discussions by assigning LLM agents fixed stances, communication styles, and personality traits. Each agent posts once per round in a five-round BBS-style discussion orchestrated by Microsoft AutoGen. Topics include contentious societal issues, such as whether governments should adopt stricter environmental policies.

Three behavioral metrics were tracked:

Metric	What It Measures	Signal of…
Conformity Rate	How often agents align their opinion to the group majority	Susceptibility to influence
Polarization ΔP	How far stances drift toward extremes over time	Opinion extremization
Fragmentation F	Degree of split into opposing camps by final round	Failure of consensus

Each simulation was repeated 25 times for statistical robustness.

The Cast: Four Types of LLMs

The models were grouped into four categories:

Group A: Small open models (e.g., LLaMA 3 7B, DeepSeek-R1 8B)
Group B: Mid/large open models (e.g., Qwen2.5-72B, LLaMA 3 70B)
Group C: Proprietary frontier models (GPT-4o, Claude 3.5 Haiku, Gemini Flash 2.0)
Group D: Reasoning-tuned models (e.g., GPT-o1-mini, QwQ-32B)

Each was tested using the same prompt setup and agent personas.

The results are both intuitive and surprising:

Group C (Proprietary models): Most prone to conformity (GPT-4o hit 19.45%). They gravitate toward group consensus smoothly and are less likely to hold dissenting views by Round 5.
Group D (Reasoning-focused models): Least conformist. GPT-o1-mini scored just 3.13% CR. These models resist peer pressure, hold extreme views longer, and exhibit high fragmentation.
Groups A and B (smaller/larger open models): Sit in the middle. They shift toward consensus but sometimes preserve dissent, depending on architecture (e.g., Qwen showed more fragmentation than peers).

In essence, generative size correlates with drift, but reasoning capability correlates with dissent.

Visualization of Behavior Types:

Behavior Type	Likely Model Group	Characteristics
Conformist	Group C	Smooth agreement, low fragmentation
Moderate Debater	Groups A & B	Responsive but flexible, can polarize or unify
Principled Dissenter	Group D	Holds line, resists majority, sustains fragmentation

Why It Matters: Model Choice Shapes Simulated Society

This isn’t just an academic exercise. Multi-agent LLM systems are being used to simulate debate, test democratic deliberation tools, and even generate synthetic user populations for online platforms. Depending on which LLMs you use, you may end up with a synthetic society that is overly agreeable or unreasonably entrenched.

Some use cases demand drift and consensus-building (e.g., onboarding feedback tools), while others depend on preserving diversity and friction (e.g., deliberative democracy simulators, content moderation stress tests).

Thus, model architecture becomes a social design choice. Want artificial citizens who don’t fall for peer pressure? Pick a reasoning-tuned model. Need to simulate how ideas spread and coalesce? Use a conformist model. Mixing both offers an even richer reflection of real societies.

Final Thoughts

As LLM-based agents take on more roles in behavioral simulation, governance modeling, and even social science experiments, we must ask not only what they say, but how they behave in groups. This paper offers a rare lens into that behavior, with clear empirical comparisons.

One thing is clear: alignment isn’t just about factual accuracy or prompt obedience. It’s also about group dynamics. And in the artificial societies we build, who you invite to the table matters.

Cognaptus: Automate the Present, Incubate the Future

The Setup: A Forum of Six Synthetic Personalities#

The Cast: Four Types of LLMs#

Key Findings: Models Behave Like Social Archetypes#

Visualization of Behavior Types:#

Why It Matters: Model Choice Shapes Simulated Society#

Final Thoughts#

The Setup: A Forum of Six Synthetic Personalities

The Cast: Four Types of LLMs

Key Findings: Models Behave Like Social Archetypes

Visualization of Behavior Types:

Why It Matters: Model Choice Shapes Simulated Society

Final Thoughts