Cover image

The Sims Get Smart? Why LLM-Driven Social Simulations Need a Reality Check

Social simulations are entering their uncanny valley. Fueled by generative agents powered by Large Language Models (LLMs), recent frameworks like Smallville, AgentSociety, and SocioVerse simulate thousands of lifelike agents forming friendships, spreading rumors, and planning parties. But do these simulations reflect real social processes — or merely replay the statistical shadows of the internet? When Simulacra Speak Fluently LLMs have demonstrated striking abilities to mimic human behaviors. GPT-4 has passed Theory-of-Mind (ToM) tests at levels comparable to 6–7 year-olds. In narrative contexts, it can detect sarcasm, understand indirect requests, and generate empathetic replies. But all of this arises not from embodied cognition or real-world goals — it’s just next-token prediction trained on massive corpora. ...

July 28, 2025 · 4 min · Zelina
Cover image

The Most Dangerous Query Is the One You Don't Question

In the age of natural language interfaces to databases (NLIDBs), asking the right question has never been easier—or more perilous. While systems like ChatGPT or SQL-Palm can convert everyday English into valid SQL, they often do so without interrogating the quality of the question itself. And as Peter Drucker warned, “The most dangerous thing is asking the wrong question.” Enter VeriMinder, a system built not to improve SQL syntax or execution accuracy—but to diagnose and refine the analytical intent behind the user’s query. It tackles a deceptively simple yet far-reaching problem: a well-formed SQL query that answers a poorly formed question can yield confident but misleading insights. This is particularly problematic in enterprise settings where non-technical users rely on LLM-based BI assistants. ...

July 25, 2025 · 4 min · Zelina
Cover image

Bias, Baked In: Why Pretraining, Not Fine-Tuning, Shapes LLM Behavior

What makes a large language model (LLM) biased? Is it the instruction tuning data, the randomness of training, or something more deeply embedded? A new paper from Itzhak, Belinkov, and Stanovsky, presented at COLM 2025, delivers a clear verdict: pretraining is the primary source of cognitive biases in LLMs. The implications of this are far-reaching — and perhaps more uncomfortable than many developers would like to admit. The Setup: Two Steps, One Core Question The authors dissected the origins of 32 cognitive biases in LLMs using a controlled two-step causal framework: ...

July 13, 2025 · 4 min · Zelina
Cover image

Bias Busters: Teaching Language Agents to Think Like Scientists

In the latest paper “Language Agents Mirror Human Causal Reasoning Biases” (Chen et al., 2025), researchers uncovered a persistent issue affecting even the most advanced language model (LM) agents: a disjunctive bias—a tendency to prefer “OR”-type causal explanations over equally valid or even stronger “AND”-type ones. Surprisingly, this mirrors adult human reasoning patterns and undermines the agents’ ability to draw correct conclusions in scientific-style causal discovery tasks. ...

May 15, 2025 · 3 min