AI Governance

Privacy by Proximity: How Nearest Neighbors Made In-Context Learning Differentially Private

Opening — Why this matters now As large language models (LLMs) weave themselves into every enterprise workflow, a quieter issue looms: the privacy of the data used to prompt them. In‑context learning (ICL) — the art of teaching a model through examples in its prompt — is fast, flexible, and dangerously leaky. Each query could expose confidential examples from private datasets. Enter differential privacy (DP), the mathematical armor for sensitive data — except until now, DP methods for ICL have been clumsy and utility‑poor. ...

The Doctor Is In: How DR. WELL Heals Multi-Agent Coordination with Symbolic Memory

Opening — Why this matters now Large language models are learning to cooperate. Or at least, they’re trying. When multiple LLM-driven agents must coordinate—say, to move objects in a shared environment or plan logistics—they often stumble over timing, misunderstanding, and sheer conversational chaos. Each agent talks too much, knows too little, and acts out of sync. DR. WELL, a new neurosymbolic framework from researchers at CMU and USC, proposes a cure: let the agents think symbolically, negotiate briefly, and remember collectively. ...

Truth Machines: VeriCoT and the Next Frontier of AI Self-Verification

Why this matters now Large language models have grown remarkably persuasive—but not necessarily reliable. They often arrive at correct answers through logically unsound reasoning, a phenomenon both amusing in games and catastrophic in legal, biomedical, or policy contexts. The research paper VeriCoT: Neuro-Symbolic Chain-of-Thought Validation via Logical Consistency Checks proposes a decisive step toward addressing that flaw: a hybrid system where symbolic logic checks the reasoning of a neural model, not just its answers. ...

When AI Becomes Its Own Research Assistant

Opening — Why this matters now Autonomous research agents have moved from the thought experiment corner of arXiv to its front page. Jr. AI Scientist, a system from the University of Tokyo, represents a quiet but decisive step in that evolution: an AI not only reading and summarizing papers but also improving upon them and submitting its own results for peer (and AI) review. The project’s ambition is as remarkable as its caution—it’s less about replacing scientists and more about probing what happens when science itself becomes partially automated. ...

When Democracy Meets the Algorithm: Auditing Representation in the Age of LLMs

Opening — Why this matters now The rise of AI in civic life has been faster than most democracies can legislate. Governments and NGOs are experimenting with large language models (LLMs) to summarize public opinions, generate consensus statements, and even draft expert questions in citizen assemblies. The promise? Efficiency and inclusiveness. The risk? Representation by proxy—where the algorithm decides whose questions matter. The new paper Question the Questions: Auditing Representation in Online Deliberative Processes (De et al., 2025) offers a rigorous framework for examining that risk. It turns the abstract ideals of fairness and inclusivity into something measurable, using the mathematics of justified representation (JR) from social choice theory. In doing so, it shows how to audit whether AI-generated “summary questions” in online deliberations truly reflect the people’s diverse concerns—or just the most statistically coherent subset. ...

Trade Winds and Neural Currents: Predicting the Global Food Network with Dynamic Graphs

Opening — Why this matters now When the price of rice in one country spikes, the shock ripples through shipping routes, grain silos, and trade treaties across continents. The global food trade network is as vital as it is volatile—exposed to climate change, geopolitics, and policy oscillations. In 2025, with global food inflation and shipping disruptions returning to headlines, predictive modeling of trade flows has become not just an academic exercise but a policy imperative. ...

When ESG Meets LLM: Decoding Corporate Green Talk on Social Media

Opening — Why this matters now Corporate sustainability is having a content crisis. Brands flood X (formerly Twitter) with green-themed posts, pledging allegiance to the UN’s Sustainable Development Goals (SDGs) while their real-world actions remain opaque. The question is no longer who is talking about sustainability—it’s what they are actually saying, and whether it means anything at all. A new study from the University of Amsterdam offers a data-driven lens on this problem. By combining large language models (LLMs) and vision-language models (VLMs), the researchers have built a multimodal pipeline that decodes the texture of corporate sustainability messaging across millions of social media posts. Their goal: to map not what companies claim, but how they construct the narrative of being sustainable. ...

When RAG Meets the Law: Building Trustworthy Legal AI for a Moving Target

Opening — Why this matters now Legal systems are allergic to uncertainty. Yet, AI thrives on it. As generative models step into the courtroom—drafting opinions, analyzing precedents, even suggesting verdicts—the question is no longer can they help, but can we trust them? The stakes are existential: a hallucinated statute or a misapplied precedent isn’t a typo; it’s a miscarriage of justice. The paper Hybrid Retrieval-Augmented Generation Agent for Trustworthy Legal Question Answering in Judicial Forensics offers a rare glimpse at how to close this credibility gap. ...

When Drones Think Too Much: Defining Cognition Envelopes for Bounded AI Reasoning

Why this matters now As AI systems move from chatbots to control towers, the stakes of their hallucinations have escalated. Large Language Models (LLMs) and Vision-Language Models (VLMs) now make—or at least recommend—decisions in physical space: navigating drones, scheduling robots, even allocating emergency response assets. But when such models “reason” incorrectly, the consequences extend beyond embarrassment—they can endanger lives. Notre Dame’s latest research introduces the concept of a Cognition Envelope, a new class of reasoning guardrail that constrains how foundational models reach and justify their decisions. Unlike traditional safety envelopes that keep drones within physical limits (altitude, velocity, geofence) or meta-cognition that lets an LLM self-critique, cognition envelopes work from outside the reasoning process. They independently evaluate whether a model’s plan makes sense, given real-world constraints and evidence. ...

Agents with Interest: How Fintech Taught RAG to Read the Fine Print

Opening — Why this matters now The fintech industry is an alphabet soup of acronyms and compliance clauses. For a large language model (LLM), it’s a minefield of misunderstood abbreviations, half-specified processes, and siloed documentation that lives in SharePoint purgatory. Yet financial institutions are under pressure to make sense of their internal knowledge—securely, locally, and accurately. Retrieval-Augmented Generation (RAG), the method of grounding LLM outputs in retrieved context, has emerged as the go-to approach. But as Mastercard’s recent research shows, standard RAG pipelines choke on the reality of enterprise fintech: fragmented data, undefined acronyms, and role-based access control. The paper Retrieval-Augmented Generation for Fintech: Agentic Design and Evaluation proposes a modular, multi-agent redesign that turns RAG from a passive retriever into an active, reasoning system. ...