Cover image

When ESG Meets LLM: Decoding Corporate Green Talk on Social Media

Opening — Why this matters now Corporate sustainability is having a content crisis. Brands flood X (formerly Twitter) with green-themed posts, pledging allegiance to the UN’s Sustainable Development Goals (SDGs) while their real-world actions remain opaque. The question is no longer who is talking about sustainability—it’s what they are actually saying, and whether it means anything at all. A new study from the University of Amsterdam offers a data-driven lens on this problem. By combining large language models (LLMs) and vision-language models (VLMs), the researchers have built a multimodal pipeline that decodes the texture of corporate sustainability messaging across millions of social media posts. Their goal: to map not what companies claim, but how they construct the narrative of being sustainable. ...

November 6, 2025 · 4 min · Zelina
Cover image

When RAG Meets the Law: Building Trustworthy Legal AI for a Moving Target

Opening — Why this matters now Legal systems are allergic to uncertainty. Yet, AI thrives on it. As generative models step into the courtroom—drafting opinions, analyzing precedents, even suggesting verdicts—the question is no longer can they help, but can we trust them? The stakes are existential: a hallucinated statute or a misapplied precedent isn’t a typo; it’s a miscarriage of justice. The paper Hybrid Retrieval-Augmented Generation Agent for Trustworthy Legal Question Answering in Judicial Forensics offers a rare glimpse at how to close this credibility gap. ...

November 6, 2025 · 4 min · Zelina
Cover image

When Drones Think Too Much: Defining Cognition Envelopes for Bounded AI Reasoning

Why this matters now As AI systems move from chatbots to control towers, the stakes of their hallucinations have escalated. Large Language Models (LLMs) and Vision-Language Models (VLMs) now make—or at least recommend—decisions in physical space: navigating drones, scheduling robots, even allocating emergency response assets. But when such models “reason” incorrectly, the consequences extend beyond embarrassment—they can endanger lives. Notre Dame’s latest research introduces the concept of a Cognition Envelope, a new class of reasoning guardrail that constrains how foundational models reach and justify their decisions. Unlike traditional safety envelopes that keep drones within physical limits (altitude, velocity, geofence) or meta-cognition that lets an LLM self-critique, cognition envelopes work from outside the reasoning process. They independently evaluate whether a model’s plan makes sense, given real-world constraints and evidence. ...

November 5, 2025 · 4 min · Zelina
Cover image

Agents with Interest: How Fintech Taught RAG to Read the Fine Print

Opening — Why this matters now The fintech industry is an alphabet soup of acronyms and compliance clauses. For a large language model (LLM), it’s a minefield of misunderstood abbreviations, half-specified processes, and siloed documentation that lives in SharePoint purgatory. Yet financial institutions are under pressure to make sense of their internal knowledge—securely, locally, and accurately. Retrieval-Augmented Generation (RAG), the method of grounding LLM outputs in retrieved context, has emerged as the go-to approach. But as Mastercard’s recent research shows, standard RAG pipelines choke on the reality of enterprise fintech: fragmented data, undefined acronyms, and role-based access control. The paper Retrieval-Augmented Generation for Fintech: Agentic Design and Evaluation proposes a modular, multi-agent redesign that turns RAG from a passive retriever into an active, reasoning system. ...

November 4, 2025 · 4 min · Zelina
Cover image

The Memory Illusion: Why AI Still Forgets Who It Is

Opening — Why this matters now Every AI company wants its assistant to feel personal. Yet every conversation starts from zero. Your favorite chatbot may recall facts, summarize documents, even mimic a tone — but beneath the fluent words, it suffers from a peculiar amnesia. It remembers nothing unless reminded, apologizes often, and contradicts itself with unsettling confidence. The question emerging from Stefano Natangelo’s “Narrative Continuity Test (NCT)” is both philosophical and practical: Can an AI remain the same someone across time? ...

November 3, 2025 · 4 min · Zelina
Cover image

Two Minds in One Machine: How Agentic AI Splits—and Reunites—the Field

Opening — Why this matters now Agentic AI is the latest obsession in artificial intelligence: systems that don’t just respond but decide. They plan, delegate, and act—sometimes without asking for permission. Yet as hype grows, confusion spreads. Many conflate these new multi-agent architectures with the old, symbolic dream of reasoning machines from the 1980s. The result? Conceptual chaos. A recent comprehensive survey—Agentic AI: A Comprehensive Survey of Architectures, Applications, and Future Directions—cuts through the noise. It argues that today’s agentic systems are not the heirs of symbolic AI but the offspring of neural, generative models. In other words: we’ve been speaking two dialects of intelligence without realizing it. ...

November 3, 2025 · 4 min · Zelina
Cover image

Who Really Runs the Workflow? Ranking Agent Influence in Multi-Agent AI Systems

Opening — Why this matters now Multi-agent systems — the so-called Agentic AI Workflows — are rapidly becoming the skeleton of enterprise-grade automation. They promise autonomy, composability, and scalability. But beneath this elegant choreography lies a governance nightmare: we often have no idea which agent is actually in charge. Imagine a digital factory of LLMs: one drafts code, another critiques it, a third summarizes results, and a fourth audits everything. When something goes wrong — toxic content, hallucinated outputs, or runaway costs — who do you blame? More importantly, which agent do you fix? ...

November 3, 2025 · 5 min · Zelina
Cover image

When Rules Go Live: Policy Cards and the New Language of AI Governance

When Rules Go Live: Policy Cards and the New Language of AI Governance In 2019, Model Cards made AI systems more transparent by documenting what they were trained to do. Then came Data Cards and System Cards, clarifying how datasets and end-to-end systems behave. But as AI moves from prediction to action—from chatbots to trading agents, surgical robots, and autonomous research assistants—documentation is no longer enough. We need artifacts that don’t just describe a system, but govern it. ...

November 2, 2025 · 4 min · Zelina
Cover image

Forgetting by Design: Turning GDPR into a Systems Problem for LLMs

The “right to be forgotten” (GDPR Art. 17) has always seemed like kryptonite for large language models. Once a trillion-parameter system memorizes personal data, how can it truly be erased without starting training from scratch? Most prior attempts—whether using influence functions or alignment-style fine-tuning—felt like damage control: approximate, unverifiable, and too fragile to withstand regulatory scrutiny. This new paper, Unlearning at Scale, turns the problem on its head. It argues that forgetting is not a mathematical optimization problem, but a systems engineering challenge. If training can be made deterministic and auditable, then unlearning can be handled with the same rigor as database recovery or transaction rollbacks. ...

August 19, 2025 · 3 min · Zelina
Cover image

From Ballots to Budgets: Can LLMs Be Trusted as Social Planners?

When you think of AI in public decision-making, you might picture chatbots handling service requests or predictive models flagging infrastructure risks. But what if we let large language models (LLMs) actually allocate resources—acting as digital social planners? That’s exactly what this new study tested, using Participatory Budgeting (PB) both as a practical decision-making task and a dynamic benchmark for LLM reasoning. Why Participatory Budgeting Is the Perfect Testbed PB is more than a budgeting exercise. Citizens propose and vote on projects—parks, public toilets, community centers—and decision-makers choose a subset to fund within a fixed budget. It’s a constrained optimization problem with a human twist: budgets, diverse preferences, and sometimes mutually exclusive projects. ...

August 11, 2025 · 3 min · Zelina