Retrieval Augmented Generation

Privacy by Proximity: How Nearest Neighbors Made In-Context Learning Differentially Private

Opening — Why this matters now As large language models (LLMs) weave themselves into every enterprise workflow, a quieter issue looms: the privacy of the data used to prompt them. In‑context learning (ICL) — the art of teaching a model through examples in its prompt — is fast, flexible, and dangerously leaky. Each query could expose confidential examples from private datasets. Enter differential privacy (DP), the mathematical armor for sensitive data — except until now, DP methods for ICL have been clumsy and utility‑poor. ...

Agents with Interest: How Fintech Taught RAG to Read the Fine Print

Opening — Why this matters now The fintech industry is an alphabet soup of acronyms and compliance clauses. For a large language model (LLM), it’s a minefield of misunderstood abbreviations, half-specified processes, and siloed documentation that lives in SharePoint purgatory. Yet financial institutions are under pressure to make sense of their internal knowledge—securely, locally, and accurately. Retrieval-Augmented Generation (RAG), the method of grounding LLM outputs in retrieved context, has emerged as the go-to approach. But as Mastercard’s recent research shows, standard RAG pipelines choke on the reality of enterprise fintech: fragmented data, undefined acronyms, and role-based access control. The paper Retrieval-Augmented Generation for Fintech: Agentic Design and Evaluation proposes a modular, multi-agent redesign that turns RAG from a passive retriever into an active, reasoning system. ...

Confounder Hunters: How LLM Agents are Rewriting the Rules of Causal Inference

When Hidden Variables Become Hidden Costs In causal inference, confounders are the uninvited guests at your data party — variables that influence both treatment and outcome, quietly skewing results. In healthcare, failing to adjust for them can turn life-saving insights into misleading noise. Traditionally, finding these culprits has been the realm of domain experts, a slow and costly process that doesn’t scale well. The paper from National Sun Yat-Sen University proposes a radical alternative: put Large Language Model (LLM)-based agents into the causal inference loop. These agents don’t just crunch numbers — they reason, retrieve domain knowledge, and iteratively refine estimates, effectively acting as tireless, always-available junior experts. ...

GraphRAG Without the Drag: Scaling Knowledge-Augmented LLMs to Web-Scale

When it comes to retrieval-augmented generation (RAG), size matters—but not in the way you might think. Most high-performing GraphRAG systems extract structured triples (subject, predicate, object) from texts using large language models (LLMs), then link them to form reasoning chains. But this method doesn’t scale: if your corpus contains millions of documents, pre-processing every one with an LLM becomes prohibitively expensive. That’s the bottleneck the authors of “Millions of GeAR-s” set out to solve. And their solution is elegant: skip the LLM-heavy preprocessing entirely, and use existing knowledge graphs (like Wikidata) as a reasoning scaffold. ...