Differential Privacy

Synthetic Data’s Ghost Problem: Auditing the Leaks That Weren’t

TL;DR for operators Synthetic data privacy reviews should stop treating every rare match as proof of memorization. That is the useful correction in Phantoms and Disclosures: a Causal Framework for Auditing Synthetic Data, a paper that turns synthetic-data auditing into a controlled experiment rather than an anxious string search.1 The paper’s mechanism is simple enough to be dangerous in the right way: split the source corpus into training and holdout records; generate synthetic data from the training split; extract rare features from training, holdout, and synthetic data; then ask whether synthetic matches are disproportionately concentrated in the training split. Matches against training records are potential true disclosures. Matches against holdout records are phantom disclosures: things that look like leaks but could have appeared even if that record had never been used. ...

When Privacy Meets Chaos: Making Federated Learning Behave

Privacy is easy to admire in a slide deck. It becomes less elegant when the model begins to behave like a shopping cart with one broken wheel. Federated learning promises a clean bargain: data stay local, clients collaborate, and the central model improves without seeing everyone’s raw records. Add differential privacy, and the promise becomes more formal. Each client update is clipped, noise is injected, and individual influence is bounded. Everyone nods. The architecture looks responsible. ...

Noise Without Regret: How Error Feedback Fixes Differentially Private Image Generation

Photos are annoying data. They are useful because they contain details: the handle of a bag, the edge of a sleeve, the texture of a face, the faint classroom gesture that matters only after someone trains a model on it. They are risky for exactly the same reason. If a generated image looks too much like the real training data, it may quietly leak what the organization was trying not to reveal. If it is protected too aggressively, it becomes a blurry souvenir from a dataset that used to be useful. ...

Regrets, Graphs, and the Price of Privacy: Federated Causal Discovery Grows Up

A hospital changes its treatment protocol. Another keeps the old one. A third removes an approval step that had quietly influenced several downstream decisions. Their datasets now disagree. The usual federated-learning instinct is to treat that disagreement as a problem: smooth it, average it, or design an aggregation rule robust enough to survive it. In causal discovery, however, some disagreements contain precisely the information the global model lacks. Removing a local dependency can expose a previously hidden causal pattern. A policy difference that looks like statistical inconvenience may function as an accidental experiment. ...

Privacy by Proximity: How Nearest Neighbors Made In-Context Learning Differentially Private

TL;DR for operators Private examples are not harmless just because they sit inside a prompt rather than inside model weights. In-context learning lets teams adapt a general LLM by adding examples at inference time, which is convenient until those examples are medical notes, legal clauses, customer tickets, invoices, or internal decisions that should not be inferable from the model’s output. ...

When AI Knows It Doesn’t Know: Turning Uncertainty into Strategic Advantage

TL;DR for operators A model that says “I don’t know” is not automatically trustworthy. It may be cautious. It may be badly calibrated. It may be uncertain for the wrong reasons. It may also be using uncertainty as a very elegant trapdoor. Polite refusal, unfortunately, is still refusal. Stephan Rabanser’s thesis, Uncertainty-Driven Reliability: Selective Prediction and Trustworthy Deployment in Modern Machine Learning, is useful because it treats uncertainty not as a philosophical mood, but as an operational control layer.1 The key question is not whether a model can emit a confidence score. Most models can emit something confidence-shaped. The harder question is whether that score can decide which cases should be automated, deferred, reviewed, rejected, routed to a larger model, or audited. ...