Privacy

The LoRA Mirage: Why Lightweight Finetuning Isn't Lightweight on Privacy

TL;DR for operators Adapters look small. The privacy surface is not. The paper behind LoRA-Leak argues that LoRA fine-tuning does not magically protect the records used to specialise a language model.1 Even though LoRA trains only low-rank adapter weights while leaving the base model frozen, the resulting model can still leak membership information: an attacker may infer whether a given sample was part of the fine-tuning dataset. ...

The Invisible Hand in the Machine: Rethinking AI Through a Collectivist Lens

TL;DR for operators Users do not experience an AI product as a theorem. They experience it as a bargain. They give data, attention, labour, trust, prompts, feedback, documents, creative work, behavioural traces, and sometimes money. In return, they expect useful output, lower friction, safer decisions, visibility, compensation, privacy, or at least not being quietly turned into unpaid infrastructure. The bargain may be explicit. More often, because apparently we enjoy building planetary-scale systems on implied consent and vibes, it is not. ...

Collapse to Forget: Turning Model Collapse into a Privacy Feature for LLMs

TL;DR for operators When an LLM leaks sensitive, copyrighted, or otherwise forbidden information, the obvious repair is to fine-tune it away from the bad answer. That sounds sensible until you notice the small operational comedy: the remediation process keeps using the very answer it is supposed to remove. The paper behind this article proposes Partial Model Collapse (PMC), a machine unlearning method that avoids directly optimising on ground-truth forget answers. Instead, PMC asks the model the sensitive question, samples multiple responses from the model itself, selects a response that is less like the model’s original answer, and fine-tunes on that self-generated response while also training on retain data to preserve general utility.1 ...

The CoRAG Deal: RAG Without the Privacy Plot Twist

TL;DR for operators CoRAG is not “RAG, but with more documents.” It is a way to let multiple organizations train a shared retrieval-augmented model while keeping their labeled question-answer data local. That matters because labels are usually the expensive, sensitive, commercially revealing part. Market documents, manuals, policies, public reports, and technical references are often easier to share than the annotations that say which answer was correct, for whom, and under what business condition. Tiny distinction. Large legal bill avoided. ...

Smart, Private AI Workflows for Small Firms to Save Costs and Protect Data

TL;DR for operators Month-end close is not where small firms discover their love of manual labour. It is where invoices arrive half-labelled, clients reply with attachments named final_final_real.xlsx, and a senior accountant spends expensive hours doing work that is intellectually closer to sorting laundry than advising a business. The practical AI opportunity for small accounting and professional service firms is not “give everyone a chatbot and hope the profession becomes futuristic by Friday.” The better architecture is a cost-aware, privacy-first workflow: classify the task, remove or mask sensitive data where possible, retrieve the right firm knowledge, route the easy work to cheap or local tools, escalate uncertain cases to stronger models, and keep humans in charge of outputs that affect filings, financial statements, tax positions, or client advice. ...