Cover image

The Phantom Menace in Your Knowledge Base

Retrieval-Augmented Generation (RAG) may seem like a fortress of AI reliability—until you realize the breach happens at the front door, not in the model. Large Language Models (LLMs) have become the backbone of enterprise AI assistants. Yet as more systems integrate RAG pipelines to improve their factuality and domain alignment, a gaping blindspot has emerged—the document ingestion layer. A new paper titled “The Hidden Threat in Plain Text” by Castagnaro et al. warns that attackers don’t need to jailbreak your model or infiltrate your vector store. Instead, they just need to hand you a poisoned DOCX, PDF, or HTML file. And odds are, your RAG system will ingest it—invisibly. ...

July 8, 2025 · 3 min · Zelina
Cover image

Swiss Cheese for Superintelligence: How STACK Reveals the Fragility of LLM Safeguards

In the race to secure frontier large language models (LLMs), defense-in-depth has become the go-to doctrine. Inspired by aviation safety and nuclear containment, developers like Anthropic and Google DeepMind are building multilayered safeguard pipelines to prevent catastrophic misuse. But what if these pipelines are riddled with conceptual holes? What if their apparent robustness is more security theater than security architecture? The new paper STACK: Adversarial Attacks on LLM Safeguard Pipelines delivers a striking answer: defense-in-depth can be systematically unraveled, one stage at a time. The researchers not only show that existing safeguard models are surprisingly brittle, but also introduce a novel staged attack—aptly named STACK—that defeats even strong pipelines designed to reject dangerous outputs like how to build chemical weapons. ...

July 1, 2025 · 3 min · Zelina