Cover image

The Trojan GAN: Turning LLM Jailbreaks into Security Shields

For years, LLM security research has mirrored the cybersecurity arms race: attackers find novel jailbreak prompts, defenders patch with filters or fine-tuning. But in this morning’s arXiv drop, a paper titled “CAVGAN: Unifying Jailbreak and Defense of LLMs via Generative Adversarial Attacks” proposes something fundamentally different: a single framework that learns to attack and defend simultaneously, using a GAN trained on internal embeddings. This paradigm shift offers not only better performance on both sides of the battlefield, but a new perspective on what it means to “align” a model. ...

July 9, 2025 · 3 min · Zelina
Cover image

Beyond the Pareto Frontier: Pricing LLM Mistakes in the Real World

For all the hype about model accuracy, inference cost, and latency, most organizations are still squinting at scatter plots to decide which large language model (LLM) to use. But what if we could cut through the tradeoff fog with a single number that tells you exactly which model is worth deploying—for your use case, under your constraints? That’s the bold proposal in a recent paper by Zellinger and Thomson from Caltech: treat LLM selection as an economic decision. Rather than searching for models on the accuracy-cost “Pareto frontier,” they suggest an approach grounded in price-tagging errors, delays, and abstentions in dollar terms. Think of it as a model selection framework that answers: How much is a mistake worth to you? ...

July 8, 2025 · 4 min · Zelina
Cover image

Collapse to Forget: Turning Model Collapse into a Privacy Feature for LLMs

Machine unlearning, once a fringe technical curiosity, is fast becoming a legal and ethical imperative. With increasing regulatory demands like the GDPR’s “right to be forgotten,” AI developers are being asked a hard question: Can a large language model truly forget? A new paper from researchers at TUM and Mila provides an unexpectedly elegant answer. Instead of fighting model collapse—the phenomenon where iterative finetuning on synthetic data causes a model to forget—they propose embracing it. ...

July 8, 2025 · 4 min · Zelina
Cover image

Mind Games: How LLMs Subtly Rewire Human Judgment

“The most dangerous biases are not the ones we start with, but the ones we adopt unknowingly.” Large language models (LLMs) like GPT and LLaMA increasingly function as our co-pilots—summarizing reviews, answering questions, and fact-checking news. But a new study from UC San Diego warns: these models may not just be helping us think—they may also be nudging us how to think. The paper, titled “How Much Content Do LLMs Generate That Induces Cognitive Bias in Users?”, dives into the subtle but significant ways in which LLM-generated outputs reframe, reorder, or even fabricate information—leading users to adopt distorted views without realizing it. This isn’t just about factual correctness. It’s about cognitive distortion: the framing, filtering, and fictionalizing that skews human judgment. ...

July 8, 2025 · 4 min · Zelina
Cover image

Passing Humanity's Last Exam: X-Master and the Emergence of Scientific AI Agents

Is it possible to train a language model to become a capable scientist? That provocative question lies at the heart of a new milestone in AI research. In SciMaster: Towards General-Purpose Scientific AI Agents, a team from Shanghai Jiao Tong University introduces X-Master, a tool-augmented open-source agent that has just achieved the highest score ever recorded on Humanity’s Last Exam (HLE)—surpassing even OpenAI and Google. But what makes this feat more than just a leaderboard update is how X-Master got there. Instead of training a larger model or fine-tuning on more data, the researchers innovated on agentic architecture and inference-time workflows. The result? An extensible framework that emulates the exploratory behavior of human scientists, not just their answers. ...

July 8, 2025 · 4 min · Zelina
Cover image

The Phantom Menace in Your Knowledge Base

Retrieval-Augmented Generation (RAG) may seem like a fortress of AI reliability—until you realize the breach happens at the front door, not in the model. Large Language Models (LLMs) have become the backbone of enterprise AI assistants. Yet as more systems integrate RAG pipelines to improve their factuality and domain alignment, a gaping blindspot has emerged—the document ingestion layer. A new paper titled “The Hidden Threat in Plain Text” by Castagnaro et al. warns that attackers don’t need to jailbreak your model or infiltrate your vector store. Instead, they just need to hand you a poisoned DOCX, PDF, or HTML file. And odds are, your RAG system will ingest it—invisibly. ...

July 8, 2025 · 3 min · Zelina
Cover image

Backtrack to the Future: How ASTRO Teaches LLMs to Think Like Search Algorithms

A persistent mystery in the recent surge of reasoning-augmented LLMs—like OpenAI’s o1 or DeepSeek-R1—is whether these models learn to reason through post hoc reinforcement fine-tuning, or if they were already good at it to begin with. ASTRO offers a rare counter-example: a method that imbues non-reasoner LLMs (like vanilla Llama 3) with structured reasoning behavior from scratch. Rather than rely on emergent capabilities or distillation from models that already search well, ASTRO teaches LLMs to think like search algorithms themselves, using a hybrid approach combining Monte Carlo Tree Search (MCTS), procedure cloning, chain-of-thought generation, and reinforcement learning with verifiable rewards. ...

July 7, 2025 · 3 min · Zelina
Cover image

Secret Handshakes at Scale: How LLM Agents Learn to Collude

As large language models (LLMs) evolve from passive tools into autonomous market participants, a critical question emerges: can they secretly coordinate in ways that harm fair competition? A recent paper titled Evaluating LLM Agent Collusion in Double Auctions explores this unsettling frontier, and its findings deserve attention from both AI developers and policy makers. The study simulates a continuous double auction (CDA), where multiple buyer and seller agents submit bids and asks in real-time. Each agent is an LLM-powered negotiator, operating on behalf of a hypothetical industrial firm. Sellers value each item at $80, buyers at $100, and trades execute when bids meet asks. The fair equilibrium price should hover around $90. ...

July 7, 2025 · 4 min · Zelina
Cover image

Talk is Flight: How RALLY Bridges Language and Learning in UAV Swarms

When language models take flight, consensus becomes not just possible, but programmable. Modern UAV swarms face the daunting task of coordinating across partial observability, adversarial threats, and shifting missions. Traditional Multi-Agent Reinforcement Learning (MARL) offers adaptability, but falters when role differentiation or semantic reasoning is required. Large Language Models (LLMs), meanwhile, understand tasks and intent—but lack grounded, online learning. RALLY (Role-Adaptive LLM-Driven Yoked Navigation) is the first framework to successfully integrate these two paradigms, enabling real-time, role-aware collaboration in UAV swarms. ...

July 7, 2025 · 3 min · Zelina
Cover image

From Trendlines to Transformers: DeepSupp Redefines Support Level Detection

In technical analysis, few concepts are as foundational as support levels — those invisible lines where prices tend to stop falling, bounce back, and spark new rallies. For decades, traders have relied on hand-drawn trendlines, Fibonacci ratios, and moving averages to guess where those turning points might be. But what if the real market structure is too complex, too dynamic, and too subtle for static rules? Enter DeepSupp, a new deep learning architecture that doesn’t guess support zones — it discovers them. By analyzing evolving market correlations through attention mechanisms and clustering latent embeddings, DeepSupp offers a glimpse into a future where support level detection is less of an art, and more of a science. ...

July 6, 2025 · 4 min · Zelina