Cover image

Big AI and the Metacrisis: When Scaling Becomes a Liability

Opening — Why this matters now The AI industry insists it is ushering in an Intelligent Age. The paper you just uploaded argues something colder: we may instead be engineering a metacrisis accelerator. As climate instability intensifies, democratic trust erodes, and linguistic diversity collapses, Big AI—large language models, hyperscale data centers, and their political economy—is not a neutral observer. It is an active participant. And despite the industry’s fondness for ethical manifestos, it shows little appetite for restraint. ...

January 2, 2026 · 3 min · Zelina
Cover image

Ethics Isn’t a Footnote: Teaching NLP Responsibility the Hard Way

Opening — Why this matters now Ethics in AI is having a moment. Codes of conduct, bias statements, safety benchmarks, model cards—our industry has never been more concerned with responsibility. And yet, most AI education still treats ethics like an appendix: theoretically important, practically optional. This paper makes an uncomfortable point: you cannot teach ethical NLP by lecturing about it. Responsibility is not absorbed through slides. It has to be practiced. ...

January 2, 2026 · 4 min · Zelina
Cover image

LeanCat-astrophe: Why Category Theory Is Where LLM Provers Go to Struggle

Opening — Why this matters now Formal theorem proving has entered its confident phase. We now have models that can clear olympiad-style problems, undergraduate algebra, and even parts of the Putnam with respectable success rates. Reinforcement learning, tool feedback, and test-time scaling have done their job. And then LeanCat arrives — and the success rates collapse. ...

January 2, 2026 · 4 min · Zelina
Cover image

MI-ZO: Teaching Vision-Language Models Where to Look

Opening — Why this matters now Vision-Language Models (VLMs) are everywhere—judging images, narrating videos, and increasingly acting as reasoning engines layered atop perception. But there is a quiet embarrassment in the room: most state-of-the-art VLMs are trained almost entirely on 2D data, then expected to reason about 3D worlds as if depth, occlusion, and viewpoint were minor details. ...

January 2, 2026 · 4 min · Zelina
Cover image

Planning Before Picking: When Slate Recommendation Learns to Think

Opening — Why this matters now Recommendation systems have quietly crossed a threshold. The question is no longer what to recommend, but how many things, in what order, and with what balance. In feeds, short-video apps, and content platforms, users consume slates—lists experienced holistically. Yet most systems still behave as if each item lives alone, blissfully unaware of its neighbors. ...

January 2, 2026 · 3 min · Zelina
Cover image

Question Banks Are Dead. Long Live Encyclo-K.

Opening — Why this matters now Every time a new benchmark is released, the same ritual follows: models race to the top, leaderboards reshuffle, and a few months later—sometimes weeks—we quietly realize the benchmark has been memorized, gamed, or both. The uncomfortable truth is that static questions are no longer a reliable way to measure rapidly evolving language models. ...

January 2, 2026 · 3 min · Zelina
Cover image

Secrets, Context, and the RAG Illusion

Opening — Why this matters now Personalized AI assistants are rapidly becoming ambient infrastructure. They draft emails, recall old conversations, summarize private chats, and quietly stitch together our digital lives. The selling point is convenience. The hidden cost is context collapse. The paper behind this article introduces PrivacyBench, a benchmark designed to answer an uncomfortable but overdue question: when AI assistants know everything about us, can they be trusted to know when to stay silent? The short answer is no—not reliably, and not by accident. ...

January 2, 2026 · 4 min · Zelina
Cover image

Deployed, Retrained, Repeated: When LLMs Learn From Being Used

Opening — Why this matters now The AI industry likes to pretend that training happens in neat, well-funded labs and deployment is merely the victory lap. Reality, as usual, is less tidy. Large language models are increasingly learning after release—absorbing their own successful outputs through user curation, web sharing, and subsequent fine‑tuning. This paper puts a sharp analytical frame around that uncomfortable truth: deployment itself is becoming a training regime. ...

January 1, 2026 · 4 min · Zelina
Cover image

Gen Z, But Make It Statistical: Teaching LLMs to Listen to Data

Opening — Why this matters now Foundation models are fluent. They are not observant. In 2024–2025, enterprises learned the hard way that asking an LLM to explain a dataset is very different from asking it to fit one. Large language models know a lot about the world, but they are notoriously bad at learning dataset‑specific structure—especially when the signal lives in proprietary data, niche markets, or dated user behavior. This gap is where GenZ enters, with none of the hype and most of the discipline. ...

January 1, 2026 · 4 min · Zelina
Cover image

Label Now, Drive Later: Why Autonomous Driving Needs Fewer Clicks, Not Smarter Annotators

Opening — Why this matters now Autonomous driving research does not stall because of missing models. It stalls because of missing labels. Every promising perception architecture eventually collides with the same bottleneck: the slow, expensive, and error-prone process of annotating multimodal driving data. LiDAR point clouds do not label themselves. Cameras do not politely blur faces for GDPR compliance. And human annotators, despite heroic patience, remain both costly and inconsistent at scale. ...

January 1, 2026 · 4 min · Zelina