Cover image

Tokens, Watts, and Waste: The Hidden Energy Bill of LLM Inference

Opening — Why this matters now Large language models are now a routine part of software development. They autocomplete functions, explain repositories, and quietly sit inside CI pipelines. The productivity gains are real. The energy bill is less visible. As inference increasingly dominates the lifecycle cost of LLMs, the environmental question is no longer about how models are trained, but how often—and how inefficiently—they are used. This paper asks an unfashionable but necessary question: where exactly does inference energy go? The answer turns out to be uncomfortable. ...

February 8, 2026 · 3 min · Zelina
Cover image

Ultra‑Sparse Embeddings Without Apology

Opening — Why this matters now Embeddings have quietly become the metabolic system of modern AI. Every retrieval query, recommendation list, and ranking pipeline depends on them—yet we keep feeding these systems increasingly obese vectors. Thousands of dimensions, dense everywhere, expensive always. The paper behind CSRv2 arrives with an unfashionable claim: you can make embeddings extremely sparse and still win. ...

February 8, 2026 · 3 min · Zelina
Cover image

When Words Start Walking: Rethinking Semantic Search Beyond Averages

Opening — Why this matters now Search systems have grown fluent, but not necessarily intelligent. As enterprises drown in text—contracts, filings, emails, reports—the gap between what users mean and what systems match has become painfully visible. Keyword search still dominates operational systems, while embedding-based similarity often settles for crude averages. This paper challenges that quiet compromise. ...

February 8, 2026 · 3 min · Zelina
Cover image

Benchmarks Lie, Rooms Don’t: Why Embodied AI Fails the Moment It Enters Your House

Opening — Why this matters now Embodied AI is having its deployment moment. Robots are promised for homes, agents for physical spaces, and multimodal models are marketed as finally “understanding” the real world. Yet most of these claims rest on benchmarks designed far away from kitchens, hallways, mirrors, and cluttered tables. This paper makes an uncomfortable point: if you evaluate agents inside the environments they will actually operate in, much of that apparent intelligence collapses. ...

February 7, 2026 · 4 min · Zelina
Cover image

Beyond Cosine: When Order Beats Angle in Embedding Similarity

Opening — Why this matters now Cosine similarity has enjoyed an unusually long reign. From TF‑IDF vectors to transformer embeddings, it remains the default lens through which we judge “semantic closeness.” Yet the more expressive our embedding models become, the more uncomfortable this default starts to feel. If modern representations are nonlinear, anisotropic, and structurally rich, why are we still evaluating them with a metric that only understands angles? ...

February 7, 2026 · 4 min · Zelina
Cover image

First Proofs, No Training Wheels

Opening — Why this matters now AI models are now fluent in contest math, symbolic manipulation, and polished explanations. That’s the easy part. The harder question—the one that actually matters for science—is whether these systems can do research when the answer is not already in the training set. The paper First Proof arrives as a deliberately uncomfortable experiment: ten genuine research-level mathematics questions, all solved by humans, none previously public, and all temporarily withheld from the internet. ...

February 7, 2026 · 3 min · Zelina
Cover image

Hallucination-Resistant Security Planning: When LLMs Learn to Say No

Opening — Why this matters now Security teams are being asked to do more with less, while the attack surface keeps expanding and adversaries automate faster than defenders. Large language models promise relief: summarize logs, suggest response actions, even draft incident playbooks. But there’s a catch that every practitioner already knows—LLMs are confident liars. In security operations, a hallucinated action isn’t just embarrassing; it’s operationally expensive. ...

February 7, 2026 · 4 min · Zelina
Cover image

When AI Forgets on Purpose: Why Memorization Is the Real Bottleneck

Opening — Why this matters now Large language models are getting bigger, slower, and—paradoxically—more forgetful in all the wrong places. Despite trillion‑token training runs, practitioners still complain about brittle reasoning, hallucinated facts, and sudden regressions after fine‑tuning. The paper behind this article argues that the problem is not insufficient memory, but poorly allocated memory. ...

February 7, 2026 · 3 min · Zelina
Cover image

When One Heatmap Isn’t Enough: Layered XAI for Brain Tumour Detection

Opening — Why this matters now Medical AI is no longer struggling with accuracy. In constrained tasks like MRI-based brain tumour detection, convolutional neural networks routinely cross the 90% mark. The real bottleneck has shifted elsewhere: trust. When an algorithm flags—or misses—a tumour, clinicians want to know why. And increasingly, a single colourful heatmap is not enough. ...

February 7, 2026 · 3 min · Zelina
Cover image

When RAG Needs Provenance, Not Just Recall: Traceable Answers Across Fragmented Knowledge

Opening — Why this matters now RAG is supposed to make large language models safer. Ground the model in documents, add citations, and hallucinations politely leave the room—or so the story goes. In practice, especially in expert domains, RAG often fails in a quieter, more dangerous way: it retrieves something relevant, but not the right kind of evidence. ...

February 7, 2026 · 4 min · Zelina