Cover image

Painkillers with Foresight: Teaching Machines to Anticipate Cancer Pain

Opening — Why this matters now Cancer pain is rarely a surprise to clinicians. Yet it still manages to arrive uninvited, often at night, often under-treated, and almost always after the window for calm, preventive adjustment has closed. In lung cancer wards, up to 90% of patients experience moderate to severe pain episodes — and most of these episodes are predictable in hindsight. ...

December 19, 2025 · 4 min · Zelina
Cover image

Stepwise Think-Critique: Teaching LLMs to Doubt Themselves (Productively)

Opening — Why this matters now Large Language Models have learned how to think out loud. What they still struggle with is knowing when that thinking is wrong — while it is happening. In high‑stakes domains like mathematics, finance, or policy automation, delayed error detection is not a feature; it is a liability. Most modern reasoning pipelines still follow an awkward split: first generate reasoning, then verify it — often with a separate model. Humans do not work this way. We reason and judge simultaneously. This paper asks a simple but uncomfortable question: what if LLMs were trained to do the same? ...

December 18, 2025 · 4 min · Zelina
Cover image

Unpacking the Explicit Mind: How ExplicitLM Redefines AI Memory

Why this matters now Every few months, another AI model promises to be more “aware” — but awareness is hard when memory is mush. Traditional large language models (LLMs) bury their knowledge across billions of parameters like a neural hoarder: everything is stored, but nothing is labeled. Updating a single fact means retraining the entire organism. The result? Models that can write essays about Biden while insisting he’s still president. ...

November 6, 2025 · 4 min · Zelina
Cover image

Breaking the Tempo: How TempoBench Reframes AI’s Struggle with Time and Causality

Opening — Why this matters now The age of “smart” AI models has reached an uncomfortable truth: they can ace your math exam but fail your workflow. While frontier systems like GPT‑4o and Claude‑Sonnet solve increasingly complex symbolic puzzles, they stumble when asked to reason through time—to connect what happened, what’s happening, and what must happen next. In a world shifting toward autonomous agents and decision‑chain AI, this isn’t a minor bug—it’s a systemic limitation. ...

November 5, 2025 · 4 min · Zelina
Cover image

When AI Packs Too Much Hype: Reassessing LLM 'Discoveries' in Bin Packing

Opening — Why this matters now The academic world has been buzzing ever since a Nature paper claimed that large language models (LLMs) had made “mathematical discoveries.” Specifically, through a method called FunSearch, LLMs were said to have evolved novel heuristics for the classic bin packing problem—an NP-hard optimization task as old as modern computer science itself. The headlines were irresistible: AI discovers new math. But as with many shiny claims, the real question is whether the substance matches the spectacle. ...

November 5, 2025 · 5 min · Zelina
Cover image

Who Really Runs the Workflow? Ranking Agent Influence in Multi-Agent AI Systems

Opening — Why this matters now Multi-agent systems — the so-called Agentic AI Workflows — are rapidly becoming the skeleton of enterprise-grade automation. They promise autonomy, composability, and scalability. But beneath this elegant choreography lies a governance nightmare: we often have no idea which agent is actually in charge. Imagine a digital factory of LLMs: one drafts code, another critiques it, a third summarizes results, and a fourth audits everything. When something goes wrong — toxic content, hallucinated outputs, or runaway costs — who do you blame? More importantly, which agent do you fix? ...

November 3, 2025 · 5 min · Zelina
Cover image

Dial M—for Markets: Brain‑Scanning and Steering LLMs for Finance

TL;DR A new paper shows how to insert a sparse, interpretable layer into an LLM to expose plain‑English concepts (e.g., sentiment, risk, timing) and steer them like dials without retraining. In finance news prediction, these interpretable features outperform final‑layer embeddings and reveal that sentiment, market/technical cues, and timing drive most short‑horizon alpha. Steering also debiases optimism, lifting Sharpe by nudging the model negative on sentiment. Why this matters (and what’s new) Finance teams have loved LLMs’ throughput but hated their opacity. This paper demonstrates a lightweight path to transparent performance: ...

September 1, 2025 · 4 min · Zelina
Cover image

The Lion Roars in Crypto: How Multi-Agent LLMs Are Taming Market Chaos

The cryptocurrency market is infamous for its volatility, fragmented data, and narrative-driven swings. While traditional deep learning systems crunch historical charts in search of patterns, they often do so blindly—ignoring the social, regulatory, and macroeconomic tides that move crypto prices. Enter MountainLion, a bold new multi-agent system that doesn’t just react to market signals—it reasons, reflects, and explains. Built on a foundation of specialized large language model (LLM) agents, MountainLion offers an interpretable, adaptive, and genuinely multimodal approach to financial trading. ...

August 3, 2025 · 3 min · Zelina
Cover image

How Sparse is Your Thought? Cracking the Inner Logic of Chain-of-Thought Prompts

Chain-of-Thought (CoT) prompting has become a go-to technique for improving multi-step reasoning in large language models (LLMs). But is it really helping models think better—or just encouraging them to bluff more convincingly? A new paper from Leiden University, “How does Chain of Thought Think?”, delivers a mechanistic deep dive into this question. By combining sparse autoencoders (SAEs) with activation patching, the authors dissect whether CoT actually changes what a model internally computes—or merely helps its outputs look better. ...

August 1, 2025 · 3 min · Zelina
Cover image

Circuits of Understanding: A Formal Path to Transformer Interpretability

Can we prove that we understand how a transformer works? Not just describe it heuristically, or highlight patterns—but actually trace its computations with the rigor of a math proof? That’s the ambition behind the recent paper Mechanistic Interpretability for Transformers: A Formal Framework and Case Study on Indirect Object Identification. The authors propose the first comprehensive mathematical framework for mechanistic interpretability, and they use it to dissect how a small transformer solves the Indirect Object Identification (IOI) task. What results is not just a technical tour de force, but a conceptual upgrade for the interpretability field. ...

July 30, 2025 · 3 min · Zelina