Cover image

When Tokens Become Actions: A Policy Gradient Built for Transformers

Opening — Why this matters now Reinforcement learning has always assumed that actions are atomic. Large language models politely disagree. In modern LLM training, an “action” is rarely a single move. It is a sequence of tokens, often structured, sometimes tool‑augmented, occasionally self‑reflective. Yet most policy‑gradient methods still pretend that Transformers behave like generic RL agents. The result is a growing mismatch between theory and practice—especially visible in agentic reasoning, tool use, and long‑horizon tasks. ...

December 14, 2025 · 4 min · Zelina
Cover image

ExaCraft and the Missing Layer of AI Education: When Examples Finally Adapt

Opening — Why this matters now AI has learned how to explain everything. Unfortunately, it still explains things to no one in particular. Most educational AI systems today obsess over sequencing: which lesson comes next, which quiz you should take, which concept you’ve allegedly “mastered.” What they largely ignore is the most human part of learning—the example. Not the abstract definition. Not the symbolic formula. The concrete, relatable scenario that makes something click. ...

December 13, 2025 · 3 min · Zelina
Cover image

ImplicitRDP: When Robots Stop Guessing and Start Feeling

Opening — Why this matters now Robotic manipulation has always had a split personality. Vision plans elegantly in slow motion; force reacts brutally in real time. Most learning systems pretend this tension doesn’t exist — or worse, paper over it with handcrafted hierarchies. The result is robots that see the world clearly but still fumble the moment contact happens. ...

December 13, 2025 · 4 min · Zelina
Cover image

RL Grows a Third Dimension: Why Text-to-3D Finally Needs Reasoning

Opening — Why this matters now Text-to-3D generation has quietly hit a ceiling. Diffusion-based pipelines are expensive, autoregressive models are brittle, and despite impressive demos, most systems collapse the moment a prompt requires reasoning rather than recall. Meanwhile, reinforcement learning (RL) has already reshaped language models and is actively restructuring 2D image generation. The obvious question—long avoided—was whether RL could do the same for 3D. ...

December 13, 2025 · 4 min · Zelina
Cover image

SceneMaker: When 3D Scene Generation Stops Guessing

Opening — Why this matters now Single-image 3D scene generation has quietly become one of the most overloaded promises in computer vision. We ask a model to hallucinate geometry, infer occluded objects, reason about spatial relationships, and place everything in a coherent 3D world — all from a single RGB frame. When it fails, we call it a data problem. When it half-works, we call it progress. ...

December 13, 2025 · 4 min · Zelina
Cover image

Suzume-chan, or: When RAG Learns to Sit in Your Hand

Opening — Why this matters now For all the raw intelligence of modern LLMs, they still feel strangely absent. Answers arrive instantly, flawlessly even—but no one is there. The interaction is efficient, sterile, and ultimately disposable. As enterprises rush to deploy chatbots and copilots, a quiet problem persists: people understand information better when it feels socially grounded, not merely delivered. ...

December 13, 2025 · 3 min · Zelina
Cover image

When Data Comes in Boxes: Why Hierarchies Beat Sample Hoarding

Opening — Why this matters now Modern machine learning has a data problem that money can’t easily solve: abundance without discernment. Models are no longer starved for samples; they’re overwhelmed by datasets—entire repositories, institutional archives, and web-scale collections—most of which are irrelevant, redundant, or quietly harmful. Yet the industry still behaves as if data arrives as loose grains of sand. In practice, data arrives in boxes: datasets bundled by source, license, domain, and institutional origin. Selecting the right boxes is now the binding constraint. ...

December 13, 2025 · 3 min · Zelina
Cover image

When LLMs Stop Guessing and Start Arguing: A Two‑Stage Cure for Health Misinformation

Opening — Why this matters now Health misinformation is not a fringe problem anymore. It is algorithmically amplified, emotionally charged, and often wrapped in scientific‑looking language that fools both humans and machines. Most AI fact‑checking systems respond by doing more — more retrieval, more reasoning, more prompts. This paper argues the opposite: do less first, think harder only when needed. ...

December 13, 2025 · 3 min · Zelina
Cover image

Agents Without Time: When Reinforcement Learning Meets Higher-Order Causality

Opening — Why this matters now Reinforcement learning has spent the last decade obsessing over better policies, better value functions, and better credit assignment. Physics, meanwhile, has been busy questioning whether time itself needs to behave nicely. This paper sits uncomfortably—and productively—between the two. At a moment when agentic AI systems are being deployed in distributed, partially observable, and poorly synchronized environments, the assumption of a fixed causal order is starting to look less like a law of nature and more like a convenience. Wilson’s work asks a precise and unsettling question: what if decision-making agents and causal structure are the same mathematical object viewed from different sides? ...

December 12, 2025 · 3 min · Zelina
Cover image

HAROOD: When Benchmarks Grow Up and Models Stop Cheating

Opening — Why this matters now Human Activity Recognition (HAR) has quietly become one of those applied ML fields where headline accuracy keeps improving, while real-world reliability stubbornly refuses to follow. Models trained on pristine datasets collapse the moment the sensor moves two centimeters, the user changes, or time simply passes. The industry response has been predictable: larger models, heavier architectures, and now—inevitably—LLMs. The paper behind HAROOD argues that this reflex is misplaced. The real problem is not model capacity. It is evaluation discipline. ...

December 12, 2025 · 3 min · Zelina