Cover image

Tensor-DTI: Binding the Signal, Not the Noise

Opening — Why this matters now Drug discovery has a scale problem. Not a small one. A billion-compound problem. Chemical space has outpaced every classical screening method we have—experimental or computational. Docking strains at a few million compounds. Diffusion models demand structural data that simply doesn’t exist for most targets. Meanwhile, enumerated libraries like Enamine REAL quietly crossed 70+ billion molecules, and nobody bothered to ask whether our AI tooling is actually ready for that reality. ...

January 14, 2026 · 4 min · Zelina
Cover image

When Views Go Missing, Labels Talk Back

Opening — Why this matters now In theory, multi‑view multi‑label learning is a gift: more modalities, richer semantics, better predictions. In practice, it is a recurring disappointment. Sensors fail, annotations are partial, budgets run out, and the elegant assumption of “complete views with full labels” quietly collapses. What remains is the real industrial problem: fragmented features and half‑known truths. ...

January 14, 2026 · 4 min · Zelina
Cover image

Click, Fail, Learn: Why BEPA Might Be the First GUI Agent That Actually Improves

Opening — Why this matters now Autonomous agents are very good at talking about tasks. They are far less competent at actually doing them—especially when “doing” involves clicking the right icon, interpreting a cluttered interface, or recovering gracefully from failure. GUI agents, in particular, suffer from a chronic problem: once they fail, they either repeat the same mistake or forget everything they once did right. ...

January 12, 2026 · 3 min · Zelina
Cover image

STACKPLANNER: When Agents Learn to Forget

Opening — Why this matters now Multi-agent systems built on large language models are having a moment. From research copilots to autonomous report generators, the promise is seductive: split a complex task into pieces, let specialized agents work in parallel, and coordinate everything with a central planner. In practice, however, these systems tend to collapse under their own cognitive weight. ...

January 12, 2026 · 4 min · Zelina
Cover image

When Debate Stops Being a Vote: DynaDebate and the Engineering of Reasoning Diversity

Opening — Why this matters now Multi-agent debate was supposed to be the antidote to brittle single-model reasoning. Add more agents, let them argue, and truth would somehow emerge from friction. In practice, what often emerges is something closer to a polite echo chamber. Despite the growing popularity of Multi-Agent Debate (MAD) frameworks, many systems quietly degenerate into majority voting over nearly identical reasoning paths. When all agents make the same mistake—just phrased slightly differently—debate becomes theater. The paper DynaDebate: Breaking Homogeneity in Multi-Agent Debate with Dynamic Path Generation tackles this problem head-on, and, refreshingly, does so by treating reasoning as an engineered process rather than a conversational one. fileciteturn0file0 ...

January 12, 2026 · 4 min · Zelina
Cover image

When Robots Guess, People Bleed: Teaching AI to Say ‘This Is Ambiguous’

Opening — Why this matters now Embodied AI has become very good at doing things. What it remains surprisingly bad at is asking a far more basic question: “Should I be doing anything at all?” In safety‑critical environments—surgical robotics, industrial automation, AR‑assisted operations—this blind spot is not academic. A robot that confidently executes an ambiguous instruction is not intelligent; it is dangerous. The paper behind Ambi3D and AmbiVer confronts this neglected layer head‑on: before grounding, planning, or acting, an agent must determine whether an instruction is objectively unambiguous in the given 3D scene. ...

January 12, 2026 · 4 min · Zelina
Cover image

Agents That Ship, Not Just Think: When LLM Self-Improvement Meets Release Engineering

Opening — Why this matters now LLM agents are no longer party tricks. They browse the web, patch production code, orchestrate APIs, and occasionally—quite creatively—break things that used to work. The industry’s instinctive response has been to make agents smarter by turning them inward: more reflection, more self-critique, more evolutionary prompt tinkering. Performance improves. Confidence does not. ...

January 11, 2026 · 4 min · Zelina
Cover image

ResMAS: When Multi‑Agent Systems Stop Falling Apart

Opening — Why this matters now Multi-agent systems (MAS) built on large language models have developed a bad habit: they work brilliantly—right up until the moment one agent goes off-script. A single failure, miscommunication, or noisy response can quietly poison the entire collaboration. In production environments, this isn’t a hypothetical risk; it’s the default operating condition. ...

January 11, 2026 · 4 min · Zelina
Cover image

When LLMs Stop Talking and Start Driving

Opening — Why this matters now Digital transformation has reached an awkward phase. Enterprises have accumulated oceans of unstructured data, deployed dashboards everywhere, and renamed half their IT departments. Yet when something actually breaks—equipment fails, suppliers vanish, costs spike—the organization still reacts slowly, manually, and often blindly. The uncomfortable truth: most “AI-driven transformation” initiatives stop at analysis. They classify, predict, and visualize—but they rarely decide. This paper confronts that gap directly, asking a sharper question: what does it take for large models to become operational drivers rather than semantic commentators? fileciteturn0file0 ...

January 11, 2026 · 4 min · Zelina
Cover image

When Solvers Guess Smarter: Teaching SMT to Think in Functions

Opening — Why this matters now Quantified SMT solving has always lived in an uncomfortable space between elegance and brute force. As models grew richer—mixing non-linear arithmetic, real-valued domains, and uninterpreted functions—the solvers stayed stubbornly syntactic. They match patterns. They enumerate. They hope. Meanwhile, large language models have quietly absorbed a century’s worth of mathematical intuition. AquaForte asks an obvious but previously taboo question: what if we let SMT solvers borrow that intuition—without surrendering formal guarantees? ...

January 11, 2026 · 3 min · Zelina