Cognaptus Insights

When Papers Learn to Draw: AutoFigure and the End of Ugly Science Diagrams

AutoFigure shows why publication-ready scientific diagrams need reasoning-first visual pipelines, not prettier text-to-image prompts.

When Your Agent Starts Copying Itself: Breaking Conversational Inertia

A mechanism-first reading of conversational inertia: why long context can make agents imitate their own mistakes, and why strategic forgetting may beat bigger memory.

Click Like a Human: Why Avenir-Web Is a Quiet Breakthrough in Web Agents

Avenir-Web shows why reliable web agents need procedural experience, hybrid grounding, explicit progress tracking, and compressed memory—not just bigger multimodal models.

Click with Confidence: Teaching GUI Agents When Not to Click

SafeGround shows how uncertainty calibration can turn GUI agents from reckless clickers into risk-budgeted automation systems.

Coaching the Swarm: Why Multi‑Agent RL Finally Scales

A mechanism-first reading of MAPPA, a process-reward method for turning multiagent LLM workflows from prompted collaboration into trainable systems.

DRIFT-BENCH: When Agents Stop Asking and Start Breaking

A business-focused reading of DRIFT-BENCH, showing why agent reliability depends less on asking more questions and more on knowing when clarification helps, when it harms, and when execution must stop.

Identity Crisis: How a Trivial Trick Teaches LLMs to Think Backwards

A mechanism-first reading of why identity-bridge data can weaken the reversal curse in autoregressive LLMs—and why the useful trick is more delicate than it first looks.

No More Bit-Length Anxiety: Policy Iteration Goes Strongly Polynomial

A mechanism-first reading of why robust policy iteration for $L_\infty$ robust MDPs is not merely convergent, but strongly polynomial under fixed discount.

RAudit: When Models Think Too Much and Still Get It Wrong

RAudit shows why longer reasoning, stronger judges, and harsher critique can reveal LLM failures—but can also amplify them.

Seeing Is Not Reasoning: Why Mental Imagery Still Breaks Multimodal AI

A mechanism-first reading of MentisOculi, and why explicit visual thoughts still fail to become reliable reasoning evidence for multimodal AI.