Decision Systems

Process Reward Agents — When Reasoning Learns to Judge Itself (Before It’s Too Late)

Opening — Why this matters now There is a quiet but consequential flaw in modern AI reasoning systems: they are excellent storytellers, but poor self-editors. In domains like healthcare, finance, and law, correctness is not a property of the final answer—it is a property of the entire reasoning trajectory. Yet most large language models (LLMs) only discover their mistakes at the very end, if at all. By then, the damage is already embedded in the chain of thought. ...

Completeness Is Not Optional — Why Game-Playing AI Finally Learned to Finish What It Starts

Opening — Why this matters now The AI industry has developed an unfortunate habit: celebrating systems that usually work. From large language models hallucinating citations to reinforcement learning agents missing obvious optimal moves, the pattern is familiar—impressive performance, quietly unreliable guarantees. This paper, “Completeness of Unbounded Best-First Minimax and Descent Minimax” fileciteturn0file0, addresses a deceptively narrow issue in game search algorithms. But underneath, it tackles something far more uncomfortable: ...

EMoT: When AI Starts Thinking Like Fungus (and Why That’s Not as Weird as It Sounds)

Opening — Why this matters now There is a quiet shift happening in AI—not in model size, but in how models think. For the past two years, the industry has optimized reasoning by refining prompts: Chain-of-Thought, Tree-of-Thoughts, Graph-of-Thoughts. Each iteration made reasoning more structured, more deliberate, more… verbose. But underneath the surface, the paradigm remained unchanged: reasoning is still a temporary, disposable process. ...