Cover image

Click Less, Do More: Why API-GUI + RL Could Finally Make Desktop Agents Useful

The gist (and why it matters for business) Enterprise buyers don’t reward demos; they reward repeatable completions per dollar. ComputerRL proposes a path to that by (1) escaping pure GUI mimicry via a machine-first API-GUI action space, (2) scaling online RL across thousands of Ubuntu VMs, and (3) preventing policy entropy collapse with Entropulse—a cadence that alternates RL and supervised fine-tuning (SFT) on successful rollouts. The result: a reported 48.1% OSWorld success with markedly fewer steps than GUI-only agents. Translation for buyers: lower latency, lower cost, higher reliability. ...

August 20, 2025 · 5 min · Zelina