
From Sparse to Smart: How PROGRM Elevates GUI Agent Training
The GUI Agent Bottleneck: Stuck in Sparse Feedback Training LLM-based GUI agents to complete digital tasks—such as navigating mobile apps or automating workflows—faces a fundamental limitation: reward sparsity. Traditional reward formulations (Outcome Reward Models, or ORMs) provide feedback only at the end of a trajectory. If the task fails, the agent receives zero signal, regardless of how many useful intermediate steps it took. This severely limits credit assignment and slows learning, especially in environments with long action horizons. ...