3D Generation

RL Grows a Third Dimension: Why Text-to-3D Finally Needs Reasoning

Opening — Why this matters now Text-to-3D generation has quietly hit a ceiling. Diffusion-based pipelines are expensive, autoregressive models are brittle, and despite impressive demos, most systems collapse the moment a prompt requires reasoning rather than recall. Meanwhile, reinforcement learning (RL) has already reshaped language models and is actively restructuring 2D image generation. The obvious question—long avoided—was whether RL could do the same for 3D. ...

SceneMaker: When 3D Scene Generation Stops Guessing

Opening — Why this matters now Single-image 3D scene generation has quietly become one of the most overloaded promises in computer vision. We ask a model to hallucinate geometry, infer occluded objects, reason about spatial relationships, and place everything in a coherent 3D world — all from a single RGB frame. When it fails, we call it a data problem. When it half-works, we call it progress. ...