When One Token Rules Them All: Diffusion Models and the Quiet Collapse of Composition
Product teams often discover image-generation failure in the most boring possible way: the image looks good. The lighting is fine. The texture is convincing. The output is not deformed, not surreal in the bad way, and not obviously broken. Then someone notices the actual requested product is missing. A prompt asks for a famous castle on a coaster. The model gives the castle. It may give a postcard, a painting, a dramatic tourist shot, perhaps a suspiciously elegant architectural fantasy. The coaster quietly leaves the room. No farewell email. ...