Self-Improvement

Evolving Minds: How LLMs Teach Themselves Through Adversarial Cooperation

The dream of self-improving intelligence has long haunted AI research—a model that learns not from humans, but from itself. Multi-Agent Evolve (MAE) by Yixing Chen et al. (UIUC, NVIDIA, PKU) gives that dream a concrete architecture: three versions of the same LLM—Proposer, Solver, and Judge—locked in a continuous loop of challenge, response, and evaluation. No human labels. No external verifiers. Just the model, teaching itself through the friction of disagreement. ...

Mirror, Mirror in the Model: How MLLMs Learn from Their Own Mistakes

When multimodal large language models (MLLMs) like Gemini or Janus are asked to generate an image and then assess whether that image matches a prompt, you’d expect agreement. But a new study shows this harmony is often missing: the model’s own understanding branch disagrees with what its generation branch creates. This phenomenon—called self-contradiction—isn’t just an embarrassing quirk. As it turns out, it may be the most valuable feedback signal MLLMs have. ...