When Robots Disagree: Taming Gradient Conflicts in Cross-Embodiment Offline RL
A robot fleet looks efficient on a spreadsheet. One warehouse robot logs a few million movements. Another quadruped logs a few million more. A bipedal platform contributes its own dataset. The obvious managerial instinct is to pour everything into one large training pool and let scale do its polite little miracle. This is where robots become less cooperative than cloud software. ...