Going With the Flow: How Community Density Might Replace Human Feedback
Opening — Why this matters now Alignment has quietly become the most expensive line item in the modern AI stack. Training a large language model is already costly, but aligning it with human values is worse. Reinforcement Learning from Human Feedback (RLHF), preference datasets, annotation pipelines, and evaluation frameworks require armies of annotators and carefully curated tasks. The result is an alignment paradigm that works well for large companies — and poorly for everyone else. ...