Data Curation

The Data Diet for Reasoning Models: Why Less (But Smarter) Wins

A model-training team has a familiar bad habit: when the model fails, it asks for more. More examples. More domains. More synthetic prompts. More compute. More benchmarks to average over until the unpleasant details become small enough to ignore. This habit is understandable. It is also expensive. And, according to SuperNova, it may be the wrong first instinct. ...

When Maps Start Thinking: Teaching Agents to Plan in Time and Space

A map query is easy: get me from A to B. A service request is harder: leave after lunch, avoid tolls, find a charging station before the battery becomes theatrical, stop somewhere quiet for dinner, and make sure the restaurant is still open when we arrive. Every additional clause turns a lookup into a sequence of commitments. Locations must be resolved. Routes must be calculated. Opening hours, traffic, weather, prices, and travel times must remain mutually consistent. An incorrect essay can still sound intelligent. An incorrect itinerary can leave someone beside a closed charging station. ...

Eight Arms, One Mind: How OctoMed Turns Data Recipes into Medical Reasoning Power

Eight Arms, One Mind: How OctoMed Turns Data Recipes into Medical Reasoning Power Recipe sounds like a small word for an expensive problem. In medical AI, the usual boardroom story is simple: buy a bigger model, add more compute, sprinkle in reinforcement learning, and wait for clinical intelligence to appear. Very elegant. Also very convenient for anyone selling compute. ...