Cover image

HAROOD: When Benchmarks Grow Up and Models Stop Cheating

A wearable model can look brilliant in the lab and embarrass itself on Monday morning. The user changes. The watch slides down the wrist. A sensor is mounted on the chest instead of the pocket. The same person walks differently after fatigue, injury, aging, or simply because life has the terrible habit of not matching the training set. Human Activity Recognition, or HAR, has always lived with this problem. It turns sensor streams from accelerometers, gyroscopes, EMG, ECG, and other wearable or ambient devices into labels such as walking, running, sitting, cycling, or stress state. It is useful precisely because it moves into the real world. That is also where benchmark accuracy goes to die. ...

December 12, 2025 · 20 min · Zelina
Cover image

Noisy but Wise: How Simple Noise Injection Beats Shortcut Learning in Medical AI

X-rays look clinical. To a neural network, they can also look like stationery. A hospital name in the corner. A scanner signature. A compression pattern. A familiar positioning marker. A slightly different way of cropping the lung field. None of these is pneumonia. None of these is COVID-19. Yet a deep learning model trained on small medical datasets can treat them as wonderfully convenient diagnostic evidence, because machines are very good at passing exams and less naturally committed to understanding what the exam is about. ...

November 9, 2025 · 15 min · Zelina