Whispers Against the Noise: How Contrastive Decoding Tames Long‑Form ASR Hallucinations
A transcript is usually treated as boring infrastructure. It sits underneath meeting summaries, call-center analytics, podcast search, earnings-call review, legal discovery, medical documentation, and the cheerful dashboard that tells managers everything is now “AI-powered.” Then the transcript invents a sentence. Not a typo. Not a small mishearing. A fluent, confident, context-shaped sentence that nobody said. In short clips, this is irritating. In long recordings, it becomes structural. One bad segment can become context for the next segment; the next segment inherits the mistake; and soon the system is not transcribing a recording so much as continuing a badly seeded story. ...