OCR and the City: Why Document AI Still Needs Eyes
A document lands in an intake queue. It might be an invoice, a memo, a form, a résumé, or one of those corporate artifacts whose layout says more than the words do. Someone wants the system to classify it instantly, because every downstream workflow—routing, extraction, compliance, archiving—depends on that first label. The fashionable answer is: send it to a large language model. Extract the text, paste it into a prompt, ask for one label, and let the machine be clever. This is attractive because it feels general. It is also how many automation projects quietly turn a visual problem into a text problem, then act surprised when the system starts calling file folders “proposals” because the word proposal appeared somewhere on the page. ...