TL;DR for operators Do not ask, “Can the model do the task?” Ask, “Does the model use the capabilities it already has when the task becomes messy?” Hallucination is not one thing. In a medical, legal, financial, or investment workflow, it is a defect. In a labelled creative mode, it can be a feature. Revolutionary stuff: context matters. Goal-directedness is also not one thing. More goal pursuit can improve execution, but it also raises safety and governance questions. The sensible business pattern is not “deploy an autonomous AI analyst and hope it behaves”. It is mode governance: separate factual, creative, and decision-support modes with different metrics, interfaces, and controls. High-stakes workflows need scaffolding: memory, rule extraction, refinement loops, ensemble checks, scoring, audit trails, and humans who can edit policy rather than merely admire the model’s prose. AI products are currently being sold with a suspiciously convenient promise: one conversational interface will reason, search, write, create, decide, advise, analyse, and maybe spiritually support the quarterly planning meeting if procurement approves the invoice.
...