When Text Doesn’t Help: Rethinking Multimodality in Forecasting
TL;DR for operators Text does not automatically make forecasts smarter. It often just makes the pipeline heavier. A new AWS study benchmarks multimodal time-series forecasting across 16 datasets and 7 domains, comparing time-series-only models, alignment-based multimodal models, and direct LLM prompting.1 The uncomfortable result is that multimodality is not a universal upgrade. Strong unimodal models still win on a substantial share of the benchmark, and the paper’s statistical tests do not support a blanket claim that adding text reliably improves accuracy. ...