Forecasting

When Physics Remembers What Data Forgets

Opening — Why this matters now AI has become very good at interpolation and notoriously bad at extrapolation. Nowhere is this weakness more visible than in dynamical systems, where small forecasting errors compound into total nonsense. From markets to climate to orbital mechanics, the business question is the same: how much data do you really need before a model can be trusted to look forward? ...

MoE Money, MoE Problems? FinCast Bets Big on Foundation Models for Markets

TL;DR FinCast is a 1B‑parameter, decoder‑only Transformer trained on >20B financial time points with a token‑level sparse Mixture‑of‑Experts (MoE), learnable frequency embeddings, and a Point‑Quantile (PQ) loss that combines Huber point forecasts with quantile targets and a trend‑consistency term. In zero‑shot benchmarks across crypto/FX/stocks/futures, it reports ~20% lower MSE vs leading generic time‑series FMs, and it also beats supervised SOTAs—even without fine‑tuning—then widens the gap with a light fine‑tune. If you build risk or execution systems, the interesting part isn’t just accuracy points; it’s the shape of the predictions (tail‑aware, regime‑sensitive) and the deployment economics (conditional compute via sparse MoE + patching). ...

Crystal Ball, Meet Cron Job: What FutureX Reveals About ‘Live’ Forecasting Agents

The one-sentence take A new live benchmark, FutureX, swaps lab-style trivia for rolling, real-world future events, forcing agentic LLMs to search, reason, and hedge under uncertainty that actually moves—and the results expose where today’s “agents” are still brittle. Why FutureX matters now Enterprise teams are deploying agents to answer questions whose truth changes by the hour—markets, elections, sports, product launches. Static leaderboards don’t measure that. FutureX runs as a cron job on reality: it collects new events every day, has agents make predictions, and grades them after events resolve. That turns evaluation from a screenshot into a time series and makes overfitting to benchmark quirks a lot harder. ...

Forecast: Mostly Context with a Chance of Routing

Large language models can forecast surprisingly well when you hand them the right context. But naïve prompts leave money on the table. Today’s paper introduces four plug‑and‑play strategies—ReDP, CorDP, IC‑DP, RouteDP—that lift accuracy, interpretability, and cost‑efficiency without training new models. Here’s what that means for teams running demand, risk, or ops forecasts. Why this matters for business readers Most production forecasts are numeric workhorses (ARIMA/ETS/TS foundation models), while contextual facts—weather advisories, policy changes, promos, strikes—arrive as text. LLMs can read that text and adjust the forecast, but simply stuffing history+context into a prompt (“direct prompting”) is often fragile. The four strategies below are operational patterns you can drop into existing stacks without re‑architecting. ...

Forecast First, Ask Later: How DCATS Makes Time Series Smarter with LLMs

When it comes to forecasting traffic patterns, weather, or financial activity, the prevailing wisdom in machine learning has long been: better models mean better predictions. But a new approach flips this assumption on its head. Instead of chasing ever-more complex architectures, the DCATS framework (Data-Centric Agent for Time Series), developed by researchers at Visa, suggests we should first get our data in order—and let a language model do it. The Agentic Turn in AutoML DCATS builds on the trend of integrating Large Language Model (LLM) agents into AutoML pipelines, but with a twist. While prior systems like AIDE focus on automating model design and hyperparameter tuning, DCATS delegates a more fundamental task to its LLM agent: curating the right data. ...

When Small Coins Roar: Rethinking Systemic Risk in Crypto Volatility Forecasting

In traditional finance, systemic risk is often linked to size — the bigger the institution, the bigger the threat. But in crypto? The rules are different. A recent paper from researchers at Jinan University rewrites the forecasting playbook by demonstrating that systemic influence in crypto markets is more about network positioning than market cap. The authors introduce a state-adaptive volatility model that integrates multi-scale realized volatility measures (like semivariance and jump components) with time-varying quantile spillovers, producing a high-resolution view of inter-asset contagion — especially under stress. ...