Cover image

When LLMs Meet Time: Why Time-Series Reasoning Is Still Hard

Opening — Why this matters now Large Language Models are increasingly marketed as general problem solvers. They summarize earnings calls, reason about code, and explain economic trends with alarming confidence. But when confronted with time—real, numeric, structured temporal data—that confidence starts to wobble. The TSAQA benchmark arrives at exactly the right moment, not to celebrate LLM progress, but to measure how far they still have to go. ...

February 3, 2026 · 3 min · Zelina
Cover image

When Prophet Meets Perceptron: Chasing Alpha with NP‑DNN

Opening — Why this matters now Stock prediction papers arrive with clockwork regularity, each promising to tame volatility with yet another hybrid architecture. Most quietly disappear after publication. A few linger—usually because they claim eye‑catching accuracy. This paper belongs to that second category, proposing a Neural Prophet + Deep Neural Network (NP‑DNN) stack that reportedly delivers over 93%–99% accuracy in stock market prediction. ...

January 9, 2026 · 3 min · Zelina
Cover image

Rationales Before Results: Teaching Multimodal LLMs to Actually Reason About Time Series

Opening — Why this matters now Multimodal LLMs are increasingly being asked to reason about time series: markets, traffic, power grids, pollution. Charts are rendered. Prompts are polished. The answers sound confident. And yet—too often—they’re wrong for the most boring reason imaginable: the model never actually reasons. Instead, it pattern-matches. This paper dissects that failure mode with unusual clarity. The authors argue that the bottleneck is not model scale, data access, or even modality alignment. It’s the absence of explicit reasoning priors that connect observed temporal patterns to downstream outcomes. Without those priors, multimodal LLMs hallucinate explanations after the fact, mistaking surface similarity for causality. ...

January 7, 2026 · 4 min · Zelina
Cover image

Kill the Correlation, Save the Grid: Why Energy Forecasting Needs Causality

Opening — Why this matters now Energy forecasting is no longer a polite academic exercise. Grid operators are balancing volatile renewables, industrial consumers are optimizing costs under razor‑thin margins, and regulators are quietly realizing that accuracy without robustness is a liability. Yet most energy demand models still do what machine learning does best—and worst: optimize correlations and hope tomorrow looks like yesterday. This paper argues that hope is not a strategy. ...

December 15, 2025 · 4 min · Zelina
Cover image

HAROOD: When Benchmarks Grow Up and Models Stop Cheating

Opening — Why this matters now Human Activity Recognition (HAR) has quietly become one of those applied ML fields where headline accuracy keeps improving, while real-world reliability stubbornly refuses to follow. Models trained on pristine datasets collapse the moment the sensor moves two centimeters, the user changes, or time simply passes. The industry response has been predictable: larger models, heavier architectures, and now—inevitably—LLMs. The paper behind HAROOD argues that this reflex is misplaced. The real problem is not model capacity. It is evaluation discipline. ...

December 12, 2025 · 3 min · Zelina
Cover image

MoE Money, MoE Problems? FinCast Bets Big on Foundation Models for Markets

TL;DR FinCast is a 1B‑parameter, decoder‑only Transformer trained on >20B financial time points with a token‑level sparse Mixture‑of‑Experts (MoE), learnable frequency embeddings, and a Point‑Quantile (PQ) loss that combines Huber point forecasts with quantile targets and a trend‑consistency term. In zero‑shot benchmarks across crypto/FX/stocks/futures, it reports ~20% lower MSE vs leading generic time‑series FMs, and it also beats supervised SOTAs—even without fine‑tuning—then widens the gap with a light fine‑tune. If you build risk or execution systems, the interesting part isn’t just accuracy points; it’s the shape of the predictions (tail‑aware, regime‑sensitive) and the deployment economics (conditional compute via sparse MoE + patching). ...

August 30, 2025 · 5 min · Zelina
Cover image

Quants With a Plan: Agentic Workflows That Outtrade AutoML

If AutoML is a fast car, financial institutions need a train with tracks—a workflow that knows where it’s going, logs every switch, and won’t derail when markets regime-shift. A new framework called TS-Agent proposes exactly that: a structured, auditable, LLM-driven agent that plans model development for financial time series instead of blindly searching. Unlike generic AutoML, TS-Agent formalizes modeling as a multi-stage decision process—Model Pre-selection → Code Refinement → Fine-tuning—and anchors each step in domain-curated knowledge banks and reflective feedback from real runs. The result is not just higher accuracy; it’s traceability and consistency that pass governance sniff tests. ...

August 20, 2025 · 5 min · Zelina
Cover image

Forecast First, Ask Later: How DCATS Makes Time Series Smarter with LLMs

When it comes to forecasting traffic patterns, weather, or financial activity, the prevailing wisdom in machine learning has long been: better models mean better predictions. But a new approach flips this assumption on its head. Instead of chasing ever-more complex architectures, the DCATS framework (Data-Centric Agent for Time Series), developed by researchers at Visa, suggests we should first get our data in order—and let a language model do it. The Agentic Turn in AutoML DCATS builds on the trend of integrating Large Language Model (LLM) agents into AutoML pipelines, but with a twist. While prior systems like AIDE focus on automating model design and hyperparameter tuning, DCATS delegates a more fundamental task to its LLM agent: curating the right data. ...

August 7, 2025 · 3 min · Zelina
Cover image

Shattering the Spectrum: How PRISM Revives Signal Processing in Time-Series AI

In the race to conquer time-series classification, most modern models have sprinted toward deeper Transformers and wider convolutional architectures. But what if the real breakthrough came not from complexity—but from symmetry? Enter PRISM (Per-channel Resolution-Informed Symmetric Module), a model that merges classical signal processing wisdom with deep learning, and in doing so, delivers a stunning blow to overparameterized AI. PRISM’s central idea is refreshingly simple: instead of building a massive model to learn everything from scratch, start by decomposing the signal like a physicist would—using symmetric FIR filters at multiple temporal resolutions, applied independently per channel. Like a prism splitting light into distinct wavelengths, PRISM separates time-series data into spectral components that are clean, diverse, and informative. ...

August 7, 2025 · 3 min · Zelina
Cover image

Causality in Stereo: How Multi-Band Granger Unveils Frequency-Specific Influence

Causality is rarely one-size-fits-all—especially in the dynamic world of time series data. Whether you’re analyzing brainwaves, financial markets, or industrial processes, the timing of influence and the frequency at which it occurs both matter. Traditional Granger causality assumes a fixed temporal lag, while Variable-Lag Granger Causality (VLGC) brings some flexibility by allowing dynamic time alignment. But even VLGC falls short of capturing frequency-specific causal dynamics, which are ubiquitous in complex systems. ...

August 4, 2025 · 4 min · Zelina