Quants With a Plan: Agentic Workflows That Outtrade AutoML

If AutoML is a fast car, financial institutions need a train with tracks—a workflow that knows where it’s going, logs every switch, and won’t derail when markets regime-shift. A new framework called TS-Agent proposes exactly that: a structured, auditable, LLM-driven agent that plans model development for financial time series instead of blindly searching.

Unlike generic AutoML, TS-Agent formalizes modeling as a multi-stage decision process—Model Pre-selection → Code Refinement → Fine-tuning—and anchors each step in domain-curated knowledge banks and reflective feedback from real runs. The result is not just higher accuracy; it’s traceability and consistency that pass governance sniff tests.

What TS-Agent Actually Is (in business English)

A planner agent orchestrates a workflow for forecasting and synthetic data generation.
Decisions at each step are grounded in three structured knowledge banks:
1. Case Bank: past financial tasks & proven solutions; 2) Financial TS Code Base: ready-to-run models & evaluation metrics; 3) Refinement Knowledge Bank: training heuristics (e.g., learning-rate schedules, leakage checks, walk-forward validation).
The agent keeps auditable logs of what changed, why, and what happened, enabling compliance, debugging, and reproducibility.

The Three Knowledge Banks at a Glance

Bank	What it holds	What decision it improves	Why it matters in finance
Case Bank	Prior tasks & winning approaches	Model pre-selection	Case-based reasoning cuts dead-ends, aligns to familiar regimes
Financial TS Code Base	Implemented models (e.g., Autoformer, PatchTST, DLinear; TimeGAN, DDPM) + metric suite	Refine without re-inventing	Lowers variance vs. freehand code-gen; faster, more reliable
Refinement Knowledge Bank	Heuristics: scaling, leakage prevention, schedulers, weight decay, early stopping, cross-validation	Make code better (safely)	Enforces best practices and prevents silent model risk

Why This Matters (beyond RMSE)

Auditability by design: Each code edit is isolated and justified; logs tie decisions to outcomes—ideal for model risk management (MRM) and internal audit.
Lower variance across LLMs: Because the agent edits within a curated code base, results don’t swing wildly with backbone swaps; that’s operational stability.
Finance-first metrics: Beyond RMSE/MAE/MAPE/sMAPE, TS-Agent optimizes Sharpe/VaR/ES deltas and distributional & dependency scores for generators—metrics risk teams actually care about.

How the Workflow Runs

Stage 1 – Model Pre-selection
- Retrieve similar cases; shortlist models (e.g., Autoformer vs. PatchTST for 60→3-day stock forecasting).
Stage 2 – Code Refinement (two phases)
- Warm-up (round-robin): iterate per model with small edits (e.g., ReduceLROnPlateau, weight decay) and quick tuning; keep only best variants.
- Optimization: focus on the top candidate; iterate longer, rejecting edits that don’t improve validation metrics; keep full logs.

Think of it as chain-of-code-edits: a controlled, reversible path of improvements with checkpoints—like Git for modeling decisions.

Evidence: Does Planning Beat Searching?

Forecasting across Crypto (hourly), Exchange (daily FX), and U.S. Stock (daily):

TS-Agent achieved 100% run success and lowest error with modern LLMs, cutting RMSE >20% vs. AutoML on Exchange and ~8% on Crypto, and up to 30% vs. DS-Agent and 15–40% vs. ResearchAgent.
On risk-sensitive metrics for Crypto, TS-Agent delivered the lowest Sharpe/VaR deltas (≈20% better than competing agents with comparable LLMs), indicating forecasts preserve market structure—not just point accuracy.

Synthetic generation (GAN/VAE/diffusion families on Exchange/Stock/Crypto):

TS-Agent consistently ranked top on Marginal, Correlation, Autocorrelation, and Covariance distances; it matched or beat Optuna while maintaining 100% success across LLMs.
Variance across backbones was materially lower than generic agents—practical reliability for stress testing and data augmentation.

Practical Implications for Financial Teams

Where it shines

Regulated environments needing explainable model evolution and tight change control.
Shops with partially standardized codebases seeking agentic acceleration without free-form code risk.
Volatile markets (crypto, EM FX) where risk-aligned metrics matter as much as pure error.

What to watch

Curation debt: The Case/Code/Refinement banks must be curated and updated; treat them like a product.
Guardrails: Keep write-access to production repos gated; TS-Agent should propose PRs, not hot-patch prod.
Data discipline: The system assumes leak-free splits and correct walk-forward; governance should validate these assumptions.

A 30-Day Adoption Playbook

Week 1: Baseline & Banks

Inventory current forecasting/generation tasks and metrics (add risk metrics if missing).
Seed the Code Base with vetted implementations and unit tests; draft the Refinement Bank from internal MRM checklists.

Week 2: Pilot Workflow

Run TS-Agent on one asset class (e.g., FX daily). Capture all logs; require human sign-off on each commit.
Compare against your current AutoML baseline on accuracy and Sharpe/VaR/ES deltas.

Week 3: Governance Tightening

Wire logs to your MRM system (model inventory IDs, approvals, owners).
Add policy checks (leakage detectors, dataset lineage tags) to the Refinement Bank.

Week 4: Scale & SRE

Parallelize Warm-up across symbols; centralize artifacts (configs, weights, logs) with retention policies.
Create a PR-only pathway: TS-Agent opens PRs with rationale & metrics; reviewers approve/merge.

Cheat-Sheet: Pain Points → TS-Agent Moves

Pain Point	TS-Agent Mechanism	Outcome
Black-box AutoML trails on risk metrics	Finance-first metric bank + reflective tuning	Lower Sharpe/VaR/ES deltas, better trading fidelity
Fragile freehand code generation	Edit within a curated code base	Fewer bugs, faster wins, easier audits
Hard-to-explain model drift	Chain-of-code-edits + logs	Reproducible, auditable evolution
LLM backbone instability	Knowledge banks + constrained edits	Lower variance across LLMs

Bottom Line

TS-Agent reframes “try everything and hope” into plan → edit → measure → log. In markets where what changed and why is as important as how well it performs, this is the agentic blueprint that finally respects both P&L and policy.

Cognaptus: Automate the Present, Incubate the Future.

What TS-Agent Actually Is (in business English)#

The Three Knowledge Banks at a Glance#

Why This Matters (beyond RMSE)#

How the Workflow Runs#

Evidence: Does Planning Beat Searching?#

Practical Implications for Financial Teams#

A 30-Day Adoption Playbook#

Cheat-Sheet: Pain Points → TS-Agent Moves#

Bottom Line#