If AutoML is a fast car, financial institutions need a train with tracks—a workflow that knows where it’s going, logs every switch, and won’t derail when markets regime-shift. A new framework called TS-Agent proposes exactly that: a structured, auditable, LLM-driven agent that plans model development for financial time series instead of blindly searching.

Unlike generic AutoML, TS-Agent formalizes modeling as a multi-stage decision processModel Pre-selection → Code Refinement → Fine-tuning—and anchors each step in domain-curated knowledge banks and reflective feedback from real runs. The result is not just higher accuracy; it’s traceability and consistency that pass governance sniff tests.

What TS-Agent Actually Is (in business English)

  • A planner agent orchestrates a workflow for forecasting and synthetic data generation.

  • Decisions at each step are grounded in three structured knowledge banks:

    1. Case Bank: past financial tasks & proven solutions; 2) Financial TS Code Base: ready-to-run models & evaluation metrics; 3) Refinement Knowledge Bank: training heuristics (e.g., learning-rate schedules, leakage checks, walk-forward validation).
  • The agent keeps auditable logs of what changed, why, and what happened, enabling compliance, debugging, and reproducibility.

The Three Knowledge Banks at a Glance

Bank What it holds What decision it improves Why it matters in finance
Case Bank Prior tasks & winning approaches Model pre-selection Case-based reasoning cuts dead-ends, aligns to familiar regimes
Financial TS Code Base Implemented models (e.g., Autoformer, PatchTST, DLinear; TimeGAN, DDPM) + metric suite Refine without re-inventing Lowers variance vs. freehand code-gen; faster, more reliable
Refinement Knowledge Bank Heuristics: scaling, leakage prevention, schedulers, weight decay, early stopping, cross-validation Make code better (safely) Enforces best practices and prevents silent model risk

Why This Matters (beyond RMSE)

  • Auditability by design: Each code edit is isolated and justified; logs tie decisions to outcomes—ideal for model risk management (MRM) and internal audit.
  • Lower variance across LLMs: Because the agent edits within a curated code base, results don’t swing wildly with backbone swaps; that’s operational stability.
  • Finance-first metrics: Beyond RMSE/MAE/MAPE/sMAPE, TS-Agent optimizes Sharpe/VaR/ES deltas and distributional & dependency scores for generators—metrics risk teams actually care about.

How the Workflow Runs

  1. Stage 1 – Model Pre-selection

    • Retrieve similar cases; shortlist models (e.g., Autoformer vs. PatchTST for 60→3-day stock forecasting).
  2. Stage 2 – Code Refinement (two phases)

    • Warm-up (round-robin): iterate per model with small edits (e.g., ReduceLROnPlateau, weight decay) and quick tuning; keep only best variants.
    • Optimization: focus on the top candidate; iterate longer, rejecting edits that don’t improve validation metrics; keep full logs.

Think of it as chain-of-code-edits: a controlled, reversible path of improvements with checkpoints—like Git for modeling decisions.

Evidence: Does Planning Beat Searching?

Forecasting across Crypto (hourly), Exchange (daily FX), and U.S. Stock (daily):

  • TS-Agent achieved 100% run success and lowest error with modern LLMs, cutting RMSE >20% vs. AutoML on Exchange and ~8% on Crypto, and up to 30% vs. DS-Agent and 15–40% vs. ResearchAgent.
  • On risk-sensitive metrics for Crypto, TS-Agent delivered the lowest Sharpe/VaR deltas (≈20% better than competing agents with comparable LLMs), indicating forecasts preserve market structure—not just point accuracy.

Synthetic generation (GAN/VAE/diffusion families on Exchange/Stock/Crypto):

  • TS-Agent consistently ranked top on Marginal, Correlation, Autocorrelation, and Covariance distances; it matched or beat Optuna while maintaining 100% success across LLMs.
  • Variance across backbones was materially lower than generic agents—practical reliability for stress testing and data augmentation.

Practical Implications for Financial Teams

Where it shines

  • Regulated environments needing explainable model evolution and tight change control.
  • Shops with partially standardized codebases seeking agentic acceleration without free-form code risk.
  • Volatile markets (crypto, EM FX) where risk-aligned metrics matter as much as pure error.

What to watch

  • Curation debt: The Case/Code/Refinement banks must be curated and updated; treat them like a product.
  • Guardrails: Keep write-access to production repos gated; TS-Agent should propose PRs, not hot-patch prod.
  • Data discipline: The system assumes leak-free splits and correct walk-forward; governance should validate these assumptions.

A 30-Day Adoption Playbook

Week 1: Baseline & Banks

  • Inventory current forecasting/generation tasks and metrics (add risk metrics if missing).
  • Seed the Code Base with vetted implementations and unit tests; draft the Refinement Bank from internal MRM checklists.

Week 2: Pilot Workflow

  • Run TS-Agent on one asset class (e.g., FX daily). Capture all logs; require human sign-off on each commit.
  • Compare against your current AutoML baseline on accuracy and Sharpe/VaR/ES deltas.

Week 3: Governance Tightening

  • Wire logs to your MRM system (model inventory IDs, approvals, owners).
  • Add policy checks (leakage detectors, dataset lineage tags) to the Refinement Bank.

Week 4: Scale & SRE

  • Parallelize Warm-up across symbols; centralize artifacts (configs, weights, logs) with retention policies.
  • Create a PR-only pathway: TS-Agent opens PRs with rationale & metrics; reviewers approve/merge.

Cheat-Sheet: Pain Points → TS-Agent Moves

Pain Point TS-Agent Mechanism Outcome
Black-box AutoML trails on risk metrics Finance-first metric bank + reflective tuning Lower Sharpe/VaR/ES deltas, better trading fidelity
Fragile freehand code generation Edit within a curated code base Fewer bugs, faster wins, easier audits
Hard-to-explain model drift Chain-of-code-edits + logs Reproducible, auditable evolution
LLM backbone instability Knowledge banks + constrained edits Lower variance across LLMs

Bottom Line

TS-Agent reframes “try everything and hope” into plan → edit → measure → log. In markets where what changed and why is as important as how well it performs, this is the agentic blueprint that finally respects both P&L and policy.

Cognaptus: Automate the Present, Incubate the Future.