Opening — Why This Matters Now

Everyone wants an “AI data scientist.” Very few want to debug one.

Uploading a CSV into a chat interface and asking for “insights” feels magical — until the dataset exceeds a few hundred megabytes, the metric definition becomes ambiguous, or the workflow quietly dissolves into hallucinated code and truncated context.

Data science is not a one-shot prompt. It is a multi-step, stateful, computationally grounded process. And that means the problem is not just model capability — it is context management.

CEDAR (Context Engineering for Data Science with Agent Routing) approaches this reality with refreshing pragmatism: instead of asking LLMs to “be smarter,” it engineers the context they operate within.

That distinction is not cosmetic. It is architectural.


Background — From Prompting to Orchestration

Traditional data science follows a familiar arc:

  1. Load data
  2. Clean and preprocess
  3. Engineer features
  4. Train models
  5. Evaluate metrics
  6. Visualize and iterate

This workflow is iterative, error-prone, and heavily dependent on code execution.

Most current LLM-based solutions attempt to compress this process into conversational exchanges. The limitations are predictable:

Constraint Why It Breaks Naive LLM Workflows
Large datasets File upload limits and context windows collapse under real-world data sizes
Mathematical rigor LLMs are weak at reliable numeric computation without external execution
Multi-step reasoning Long conversational context becomes noisy and unintelligible
Privacy concerns Enterprise data cannot be freely uploaded to cloud APIs
Transparency Users cannot clearly see how solutions are produced

Recent agentic systems attempt to chain prompts together, but often at the cost of interpretability. When everything is hidden behind orchestration layers, users are left with results — not reasoning.

CEDAR chooses a different path: make the workflow explicit, structured, and executable.


Analysis — What CEDAR Actually Does

CEDAR is built around a deceptively simple principle:

Keep data local. Generate plans and code in small, structured steps. Render only what matters into context.

Let’s unpack the architecture.

1. Structured Prompts Instead of Free-Form Chaos

Rather than allowing users to write sprawling prompts, CEDAR enforces a structured input form divided into:

  • General instructions (verbosity, number of steps, plots expected)
  • Task description
  • Data description
  • Data location
  • Metrics
  • Inputs and outputs
  • Special instructions

This removes ambiguity and compresses cognitive overhead — both for the human and the LLM.

It’s less “chatting.” More specification.


2. Interleaved Plan-and-Code Workflow

CEDAR generates an enumerated sequence of steps resembling a Jupyter notebook:

Each step contains:

  • A natural language plan (Markdown)
  • A corresponding Python code block

This separation is subtle but powerful.

Component Responsibility
Text block Explain intent, reasoning, and next steps
Code block Execute computation locally

By delegating computation to Python instead of the LLM, CEDAR avoids the classic “confidently wrong arithmetic” problem.

More importantly, the workflow becomes auditable.


3. Three-Agent Routing Architecture

CEDAR uses three LLM agents:

Agent Role
Orchestrator Decides what happens next
Text Agent Produces human-readable explanations
Code Agent Produces executable Python

The orchestrator does not generate free-form content. Instead, it emits structured JSON actions:

  • request_text
  • request_code
  • finish

Each action includes either a spec (for narrative intent) or a purpose (for code execution intent).

This prevents ambiguity between “explain something” and “execute something.”

That separation alone reduces a surprising amount of agent instability.


4. Local Execution and Privacy-Aware Design

CEDAR executes Python code in Docker containers.

Implications:

  • Data never leaves the local environment
  • Only aggregate statistics or small outputs enter LLM context
  • Network access can be restricted
  • Sensitive enterprise data remains protected

This addresses a major enterprise adoption barrier: compliance and governance.

If you are building AI inside a regulated organization, this design choice is not optional — it is mandatory.


5. History Rendering: Context as a First-Class Citizen

Here is where CEDAR becomes genuinely interesting.

Instead of dumping the entire notebook history into the LLM context, CEDAR applies controlled rendering rules:

  • Keep full text/code blocks (they are short and important)
  • Include only successful code outputs
  • Include only the head of outputs (most informative)
  • Include only the tail of error traces (most diagnostic)
  • Truncate history beyond a configurable threshold (default 10k characters)

This is context engineering in practice — not theory.

The LLM sees:

  • Enough history to reason
  • Not so much that reasoning collapses

In enterprise AI systems, this kind of disciplined context pruning often matters more than model size.


Findings — What This Means in Practice

When tested on canonical Kaggle challenges, CEDAR demonstrates that:

Capability Outcome
Multi-step DS workflow Generated transparently in readable steps
Error recovery Iterative code retries handle common bugs
Large data handling Works locally without upload limits
Model evaluation Metrics computed via executable code
Exportability Outputs can be saved as JSON, Markdown, or Jupyter notebooks

The shift in human role is notable:

Traditional DS CEDAR-Enhanced DS
Manual scripting Specification and supervision
Debugging pipelines Inspecting structured steps
Rewriting boilerplate Evaluating solution quality

In other words, the human becomes a reviewer and strategist, not a typist.

A subtle but important productivity multiplier.


Implications — Why This Architecture Matters for Business

CEDAR is not just a Kaggle toy. It suggests broader architectural lessons for AI deployment.

1. Context Engineering > Model Scaling

Organizations often chase larger models.

CEDAR shows that intelligent context control can unlock significant gains even with existing models.

2. Agent Routing Enables Transparency

Explicit routing between narrative and computation prevents blurred responsibilities.

This is critical for:

  • Auditability
  • Compliance
  • Explainability
  • Governance reviews

In regulated sectors, that is the difference between experimentation and production.

3. Local-First AI Is Strategically Valuable

Cloud-only AI workflows introduce risk:

  • Data exposure
  • Network latency
  • Vendor lock-in

Hybrid architectures — where LLM reasoning and local execution coexist — are more aligned with enterprise risk models.

4. The Human Role Evolves, Not Disappears

CEDAR does not remove the data scientist.

It changes the cognitive workload:

  • From writing repetitive code
  • To designing structured requirements
  • To critiquing model faithfulness

That shift aligns with how high-leverage technical roles typically evolve.


Conclusion — Engineering the Invisible Layer

The hype cycle around AI agents often focuses on autonomy.

CEDAR quietly reminds us that autonomy without structure is chaos.

The real frontier is not “more intelligence.” It is disciplined orchestration.

By formalizing structured prompts, separating planning from execution, routing agent responsibilities explicitly, and pruning context intelligently, CEDAR reframes LLM-powered data science as a systems engineering problem.

And that is precisely how it should be treated.

If your organization is serious about operational AI, you will eventually confront the same lesson:

The prompt is not the product. The pipeline is.

Cognaptus: Automate the Present, Incubate the Future.