Mind the Gap: How Tool Graph Retriever Fixes LLMs’ Missing Links

In enterprise AI automation, the devil isn’t in the details—it’s in the dependencies. As LLM-powered agents gain access to hundreds or thousands of external tools, they face a simple but costly problem: finding all the right tools for the job. Most retrieval systems focus on semantic similarity—matching user queries to tool descriptions—but ignore a crucial fact: some tools can’t work without others.

The result? A task that seems perfectly matched to a retrieved tool still fails, because a prerequisite tool never made it into the context window. Tool Graph Retriever (TGR) aims to solve this by making dependencies first-class citizens in retrieval.

From Similarity Search to Dependency-Aware Retrieval

TGR introduces a three-step process that’s as much about who a tool depends on as what it does:

Dependency Identification
- Defines two key dependency types: result-based (Tool A needs Tool B’s output) and verification-based (Tool A requires Tool B to verify something first).
- Builds TDI300K, a synthetic + manually annotated dataset for classifying dependencies.
- Trains a BERT-based discriminator to label tool pairs as A depends on B, no dependency, or B depends on A.
Graph-Based Tool Encoding
- Turns tools into nodes, dependencies into directed edges.
- Uses graph convolution (GCN) to propagate dependency signals across tool embeddings, enriching them with context from connected tools.
Online Retrieval
- Encodes the user query and compares it to these dependency-aware embeddings.
- Returns top-k tools ranked by cosine similarity.

The Numbers Don’t Lie

On API-Bank, pairing TGR with ToolBench-IR boosts the pass rate from 62.4% to 78.8% at top-10 retrieval—a 26% jump. On ToolBench-I1, the same pairing lifts the pass rate from 69.0% to 73.0%.

Even with a less specialized baseline like Paraphrase MiniLM-L3-v2, TGR delivers consistent gains, proving that dependency-awareness is a universal enhancer.

Why This Matters for AI Agents in Business

In corporate automation, missing a prerequisite tool can mean:

A failed transaction in financial systems because the authentication tool wasn’t retrieved.
A customer service chatbot stalling when it needs a validation tool before updating user data.
A logistics AI skipping a rate-calculation API required before booking shipments.

TGR’s approach mirrors best practices in supply chain management—you wouldn’t schedule assembly without securing the parts. By encoding dependencies, it ensures AI agents retrieve not just the obvious tool, but the whole execution chain.

The Dependency-Density Effect

One of TGR’s most intriguing findings is the graph density effect: the more interconnected the tools in a category, the greater the retrieval improvement. This suggests industries with complex, multi-step workflows (finance, healthcare, logistics) stand to benefit most.

Where This Could Go Next

TGR isn’t perfect—the accuracy of its dependency discriminator limits performance, and the current graph construction is $O(N^2)$. But the roadmap is clear:

Smarter, more generalizable dependency classifiers.
Faster graph construction via rule-based filtering.
Integration with more efficient graph networks.

For enterprises deploying AI agents at scale, this isn’t just an academic improvement—it’s an operational safeguard. In a future where AI agents are juggling hundreds of APIs, dependency-aware retrieval could be the difference between a smooth orchestration and a costly failure.

Cognaptus: Automate the Present, Incubate the Future

From Similarity Search to Dependency-Aware Retrieval#

The Numbers Don’t Lie#

Why This Matters for AI Agents in Business#

The Dependency-Density Effect#

Where This Could Go Next#

From Similarity Search to Dependency-Aware Retrieval

The Numbers Don’t Lie

Why This Matters for AI Agents in Business

The Dependency-Density Effect

Where This Could Go Next