Tracked Repositories
115
Cognaptus DataHub Monitor
A monitored reference page for GitHub repositories surfaced from arXiv-paper digests, rendered from a machine-generated local data file.
Tracked Repositories
115
Unique Papers
47
Core Fields
paper title, ref_id, GitHub URL
Refresh Mode
local data file written by backend tasks
Use it as a lightweight index of implementation assets surfaced from research digests. The page stays defensive: required fields remain visible even when optional metadata is missing.
This page is designed as a refreshable reference surface rather than a hand-maintained article.
The goal is simple: when paper-digest workflows identify linked GitHub repositories, keep them visible in one place with enough context to scan quickly and revisit later.
Search by paper title, arXiv reference, GitHub repository, author, or tags when those fields are available.
An Awesome Data Agents repository linked directly in the paper header, likely used to collect or organize data-agent resources associated with the survey.
JoyAgent is discussed as a proto-L3 system that begins to address predefined-toolset limitations through tool evolution and multi-level thinking.
GitHub repository established by the authors as the project page associated with the survey on embodied learning for object-centric robotic manipulation.
GPT Engineer, a software-development agent implementation cited in the engineering application survey and open-source project discussion.
GPT Researcher, an experimental application that uses LLMs for research-question development, web crawling, source summarization, and aggregation.
AI Legion, an LLM-agent implementation cited in the survey's open-source library and reference set.
LoopGPT, an LLM-agent implementation cited in the survey's open-source library and reference set.
AGiXT, an agent framework implementation cited in the survey as a dynamic AI automation platform.
DemoGPT, a software-development agent repository cited in the engineering application survey and open-source project discussion.
MiniAGI, an LLM-agent implementation cited in the survey's open-source library and reference set.
AgentVerse, a multi-agent collaboration framework referenced among surveyed agent systems and open-source libraries.
AgentGPT, an LLM-based autonomous-agent system cited in the survey's open-source library and reference set.
Auto-GPT, an autonomous LLM-agent implementation included in the construction taxonomy and open-source library discussion.
SmolModels/developer-style agent repository cited as a software engineering application artifact.
WorkGPT, a workflow-oriented LLM-agent framework cited as similar to AutoGPT and LangChain.
SuperAGI, an autonomous-agent framework cited in the survey's open-source library and reference set.
XLang, an LLM-agent/tool-use framework cited as supporting executable language grounding and interaction with databases, web applications, and physical robots.
Repository for the AGENT KB cross-framework agent memory system introduced and evaluated in the paper.
Repository for the Agent Mentor / Agent Analytics open-source observability and analytics platform for agentic AI applications.
GitHub path identified by the paper as the code corresponding to the analytics pipeline used for semantic feature analysis.
Repository for the Agent-as-a-Judge project and DevAI-related evaluation artifacts.
GPT-Pilot is one of the three open-source code-generation agentic systems benchmarked in the paper.
Google's Agent-to-Agent Protocol repository, referenced as the source for A2A, one of the modern agent communication protocols compared in the paper.
GitHub Gist containing the Claude Code implementation prompt for the case summarization by file name microservice.
GitHub Gist containing the Case Summarization by Given Case Name Workflow pitch generated by the Planning Agent.
BabyAGI is used as a representative agentic framework showing how LLMs can be embedded in feedback loops to plan, act, adapt, and manage or prioritize subtasks.
Repository stated by the paper as the public code and data release for AirQA.
Repository reported by the authors as containing the code and data for Ask an Expert / BBMHReasoning experiments.
Author-referenced sample dataset of synthetic identification-document images covering five document types.
Author-referenced repository for generating synthetic document images used in the document identification and information extraction experiment.
The paper's AutoGen framework repository for building LLM applications via multi-agent conversations.
Repository for the ADAS codebase introduced by the paper, including the Meta Agent Search implementation and experimental framework.
AgentGPT is analysed as a general-purpose autonomous LLM-powered multi-agent system with user-guided alignment in selected aspects such as decomposition, agent generation, and resource utilization.
Auto-GPT is analysed as a general-purpose autonomous LLM-powered multi-agent system with autonomous goal decomposition, task action management, and resource utilization.
SuperAGI is analysed as a general-purpose autonomous LLM-powered multi-agent system with some user-guided alignment options for agent-related and resource-related aspects.
BabyAGI is analysed as a general-purpose autonomous LLM-powered multi-agent system with a profile similar to Auto-GPT across many assessed aspects.
An aggregated dataset of chess opening names and move sequences used by the paper to create opening-position concept datasets.
A GitHub repository listed by the paper as an accompanying curated resource for papers on code as agent harness.
Repository for the code, human-subject study materials, results, and supplementary materials associated with the Persona framework and AAAI 2025 paper.
Semantic routing package for routing inputs by embedding or intent similarity.
Framework using repeated generations, verification prompts, and confidence estimates to decide whether to escalate to larger models.
AWS multi-agent orchestration framework that includes prompt-based routing or agent selection patterns.
Implementation associated with routing prompts to pre-trained experts after fine-tuned meta-model categorisation.
Implementation associated with deciding whether a query requires a complex prompting strategy.
Iterative multi-agent code generation system using execution success as a routing signal.
LLM routing implementation associated with assessing model adequacy through multiple responses and ground-truth comparison.
Routing-agent implementation using synthetic data and small classifiers for classification-based routing.
Orchestrator implementation using decoder-only LLM representations for routing or model selection.
Framework for serving and evaluating routers that choose between LLMs using preference-oriented routing strategies.
Task-planning framework in which an LLM selects among models or tools based on descriptions and user tasks.
Implementation assessing consistency across reasoning representations for cascade-style routing.
OpenAI multi-agent orchestration framework discussed as an example of prompt-based routing practice.
Fine-tuned model framework for API call generation, discussed as treating routing as a code generation problem.
Framework for reducing LLM application cost using LLM cascades and related strategies.
Adaptive RAG framework that routes among no retrieval, single-step retrieval, and multi-step retrieval paths according to query complexity.
Code and data for a multi-LLM routing benchmark and evaluation framework.
Repository for EmbedLLM materials, described by the paper as containing the dataset, code, and embedder for further research and application.
Stated repository for the modular Python prototype implementing the neuro-symbolic ontology-based LLM validation pipeline.
Repository associated with the paper's contamination-detection work for LLM evaluation. The paper links it as code and data; the currently visible README describes a lightweight tool for identifying and analysing potential contamination without access to LLM training data.
Repository containing code and technical details for the Multi-Agent Scoring System for essay assessment.
Repository containing the benchmark data, task files, representative logs, and evaluation scripts for AutoGen, MetaGPT, and TaskWeaver.
Repository titled for the paper 'From Large AI Models to Agentic AI: A Tutorial on Future Intelligent Communications' and identified by GitHub as the code repository for the paper.
BeeAI is described as the experimental platform central to IBM's ACP, supporting local-first orchestration, agent discovery, REST endpoints, SDKs, telemetry, and multi-agent execution.
The MCP servers repository is cited as an ecosystem of reference and integration servers for file management, databases, Google Drive, Git, GitHub, GitLab, Slack, Google Maps, image generators, and search APIs.
OpenAI Swarm is reviewed as a lightweight, stateless abstraction for multi-agent systems with agent definitions, dynamic handoffs, context management, direct function calling, streaming, and backend flexibility.
TheAgentCompany repository contains sandboxed work environments, task directories, evaluators, task instructions, and supporting files for many task instances listed in the paper's appendix task table.
GitHub directory linked by the paper for the deterministic prediction task source code, including Pauli string multiplication, divide-and-conquer, letter replacement, and addition-related files.
Repository for the FastAPI control server, callback-augmented interactive trainer, React/TypeScript dashboard, examples, and LLM-based tuning demonstration.
Open-source CAMEL framework for autonomous cooperation among communicative agents using inception prompting and role-play.
Open-source multi-agent collaborative framework associated with MetaGPT, discussed as a representative framework that embeds human workflow processes and SOPs into language-agent collaboration.
Open-source AutoGen framework for creating LLM applications using customizable agents that can be programmed through natural language and code.
Author-maintained repository for tracking LLM-based multi-agent papers and organizing them into streams such as frameworks, orchestration and efficiency, problem solving, world simulation, datasets, and benchmarks.
CAMEL is an open-source multi-agent framework for role-playing and agent collaboration.
Code repository for improving factuality and reasoning through multi-agent debate.
MetaGPT is a multi-agent framework that models a software company using role assignments and SOP-style workflows.
AutoAgents generates different roles for GPTs to form a collaborative entity for complex tasks.
Microsoft AutoGen, a framework for building multi-agent AI applications.
Code repository for Solo Performance Prompting / multi-persona self-collaboration.
AgentVerse provides task-solving and simulation frameworks for multiple LLM-based agents.
ChatDev implements LLM-powered multi-agent collaboration for software development.
Repository connected to AI Scientist-generated papers reported as having passed peer review at an ICLR workshop.
Code repository for MAD, a multi-agent debate framework using large language models.
Sibyl System repository included in the selected AutoGen application sample.
AutoTx repository for planning and executing on-chain transactions.
GPT-Academic repository for LLM-assisted academic reading, writing, translation, and code/project analysis workflows.
Composio platform/repository evaluated as a flexible agent application or platform with multiple autonomy-related configurations.
h2oGPT repository for private local GPT-style chat and document interaction.
GraphRag_Ollama repository combining AutoGen, GraphRAG, Ollama, and related tooling.
Langflow platform for building and deploying AI-powered agents and workflows.
Letta platform for stateful agents with advanced memory.
AutoGen open-source framework for building AI agent systems using language models, multi-agent conversations, and tool use.
AutoGen Studio application within the AutoGen repository.
Dream Team repository for building a team of AI agents with AutoGen.
GitHub source for the multiple-choice Truthful-QA variant used in the model-level ranking experiment.
CHALE repository used as a hallucination-evaluation dataset with non-hallucinated, half-hallucinated, and hallucinated answer categories.
Repository path for Agent Spec runtime adapters that translate Agent Spec components into framework-specific equivalents for popular agentic frameworks.
WayFlow is presented as the paper's reference runtime for executing Agent Spec components, including native support for Agent Spec Agents and Flows.
Library of ADMM applications for sparse and low-rank optimization used to test NewADMM.
Huawei Cloud VM-placement traces used in the cloud resource scheduling case study.
Repository for the ICLR 2023 ReAct prompting paper, including data, prompts, HotpotQA, FEVER, ALFWorld, and WebShop notebooks, plus Wikipedia environment wrappers.
Repository for serving, training, and evaluating LLM routers, including router types corresponding to the paper such as matrix factorization, similarity-weighted ranking, BERT, causal LLM, and random routing.
Repository reported by the paper for the AtomicTranslation code used in the language-to-logic translation experiments.
A curated repository associated with the paper that organizes efficient architecture papers according to the survey's categories.
Repository for the BESSTIE sentiment and sarcasm classification benchmark for varieties of English.
Repository for the InstruSum instruction-controllable summarization dataset referenced and used as an evaluation target in the paper.
Companion repository that organizes works on LLM-agent evaluation according to the survey's structure and tracks papers, benchmarks, methodologies, and frameworks.
Framework for evaluating and optimizing agents and models in container environments, discussed as part of emerging standardized cross-environment agent evaluation.
LangChain AgentEvals package for evaluating agent trajectories, including trajectory matching and graph-based evaluation.
HAL harness for centralized and reproducible evaluation across agent benchmarks.
Repository for SWE-agent, the LM-based agent system that attempts to fix GitHub issues using an agent-computer interface and configurable tools.
Repository linked by the paper for LLM uncertainty decomposition, with folders and scripts related to input uncertainty, decoding uncertainty, model uncertainty, data, models, utilities, and uncertainty scoring.
A GitHub repository collecting papers related to LLM-based agents, linked by the survey as a related-papers resource.
Repository containing code for Using Non-Expert Data to Robustify Imitation Learning via Offline Reinforcement Learning, including simulation pipeline, scripts, installation instructions, and training/evaluation commands.
Repository identified as the code for 'When Routing Collapses: On the Degenerate Convergence of LLM Routers'.
Public repository containing the cost-aware LLM routing system, training/data-preprocessing components, evaluation and serving pipeline, router tests, and documentation.