TL;DR for operators
SMARTAPS is not another chatbot sprinkled over enterprise software like parsley on a mediocre buffet. It is a tool-augmented interface for advanced planning systems: planners ask natural-language questions, the system detects the planning intent, retrieves the right expert-built API, extracts the necessary parameters, runs the tool, and turns the raw result into a readable answer.1
The operational promise is narrower, and therefore more useful, than “AI will optimise your supply chain.” SMARTAPS does not replace the optimisation model, the APS, or the OR consultant. It reduces the number of routine planning questions that must wait for a consultant to manually run analysis. In the paper’s production-planning case, planners reported that delay diagnosis and what-if analysis could move from a 1–2 day consultant-dependent cycle to a few hours.
That matters because APS value is often trapped behind interface and expertise costs. Companies may own powerful planning software but still depend on specialists to interrogate it. SMARTAPS attacks that bottleneck by making recurring analysis conversational without removing the underlying discipline.
The catch is simple: tools must exist before the LLM can use them. Unsupported queries, slow optimisation runs, multi-planner conflicts, and new API creation still require serious engineering. The system is best understood as a usability layer over curated OR capability, not as a silicon consultant with a charming accent.
The real planning problem is not optimisation. It is access.
A planner does not usually wake up wanting a “large language model experience.” A planner wakes up needing to know why a customer order is late, whether a machine outage breaks next week’s production schedule, or what happens if a material shipment arrives earlier than expected. These are not philosophical questions. They are tomorrow-morning-meeting questions.
Advanced planning systems already know a great deal about these situations. They store customer orders, bills of materials, inventory levels, equipment constraints, production calendars, and optimisation models. They can generate plans, compare alternatives, and visualise schedules. Yet using them well often requires domain expertise, custom configuration, and help from operations research consultants.
That is the practical gap SMARTAPS targets. The paper’s authors describe APS adoption as constrained not only by software cost but by maintenance, customisation, and reliance on expert consultants. The irritating part is that many planner questions are not novel research problems. They are recurring operational questions with slightly different parameters. Still, in many organisations, they travel through the same expert bottleneck.
SMARTAPS reframes the LLM’s role. The model is not asked to invent an optimisation model from a vague instruction, which would be brave in the same way that jumping from a balcony with an umbrella is brave. Instead, it acts as a mediator between human planning language and a catalogue of trusted tools.
SMARTAPS works because the LLM is not the solver
The mechanism is the paper’s most important contribution. SMARTAPS contains three main modules: a conversation manager, a tool retriever, and a tool manager. Each performs a different role in converting a planner’s request into an executable planning operation.
| Module | What it does | Operational consequence |
|---|---|---|
| Conversation manager | Detects whether the user is making an operations-planning request or continuing ordinary conversation; refines raw tool output into context-aware language | Prevents every message from triggering unnecessary tool calls and makes tool outputs readable |
| Tool retriever | Uses embeddings to match the user query against a catalogue of API contracts | Converts natural-language intent into a candidate planning tool |
| Tool manager | Identifies the relevant model and APS data, extracts required parameters, asks for missing inputs where needed, and executes the tool | Turns the chosen tool into an actual planning action |
This architecture matters because it keeps the LLM away from the part of the workflow where hallucination is most expensive. The system does not ask Mistral-7B to calculate an optimal production plan from scratch. It uses Mistral-7B for intent detection, parameter extraction, and response refinement; it uses BGE-LARGE-EN-V1.5 and ChromaDB for tool retrieval; and it relies on an optimisation engine, Huawei Cloud’s OptVerse AI Solver, for solver-backed planning analysis.
The paper’s core design principle is therefore containment. Let the LLM understand messy human language. Let the solver solve. Let the API contract define the handoff. This is less glamorous than “autonomous agentic planning,” but it has the small advantage of being plausible.
The API contract is where the magic is quietly removed
The system’s most business-relevant artefact is not the chat window. It is the tool contract.
Each tool in SMARTAPS has metadata describing what it does, examples of natural-language queries that should retrieve it, a natural-language output template, a callable function, and input/output schemas. The description and examples are embedded for retrieval. The function call tells the system what to execute. The schemas tell the system what parameters are required and what kind of result should come back.
That contract turns a vague user request into a bounded operation. “Can we produce this order on time?” becomes a structured call to a why-not or feasibility-related tool. “What happens if 100 kg of natural rubber arrives on 2024-04-17?” becomes a what-if analysis over APS data. “How many more tires are produced in the new plan?” becomes a comparison between generated plans.
The paper groups the available tool types into five categories:
| Tool category | Planning question it supports | What changes in the workflow |
|---|---|---|
| Query plan | “What is currently scheduled?” | The planner interrogates the existing plan without navigating complex screens |
| Why-not | “Why can’t this requirement be satisfied?” | The planner investigates infeasibility or delay causes |
| What-if | “What changes if data or assumptions change?” | Scenario analysis becomes conversational |
| Compare plan | “How does this plan differ from that one?” | Alternatives can be interpreted without manual spreadsheet archaeology |
| Display plan | “Show me the plan or chart.” | The interface returns tables, plots, or visual summaries |
The important point is that SMARTAPS does not make planning knowledge disappear into a black box. It relocates planning expertise into reusable tools. OR consultants still define the logic. The LLM makes that logic easier to reach.
This is where many enterprise AI projects either become useful or become expensive theatre. If the organisation has no well-defined APIs, no stable planning data, and no agreement about what each analysis should mean, a chat interface merely gives confusion a friendlier voice. SMARTAPS assumes the opposite: expert-defined planning capabilities are available and can be packaged into tool contracts.
Retrieval is control flow, not document search
Most business readers now associate retrieval with RAG: pull a few documents from a knowledge base, paste them into a prompt, ask the model to answer. SMARTAPS uses retrieval differently. It retrieves a tool, not a paragraph.
This distinction is small in wording and large in consequence. Retrieving a document helps the model talk about knowledge. Retrieving a tool helps the system act on an operational model. In SMARTAPS, the user query is embedded and compared against the embedded descriptions and examples of available API contracts. The closest match becomes the selected tool.
The paper says this retrieval method was tested on 150 annotated user-query instances for the case study. That is likely intended as implementation evidence for the tool retriever rather than the main business evidence. However, the accessible HTML and PDF renderings reference “Table 1” without exposing the table values. So the safe interpretation is not “retrieval accuracy is X%”; it is “the authors built and evaluated a semantic tool-selection layer, but the accessible paper version does not provide usable numeric results for that table.”
That matters. The production-planning story should not be oversold as a statistically rich benchmark. The stronger claim is architectural: a retriever can map planner language to curated planning actions. The weaker claim is quantitative: we do not have enough visible table detail to judge retrieval performance from the public rendering alone.
The evidence is a realistic case study, not a tournament leaderboard
The paper’s evidence comes from a production-planning case at Huawei, built around workflows observed with supply-chain planners and OR consultants. The authors identify a common pain point: planners rely heavily on OR consultants to perform analyses, especially for diagnosing customer-order production delays and finding possible resolutions.
SMARTAPS was deployed in a realistic production-planning scenario and given to planners. The reported feedback is positive. Users said the system made plan queries more efficient and helped them identify reasons for production delays more readily. They especially valued why-not and what-if analysis. The paper reports that this could reduce analysis turnaround from a consultant-dependent 1–2 days to a few hours.
That is meaningful, but it is not a controlled productivity study with randomised teams, measured throughput, and audited decision quality. The likely purpose of the case study is demonstration and user validation: does the system address a real workflow bottleneck, and do planners find the interaction useful? On that question, the evidence supports cautious interest.
| Paper element | Likely purpose | What it supports | What it does not prove |
|---|---|---|---|
| Figure 1 interface | Implementation detail and usability demonstration | The system can expose tool execution steps and task history through a planner-facing UI | That planners make better decisions than with existing tools |
| Figure 2 framework | Main mechanism | The architecture separates conversation, retrieval, tool execution, APS data, and solver endpoints | That this exact architecture is optimal |
| Figure 3 API contract | Implementation detail | Tool contracts provide the bridge between natural language and callable planning functions | That tool creation is cheap or automatic |
| Figures 4–6 module diagrams | System design explanation | Intent detection, response refinement, retrieval, model identification, and parameter extraction are separated | That each component performs reliably under all planning conditions |
| Production-planning case | Main evidence | Planners reported faster diagnosis and scenario analysis, with potential turnaround reduction from 1–2 days to hours | Generalisable ROI across industries, sites, or APS vendors |
| Referenced Table 1 | Intended retrieval evaluation | The authors considered tool-retrieval performance on annotated queries | Exact retrieval accuracy, because the visible public version does not expose the table values |
The distinction is not academic fussiness. It changes how a company should pilot the idea. SMARTAPS should be evaluated as a workflow-acceleration layer, not as a universal decision engine. The pilot metric should not be “Does the AI sound smart?” It should be “How many recurring consultant-mediated analyses can planners now complete safely on their own?”
The business value is cheaper diagnosis, not cheaper thinking
The obvious business pitch is labour saving. The better one is latency reduction.
When a planner waits 1–2 days for an OR consultant to investigate a delay or run a scenario, the cost is not only consultant time. It is decision waiting time. Production plans decay as new information arrives from sales, warehouse, logistics, maintenance, and procurement. A decision that arrives tomorrow may still be correct mathematically and late operationally. Wonderful. The spreadsheet has achieved moral victory while the shipment misses its slot.
SMARTAPS compresses the interaction loop. The planner can ask follow-up questions, test assumptions, compare plans, and inspect results while the context is still active. That changes planning from a batch service request into an exploratory conversation with the APS.
Cognaptus’ practical inference is that the first ROI layer is not full automation. It is reducing avoidable expert handoffs in recurring analysis. This is especially relevant where planners repeatedly ask variants of the same questions:
- Why is this order delayed?
- Which constraint is binding?
- What if a material shipment arrives earlier?
- What if a plant, line, or machine is unavailable?
- How does the revised plan differ from the baseline?
- Which relaxation would make the requirement feasible?
These questions still require formal planning logic. SMARTAPS simply makes that logic accessible at the moment of need.
The misconception: this is not an OR consultant replacement machine
The tempting interpretation is that SMARTAPS makes the consultant unnecessary. The paper says almost the opposite, though politely, as papers do when they do not want to ruin the product demo.
SMARTAPS depends on OR consultants to create tools and API contracts. It depends on deployed solver endpoints. It depends on APS data and model identification. It depends on the tool manager being able to extract or infer required inputs. When a request is unsupported, the system cannot simply manifest a correct planning tool through inspirational prompt engineering. The authors describe a human-in-the-loop module for generating and recommending new tools when unsupported queries arise.
That is the right division of labour. Consultants should not spend their time repeatedly answering routine variants of known analyses. But they are still needed to design the analyses, encode assumptions, validate solver behaviour, and expand the tool catalogue. SMARTAPS reduces the dependency surface; it does not abolish expertise.
The replacement story also misses a subtler point: planners are not passive users. In a real planning environment, they carry local knowledge about customer urgency, workaround feasibility, supplier reliability, and organisational priorities. A better interface lets them interrogate the plan more directly. It does not remove judgement from the room. Operations has enough invisible assumptions already.
The implementation boundary is where pilots will succeed or fail
SMARTAPS has three boundaries that matter for deployment.
First, the tool catalogue defines the possible action space. If a planner asks for analysis outside the available APIs, the system reaches its limit. That is not a defect; it is the price of grounding. But it means implementation should start by mapping high-frequency planner questions and building a small set of robust tools around them. Trying to cover every possible planning conversation on day one is how enterprise pilots become museum exhibits.
Second, solver time does not vanish. The paper notes that large optimisation runs may still take long enough that consultants submit jobs overnight and review the plan the next morning. A chat interface can make it easier to request and monitor such jobs, but it cannot repeal computational complexity. Future task management and parallel job tracking would be necessary for production use.
Third, the current system is framed as a one-user conversation. Real operations planning is multi-user and political in the dull but important sense: different planners optimise for different objectives. Sales wants customer promises met. Production wants feasible schedules. Procurement worries about materials. Logistics worries about movement. Finance would like all of this to cost less, naturally. A single-user assistant cannot fully model that negotiation. Multi-user planning support is not decoration; it is part of the enterprise problem.
What Cognaptus would watch in a real deployment
For a company evaluating this architecture, the first question is not whether the LLM is impressive. The first question is whether the organisation has enough structured planning capability for an LLM to orchestrate.
A practical pilot should measure four things.
| Pilot question | Why it matters |
|---|---|
| Which planner questions recur every week? | These are candidates for tool contracts and early ROI |
| Which analyses currently require consultant intervention? | This identifies the handoff bottleneck SMARTAPS is meant to reduce |
| Which outputs require formal solver calls versus simple plan queries? | This separates fast conversational wins from long-running optimisation jobs |
| Which decisions involve multiple stakeholders? | These reveal where single-user chat support will be insufficient |
The expected sequence is not “install chatbot, become autonomous.” A more realistic sequence is: catalogue recurring planning questions, encode them as tools, connect them to APS data and solver endpoints, validate output schemas, expose the chat interface, and then track how many consultant requests are avoided or shortened.
That may sound less exciting than “AI agent transforms supply chain.” Good. Excitement is not a planning metric.
A useful pattern for enterprise AI: language in front, tools underneath
SMARTAPS is most valuable as a reference pattern. It shows how LLMs can help in operational decision environments without being trusted to improvise the decision logic.
The pattern is portable:
- Put natural language at the interface.
- Put retrieval over a curated catalogue of tools.
- Put expert-defined APIs behind the tools.
- Put formal models, databases, and solvers behind the APIs.
- Put the LLM back in charge of interpretation, not truth.
This design is relevant beyond supply-chain planning. Finance, logistics, HR capacity planning, procurement, maintenance scheduling, and project portfolio management all contain the same basic problem: complex models exist, but business users cannot easily interrogate them without specialist translation.
SMARTAPS suggests that the most credible enterprise AI systems may look less like autonomous geniuses and more like disciplined dispatchers. They know what tool to call, what parameters to extract, when to ask for missing information, and how to explain the result. Not glamorous. Very useful. A rare combination.
Conclusion: the smart sidekick is the one that knows its job
SMARTAPS does not prove that LLMs can run operations planning end to end. It proves something more commercially interesting: LLMs can reduce the interface burden around existing OR and APS infrastructure when they are constrained by expert-built tools.
The paper’s strongest contribution is therefore architectural, not theatrical. It shows a path for moving from consultant-mediated analysis to planner-accessible analysis while preserving the role of formal optimisation. The planner gets a conversational interface. The OR consultant gets fewer repetitive requests. The APS becomes less of a sealed machine.
The remaining work is not small. Tool catalogues must be built. Solvers must be managed. Unsupported queries must be routed into expert development. Multi-user planning needs proper treatment. But those are implementation problems, not fantasy problems.
For operators, that is the point. SMARTAPS is not a revolution because it makes planning automatic. It is useful because it makes planning tools easier to use before everyone has forgotten why the question mattered.
Notes
Cognaptus: Automate the Present, Incubate the Future.
-
Timothy T. Yu, Mahdi Mostajabdaveh, Serge J. Byusa, Rindra Ramamonjison, Giuseppe Carenini, Kun Mao, Zirui Zhou, and Yong Zhang, “SmartAPS: Tool-augmented LLMs for Operations Management,” arXiv:2507.17927v1, 2025. https://arxiv.org/abs/2507.17927 ↩︎