Teaching Safety to Machines: How Inverse Constraint Learning Reimagines Control Barrier Functions

Factory robots, drones, and autonomous vehicles do not usually fail because nobody cared about safety. They fail because “safe” is annoyingly difficult to write down.

An operator may know that a drone should not scrape the ground, that a warehouse robot should not cut across a human worker’s path, or that an autonomous car should not tailgate even when the road is technically clear. But turning that judgement into a formal mathematical boundary is another matter. The physical system has dynamics. The controller has limits. The dangerous state may not be a simple wall or circle. And the difference between “safe enough” and “please do not put that in production” may live in patterns of behaviour rather than in a clean rule.

That is the problem behind Learning Neural Control Barrier Functions from Expert Demonstrations using Inverse Constraint Learning, by Yuxuan Yang and Hussein Sibai.¹ The paper proposes ICL-CBF, a pipeline that uses inverse constraint learning to infer a safety constraint from expert demonstrations, then uses that inferred constraint to train a neural control barrier function. The result is a safety filter: a controller wrapper that tries to keep the system safe while changing the reference controller’s action as little as possible.

The interesting part is not merely that the method performs well in simulation. It is the mechanism. The paper is not saying, “Watch an expert and magically become safe.” That would be charming, and also nonsense. It says something narrower and more useful: if you have expert demonstrations, known system dynamics, a reference controller, and enough simulation coverage, you can infer the states the expert seems to avoid, turn that inference into labels, and train a neural barrier function that can be used inside a quadratic-program safety filter.

In business language: this is a route from tacit safety practice to deployable control logic. Not certification. Not autonomy pixie dust. A route.

The safety rule is not the controller

Control barrier functions, or CBFs, occupy a useful middle layer in autonomous systems. They are not the task policy itself. They do not decide that a robot should carry a package to Bay 3 or that a quadrotor should fly to a goal position. Instead, a CBF defines a region of state space that should remain invariant: once the system is inside the safe set, the controller should keep it there.

The standard CBF move is elegant. Given a reference controller $\pi_{ref}$, a safety filter solves a small optimisation problem online. It asks: what control input is closest to what the reference controller wanted, while still satisfying the barrier condition? In the paper’s notation, this becomes a CBF-QP policy: a quadratic-program filter that minimally edits unsafe actions.

This matters operationally because it separates performance from safety intervention. A warehouse robot may still use its normal navigation policy. A drone may still use its nominal trajectory controller. The safety filter only steps in when the proposed action threatens the safe set. The corporate translation is fairly plain: do not rebuild the whole machine if the problem is that the machine occasionally needs a better adult in the room.

The catch is that CBFs need a meaningful safety set. Traditional synthesis methods can build them when the failure set and dynamics are known, but those methods can suffer badly as dimensionality grows. Neural CBFs scale more easily, but they need labelled examples of safe and unsafe states. And here the industry fantasy begins to wobble. In many real systems, unsafe states are not nicely labelled. Sometimes the unsafe state is obvious, such as hitting the ground. Sometimes it is a system-dependent region from which failure becomes unavoidable. Sometimes the expert avoids something nobody has written down.

ICL-CBF targets precisely this gap.

The paper’s mechanism: infer, label, filter

The proposed method has three conceptual stages.

First, it uses inverse constraint learning to infer a constraint function from expert demonstrations. ICL differs from inverse reinforcement learning in a useful way. IRL asks, “What reward was the expert optimising?” ICL asks, “What constraint was the expert obeying while pursuing a known task?” That distinction is not academic housekeeping. In safety-critical control, the expert’s value is often visible in what they do not do.

Second, the inferred constraint function is used to label newly sampled trajectories. The method samples trajectories using the reference controller, then classifies states as safe or unsafe according to the learned constraint. This creates the labelled dataset required to train a neural CBF.

Third, the neural CBF is placed into a CBF-QP safety filter. At runtime, the filter compares the reference controller’s preferred action against the barrier condition and modifies the action only when needed.

That pipeline is the core contribution:

Stage	Technical role	Business interpretation	Boundary
Expert demonstrations	Show safe task completion under implicit constraints	Capture operational judgement without forcing experts to write equations	Demonstrations must cover relevant parts of the state space
Inverse constraint learning	Infer a constraint separating expert-like behaviour from unsafe sampled behaviour	Translate “what experts avoid” into a machine-usable safety signal	The inferred constraint can be conservative or wrong outside coverage
Trajectory labelling	Produce safe/unsafe labels for sampled states	Replace expensive manual labelling with simulation-driven annotation	Requires dynamics and simulation access
Neural CBF training	Learn a scalable barrier function	Create a reusable safety filter for deployment-time control	Neural CBFs lose formal correctness guarantees
CBF-QP filtering	Minimally modify the reference action	Preserve task performance while blocking risky controls	Real-time feasibility and validation still matter

The paper also adds a pragmatic training heuristic. Retraining the neural CBF at every inverse-constraint iteration is expensive, so the authors sometimes postpone CBF training until the final iteration. During earlier iterations, they use a grid over the control space to approximate the expert-like safe policy. This is sensible for low-dimensional action spaces. It is also the sort of engineering shortcut that should arrive with a warning label, and the authors provide one: as action dimensionality grows, grid search becomes less useful, and the original algorithm becomes more appropriate.

The misconception: demonstrations are not certificates

The likely misunderstanding is predictable: if the system learns safety from expert demonstrations, perhaps the expert demonstrations certify safety.

They do not.

The paper is explicit that supervised neural CBF approaches are more scalable but sacrifice correctness guarantees. ICL-CBF inherits that tradeoff. It does not recover the Platonic ideal of safety from a few expert traces. It uses expert traces, known dynamics, a reference controller, and sampled trajectories to approximate a useful constraint classifier.

That distinction matters because the theoretical bridge is idealised. Prior work cited by the paper shows that, under exact ICL or multi-task ICL assumptions, an inferred constraint can correspond to a backward reachable set: states from which failure cannot be avoided in the worst case. The complement of that set is the maximum controlled forward invariant safe set. Lovely. In actual experiments, however, the method uses learned neural functions, finite demonstrations, hyperparameters, and simulation.

So the correct reading is not “expert data equals safety proof.” It is “expert data can help construct training labels for a neural safety filter when direct safety labels are difficult to specify.” Less glamorous, much more useful.

What the experiments actually test

The paper evaluates ICL-CBF in four simulated systems: a single integrator robot avoiding a circular obstacle, an inverted pendulum avoiding unsafe states, a Dubins car avoiding a square obstacle, and a planar quadrotor avoiding the ground. The baselines are iDBF and ROCBF, two prior demonstration-based neural CBF methods. The authors also compare against L-CBF, a neural CBF trained with ground-truth safety labels, which functions as an upper reference point rather than a realistic deployment baseline.

The experimental structure is more layered than a simple leaderboard. The paper asks five research questions, and they serve different evidentiary roles.

Evidence item	Likely purpose	What it supports	What it does not prove
Main closed-loop comparison across four tasks	Main evidence and comparison with prior work	ICL-CBF improves safety-task tradeoffs against iDBF and ROCBF in these simulations	Field robustness or formal safety certification
Single integrator visualisation	Interpretability check	The learned constraint and learned CBF resemble the ground-truth structure more closely than baselines	General visual interpretability in high-dimensional systems
Inverted pendulum trajectory visualisation	Behavioural illustration	ICL-CBF and L-CBF intervene to avoid failure where iDBF and ROCBF trajectories enter unsafe regions	Statistical superiority beyond the measured trials
Heuristic versus non-heuristic comparison	Ablation and implementation tradeoff	Delayed training/grid approximation saves training time, with more degradation in quadrotor	That grid heuristics scale to high-dimensional action spaces
Sampling policy comparison	Ablation	Sampling with the reference controller gives better CBF training data than aggregating earlier learned-policy samples in the single integrator case	Universal superiority across all systems
Label-quality visualisation	Diagnostic evidence	ICL-generated labels align more closely with true labels than baseline labelling strategies in the inverted pendulum	Complete label accuracy across all state spaces
Sensitivity to $\delta$	Robustness/sensitivity test	Performance depends materially on the threshold used to partition safe and unsafe labels	Automatic hyperparameter reliability

This distinction is important. The paper’s strongest evidence is the closed-loop performance table. The visualisations explain why the method behaves as it does. The heuristic and $\delta$ studies explain where the method is operationally sensitive. If you read all of these as equal “results”, the paper becomes foggy. If you read them as mechanism, evidence, ablation, and sensitivity, the shape becomes much clearer.

The main result: better safety without completely murdering the task

The headline numbers are unusually interpretable because the paper reports two metrics: collision rate (CR) and success rate (SR). A useful safety filter should reduce collisions without destroying task completion. A filter that prevents all collisions by refusing to move is not a triumph; it is a very expensive paperweight.

In the single integrator task, ICL-CBF achieves 0.00% collision rate and 80.60% success rate. ROCBF also gets 0.00% collisions, but only 9.80% success, which is the classic symptom of over-conservatism. iDBF performs disastrously here, with 99.20% collisions and 0.80% success. L-CBF, trained with true labels, gets 0.00% collisions and 86.20% success.

The inverted pendulum task is where ICL-CBF looks especially strong: 0.20% collision rate and 99.80% success, slightly better on success than L-CBF’s 99.40% and lower collision than iDBF and ROCBF. That does not mean it is “better than ground truth” in any grand philosophical sense. It means the learned filter performed at least comparably under this simulation and evaluation setup. Calm down, leaderboard enthusiasts.

The Dubins car result is also favourable: ICL-CBF reports 1.80% collisions and 97.60% success. L-CBF is slightly safer and more successful, with 0.30% collisions and 99.60% success, but ICL-CBF is close. ROCBF struggles badly here, with 69.30% collisions and 6.40% success. iDBF has no collisions but only 75.00% success, suggesting again that safety without performance is a thin victory.

The quadrotor is the harder case and the most useful reality check. ICL-CBF produces 17.10% collisions and 77.20% success. That is far better than iDBF and ROCBF, which show 65.7% and 75.00% collision rates respectively, with very low success. But it remains well behind L-CBF, which achieves 1.50% collisions and 98.00% success. The business lesson is not “ICL-CBF solves quadrotor safety.” It is that inferred labels can move a difficult system meaningfully toward safer behaviour, but ground-truth-labelled training still wins when available.

The comparison is therefore best read as a tradeoff map:

Scenario	ICL-CBF result	Best practical interpretation
Single integrator	0.00% CR, 80.60% SR	Matches true-label collision avoidance, with modest success loss
Inverted pendulum	0.20% CR, 99.80% SR	Very strong safety-task balance in this setup
Dubins car	1.80% CR, 97.60% SR	Close to true-label CBF, much stronger than baselines
Quadrotor	17.10% CR, 77.20% SR	Clear improvement over baselines, but not close to true-label performance

That final row should be underlined in any executive briefing. Harder dynamics expose the distance between “learned from demonstrations” and “deployment-ready safety assurance.”

Why ICL-CBF beats the obvious demonstration baselines

The baselines fail for different reasons, and that is useful.

iDBF labels states as unsafe by taking expert states, sampling low-probability actions under a behaviour-cloned policy, and simulating the next state. In low-dimensional spaces with dense expert trajectories, this can produce unsafe samples that sit too close to safe expert states. The result is label overlap: the model is asked to learn a boundary from contradictory evidence. Neural networks are many things, but mind readers they remain not.

ROCBF uses reverse nearest neighbours to identify the boundary of the expert-visited region and labels boundary states as unsafe. This avoids some overlap, but it can misclassify valid expert regions as unsafe. In the paper’s single integrator visualisation, ROCBF delineates the obstacle but also misclassifies part of the region visited by expert trajectories. In the inverted pendulum label visualisation, states near the origin are treated as boundary points and labelled unsafe.

ICL-CBF’s advantage is that it does not merely ask, “Where did the expert go?” It asks a more diagnostic question: “Where would the reference controller have gone if it were not constrained, and what did the expert avoid?” That is a stronger signal. It uses deviation from task-driven behaviour as evidence of an implicit constraint.

For business users, this difference is the whole point. In many operations, expert behaviour is not a perfect map of all safe states. Experts are efficient. They do not visit every safe area simply to help your dataset. The more valuable information may be in the contrast between the nominal plan and the safe correction. ICL-CBF is designed to exploit that contrast.

The heuristic saves time, but the bill arrives in harder systems

The training heuristic is one of the paper’s most practical contributions because it speaks directly to implementation cost. In the single integrator case, using the heuristic reduces training time from 578.11 to 217.80 while preserving 0.00% collision rate and reducing success only slightly, from 81.80% to 80.60%. That is a reasonable exchange.

In the quadrotor case, the same tradeoff is less comfortable. Without the heuristic, ICL-CBF records 9.30% collisions and 87.60% success, with training time of 2967.33. With the heuristic, collisions rise to 17.10% and success falls to 77.20%, while training time drops to 2369.38. That is still faster, but the safety-performance degradation is no longer pocket change.

The grid-based policy comparisons sharpen the message. In the single integrator task, a small grid can reduce inference time to 0.83 while maintaining 0.00% collisions and 79.40% success. But in the quadrotor task, grid policies perform worse: a $100^2$ grid gives 32.10% collisions and 57.20% success; a $300^2$ grid improves performance to 24.20% collisions and 67.60% success but increases inference time sharply to 29.70. The CBF-QP policy remains much more attractive for higher-dimensional control.

This is an implementation detail with strategic consequences. Low-dimensional pilots may make grid-based shortcuts look seductive. Then the real robot arrives, with more state variables, more control dimensions, and less patience for brute force. A good prototype trick is not automatically a production architecture. Shocking, I know.

The threshold $\delta$ is not cosmetic

The paper’s sensitivity test examines $\delta$, the threshold used to partition sampled states into safe and unsafe classes based on the learned constraint function. This is not a decorative hyperparameter. It directly shapes the labelled data used to train the neural CBF.

Figure 5 shows that performance can deteriorate when $\delta$ is poorly chosen. The authors report that searching over $\delta \in [0,1]$ can find reasonable performance, and that $\delta = 0.6$ worked consistently well in their experiments. That is useful, but it should not be overread. A grid search over validation data is still a tuning procedure. It requires representative validation conditions. If the operating environment shifts, the chosen threshold may become less reliable.

For an enterprise deployment, $\delta$ would belong in the safety validation plan, not in a notebook cell labelled “miscellaneous.” It affects the boundary between safe and unsafe training data. That boundary affects the learned CBF. The learned CBF affects which controls are blocked at runtime. This is not back-office hyperparameter tidying; it is governance material.

What this means for business deployment

The direct paper result is modest and important: in four simulated control tasks, ICL-CBF produces neural CBF safety filters that outperform iDBF and ROCBF on the reported collision/success tradeoff, and often approach the performance of neural CBFs trained with ground-truth safety labels.

The Cognaptus interpretation is broader but bounded. ICL-CBF suggests a practical pattern for organisations that have expert operators, simulators, and reference controllers, but lack formal safety labels:

record expert safe behaviour;
define or retain the nominal task controller;
use simulation and dynamics to expose what the nominal controller would do;
infer the avoided constraint;
train a neural CBF;
deploy it as a minimally invasive safety filter;
validate aggressively before trusting it near expensive objects, fragile humans, or both.

The likely application areas are robotics, drones, autonomous vehicles, warehouse automation, and industrial control. The strongest fit is not where safety rules are already simple and fully formalised. If the unsafe set is just “do not cross this line,” you may not need inverse constraint learning. The stronger fit is where expert avoidance behaviour contains useful information that the organisation cannot easily turn into equations.

That said, the method assumes quite a lot. It needs known system dynamics. It needs a reference controller that pursues the task but may be unsafe. It needs expert demonstrations that sufficiently cover the relevant state space. It evaluates in simulation, not field deployment. And it does not restore the formal guarantees sacrificed by neural CBF training.

A sensible business reading is therefore:

Question	Answer
Does the paper show a new route to learning safety filters from demonstrations?	Yes.
Does it remove the need to specify every unsafe state manually?	Potentially, under the paper’s assumptions.
Does it prove certified safety from expert data?	No.
Is it ready to be treated as a production safety case?	Not by itself.
Where is the business value?	Reducing the cost of translating tacit expert safety behaviour into deployable control constraints.
Where is the operational risk?	Coverage gaps, simulation mismatch, hyperparameter sensitivity, and degraded performance in harder dynamics.

This is the right sort of research result for enterprise AI: not a miracle, but a useful conversion mechanism. It converts demonstrations into constraints, constraints into labels, and labels into a runtime safety filter. Each conversion step creates value. Each also creates a place where error can enter. Conveniently, reality remains undefeated.

The real contribution is a safety-learning workflow

The paper’s main intellectual move is to connect inverse constraint learning and neural CBF training. That bridge matters because it changes what counts as usable safety data. Instead of requiring labelled unsafe states, the method extracts constraint information from expert demonstrations and reference-controller deviations. It then uses that information to train a safety filter that can intervene online.

This is not a replacement for formal methods where formal methods are available. It is not a waiver from testing. It is not a compliance story in a lab coat. But it is a serious step toward a practical middle ground: safety filters that can be learned from how experts actually operate, not only from what engineers can fully specify.

For businesses building autonomous systems, the lesson is simple enough to be dangerous: expert demonstrations can be more than training data for imitation. They can be evidence of invisible constraints. The hard part is preserving that evidence through the full chain: inference, labelling, training, filtering, and validation.

That is where ICL-CBF is useful. It gives structure to a problem companies already have. Their best operators know what safe looks like. Their engineers need machinery that can use that knowledge. This paper shows one plausible way to connect the two, without pretending the connection is magic.

Cognaptus: Automate the Present, Incubate the Future.

Yuxuan Yang and Hussein Sibai, “Learning Neural Control Barrier Functions from Expert Demonstrations using Inverse Constraint Learning,” arXiv:2510.21560, 2025. https://arxiv.org/abs/2510.21560 ↩︎

The safety rule is not the controller#

The paper’s mechanism: infer, label, filter#

The misconception: demonstrations are not certificates#

What the experiments actually test#

The main result: better safety without completely murdering the task#

Why ICL-CBF beats the obvious demonstration baselines#

The heuristic saves time, but the bill arrives in harder systems#

The threshold $\delta$ is not cosmetic#

What this means for business deployment#

The real contribution is a safety-learning workflow#