Robustness, Safety, and Verification
A robot that works brilliantly in the lab and fails dangerously in the field is worse than useless — it creates risks that did not exist before. As robots move into hospitals, schools, roads, and homes, the question of trustworthiness becomes as important as the question of capability. This lesson examines the three distinct challenges of making learned autonomous systems reliable: robustness (maintaining performance under unexpected conditions), safety (guaranteeing the absence of harmful behaviors), and verification (providing the evidence needed to trust a system before deployment).
Robustness: Performance Under Distribution Shift
A learned robot policy is trained on a specific distribution of inputs. At deployment, the world does not behave exactly like the training distribution. Lighting changes. Sensor calibration drifts. New objects appear. People behave in unexpected ways. The gap between the training distribution and the deployment distribution is called distribution shift, and it is the primary source of learned-policy failures in real deployments. Distribution shift takes several forms: Covariate shift: the input distribution P(x) changes, but the true input-output relationship P(y|x) stays the same. A perception model trained on images from a warehouse in summer may fail in winter when lighting and reflections differ. The model is still correct in principle — it just never saw those inputs. Concept drift: the true relationship P(y|x) changes over time. A predictive maintenance model trained on data from new motors will misclassify wear patterns in aging motors, because the correct diagnosis for the same vibration signature changes as the motor ages. Out-of-distribution (OOD) inputs: inputs that fall entirely outside the training distribution — object types the robot has never seen, environments it was never trained in. These are most dangerous because the learned function extrapolates arbitrarily. Building robust policies requires addressing distribution shift proactively. Techniques include: data augmentation (expanding training diversity to cover more input variations), domain randomization (as discussed in Lesson 5), input normalization and calibration to reduce sensor drift effects, ensemble methods (running multiple policies and flagging disagreement as an OOD signal), and uncertainty quantification (having the model output a confidence estimate alongside its prediction so that low-confidence situations can trigger conservative behavior or human escalation).
A robot that says 'I am 95% confident this is a clear path' and is wrong is less dangerous than a robot that gives no confidence estimate. When a model outputs calibrated uncertainty, the system can use low-confidence states as a trigger: slow down, request help, or switch to a more conservative mode. Uncertainty quantification transforms the model's limitations from silent liabilities into actionable signals.
Safety: Guaranteeing the Absence of Harm
Robustness is about performance. Safety is about constraints: certain things must never happen regardless of what the robot encounters. A surgical robot must never exceed a certain force on tissue. An autonomous vehicle must never cross the center line at highway speed. A service robot must never enter a room occupied by a patient without a nurse present. These are hard constraints, not soft preferences. Formally, safety is often specified using control barrier functions (CBFs). A CBF h(x) is a function of the system state x such that h(x) >= 0 defines the safe set — states where the robot is operating safely. A CBF-based safety filter monitors the robot's intended action, checks whether executing it would violate the safety constraint, and if so replaces the intended action with the nearest safe action. This creates a safety layer that wraps any underlying policy — learned or classical — and guarantees constraint satisfaction regardless of what the policy outputs. Safe reinforcement learning adds safety constraints directly into the RL objective, formulated as a constrained MDP (CMDP): maximize expected return subject to constraints on cumulative constraint violations. Algorithms like Constrained Policy Optimization (CPO) and TRPO-Lagrangian solve CMDPs with theoretical guarantees. In practice, safety constraints during training are combined with safety filters during deployment as defense in depth. A third approach is formal synthesis: designing the controller analytically so that safety is guaranteed by construction, without relying on testing. For simple systems and simple safety properties, this is feasible. For complex learned systems, it is not yet generally applicable — but hybrid approaches that formally verify the safety filter while allowing learned policies behind it are an active research direction.
Match each safety or robustness technique to the specific problem it is designed to solve.
Terms
Definitions
Drag terms onto their definitions, or click a term then click a definition to match.
Verification: Providing Evidence for Trust
Verification answers the question: how do we know the system is safe before we deploy it? For classical software, formal verification methods (model checking, theorem proving) can in principle prove correctness for all possible inputs. For learned systems, this is far harder because neural networks are high-dimensional, nonlinear functions without the structure that formal verification exploits. Currently, the primary verification approach for learned robot systems is extensive empirical testing combined with statistical guarantees. Empirical testing: run the system through a large and diverse set of test scenarios, including adversarially chosen edge cases. Record success and failure rates. Compute confidence intervals on performance. For autonomous vehicles, this means billions of miles of simulation and millions of miles of real-world testing (Waymo has logged over 20 million autonomous miles as of 2024). For surgical robots, this means thousands of cadaveric and simulation trials. Scenario-based testing: rather than random tests, construct specific scenarios that target known failure modes — sensor occlusion, adversarial pedestrian behavior, sensor noise at edge thresholds. This coverage-based approach is more efficient than random testing for finding failure modes. Formal neural network verification: emerging methods (alpha-beta-CROWN, Marabou) use constraint propagation and branch-and-bound to formally verify properties of small neural networks — for example, 'if the input image differs from the training image by at most epsilon in pixel space, the classification does not change.' These methods scale to networks with millions of parameters only approximately, but the field is advancing rapidly. Runtime monitoring: rather than certifying the system fully before deployment, deploy with monitors that observe behavior in real time and alert or intervene when anomalies are detected. A runtime monitor can check: are sensor readings within expected ranges? Are the robot's planned actions consistent with physical feasibility? Is the robot's confidence estimate suspiciously high for an unusual input? Runtime monitoring is a form of defense in depth — catching failures that verification missed.
No finite number of test scenarios can prove that a system will behave safely on all possible future inputs. Testing builds confidence; formal methods provide guarantees; runtime monitoring provides last-resort detection. Safety engineering requires all three layers. A system that passes every test is not proven safe — it is tested to a specified coverage level. This distinction matters enormously when certifying robots for high-stakes deployment.
Flashcards — click each card to reveal the answer
A learned object-detection model for a warehouse robot achieves 99.5% accuracy on its test set. A safety engineer argues this is insufficient evidence for deployment. What is the strongest version of the safety engineer's argument?
A control barrier function (CBF) safety filter is added around a learned locomotion policy for a robot operating near humans. Which statement correctly describes what this safety filter does?
Audit a Deployed Robot for Safety
- You are a safety auditor reviewing a home eldercare robot that autonomously assists elderly residents with daily living tasks (fetching objects, reminding about medications, detecting falls). The robot uses a learned end-to-end policy trained on 500,000 demonstration trajectories.
- Step 1: Distribution shift analysis. Identify three specific ways the deployment environment (a real home with real elderly residents) might differ systematically from the demonstration collection environment (a clinical lab with young volunteer demonstrators). For each, describe the failure mode it could cause.
- Step 2: Hard constraints. List five behaviors that must never occur regardless of what the robot's policy outputs. For each, specify whether you would implement the constraint via a control barrier function, a hardware limit, or an organizational procedure — and why.
- Step 3: Verification plan. The robot will be deployed in 50 homes. Design a verification testing program that you would require before approval. What specific test scenarios would you include? How many trials per scenario? What success rate would you require? What happens if the robot fails a scenario once?
- Step 4: Runtime monitoring. Design two runtime monitors for this system. For each monitor, describe: what it measures, what threshold triggers an alert, and what action the system takes when the threshold is crossed.
- Step 5: A resident's family asks: 'How do you know this robot is safe?' Write a two-paragraph honest answer that neither overclaims the safety evidence nor dismisses the genuine efforts made.