- Project website: https://cxu-tri.github.io/FAIL-Detect-Website/.
- The paper titled "Can We Detect Failures Without Failure Data? Uncertainty-Aware Runtime Failure Detection for Imitation Learning Policies" is accepted at Robotics: Science and Systems (RSS) 2025.
- Please direct implementation questions to Chen Xu ([email protected]).
We base our environment on diffusion_policy. Set up the environment by running
mamba env create -f conda_environment.yaml
Tasks: we consider square, transport, tool_hang, and can tasks in robomimic.
Policy backbone: Either diffusion policy or flow-matching policy. Both policies have the same network architecture and are trained on the same datasets with same hyperparameters.
Usage: see diffusion_policy/configs_robomimic for the set of configs.
# This trains a flow policy (e.g, on the square task)
python train.py --config-dir=diffusion_policy/configs_robomimic --config-name=image_square_ph_visual_flow_policy_cnn.yaml training.seed=1103 training.device=cuda:0 hydra.run.dir='data/outputs/${name}_${task_name}'
# This trains a diffusion policy (e.g, on the square task)
python train.py --config-dir=diffusion_policy/configs_robomimic --config-name=image_square_ph_visual_diffusion_policy_cnn.yaml training.seed=1103 training.device=cuda:0 hydra.run.dir='data/outputs/${name}_${task_name}'
# For other tasks, change 'square' to be among ['transport', 'tool_hang', 'can']
Here,
-
$O_t$ = [Embedded visual features, non-visual information (e.g., robot states)]. -
$A_t$ = corresponding action in training data.
# For flow policy (e.g, on the square task)
python save_data.py --config-dir=diffusion_policy/configs_robomimic \
--config-name=image_square_ph_visual_flow_policy_cnn.yaml \
training.seed=1103 training.device=cuda:0 hydra.run.dir='data/outputs/${name}_${task_name}'
# For diffusion policy (e.g, on the square task)
python save_data.py --config-dir=diffusion_policy/configs_robomimic \
--config-name=image_square_ph_visual_diffusion_policy_cnn.yaml \
training.seed=1103 training.device=cuda:0 hydra.run.dir='data/outputs/${name}_${task_name}'
# For other tasks, change 'square' to be among ['transport', 'tool_hang', 'can']
We give the examples of using logpZO and RND, which are the best performings ones. The other baselines are similar by switching to the corresponding folders
cd UQ_baselines/logpZO/ # Or change to /RND/, /CFM/, /NatPN/, /DER/ ...
# flow policy
python train.py --policy_type='flow' --type 'square'
# diffusion policy
python train.py --policy_type='diffusion' --type 'square'
cd ../..
# For other tasks, change 'square' to be among ['transport', 'tool_hang', 'can']
cd UQ_test
# modify = False is ID
python eval_together.py --policy_type='flow' --task_name='square' --device=0 --modify=false --num=2000
python eval_together.py --policy_type='diffusion' --task_name='square' --device=0 --modify=false --num=2000
# modify = True is OOD
python eval_together.py --policy_type='flow' --task_name='square' --device=0 --modify=true --num=2000
python eval_together.py --policy_type='diffusion' --task_name='square' --device=0 --modify=true --num=2000
cd ..
# For other tasks, change 'square' to be among ['transport', 'tool_hang', 'can']
cd UQ_test
# flow
python plot_with_CP_band.py # Generate CP band and make decision
python barplot.py # Generate barplots
# diffusion
python plot_with_CP_band.py --diffusion_policy # Generate CP band and make decision
python barplot.py --diffusion_policy # Generate barplots
