Shuo Sha1, Yixuan Wang1, Binghao Huang1, Antonio Loquercio2, Yunzhu Li1
1Columbia University 2University of Pennsylvania
Paper | Project Page | Deploy Code | Data & Checkpoints
teaser-short.mp4
(Video has sound.)
- Python 3.11
- uv package manager
- CUDA 12.8 (matches IsaacSim 5.1.0 / PyTorch 2.7.0 cu128; for other versions see the IsaacLab installation guide)
git clone --recurse-submodules https://github.com/shuosha/Residual_Copilot.git
cd Residual_Copilot
uv venv --python 3.11 --seed .venv
source .venv/bin/activate
UV_HTTP_TIMEOUT=300 uv sync # Isaac Sim wheels are large; extended timeout prevents download failuresTo verify that all environments are registered correctly:
python scripts/list_envs.py # should list XArm-{GearMesh,NutThread,PegInsert}-{Residual,GuidedDiffusion}First run: Isaac Sim will prompt you to accept the EULA and will download/cache simulator assets, which may take several minutes.
Note: Isaac Sim can exhaust static TLS slots, causing
cannot allocate memory in static TLS block. Preloadlibgompbefore running any script:export LD_PRELOAD=/lib/x86_64-linux-gnu/libgomp.so.1The provided shell scripts handle this automatically.
Generate side-by-side comparison videos of the Residual Copilot vs. an unassisted Replay baseline:
bash scripts/demo_side_by_side.sh NutThread # launches with Isaac Sim viewer
bash scripts/demo_side_by_side.sh GearMesh --headless # no viewer (for remote machines or no display)
bash scripts/demo_side_by_side.sh PegInsert --no-clean # keep intermediate filesOutput: logs/demos/<task>/demo_*.mp4 — annotated videos with action overlays (red = base action, pink = residual, blue = net).
# Pilot + copilot with recording
python scripts/play.py \
--task NutThread --pilot kNNPilot --copilot ResidualCopilot \
--num_envs 16 --record
# Pilot only
python scripts/play.py --task NutThread --pilot kNNPilot --num_envs 16Arguments: --task (GearMesh / PegInsert / NutThread), --pilot (see below), --copilot (optional), --num_envs (default 1), --record (save to logs/rollouts/), --no_rand (disable domain randomization), --checkpoint (override the HF copilot download with a local .pth; only applies to RL-Games copilots ResidualBC / ResidualCopilot).
Pilots:
| Name | Description |
|---|---|
kNNPilot |
kNN Pilot |
BCPilot |
Teleop BC policy |
ExpertPilot |
Expert BC policy |
NoisyPilot |
Expert BC + mistakeful behaviors (noisy actions) |
LaggyPilot |
Expert BC + laggy behaviors (repeat actions) |
ReplayPilot |
Replay recorded episodes |
Copilots:
| Name | Description |
|---|---|
ResidualCopilot |
Residual RL trained with Residual Copilot |
ResidualBC |
Residual RL trained with BC pilot |
GuidedDiffusionBC |
Guided Diffusion trained on teleop data |
GuidedDiffusionExpert |
Guided Diffusion trained on expert data |
All checkpoints are auto-downloaded from HuggingFace on first use. To evaluate a local RL-Games copilot checkpoint instead, pass --checkpoint <path/to/FactoryXarm.pth>:
python scripts/play.py \
--task GearMeshIntent --pilot NoisyPilot --copilot ResidualCopilot \
--checkpoint logs/rl_games/FactoryXarm/<run_name>/nn/FactoryXarm.pth \
--num_envs 15 --no_rand --record--checkpoint only overrides RL-Games copilots (ResidualBC, ResidualCopilot). Combined with --no_rand, envs are run in deterministic order (one episode per env), so --num_envs N evaluates exactly N ordered episodes.
Per-episode and collage videos from recorded rollouts:
# <recording_dir> is created by `play.py --record` under logs/rollouts/,
# named eval_<task>_with_<copilot>_and_<pilot> (e.g. eval_GearMesh_with_ResidualCopilot_and_kNNPilot)
# Annotated single-episode videos
python scripts/vis/to_videos.py \
logs/rollouts/<recording_dir> \
--single --annotate
# Collage grid
python scripts/vis/to_videos.py \
logs/rollouts/<recording_dir> \
--collage --annotate --cols 4 --scale 0.5When --annotate is enabled, action arrows are drawn per frame: red = base action, pink = residual, blue = net action.
See Residual_Copilot_Deployment for real-world setup, hardware configuration, and deployment instructions.
Train the RL residual copilot using PPO with a pilot model:
python scripts/train.py \
--task XArm-GearMesh-Residual \
--pilot kNNPilot \
--num_envs 128 \
--headlessArguments: --task (full gym ID, e.g. XArm-GearMesh-Residual), --pilot (pilot model), --num_envs (default 128), --checkpoint (resume), --distributed (multi-GPU), --track (W&B logging), --wandb_project_name (W&B project name, defaults to task config name), --wandb_name (W&B experiment name, defaults to log directory). Logs saved to logs/rl_games/.
Diffusion policies are trained with LeRobot. Training data can either come from successful rollout data of an existing policy or real-world data collected using the deployment repo.
1. Collect data:
python scripts/collect_data.py \
--task GearMesh --pilot kNNPilot \
--num_envs 16 --num_episodes 500 \
--output_dir logs/data/gearmesh_knn_500 --headless2. Augment (optional):
python scripts/augment_data.py \
--in logs/data/gearmesh_train.npy --target-total 2000 \
--pos-aug 0.02 --rot-aug-deg 5Visualize training data with fingertip position heatmaps. You can point to either an .npy file or a data root directory:
# From .npy file
python scripts/vis/plot_data.py path/to/camera_image.jpg \
--npy-path logs/data/gearmesh_train.npy \
--out logs/vis/gearmesh_heatmap.png
# From data root directory
python scripts/vis/plot_data.py path/to/camera_image.jpg \
--data-root logs/data/gearmesh_knn_500 \
--out logs/vis/gearmesh_heatmap.png3. Train DiffusionPolicy:
bash scripts/train_bc.sh <dataset_path> <job_name>The dataset_path can be a HuggingFace dataset ID or a locally collected dataset. Ensure the observation and action spaces are consistent with the target task, as defined in source/pilot_models/config/state_dp_cfg.json.
Available HF datasets: Expert — shashuo0104/0126_gearmesh_expert_2000, shashuo0104/0129_peginsert_expert_2000, shashuo0104/0129_nutthread_expert_2000. Augmented Teleop — shashuo0104/0121_gearmesh_teleop_aug_2000, shashuo0104/0121_peginsert_teleop_aug_2000, shashuo0104/0121_nutthread_teleop_aug_2000.
- Create USD assets for the held and fixed objects (local paths or uploaded to HuggingFace)
- Define asset configs in
assembly_tasks_cfg.py— subclassHeldAssetCfgandFixedAssetCfg - Define task config as
AssemblyTasksubclass — set data paths, rewards, success threshold - Create env config in
xarm_env_cfg.py— subclassXArmEnvCfg - Register in
__init__.pywithgym.register() - Verify with
python scripts/list_envs.py
- Implement in
source/pilot_models/withpredict()andreset()methods (seeknn_pilot.py) - Register in
_init_pilot()inxarm_env.py - Add mapping in
source/utils/constants.py(PILOT_NAME_MAP)
All data, models, and assets are hosted as a HuggingFace collection, auto-downloaded on first use.
| Repo | Contents |
|---|---|
residual_copilot_assets |
Robot USD/URDF, object meshes, camera params |
residual_copilot_models |
BC pilots, RL copilot checkpoints, DP baselines |
residual_copilot_data |
Teleoperation trajectories |
See each HuggingFace repo for the detailed file structure.