Learning Generalizable Hand-Object Tracking from Synthetic Demonstrations

TODO 📋

adapt to Isaacsim simulator
Release distilled student checkpoint
Release pre-trained teacher checkpoints
Release Text2HOI motion file
Release multiobjs train data
Release data generation code
Release train, test and distill code

Installation 💽

Step 1: Build Environment

Create the conda environment and install dependencies.

# Option 1: Create manually
conda create -n hot python=3.8
conda activate hot
pip install -r requirements.txt

# Option 2: Create from yaml
# conda env create -f environment.yml

Step 2: Install Isaac Gym

Download Isaac Gym Preview 4 from the NVIDIA website.
Unzip the file and install the python package:

tar -xzvf IsaacGym_Preview_4_Package.tar.gz -C /{your_target_dir}/
cd /{your_target_dir}/isaacgym/python/
pip install -e .

Dataset & Preparation 💾

1. Included Data

To keep the repository size manageable, we only provide a subset of the motion data in this repository:

MANO: Bottle, Box, Hammer, Sword
Shadow Hand: Bottle
Allegro Hand: Bottle

2. Full Dataset Download

For all other objects (and the full dataset), please download them from Google Drive:

⬇️ Download Full Dataset (Google Drive)

Alternatively, you can generate the dataset from scratch (or extend it to new objects) by following our detailed guide:

👉 Data Generation & Processing Guide

3. Organization

After downloading, extract the data and ensure the directory structure looks like this:

hot/data/motions/
├── dexgrasp_train_mano/
│   ├── bottle/
│   ├── box/
│   ├── hammer/
│   ├── sword/
├── dexgrasp_train_shadow/
│   └── bottle/
├── dexgrasp_train_allegro/
│   └──bottle/
└── dexgrasp_train_mano_20obj/
     └── xxx/ ... (other objects from Google Drive)

MANO Hand ✋

The MANO hand pipeline consists of two training stages: Precise Tracking and Noisy Generalization.

Stage 1: Precise Tracking

Train the policy to closely track the reference motion with low noise.

Shell Shortcut:

bash teacher_train_stage1.sh

Full Command:

CUDA_LAUNCH_BLOCKING=1 python hot/run.py --task SkillMimicHandRand \
--num_envs 4096 \
--episode_length 60 \
--cfg_env hot/data/cfg/mano/mano_stage1_precise_track.yaml \
--cfg_train hot/data/cfg/train/rlg/skillmimic_denseobj.yaml \
--motion_file hot/data/motions/dexgrasp_train_mano_gmp/bottle \
--state_noise_prob 0.2 \
--enable_obj_keypoints \
--enable_ig_scale \
--use_delta_action \
--enable_dof_obs \
--enable_early_termination \
--hand_model mano \
--objnames Bottle \
--headless

Stage 2: Generalization

Train with higher noise and object randomization to improve robustness.

Shell Shortcut:

bash teacher_train_stage2.sh

Full Command:

CUDA_LAUNCH_BLOCKING=1 python hot/run.py --task SkillMimicHandRand \
--num_envs 4096 \
--episode_length 60 \
--cfg_env hot/data/cfg/mano/mano_stage2_noisey_generalize.yaml \
--cfg_train hot/data/cfg/train/rlg/skillmimic_denseobj.yaml \
--motion_file hot/data/motions/dexgrasp_train_mano_gmp/bottle \
--state_noise_prob 0.5 \
--obj_rand_scale \
--enable_obj_keypoints \
--enable_ig_scale \
--use_delta_action \
--enable_dof_obs \
--enable_early_termination \
--hand_model mano \
--objnames Bottle \
--headless

Inference Command:

Test the trained model.

CUDA_LAUNCH_BLOCKING=1 python hot/run.py --test --task SkillMimicHandRand \
--num_envs 1 \
--cfg_env hot/data/cfg/mano/mano_stage1_precise_track.yaml \
--cfg_train hot/data/cfg/train/rlg/skillmimic_denseobj.yaml \
--motion_file hot/data/motions/dexgrasp_train_mano/bottle/grasp_higher_kp \
--state_init 2 \
--episode_length 180 \ 
--enable_obj_keypoints \
--use_delta_action \
--enable_dof_obs \
--objnames ${OBJ_NAME} \
--checkpoint ${CHECKPOINT}

Please note that different skills require specific --episode_length settings during training and inference. Refer to the table below for the specific values:

Parameter	Grasp	Move	Place	Regrasp	Rotate	Catch	Throw	Freemove
Skill Label	1	2	3	5	6	7	8	9
Test Ep. Length	180	120	220	180	120	100	50	120

Inference Text2HOI Tracking:

# Replace ${OBJ_NAME} and ${CKPT_SUFFIX} according to the table below
CUDA_LAUNCH_BLOCKING=1 python hot/run.py --test --task SkillMimicHandRand \
--num_envs 1 \
--cfg_env hot/data/cfg/mano/mano_stage1_precise_track.yaml \
--cfg_train hot/data/cfg/train/rlg/skillmimic_denseobj.yaml \
--motion_file hot/data/motions/text2hoi/000_Text2HOI_${OBJ_NAME}-ckpt_${CKPT_SUFFIX}.pt \
--state_init 2 \
--episode_length 180 \
--enable_obj_keypoints \
--use_delta_action \
--enable_dof_obs \
--objnames Text2HOI_${OBJ_NAME} \
--checkpoint checkpoint/multiobj_teacher_checkpoints/GraspMovePlace_${CKPT_SUFFIX}_0.pth

Supported Trajectory Configurations:

Please refer to the following mapping to set the correct arguments for each trajectory:

Trajectory ID	`${OBJ_NAME}`	`${CKPT_SUFFIX}`	Text2HOI Prompt
1	Apple	`sword`	Eat an apple with right hands.
2	Duck	`airplane`	Play duck with right hands.
3	Piggybank	`book`	Pass a piggybank with right hand.
4	Waterbottle	`airplane`	Hold a waterbottle with right hands.

For example, to track the Piggybank, you would use OBJ_NAME=Piggybank and CKPT_SUFFIX=book.

Shadow Hand 🦾

Training

CUDA_LAUNCH_BLOCKING=1 python hot/run.py --task SkillMimicHandRand \
--num_envs 4096 \
--episode_length 60 \
--cfg_env hot/data/cfg/shadow/shadow_stage1_precise_track.yaml \
--cfg_train hot/data/cfg/train/rlg/skillmimic_denseobj.yaml \
--motion_file hot/data/motions/dexgrasp_train_shadow/bottle/grasp_higher_kp \
--state_noise_prob 0.2 \
--enable_obj_keypoints \
--enable_ig_scale \
--enable_dof_obs \
--use_delta_action \
--enable_early_termination \
--hand_model shadow \
--objnames Bottle \
--headless

Inference

CUDA_LAUNCH_BLOCKING=1 python hot/run.py --task SkillMimicHandRand \
--test \
--num_envs 1 \
--episode_length 180 \
--cfg_env hot/data/cfg/shadow/shadow_stage1_precise_track.yaml \
--cfg_train hot/data/cfg/train/rlg/skillmimic_denseobj.yaml \
--motion_file hot/data/motions/dexgrasp_train_shadow/bottle/grasp_higher_kp \
--state_init 2 \
--enable_obj_keypoints \
--enable_ig_scale \
--enable_dof_obs \
--use_res_action \
--hand_model shadow \
--objnames Bottle \
--checkpoint checkpoint/shadow/shadow_bottle_grasp-move-place.pth

Allegro Hand 🦾

Training

CUDA_LAUNCH_BLOCKING=1 python hot/run.py --task SkillMimicHandRand \
--num_envs 4096 \
--episode_length 60 \
--cfg_env hot/data/cfg/allegro/allegro_stage1_precise_track.yaml \
--cfg_train hot/data/cfg/train/rlg/skillmimic_denseobj.yaml \
--motion_file hot/data/motions/dexgrasp_train_allegro/bottle/grasp_higher_kp \
--state_noise_prob 0.2 \
--enable_obj_keypoints \
--enable_ig_scale \
--enable_dof_obs \
--use_delta_action \
--enable_early_termination \
--hand_model allegro \
--objnames Bottle \
--headless

Inference

CUDA_LAUNCH_BLOCKING=1 python hot/run.py --task SkillMimicHandRand \
--test \
--num_envs 1 \
--episode_length 180 \
--cfg_env hot/data/cfg/allegro/allegro_stage1_precise_track.yaml \
--cfg_train hot/data/cfg/train/rlg/skillmimic_denseobj.yaml \
--motion_file hot/data/motions/dexgrasp_train_allegro/bottle/grasp_higher_kp \
--state_init 2 \
--enable_obj_keypoints \
--enable_ig_scale \
--enable_dof_obs \
--use_delta_action \
--hand_model allegro \
--objnames Bottle \
--checkpoint checkpoint/allegro/allegro_bottle_grasp-move-place.pth

Here is the updated documentation with the requested information about Data Refinement added before the distillation command.

Distillation 🧪

This section covers the policy distillation process, designed to train a unified student policy capable of handling multiple skills or multiple objects simultaneously.

⚠️ Important: Before running the command, please modify hot/data/cfg/skillmimic_multiobjs_distill.yaml & hot/data/cfg/skillmimic_distill.yaml to specify:

obj_names: The list of objects you want to distill (e.g., ['Bottle', 'Box', ...]).

teacher_ckpt: The file paths to the pre-trained teacher checkpoints for each corresponding object.

💾 Data Preparation (Optional)

To improve distillation performance, you can generate physically plausible motion data using the trained teacher policies.

Run the Teacher Policy Inference with the --save_refined_data flag.
Use the path of the saved data to replace the --refined_motion_file argument in the distillation command below.

Multi-Skill Distillation

Distill diverse skills (e.g., grasp, move, place) into a single policy.

Command:

DRI_PRIME=1 CUDA_VISIBLE_DEVICES=1  CUDA_LAUNCH_BLOCKING=1 python hot/run.py --task Distill \
--num_envs 1024 \
--episode_length 60 \
--cfg_env hot/data/cfg/skillmimic_distill.yaml \
--cfg_train hot/data/cfg/train/rlg/skillmimic_distill.yaml \
--motion_file hot/data/motions/dexgrasp_train_mano_gmp/bottle \
--refined_motion_file hot/data/motions/dexgrasp_train_mano_gmp/bottle \
--state_noise_prob 0.3 \
--enable_obj_keypoints \
--enable_ig_scale \
--enable_dof_obs \
--use_delta_action \
--enable_early_termination \
--headless \
--obj_rand_scale

Multi-Object Distillation

Distill interaction skills across different objects (e.g., Bottle, Box, Hammer) into a single policy.

Command:

DRI_PRIME=1 CUDA_VISIBLE_DEVICES=0  CUDA_LAUNCH_BLOCKING=1 python hot/run.py --task MultiObjDistill \
--num_envs 8192 \
--episode_length 60 \
--cfg_env hot/data/cfg/skillmimic_multiobjs_distill.yaml \
--cfg_train hot/data/cfg/train/rlg/skillmimic_distill.yaml \
--motion_file hot/data/motions/dexgrasp_train_mano_20obj \
--refined_motion_file hot/data/motions/dexgrasp_train_mano_20obj \
--state_noise_prob 0.3 \
--enable_obj_keypoints \
--enable_ig_scale \
--enable_dof_obs \
--use_delta_action \
--enable_early_termination \
--headless \
--obj_rand_scale

Inference

For general testing, you can use the standard inference commands described in the MANO/Shadow/Allegro sections above (ensure you point to the distilled checkpoint).

For Multi-Object Distillation: We provide a convenient script for testing multi-object policies. Please modify the CHECKPOINT_PATH variable in test.sh to your own checkpoint path before running:

bash test.sh

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.vscode		.vscode
DexGraspNet		DexGraspNet
checkpoint		checkpoint
data_test		data_test
hot		hot
.gitattributes		.gitattributes
.gitignore		.gitignore
DATA_GENERATION.md		DATA_GENERATION.md
LICENSE		LICENSE
README.md		README.md
change_position.py		change_position.py
environment.yml		environment.yml
requirements.txt		requirements.txt
teacher_train_stage1.sh		teacher_train_stage1.sh
teacher_train_stage2.sh		teacher_train_stage2.sh
teaser.gif		teaser.gif
test.sh		test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning Generalizable Hand-Object Tracking from Synthetic Demonstrations

TODO 📋

Installation 💽

Step 1: Build Environment

Step 2: Install Isaac Gym

Dataset & Preparation 💾

1. Included Data

2. Full Dataset Download

3. Organization

MANO Hand ✋

Stage 1: Precise Tracking

Stage 2: Generalization

Inference Command:

Inference Text2HOI Tracking:

Shadow Hand 🦾

Training

Inference

Allegro Hand 🦾

Training

Inference

Distillation 🧪

💾 Data Preparation (Optional)

Multi-Skill Distillation

Multi-Object Distillation

Inference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Learning Generalizable Hand-Object Tracking from Synthetic Demonstrations

TODO 📋

Installation 💽

Step 1: Build Environment

Step 2: Install Isaac Gym

Dataset & Preparation 💾

1. Included Data

2. Full Dataset Download

3. Organization

MANO Hand ✋

Stage 1: Precise Tracking

Stage 2: Generalization

Inference Command:

Inference Text2HOI Tracking:

Shadow Hand 🦾

Training

Inference

Allegro Hand 🦾

Training

Inference

Distillation 🧪

💾 Data Preparation (Optional)

Multi-Skill Distillation

Multi-Object Distillation

Inference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages