Official code for "DexH2R: A Benchmark for Dynamic Dexterous Grasping in Human-to-Robot Handover" (ICCV 2025) Project Page | Paper
Please see README in GraspAndMotionet folder.
Please see README in DiffusionPolicy folder.
- Clone this repository:
git clone [email protected]:wang-youzhuo/DexH2R.git
- Create a
datasetfolder underDexH2R:
mkdir data
- Download the dataset from here, and put them under
dataset.
We divided the dataset into individual units, with all the data of each person stored in a single zip file. Each zip file contains all the data from 18 viewpoints across three types of cameras: Kinect, RealSense, and ZCam. Additionally, it includes the qpos of the shadow hand, human hand reconstruction results of the human subject, segmented and clustered point clouds of real-world objects, as well as the object poses at each time step. The detailed file directory structure of each zip file is as follows:
DexH2R
├── GraspAndMotionet
├── DiffusionPolicy
└── dataset
├── DexH2R_dataset
│ ├── 0(subject_name)
│ │ ├── air_duster(object_name)
│ │ │ ├── 0(sequence_index)
│ │ │ │ ├── calibration
│ │ │ │ ├── kinect
│ │ │ │ ├── realsense
│ │ │ │ ├── zcam
│ │ │ │ ├── (right/left)_mano.pt
│ │ │ │ ├── obj_pose.pt
│ │ │ │ ├── qpos.pt
│ │ │ │ ├── real_obj_pcd_color.pt
│ │ │ │ └── real_obj_pcd_xyz.pt
│ │ │ └── ...
│ │ ├── bathroom_cleaner(object_name)
│ │ └── ...
│ │
│ ├── 1(subject_name)
│ └── ...
├── object_model
└── dataset_split.json
To save space, the storage format for all depth images is torch.int16. Please manually change its data type to numpy.uint16 to convert it to the normal format
Download obj_and_data_split.zip from here, unzip it in the dataset folder, this zip contains the objects' mesh models and dataset split information
Download shadow hand description.zip from here, unzip it in the GraspAndMotionet/assets
You can visualize the dataset by GraspAndMotionet/viz_and_test_motion/viz_dataset.py
For all the URDF files in dataset/object_model, you need to change the mesh paths contained within them to the absolute path format on your computer
The camera extrinsic parameters folder, the content in the kinect folder, refers to the extrinsic parameters between kinect's camera 0 and other cameras. hand_arm_mesh_to_kinect_pcd_0 refers to the extrinsic parameters between the robotic hand's coordinate system and kinect's camera 0. realsense_to_forearm_index_0 refers to the coordinate transformation between realsense 0 and the shadow hand's forearm (because the realsense and forearm are relatively stationary), and the same applies to realsense_to_forearm_index_1. The file zcam_calibration.json specifies all the extrinsic parameter transformations between zcams, and zcam_to_kinect_transform specifies the coordinate transformation between zcam and kinect's camera 0. For all coordinate transformation relationships, please refer to the file GraspAndMotionet/viz_and_test_motion/viz_dataset.py to obtain specific usage.
For users training with the Zarr dataset, use the following commands to reassemble and verify the data.
- Extract All Splits This command concatenates the split parts and pipes them directly to tar for extraction.
# Extract train, test, and validation sets
cat train.tar.part* | tar -xvf -
cat test.tar.part* | tar -xvf -
cat val.tar.part* | tar -xvf -
- Verify Data Integrity After extraction, use sha256sum to ensure the data is not corrupted.
# Verify all sets
sha256sum -c train.sha256 > train_check.txt
sha256sum -c test.sha256 > test_check.txt
sha256sum -c val.sha256 > val_check.txt
Note: Check the _check.txt logs. If all files report OK, you are ready to start training.
@article{wang2025dexh2r,
title={DexH2R: A Benchmark for Dynamic Dexterous Grasping in Human-to-Robot Handover},
author={Wang, Youzhuo and Ye, Jiayi and Xiao, Chuyang and Zhong, Yiming and Tao, Heng and Yu, Hang and Liu, Yumeng and Yu, Jingyi and Ma, Yuexin},
journal={arXiv preprint arXiv: https://arxiv.org/abs/2506.23152},
year={2025}
}
This work and the dataset are licensed under CC BY-NC 4.0.

