Abstract
Performing in-hand, contact-rich, and long-horizon dexterous manipulation remains an unsolved challenge in robotics. Prior hand dexterity works have considered each of these three challenges in isolation, yet do not combine these skills into a single, complex task. To further test the capabilities of dexterity, we propose drumming as a testbed for dexterous manipulation. Drumming naturally integrates all three challenges: it involves in-hand control for stabilizing and adjusting the drumstick with the fingers, contact-rich interaction through repeated striking of the drum surface, and long-horizon coordination when switching between drums and sustaining rhythmic play. We present DexDrummer, a hierarchical object-centric bimanual drumming policy trained in simulation with sim-to-real transfer. The framework reduces the exploration difficulty of pure reinforcement learning by combining trajectory planning with residual RL corrections for fast transitions between drums. A dexterous manipulation policy handles contact-rich dynamics, guided by rewards that explicitly model both finger–stick and stick–drum interactions. In simulation, we show our policy can play two styles of music: multi-drum, bimanual songs and challenging, technical exercises that require increased dexterity. Across simulated bimanual tasks, our dexterous, reactive policy outperforms a fixed grasp policy by 1.87x across easy songs and 1.22x across hard songs F1 scores. In real-world tasks, we show song performance across a multi-drum setup. DexDrummer is able to play our training song and its extended version with an F1 score of 1.0.
Method
System Overview
DexDrummer is a hierarchical, object-centric bimanual drumming policy.
- High-level policy: Plans drumstick-centered task-space trajectories from target drum sequences, maps them to end-effector motions, and applies residual RL for fast transition correction.
- Low-level policy: Produces dexterous finger control for stable in-hand stick manipulation under dynamic impacts and precise strikes.
- Rewards:
- In-hand contact: fingertip/fulcrum grasp rewards with arm and energy regularization to encourage finger-driven control.
- External contact: trajectory shaping + contact curriculum + reactive grasp for robust stick-drum interaction.
- Task: sparse hit-timing reward for successful drum strikes.
Real-world Drumming Video
Everlong
Original Song Audio
Drum-only Audio
Seven Nation Army
Original Song Audio
Drum-only Audio
Analysis
Finger-driven vs. Arm-driven
Contact Curriculum
Finger-driven
More dexterous stick stabilization and consistent strike timing.
Arm-driven
Relies more on arm motion with less in-hand adjustment.
With Contact Curriculum
With contact curriculum, the policy unlocks finger dexterity and maintains in-hand exploration.
Without Contact Curriculum
Without it, the stick rests on the drum head and limits finger exploration; subtle finger movements are easily canceled by external contact.
Ablations for Arm Control
Motion Planning + Residual RL
Our full method. F1 score: 1.0.
Without Residual RL
Motion planning only. F1 score: 0.8.
Without Residual RL + Motion Planning
RL from scratch. F1 score: 0.5.
BibTeX
@misc{fang2026dexdrummer,
title={DexDrummer: In-Hand, Contact-Rich, and Long-Horizon Dexterous Robot Drumming},
author={Hung-Chieh Fang and Amber Xie and Jennifer Grannen and Kenneth Llontop and Dorsa Sadigh},
year={2026},
eprint={2603.22263},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2603.22263},
}