DexDrummer: In-Hand, Contact-Rich, and Long-Horizon Dexterous Robot Drumming

Abstract

Performing in-hand, contact-rich, and long-horizon dexterous manipulation remains an unsolved challenge in robotics. Prior hand dexterity works have considered each of these three challenges in isolation, yet do not combine these skills into a single, complex task. To further test the capabilities of dexterity, we propose drumming as a testbed for dexterous manipulation. Drumming naturally integrates all three challenges: it involves in-hand control for stabilizing and adjusting the drumstick with the fingers, contact-rich interaction through repeated striking of the drum surface, and long-horizon coordination when switching between drums and sustaining rhythmic play. We present DexDrummer, a hierarchical object-centric bimanual drumming policy trained in simulation with sim-to-real transfer. The framework reduces the exploration difficulty of pure reinforcement learning by combining trajectory planning with residual RL corrections for fast transitions between drums. A dexterous manipulation policy handles contact-rich dynamics, guided by rewards that explicitly model both finger–stick and stick–drum interactions. In simulation, we show our policy can play two styles of music: multi-drum, bimanual songs and challenging, technical exercises that require increased dexterity. Across simulated bimanual tasks, our dexterous, reactive policy outperforms a fixed grasp policy by 1.87x across easy songs and 1.22x across hard songs F1 scores. In real-world tasks, we show song performance across a multi-drum setup. DexDrummer is able to play our training song and its extended version with an F1 score of 1.0.

Method

Dimensional collapse in extreme UniDA scenarios

System Overview

DexDrummer is a hierarchical, object-centric bimanual drumming policy.

High-level policy: Plans drumstick-centered task-space trajectories from target drum sequences, maps them to end-effector motions, and applies residual RL for fast transition correction.
Low-level policy: Produces dexterous finger control for stable in-hand stick manipulation under dynamic impacts and precise strikes.
Rewards:
- In-hand contact: fingertip/fulcrum grasp rewards with arm and energy regularization to encourage finger-driven control.
- External contact: trajectory shaping + contact curriculum + reactive grasp for robust stick-drum interaction.
- Task: sparse hit-timing reward for successful drum strikes.

Real-world Drumming Video

Everlong

Original Song Audio

Drum-only Audio

Seven Nation Army

Original Song Audio

Drum-only Audio

Analysis

Finger-driven vs. Arm-driven

Contact Curriculum

Finger-driven

More dexterous stick stabilization and consistent strike timing.

Arm-driven

Relies more on arm motion with less in-hand adjustment.

With Contact Curriculum

With contact curriculum, the policy unlocks finger dexterity and maintains in-hand exploration.

Without Contact Curriculum

Without it, the stick rests on the drum head and limits finger exploration; subtle finger movements are easily canceled by external contact.

Ablations for Arm Control

Motion Planning + Residual RL

Our full method. F1 score: 1.0.

Without Residual RL

Motion planning only. F1 score: 0.8.

Without Residual RL + Motion Planning

RL from scratch. F1 score: 0.5.

BibTeX

@misc{fang2026dexdrummer,
      title={DexDrummer: In-Hand, Contact-Rich, and Long-Horizon Dexterous Robot Drumming}, 
      author={Hung-Chieh Fang and Amber Xie and Jennifer Grannen and Kenneth Llontop and Dorsa Sadigh},
      year={2026},
      eprint={2603.22263},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2603.22263}, 
}