DP-AG: Action-Guided Diffusion Policy

This repository contains the official implementation of our paper:

Act to See, See to Act: Diffusion-Driven Perception-Action Interplay for Adaptive Policies Jing Wang, Weiting Peng, Jing Tang, Zeyu Gong, Xihua Wang, Bo Tao, Li Cheng [NeurIPS 2025]

📄 Paper | 🌐 Project Page

📖 Overview

Existing imitation learning methods freeze perception during action sequence generation, ignoring how humans naturally refine perception through ongoing actions. DP-AG (Action-Guided Diffusion Policy) closes this gap by evolving observation features dynamically with action feedback.

Latent observations are modeled via variational inference.
An action-guided SDE evolves features, driven by the Vector–Jacobian Product (VJP) of diffusion noise predictions.
A cycle-consistent contrastive loss aligns evolving and static latents, ensuring smooth perception–action interplay.

DP-AG significantly outperforms state-of-the-art methods on Robomimic, Franka Kitchen, Push-T, Dynamic Push-T, and real-world UR5 tasks, delivering higher success rates, faster convergence, and smoother actions.

Figure: DP-AG extends Diffusion Policy by evolving observation features through an action-guided SDE and aligning perception–action interplay with a cycle-consistent contrastive loss.

🛠️ Installation

We follow the same setup as Diffusion Policy. To reproduce simulation benchmarks, install the conda environment on a Linux machine with an NVIDIA GPU.

sudo apt install -y libosmesa6-dev libgl1-mesa-glx libglfw3 patchelf

We recommend Mambaforge:

mamba env create -f conda_environment.yaml

For conda:

conda env create -f conda_environment.yaml

Note: conda_environment_macos.yaml is only for development on macOS and does not support full benchmarks.

📥 Download Training Data

Create the data directory under the repo root:

mkdir data && cd data

Download training datasets:

wget https://diffusion-policy.cs.columbia.edu/data/training/pusht.zip
unzip pusht.zip && rm -f pusht.zip && cd ..

Grab experiment configs:

wget -O image_pusht_diffusion_policy_cnn.yaml \
https://diffusion-policy.cs.columbia.edu/data/experiments/image/pusht/diffusion_policy_cnn/config.yaml

📂 Datasets

We provide our Dynamic Push-T dataset in the data/ folder of this repository for direct use with our implementation.

For all other simulation benchmarks (e.g., Robomimic, Franka Kitchen, Push-T), please kindly refer to the official Diffusion Policy repository for detailed instructions on downloading and preparing datasets.

⚡ Quick Start with Jupyter Notebooks

We provide two Jupyter notebooks that contain the core implementation of DP-AG and are designed to be easy to use and understand:

PushT-Vision-Image-Action-Guided.ipynb – Demonstrates our method on the Push-T benchmark.
Dynamic-PushT-Environment.ipynb – Showcases our Dynamic Push-T environment with action–perception interplay.

👉 We strongly suggest starting from these notebooks, as they provide the clearest entry point for understanding and experimenting with DP-AG.

🚀 Running Experiments

Activate the conda environment and log into wandb:

conda activate robodiff
wandb login

Train with a single seed:

python train.py --config-dir=. --config-name=image_pusht_diffusion_policy_cnn.yaml \
training.seed=42 training.device=cuda:0

Train with multiple seeds using Ray:

export CUDA_VISIBLE_DEVICES=0,1,2
ray start --head --num-gpus=3
python ray_train_multirun.py --config-dir=. --config-name=image_pusht_diffusion_policy_cnn.yaml \
--seeds=42,43,44 --monitor_key=test/mean_score

📊 Evaluate Pre-trained Checkpoints

Download a checkpoint (example):

wget https://diffusion-policy.cs.columbia.edu/data/experiments/low_dim/pusht/diffusion_policy_cnn/train_0/checkpoints/epoch=0550-test_mean_score=0.969.ckpt -O data/checkpoint.ckpt

Run evaluation:

python eval.py --checkpoint data/checkpoint.ckpt --output_dir data/pusht_eval_output --device cuda:0

🤖 Real Robot Experiments

Our framework has been validated on a UR5 robot with RealSense cameras and SpaceMouse teleoperation. Please refer to demo_real_robot.py and eval_real_robot.py for data collection, training, and evaluation following the same structure as Diffusion Policy.

🔧 Codebase Structure

The codebase follows DP’s modular design:

Tasks: dataset wrappers, environments, configs.
Policies: inference + training.
Workspaces: manage experiment lifecycle.

See train.py as the entry point.

📜 Citation

If you use this work, please cite:

@inproceedings{wang2025dpag,
  title={Act to See, See to Act: Diffusion-Driven Perception-Action Interplay for Adaptive Policies},
  author={Wang, Jing and Peng, Weiting and Tang, Jing and Gong, Zeyu and Wang, Xihua and Tao, Bo and Cheng, Li},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS 2025)},
  year={2025}
}

🙏 Acknowledgements

We build upon the foundational work of Diffusion Policy: Visuomotor Policy Learning via Action Diffusion (Chi et al.).
Their open-source code, benchmarks, and datasets enabled our development of DP-AG.
We especially thank the authors for releasing simulation environments, vision and state-based notebooks, and experiment data.

🔗 diffusion-policy.cs.columbia.edu

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Dynamic Push-T Environment		Dynamic Push-T Environment
Experiments		Experiments
Jupyter Notebook		Jupyter Notebook
data		data
LICENSE		LICENSE
README.md		README.md
preview.png		preview.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DP-AG: Action-Guided Diffusion Policy

📖 Overview

🛠️ Installation

📥 Download Training Data

📂 Datasets

⚡ Quick Start with Jupyter Notebooks

🚀 Running Experiments

📊 Evaluate Pre-trained Checkpoints

🤖 Real Robot Experiments

🔧 Codebase Structure

📜 Citation

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DP-AG: Action-Guided Diffusion Policy

📖 Overview

🛠️ Installation

📥 Download Training Data

📂 Datasets

⚡ Quick Start with Jupyter Notebooks

🚀 Running Experiments

📊 Evaluate Pre-trained Checkpoints

🤖 Real Robot Experiments

🔧 Codebase Structure

📜 Citation

🙏 Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages