Real-world RL of Active Perception Behaviors

NeurIPS 2025 & ARLET Workshop

Edward S. Hu*, Jie Wang*, Xingfang Yuan*, Fiona Luo, Muyao Li, Gaspard Lambrechts, Oleh Rybkin, Dinesh Jayaraman

This is the official implementation of the Asymmetric Advantage Weighted Regression (AAWR) RL algorithm. AAWR enables efficient online / offline RL in the real world to learn active perception policies.

The key idea is to use privileged sensors during training time to learn high-quality value functions, which are used to compute advantage weights for a weighted BC loss. This leads to better policy extraction in POMDPs compared to BC and unprivileged AWR. With just 50 demos, AAWR policies outperform BC and AWR in search quality, speed, and success rate in real world active perception tasks.

Repo Structure

The algorithm is in src/algorithm/asym_awr.py where we use IQL to train a privileged Q and V network and do AWR policy extraction. The offline and online RL training is in src/train_il.py.

Setup

Here, we show you how to get AAWR up and running for a toy simulated task.

Install python dependencies. You can refer to the environment.yaml file, or just manually install the dependencies. There's not too many, mainly torch, gymnasium and their related packages.
Install the simulated xarm environment code, and generate demos

git clone [email protected]:edwhu/gym-xarm.git
pip install -e .
python generate_demos.py # generate demo data.

mkdir gymnasium_xarm_lift_bc
mv buffer.pkl gymnasium_xarm_lift_bc

Train AAWR on a simple simulated lifting task.

python src/train_il.py agent=asym_awr modality=all task=gymnasium_xarm_lift dataset_dir=data/gymnasium_xarm_lift_bc seed=41 use_wandb=false exp_name=aawr_xarm_lift horizon=1 offline_steps=20000 train_steps=40000 lr=1e-4 grad_clip_norm=10 awr_filter=indicator expectile=0.9 A_scaling=3 save_video=true switch_awr_filter=true

References and Acknowledgements

FOWM codebase: AAWR is built on top of the FOWM codebase

If you find our project helpful, please cite us:

@inproceedings{
    hu2025realworld,
    title={Real-World Reinforcement Learning of Active Perception Behaviors},
    author={Edward S. Hu and Jie Wang and Xingfang Yuan and Fiona Luo and Muyao Li and Gaspard Lambrechts and Oleh Rybkin and Dinesh Jayaraman},
    booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
    year={2025},
    url={https://openreview.net/forum?id=RkdTtznSAL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
cfgs		cfgs
data/gymnasium_xarm_lift_bc		data/gymnasium_xarm_lift_bc
docs		docs
real_robot		real_robot
scripts		scripts
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
environment.yaml		environment.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Real-world RL of Active Perception Behaviors

Repo Structure

Setup

References and Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

penn-pal-lab/aawr

Folders and files

Latest commit

History

Repository files navigation

Real-world RL of Active Perception Behaviors

Repo Structure

Setup

References and Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages