RLMU: Reinforcement Learning with Self-supervisedHuman Motion Analysis

The baseline implementation is based on softlearning.

Experiment results are here: link

Installation

We recommend making a new virtual environment to install the dependencies.

git clone https://github.com/kschmeckpeper/rl_with_videos.git
cd rl_with_videos

pip install -r requirements.txt
python setup.py develop

Examples

We provide the commands to replicate experiments from the paper.

Acrobot

We wrap the Acrobot environment in the AcrobotContinuous environment, which takes a continuous action and discretizes it before passing it to the original Acrobot environment.

To run the SAC baseline, run the following commands.

cd examples/run_rl
python3 -u main.py --task=AcrobotContinuous-v1 --algorithm SAC --exp-name EXP_NAME --gpus=1 --trial-gpus=1

To run RLV, first download a replay pool containing the desired observations. You may also use a replay pool generated during the training of SAC.

Avg. Reward	Link
-99	here
-79	here
-63	here

Then, run the following commands.

cd examples/run_rl
python -u main.py --task=AcrobotContinuous-v1 --algorithm RLV --exp-name EXP_NAME --replace_rewards_bottom=-1.0 --replace_rewards_scale=10.0 --gpus=1 --trial-gpus=1 --replay_pool_load_path PATH/TO/REPLAY/POOL

python -u main.py --task=AcrobotContinuous-v1 --algorithm RLVU --exp-name EXP_NAME --replace_rewards_bottom=-1.0 --replace_rewards_scale=10.0 --gpus=1 --trial-gpus=1 --replay_pool_load_path C:\nyu\DRL\final_project\dataset\acrobot-975-1000.pkl --video_data_path C:\nyu\DRL\final_project\dataset\acrobot_sequence.pkl

Pushing with Human Observations

To run the SAC baseline, run the following command.

cd examples/run_rl
python3 -u main.py --task=Image48HumanLikeSawyerPushForwardEnv-v0 --domain mujoco --algorithm SAC --exp-name EXP_NAME --gpus=1 --trial-gpus=1

To run RLV, first download the human observations from here and the human paired data from here.

Run the following commands:

cd examples/run_rl
python3 -u main.py --task=Image48HumanLikeSawyerPushForwardEnv-v0 --domain mujoco --algorithm RLV  --exp-name EXP_NAME --gpus=1 --trial-gpus=1 --replay_pool_load_path /PATH/TO/REPLAY/POOL --paired_data_path /PATH/TO/PAIRED/DATA --paired_loss_scale 1e-06 --replace_rewards_scale=10.0 --replace_rewards_bottom=0.0 --domain_shift --domain_shift_generator_weight 0.001 --domain_shift_discriminator_weight 1e-08

To run the RV with video understanding, run the following command.

cd examples/run_rl
python -u main.py --task=Image48HumanLikeSawyerPushForwardEnv-v0 --domain mujoco --algorithm RLVU  --exp-name EXP_NAME --gpus=2 --trial-gpus=1 --replay_pool_load_path /PATH/TO/REPLAY/POOL --paired_data_path /PATH/TO/PAIRED/DATA --paired_loss_scale 1e-06 --replace_rewards_scale=10.0 --replace_rewards_bottom=0.0 --domain_shift --domain_shift_generator_weight 0.001 --domain_shift_discriminator_weight 1e-08

Drawer opening with Human Observations

To run the SAC baseline, run the following command.

cd examples/run_rl
python3 -u main.py --task=Image48MetaworldDrawerOpenSparse2D-v0 --domain Metaworld --algorithm SAC  --exp-name EXP_NAME --gpus=1 --trial-gpus=1

Download the human observations from here and the human paired data from here.

Run the following commands:

cd examples/run_rl
python3 -u main.py --task=Image48MetaworldDrawerOpenSparse2D-v0 --domain Metaworld --algorithm RLV --exp-name EXP_NAME --gpus=1 --trial-gpus=1 --replay_pool_load_path /PATH/TO/REPLAY/POOL --paired_data_path /PATH/TO/PAIRED/DATA --paired_loss_scale 1e-08 --replace_rewards_scale=10.0 --replace_rewards_bottom=0.0 --domain_shift --domain_shift_generator_weight 0.001 --domain_shift_discriminator_weight 1e-08

Citation

If this codebase helps you in your academic research, you are encouraged to cite our paper. Here is an example bibtex:

@article{schmeckpeper2020rlv,
  title={Reinforcement Learning with Videos: Combining Offline Observations with Interaction},
  author={Schmeckpeper, Karl and Rybkin, Oleh and Daniilidis, Kostas and Levine, Sergey and Finn, Chelsea},
  journal={Conference on Robot Learning},
  year={2020}
}

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
appendix		appendix
examples		examples
papers		papers
report		report
rl_with_videos		rl_with_videos
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
conda_env.yml		conda_env.yml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RLMU: Reinforcement Learning with Self-supervisedHuman Motion Analysis

Installation

Examples

Acrobot

Pushing with Human Observations

Drawer opening with Human Observations

Citation

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

RaiAnant/rl_with_videos

Folders and files

Latest commit

History

Repository files navigation

RLMU: Reinforcement Learning with Self-supervisedHuman Motion Analysis

Installation

Examples

Acrobot

Pushing with Human Observations

Drawer opening with Human Observations

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages