mega

Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning

Silviu Pitis, Harris Chan, Stephen Zhao, Bradly Stadie, Jimmy Ba

This is code for replicating our ICML 2020 paper. See also the ALA 2020 presentation (25 minutes, best paper).

See the mrl readme for general instructions.

The launch commands used for the main experiments are in commands_for_experiments.txt, which call the train_mega.py launch script.

To run a MEGA agent use --ag_curiosity minkde. To use OMEGA use --transition_to_dg. The actual implementation of MEGA is the DensityAchievedGoalCuriosity from mrl.modules.curiosity, which assumes the agent has a density module from mrl.modules.density (KDE / Flow / RND).

While the paper experiments use the protoge_config from mrl.configs.continuous_off_policy.py, please note that the best_slide_config hparams with --optimize_every 10 --her rfaab_1_5_2_1_1 are much more stable for Stack (and likely FPP, etc.).

Bibtex

@inproceedings{pitis2020mega,
  title={Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning},
  author={Pitis, Silviu and Chan, Harris and Zhao, Stephen and Stadie, Bradly and Ba, Jimmy},
  booktitle={Proceedings of the Thirty-seventh International Conference on Machine Learning},
  year={2020}
}

Name		Name	Last commit message	Last commit date
parent directory ..
hparam_search		hparam_search
Visualization.ipynb		Visualization.ipynb
__init__.py		__init__.py
commands_for_experiments.txt		commands_for_experiments.txt
make_env.py		make_env.py
readme.md		readme.md
train_mega.py		train_mega.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning

Silviu Pitis, Harris Chan, Stephen Zhao, Bradly Stadie, Jimmy Ba

Bibtex

FilesExpand file tree

mega

Directory actions

More options

Directory actions

More options

Latest commit

History

mega

Folders and files

parent directory

readme.md

Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning

Silviu Pitis*, Harris Chan*, Stephen Zhao, Bradly Stadie, Jimmy Ba

Bibtex

Silviu Pitis, Harris Chan, Stephen Zhao, Bradly Stadie, Jimmy Ba