This is the official repository of MuDreamer.
Read MuDreamer paper on Arxiv
Clone GitHub repository and set up environment
git clone https://github.com/burchim/MuDreamer && cd MuDreamer
./install.sh
Download Kinetics400 videos
python3 download_videos.py
Download DAVIS videos (Distracting Control Suite)
wget https://data.vision.ee.ethz.ch/csergi/share/davis/DAVIS-2017-trainval-480p.zip && unzip DAVIS-2017-trainval-480p.zip
Run an experiment:
env_name=dmc-walker-run run_name=dmc python3 main.py
The training task can be selected with the env_name=dmc-domain-task variable. Training logs, replay buffer and checkpoints will be saved to callbacks/MuDreamer/run_name/env_name.
Replace 'dmc' with 'dis' to experiment with the Distracting Control Suite:
env_name=dis-walker-run run_name=dis python3 main.py
Overriding model config hyperparameters (set apply_random_background to false):
env_name=dmc-walker-run run_name=dmc_bg_off override_config='{"loss_reward_scale": 1.0, "train_env_params": {"apply_random_background": false}, "eval_env_params": {"apply_random_background": false}}' python3 main.py
tensorboard --logdir ./callbacks
'--mode evaluation' can be used to evaluate agents. The '--load_last' flag will scan the log directory to load the last checkpoint. '--checkpoint' can also be used to load a specific '.ckpt' checkpoint file.
env_name=dmc-walker-run run_name=dmc python3 main.py --load_last --mode evaluation
# Args
-c / --config_file type=str default="configs/mu_dreamer.py" help="Python configuration file containing model hyperparameters"
-m / --mode type=str default="training" help="Mode: training, evaluation, pass"
-i / --checkpoint type=str default=None help="Load model from checkpoint name"
--cpu action="store_true" help="Load model on cpu"
--load_last action="store_true" help="Load last model checkpoint"
--wandb action="store_true" help="Initialize wandb logging"
--verbose_progress_bar type=int default=1 help="Verbose level of progress bar display"
# Training
--saving_period_epoch type=int default=1 help="Model saving every 'n' epochs"
--log_figure_period_step type=int default=None help="Log figure every 'n' steps"
--log_figure_period_epoch type=int default=1 help="Log figure every 'n' epochs"
--step_log_period type=int default=100 help="Training step log period"
--keep_last_k type=int default=3 help="Keep last k checkpoints"
# Eval
--eval_period_epoch type=int default=5 help="Model evaluation every 'n' epochs"
--eval_period_step type=int default=None help="Model evaluation every 'n' steps"
# Info
--show_dict action="store_true" help="Show model dict summary"
--show_modules action="store_true" help="Show model named modules"
# Debug
--detect_anomaly action="store_true" help="Enable or disable the autograd anomaly detection"
If this code or paper is helpful in your research, please use the following citation:
@article{burchi2024mudreamer,
title={Mudreamer: Learning predictive world models without reconstruction},
author={Burchi, Maxime and Timofte, Radu},
journal={arXiv preprint arXiv:2405.15083},
year={2024}
}
Official DreamerV3 Implementation: https://github.com/danijar/dreamerv3
Official DreamerPro Implementation https://github.com/fdeng18/dreamer-pro
Distracting Control Suite: https://github.com/google-research/google-research/tree/master/distracting_control
