Online Optimization for Offline Safe Reinforcement Learning

This repository contains the implementation of O3SRL.

Installation

To train an O3SRL agent, run the following command:

python train_o3srl.py --task <env_name>

The default cost limit is 5 for BulletGym and 10 for SafetyGym. You can also use the --cost_limit parameter for a different cost limit.

To evaluate a trained agent, use the following command:

python eval_o3srl.py --path path_to_model --cost_limit 5 --eval_episodes 20

Our implementation of O3SRL follows the OSRL repository design. We thank the authors for their well-structured codebase.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
common		common
README.md		README.md
eval_o3srl.py		eval_o3srl.py
o3srl.py		o3srl.py
o3srl_configs.py		o3srl_configs.py
train_o3srl.py		train_o3srl.py