Skip to content

sony/hero

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[ICLR'25] HERO: Human-Feedback-Efficient Reinforcement Learning for Online Diffusion Model Finetuning

This repository officially houses the official PyTorch implementation of the paper titled "HERO: Human-Feedback-Efficient Reinforcement Learning for Online Diffusion Model Finetuning", which is presented at ICLR 2025.

TL;DR: HERO efficiently fintetunes text-to-image diffusion models with minimal online human feedback (<1K) for various tasks.

Requirements

Setup

  1. Clone the repository

    git clone <your-repo-url>
    cd HERO
  2. Install dependencies

    pip install -e .
    cd rl4dgm
    pip install -e .

Training

The main training code is implemented in train_hero.py.

To start training, use the following command:

accelerate launch --num-processes 1 --dynamo_backend no --gpu_ids 1 train_hero.py
  • --num-processes 1: Run on a single process (single GPU).
  • --dynamo_backend no: Disables torch dynamo backend.
  • --gpu_ids 1: Use GPU 1 (change as needed).
  • train_hero.py: The main training script (make sure this file exists and is configured).

You may need to adjust the arguments or configuration files according to your experiment setup.

Configuration

Training and model parameters are managed via hydra config files. See the HERO/config/hydra_configs for more details.

Logging

  • Training progress and metrics are logged to Weights & Biases.
  • Images generated during training are saved to HERO/real_human_ui_images

References


For more details, please refer to the code and comments in ddpo_trainer.py.

Contacts

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages