Skip to content

Lyy-iiis/imeanflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Improved Mean Flows

This is the official JAX implementation for the paper Improved Mean Flows: On the Challenges of Fastforward Generative Models. This code is written and tested on TPUs. For PyTorch implementation, please refer to this branch.

Update

Initialization

Run install.sh to install the dependencies (JAX+TPUs). Log in to WandB to track your experiments if needed.

bash scripts/install.sh
wandb login YOUR_WANDB_API_KEY

Inference

You can quickly verify your setup with our provided checkpoint.

iMF-B/2 iMF-M/2 iMF-L/2 iMF-XL/2
pre-trained checkpoint (inference) download download download download
pre-trained checkpoint (full) download download download download
NFE 1 1 1 1 2
FID (this repo / original paper) 3.37/3.39 2.27/2.27 1.85/1.86 1.70/1.72 1.53/1.54
IS (this repo / original paper) 256.0/255.3 260.9/257.7 278.6/276.6 282.0/282.0 292.0/-

Note that slight differences in FID/IS may arise due to different computation setups.

Sanity Check

  1. Download the checkpoint and FID stats:

    • Download the pre-trained checkpoint (inference) from the table above.
    • Download the FID stats file from here. Our FID stats is computed on TPU and JAX, which may slightly differ from those computed on GPU and PyTorch.
  2. Unzip the checkpoint:

    unzip <downloaded_checkpoint.zip> -d <your_ckpt_dir>

    Replace <downloaded_checkpoint.zip> and <your_ckpt_dir> with your actual paths.

  3. Set up the config:

    • Set load_from in configs/eval_config.yml to the path of <your_ckpt_dir>.
    • Set fid.cache_ref to the path of the downloaded FID stats file.
    • Set CFG-related parameters for corresponding model.
  4. Launch evaluation:

    bash scripts/eval.sh JOB_NAME

    Our default evaluation script generates 50,000 samples using pre-trained iMF-B/2 for FID and IS evaluation. The expected FID and IS is 3.37 and 256.0 for this checkpoint. (compared to 3.39 and 255.3 reported in the original paper)

Data Preparation

Before training, you need to prepare the ImageNet dataset and compute latent representations:

1. Download ImageNet

Download the ImageNet dataset and extract it to your desired location. The dataset should have the following structure:

imagenet/
├── train/
│   ├── n01440764/
│   ├── n01443537/
│   └── ...
└── val/
    ├── n01440764/
    ├── n01443537/
    └── ...

2. Configure Data Paths

Update the data paths in scripts/prepare_data.sh:

IMAGENET_ROOT="YOUR_IMGNET_ROOT"
OUTPUT_DIR="YOUR_OUTPUT_DIR"
LOG_DIR="YOUR_LOG_DIR"

3. Launch Data Preparation

Run the data preparation script to compute latent representations:

IMAGE_SIZE=256 COMPUTE_LATENT=True bash ./scripts/prepare_data.sh

The script will:

  • Encode ImageNet images to latent representations using a VAE model
  • Save the latent dataset to OUTPUT_DIR/
  • Compute FID statistics and save to OUTPUT_DIR/imagenet_256_fid_stats.npz
  • Log progress to LOG_DIR/$USER/

Configuration Setup

After data preparation, you need to configure your FID cache reference in the config files:

1. Update Config Files

Edit your config file (e.g., configs/train_config.yml and configs/eval_config.yml) and replace the placeholder values:

dataset:
    root: YOUR_DATA_ROOT  # Path to your prepared latent dataset, only for training config

fid:
    cache_ref: YOUR_FID_CACHE_REF  # Path to your FID statistics file

2. Available Config Files

  • configs/train_config.yml - Configuration for iMF-B/2 model training (recommended)
  • configs/eval_config.yml - Configuration for evaluation
  • configs/default.py - Default configuration (Python format, used as base)

Configuration Hierarchy: The system uses a hierarchical approach where train_config.yml and eval_config.yml override specific parameters from default.py. This allows you to customize only the parameters you need while keeping sensible defaults. Make sure to update both the dataset root path and the FID cache reference path according to your data preparation output.

Training

Run the following commands to launch training:

bash scripts/launch.sh JOB_NAME

Note: Update the environment variables in scripts/train.sh before running:

  • DATA_ROOT: Path to your prepared data directory
  • LOG_DIR: Path where to save training logs

Config System

The training system uses two config files:

  • configs/default.py - Base configuration with all default hyperparameters
  • configs/train_config.yml - Model-specific overrides for iMF-B/2 training

The system merges these files, allowing you to customize only the parameters you need.

Customizing Training

To create a custom experiment:

  1. Create a new config file (e.g., configs/my_exp_config.yml)
  2. Update the launch script to use your config:
    # In launch.sh, change the config line to:
    --config=configs/load_config.py:my_exp

Example custom config:

training:
    num_epochs: 80                  # Train for fewer epochs

method:
    model_str: imfDiT_B_2               # Use iMF-B/2 model
    cfg_beta: 1.0                     # Set cfg distribution

for more details on configuration options, refer to configs/default.py and configs/train_config.yml.

License

This repo is under the MIT license. See LICENSE for details.

Citation

If you find this work useful in your research, please consider citing our paper :)

@article{imeanflow,
  title={Improved Mean Flows: On the Challenges of Fastforward Generative Models},
  author={Geng, Zhengyang and Lu, Yiyang and Wu, Zongze and Shechtman, Eli and Kolter, J Zico and He, Kaiming},
  journal={arXiv preprint arXiv:2512.02012},
  year={2025}
}

Contributors

This repository is a collaborative effort by Kaiming He, Hanhong Zhao, Yiyang Lu, and Zhengyang Geng, developed in support of several research projects. We sincerely thank Qiao Sun, Zhicheng Jiang, Xianbang Wang for their help in building the codebase and infrastructure.

Acknowledgement

We gratefully acknowledge the Google TPU Research Cloud (TRC) for granting TPU access. We hope this work will serve as a useful resource for the open-source community.

Releases

No releases published

Packages

 
 
 

Contributors