Skip to content

Lyy-iiis/pMF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pixel Mean Flows

arXiv  License: MIT  Hugging Face 

This is the official JAX implementation for the paper One-step Latent-free Image Generation with Pixel Mean Flows. This code is written and tested on TPUs.

For HSDP implementation, please refer to this branch, where we provide HSDP training and inference code for pMF-H models. For PyTorch implementation, please refer to this branch.

Initialization

Run install.sh to install the dependencies (JAX+TPUs). Log in to WandB to track your experiments if needed.

bash scripts/install.sh
wandb login YOUR_WANDB_API_KEY

Inference

You can quickly verify your setup with our provided checkpoint.

ImageNet 256x256 pMF-B/16 pMF-L/16 pMF-H/16
pre-trained checkpoint (inference) download download download
pre-trained checkpoint (full) download download download
FID (this repo / original paper) 3.11/3.12 2.50/2.52 2.11/2.22
IS (this repo / original paper) 256.4/254.6 266.0/262.6 270.5/268.8
ImageNet 512x512 pMF-B/32 pMF-L/32 pMF-H/32
pre-trained checkpoint (inference) download download download
pre-trained checkpoint (full) download download download
FID (this repo / original paper) 3.64/3.70 2.73/2.75 2.37/2.48
IS (this repo / original paper) 274.4/271.9 276.6/276.8 285.3/284.9

Note that slight differences in FID/IS may arise due to different computation setups. Our results are computed from TPU v5p-64.

Sanity Check

  1. Download the checkpoint and FID stats:

    • Download the pre-trained checkpoint from the table above.
    • Download the FID stats file from here and here. Our FID stats is computed on TPU and JAX, which may slightly differ from those computed on GPU and PyTorch. You can also compute FID stats using prepare_ref.py if needed.
  2. Unzip the checkpoint:

    unzip <downloaded_checkpoint.zip> -d <your_ckpt_dir>

    Replace <downloaded_checkpoint.zip> and <your_ckpt_dir> with your actual paths.

  3. Set up the config:

    • Set load_from in configs/eval_config.yml to the path of <your_ckpt_dir>.
    • Set fid.cache_ref to the path of the downloaded FID stats file.
    • Set parameters for corresponding model, e.g., model.model_str and sampling.
  4. Launch evaluation:

    bash scripts/eval.sh JOB_NAME

    Our default evaluation script generates 50,000 samples using pre-trained pMF-B/16 for FID and IS evaluation. The expected FID and IS is 3.11 and 256.4 for this checkpoint. (compared to 3.12 and 254.6 reported in the original paper)

Setup

Data Preparation

Before training, you need to download the ImageNet dataset and extract it to your desired location. The dataset should have the following structure:

imagenet/
├── train/
│   ├── n01440764/
│   ├── n01443537/
│   └── ...
└── val/
    ├── n01440764/
    ├── n01443537/
    └── ...

Configuration Setup

After data preparation, you need to configure your FID cache reference in the config files:

1. Update Config Files

Edit your config file (e.g., configs/pMF_B_16_config.yml) and replace the placeholder values:

dataset:
    root: YOUR_DATA_ROOT  # Path to your dataset, only for training config

fid:
    cache_ref: YOUR_FID_CACHE_REF  # Path to your FID statistics file

logging:
    wandb_project: 'YOUR PROJECT'  # Your WandB project name

2. Available Config Files

  • configs/pMF_B_16_config.yml - Configuration for pMF-B/16 model training (recommended)
  • configs/pMF_B_32_config.yml - Configuration for pMF-B/32 model training
  • configs/pMF_L_16_config.yml - Configuration for pMF-L/16 model training
  • configs/pMF_L_32_config.yml - Configuration for pMF-L/32 model training
  • configs/default.py - Default configuration (Python format, used as base)

Configuration Hierarchy: The system uses a hierarchical approach where pMF_B_16_config.yml and eval_config.yml override specific parameters from default.py. This allows you to customize only the parameters you need while keeping sensible defaults. Make sure to update both the dataset root path and the FID cache reference path according to your data preparation output.

Training

Run the following commands to launch training:

bash scripts/launch.sh JOB_NAME

Note: Update the environment variables in scripts/train.sh before running:

  • DATA_ROOT: Path to your prepared data directory
  • LOG_DIR: Path where to save training logs

Config System

The training system uses two config files:

  • configs/default.py - Base configuration with all default hyperparameters
  • configs/pMF_B_16_config.yml - Model-specific overrides for pMF-B/16 training

The system merges these files, allowing you to customize only the parameters you need.

Customizing Training

To create a custom experiment:

  1. Create a new config file (e.g., configs/my_exp_config.yml)
  2. Update the launch script to use your config:
    # In launch.sh, change the config line to:
    --config=configs/load_config.py:my_exp

Example custom config:

training:
    num_epochs: 80                  # Train for fewer epochs

model:
    model_str: pmfDiT_B_16               # Use pMF-B/16 model
    noise_scale: 1.0                 # Set noise scale

for more details on configuration options, refer to configs/default.py and configs/pMF_B_16_config.yml.

License

This repo is under the MIT license. See LICENSE for details.

Citation

If you find this work useful in your research, please consider citing our paper :)

@article{pixelmeanflows,
  title={One-step Latent-free Image Generation with Pixel Mean Flows},
  author={Lu, Yiyang and Lu, Susie and Sun, Qiao and Zhao, Hanhong and Jiang, Zhicheng and Wang, Xianbang and Li, Tianhong and Geng, Zhengyang and He, Kaiming},
  journal={arXiv preprint arXiv:2601.22158},
  year={2026}
}

Contributors

This repository is a collaborative effort by Kaiming He, Hanhong Zhao, Qiao Sun and Yiyang Lu, developed in support of several research projects, including MeanFlow, improved MeanFlow, and BiFlow.

Acknowledgement

We gratefully acknowledge the Google TPU Research Cloud (TRC) for granting TPU access. We hope this work will serve as a useful resource for the open-source community.

Releases

No releases published

Packages

 
 
 

Contributors