DFloat11 + OmniGen2

This is a DFloat11 losslessly compressed version of the original OmniGen2 model. It reduces model size by 32% compared to the original BFloat16 model, while maintaining bit-identical outputs and supporting efficient GPU inference.

🔥🔥🔥 Thanks to DFloat11 compression, OmniGen2 can now run smoothly on a single 16GB GPU without any quality loss. 🔥🔥🔥

We apply Huffman coding to losslessly compress the exponent bits of BFloat16 model weights, which are highly compressible (their 8 bits carry only ~2.6 bits of actual information). To enable fast inference, we implement a highly efficient CUDA kernel that performs on-the-fly weight decompression directly on the GPU.

The result is a model that is ~32% smaller, delivers bit-identical outputs, and achieves performance comparable to the original BFloat16 model.

Learn more in our research paper.

📊 Performance Comparison

Metric	OmniGen2 (BFloat16)	OmniGen2 (DFloat11)
Model Size	16.23 GB	11.11 GB
Peak GPU Memory (1024×1024 image generation)	18.41 GB	14.36 GB
Generation Time (A100 GPU)	25 seconds	27 seconds

🚀 Quick Start

Requires a CUDA-compatible GPU with at least 16GB of VRAM.

🛠️ Environment Setup

✅ Recommended Setup

# 1. Clone the repo
git clone https://github.com/LeanModels/OmniGen2-DFloat11.git
cd OmniGen2-DFloat11

# 2. (Optional) Create a clean Python environment
conda create -n omnigen2 python=3.11
conda activate omnigen2

# 3. Install dependencies
# 3.1 Install PyTorch (choose correct CUDA version)
pip install torch==2.6.0 torchvision --extra-index-url https://download.pytorch.org/whl/cu124

# 3.2 Install other required packages
pip install -r requirements.txt

# Note: Version 2.7.4.post1 is specified for compatibility with CUDA 12.4.
# Feel free to use a newer version if you use CUDA 12.6 or they fixed this compatibility issue.
# OmniGen2 runs even without flash-attn, though we recommend install it for best performance.
pip install flash-attn==2.7.4.post1 --no-build-isolation

🌏 For users in Mainland China

# Install PyTorch from a domestic mirror
pip install torch==2.6.0 torchvision --index-url https://mirror.sjtu.edu.cn/pytorch-wheels/cu124

# Install other dependencies from Tsinghua mirror
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

# Note: Version 2.7.4.post1 is specified for compatibility with CUDA 12.4.
# Feel free to use a newer version if you use CUDA 12.6 or they fixed this compatibility issue.
# OmniGen2 runs even without flash-attn, though we recommend install it for best performance.
pip install flash-attn==2.7.4.post1 --no-build-isolation -i https://pypi.tuna.tsinghua.edu.cn/simple

🧪 Run Examples

The following examples will automatically download the DFloat11 OmniGen2 model, and use the GPU to generate/edit images or generate text.

# Visual Understanding
bash example_understanding.sh

# Text-to-image generation
bash example_t2i.sh

# Instruction-guided image editing
bash example_edit.sh

# In-context generation
bash example_in_context_generation.sh

Gradio Demo:

# for only generating image
pip install gradio
python app.py
# Optional: Share demo with public link (You need to be able to access huggingface)
python app.py --share

# for generating image or text
pip install gradio
python app_chat.py

Learn More About DFloat11

OmniGen2 Introduction

OmniGen2 is a powerful and efficient generative model. Unlike OmniGen v1, OmniGen2 features two distinct decoding pathways for text and image modalities, utilizing unshared parameters and a decoupled image tokenizer. OmniGen2 has competitive performance across four primary capabilities:

Visual Understanding: Inherits the robust ability to interpret and analyze image content from its Qwen-VL-2.5 foundation.
Text-to-Image Generation: Creates high-fidelity and aesthetically pleasing images from textual prompts.
Instruction-guided Image Editing: Executes complex, instruction-based image modifications with high precision, achieving state-of-the-art performance among open-source models.
In-context Generation: A versatile capability to process and flexibly combine diverse inputs—including humans, reference objects, and scenes—to produce novel and coherent visual outputs.

As an open-source project, OmniGen2 provides a powerful yet resource-efficient foundation for researchers and developers exploring the frontiers of controllable and personalized generative AI.

Demonstrations.

Demonstration of OmniGen2's image editing capabilities.

Demonstration of OmniGen2's in-context generation capabilities.

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
assets		assets
example_images		example_images
omnigen2		omnigen2
pretrained_models		pretrained_models
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
app_chat.py		app_chat.py
app_chat.sh		app_chat.sh
example.ipynb		example.ipynb
example_edit.sh		example_edit.sh
example_in_context_generation.sh		example_in_context_generation.sh
example_t2i.sh		example_t2i.sh
example_understanding.sh		example_understanding.sh
inference.py		inference.py
inference_chat.py		inference_chat.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DFloat11 + OmniGen2

📊 Performance Comparison

🚀 Quick Start

🛠️ Environment Setup

✅ Recommended Setup

🌏 For users in Mainland China

🧪 Run Examples

Learn More About DFloat11

OmniGen2 Introduction

About

Uh oh!

Releases

Packages

Contributors 8

Uh oh!

Languages

License

LeanModels/OmniGen2-DFloat11

Folders and files

Latest commit

History

Repository files navigation

DFloat11 + OmniGen2

📊 Performance Comparison

🚀 Quick Start

🛠️ Environment Setup

✅ Recommended Setup

🌏 For users in Mainland China

🧪 Run Examples

Learn More About DFloat11

OmniGen2 Introduction

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 8

Uh oh!

Languages

Packages