PrismLayers: Open Data for High-Quality Multi-Layer Transparent Image Generative Models

Official Github Repo for PrismLayers: Open Data for High-Quality Multi-Layer Transparent Image Generative Models

In this work, we release the first open, ultra-high-fidelity dataset of multi-layer transparent images — PrismLayers (200K) and its high-quality subset PrismLayersPro (20K) — with accurate alpha mattes, synthesized based on our bottom-up approach MultilayerFlux.

This project provides three core components:

ART+, a new-generation multi-layer image generation model trained on our synthesized data.
The Transparent Image Quality Assessment model(TIPS) trained on over 100K transparent image pairs.
The artifact filter for multi-layer transparent images to automatically remove low-quality generations.

Environment Setup

1. Create Conda Environment

conda create -n multilayer python=3.10 -y
conda activate multilayer

2. Install Dependencies

pip3 install torch==2.4.0 torchvision==0.19.0
pip install diffusers==0.31.0 transformers==4.44.0 accelerate==0.34.2 peft==0.12.0 datasets==2.20.0
pip install wandb==0.17.7 einops==0.8.0 sentencepiece==0.2.0 mmengine==0.10.4 prodigyopt==1.0

3. Login to Hugging Face

huggingface-cli login

Checkpoints

Create a path checkpoints and download the following checkpoints into this path.

Variable	Description	Action Required
`base_ckpt`	ART model pretrained	Download fromGoogle Drive
`artplus_PrismLayersPro_ckpt`	ART2 model fine-tuned at 1024 resolution with high-quality data from the PrismLayersPro dataset	Download fromGoogle Drive
`artplus_PrismLayersPro100k_ckpt`	ART2 model fine-tuned at 1024 resolution with ultra high-quality data from the PrismLayersPro100k dataset	Download fromGoogle Drive
`transp_vae_ckpt`	Multi-layer transparency decoder checkpoint	Download fromGoogle Drive
`pre_fuse_lora_dir`	LoRA weights to be fused initially	Download fromGoogle Drive
`classifier_ckpt`	The artifact multi-layer transparent image filter	Download fromGoogle Drive
`TIPS_ckpt`	Transparent Image Quality Assessment model	Download fromGoogle Drive

You can choose either artplus_PrismLayersPro_ckpt or artplus_PrismLayersPro100k_ckpt as artplus_ckpt.

The downloaded checkpoints for ART+'s inference should be organized as follows:

checkpoints/
├── artplus_ckpt/
│   ├── layer_pe.pth
│   └── pytorch_lora_weights.safetensors
├── base_ckpt/
|   └── pytorch_lora_weights.safetensors
├── pre_fuse_lora/
|   └── pytorch_lora_weights.safetensors
└── transparent_decoder_ckpt.pt

Demo

ART+

Use our prepared layout with style guidance caption in inference_layout/demo/test_1.json

python multi_layer_gen/demo.py -c multi_layer_gen/demo.yaml -i inference_layout/test_1.json

Change the save_dir in multi_layer_gen/demo.yaml

inference_layout format

[
    {
        "whole_image": {
            "caption": "This is a [STYLE_NAME] style image. xxx",
            "height": 1024,
            "width": 1024
        },
        "layout": [
            [0, 0, 1023, 1023],
            [0, 0, 1023, 1023],
            [x11, y11, x12, y12],
            [x21, y21, x22, y22],
            ...
        ]
    },
    ...
]

box: value from 0 to image_size-1
STYLE_NAME: 3D, Japanese anime, cartoon, watercolor painting, steampunk, pixel art, melting gold, furry, metal-textured, pop art, wood carving, melting silver, toy, Pokemon, sand painting, line draw, kid crayon drawing, neon graffiti, doodle art, papercut art, ink

Artifact Classifier

Env

pip install hpsv2
wget https://dl.fbaipublicfiles.com/mmf/clip/bpe_simple_vocab_16e6.txt.gz
mv bpe_simple_vocab_16e6.txt.gz [site-packages_path]/hpsv2/src/open_clip/bpe_simple_vocab_16e6.txt.gz

The classifier outputs values between 0 and 1, where higher scores indicate more severe artifacts. We use 0.18 as the threshold.

Inference

cd artifact_classifier
python inference.py --image_path merged.png --classifier_weights [ckpt_file_path] --output_file merged.json

TIPS

The TIPS ranges from 0 to 1, with higher scores indicating better transparent image quality.

Inference

cd TIPS
python inference.py --img_path test.png --caption "A pink shiny star" --checkpoint [ckpt_file_path] --output_file result.json

📚 Citation

Please kindly cite our paper if you use our code, data, models or results:

@article{chen2025prismlayers,
  title={PrismLayers: Open Data for High-Quality Multi-Layer Transparent Image Generative Models},
  author={Chen, Junwen and Jiang, Heyang and Wang, Yanbin and Wu, Keming and Li, Ji and Zhang, Chao and Yanai, Keiji and Chen, Dong and Yuan, Yuhui},
  journal={arXiv preprint arXiv:2505.22523},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
TIPS		TIPS
artifact_classifier		artifact_classifier
assets		assets
inference_layout		inference_layout
layout_planner		layout_planner
multi_layer_gen		multi_layer_gen
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
example.py		example.py
test.ipynb		test.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PrismLayers: Open Data for High-Quality Multi-Layer Transparent Image Generative Models

Table of Contents

Environment Setup

1. Create Conda Environment

2. Install Dependencies

3. Login to Hugging Face

Checkpoints

Demo

ART+

Artifact Classifier

TIPS

📚 Citation

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

redredsheep/PrismLayers

Folders and files

Latest commit

History

Repository files navigation

PrismLayers: Open Data for High-Quality Multi-Layer Transparent Image Generative Models

Table of Contents

Environment Setup

1. Create Conda Environment

2. Install Dependencies

3. Login to Hugging Face

Checkpoints

Demo

ART+

Artifact Classifier

TIPS

📚 Citation

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages