Xiaole Xian * ♰ 1 · Zhichao Liao* ♰ 2 · Qingyu Li3 · Wenyu Qin3 · Pengfei Wan3 · Weicheng Xie1✉ · Long Zeng2 ✉ · Linlin Shen1 · Pingfa Feng2
1Shenzhen University · 2Tsinghua University · 3Kuaishou Technology
*Equal contributions · ♰ Internship at KwaiVGI, Kuaishou Technology · ✉Corresponding authors
-
2025/04/02: 🔥 We released the technical report on arXiv.
- Technical report
- Inference code
- Pre-trained weight for inference [SDv1.5] in Huggingface Model🤗
- Demo deployment in Huggingface Space🤗
- Training code
(Thanks for your attention! The checkpoints and codes are coming soon!)
Fine-tuning a pre-trained Text-to-Image (T2I) model on a tailored portrait dataset is the mainstream method for text-to-portrait customization. However, existing methods often severely impact the original model’s behavior (e.g., changes in ID, layout, etc.) while customizing portrait attributes. To address this issue, we propose SPF-Portrait, a pioneering work to purely understand customized target semantics and minimize disruption to the original model. In our SPF-Portrait, we design a dual-path contrastive learning pipeline, which introduces the original model as a behavioral alignment reference for the conventional fine-tuning path. During the contrastive learning, we propose a novel Semantic-Aware Fine Control Map that indicates the intensity of response regions of the target semantics, to spatially guide the alignment process between the contrastive paths. It adaptively balances the behavioral alignment across different regions and the responsiveness of the target semantics. Furthermore, we propose a novel response enhancement mechanism to reinforce the presentation of target semantics, while mitigating representation discrepancy inherent in direct cross-modal supervision. Through the above strategies, we achieve incremental learning of customized target semantics for pure text-to-portrait customization. Extensive experiments show that SPF-Portrait achieves state-of-the-art performance.
🔥 Our SPF-Portrait achieves human attributes adaption of T2I model without pollution of the original capability.
🔥 For more results, visit our homepage
If you find SPF-Portrait useful for your research, welcome to 🌟 this repo and cite our work using the following BibTeX:
@article{xian2025spf,
title={SPF-Portrait: Towards Pure Portrait Customization with Semantic Pollution-Free Fine-tuning},
author={Xian, Xiaole and Liao, Zhichao and Li, Qingyu and Qin, Wenyu and Wan, Pengfei and Xie, Weicheng and Zeng, Long and Shen, Linlin and Feng, Pingfa},
journal={arXiv preprint arXiv:2504.00396},
year={2025}
}

