This is the official repository for our paper:
๐ HOComp: Interaction-Aware Human-Object Composition
๐ Preprint available on arXiv
HOComp is a novel framework for harmonizing foreground objects into human-centric backgrounds.
By leveraging a Flux.1 Kontext base model and a novel Sequence Concatenation strategy, the method achieves precise control over humanโobject interactions with high fidelity.
To generate a specific interaction, provide background / foreground images, the interaction prompt, and the foreground bounding box:
python run_inference.py \
--prompt "A young man holding a vintage camera" \
--bg_path "examples/background.jpg" \
--fg_path "examples/camera.png" \
--box "[300 300 700 700]" If you find our work helpful, please consider citing:
@article{liang2025hocomp,
title={HOComp: Interaction-Aware Human-Object Composition},
author={Dong Liang and Jinyuan Jia and Yuhao Liu and Rynson W. H. Lau},
journal={arXiv preprint arXiv:2507.16813},
year={2025}
}