Clone repository
git clone https://github.com/sony/CLIPSep.git
cd CLIPSep/clipsepInstall the dependencies:
conda env create -f environment.yml
conda activate clipsepWe provide a script to download datasets used in our paper and the pre-trained networks. The datasets and network checkpoints will be downloaded and stored in the CLIPSep/clipsep/data and CLIPSep/clipsep/exp/vggsound directories, respectively.
Please use the script in CLIPSep/music directory.
Please use the script in CLIPSep/vggsound directory.
The pretrained CLIPSep-NIT model can be found here.
Train CLIPSep-NIT model
python train.py -o exp/vggsound/clipsep_nit -t data/vggsound/train.csv -v data/vggsound/val.csv --image_model clipsepnitOMP_NUM_THREADS=1 python infer.py -o exp/vggsound/clipsep_nit/ -i "demo/audio/hvCj8Dk0Su4.wav" --text_query "playing bagpipes" -f "exp/vggsound/clipsep_nit/hvCj8Dk0Su4/playing bagpipes.wav"Evaluate on MUSIC + VGGSound
OMP_NUM_THREADS=1 python evaluate.py -o exp/vggsound/clipsep_nit/ -l exp/vggsound/clipsep_nit/eval_woPIT_MUISC_VGGS.txt -t data/MUSIC/solo/test.csv -t2 data/vggsound/test-good-no-music.csv --no-pit --prompt_ensEvaluate on VGGSoundClean + VGGSound
OMP_NUM_THREADS=1 python evaluate.py -o exp/vggsound/clipsep_nit/ -l exp/vggsound/clipsep_nit/eval_woPIT_VGGS_VGGSN.txt -t data/vggsound/test-good.csv -t2 data/vggsound/test-no-music.csv --no-pit --prompt_ensVisualize results on VGGSoundClean + VGGSound
OMP_NUM_THREADS=1 python visualize.py -o exp/vggsound/clipsep_nit/ -t data/vggsound/test-good.csv -t2 data/vggsound/test-no-music.csv --vis_dir exp/vggsound/clipsep_nit/visualization_VGGSGood_VGGSIf you find this work useful for your research, please cite our paper:
@inproceedings{dong2023clipsep,
title={CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos},
author={Hao-Wen Dong and Naoya Takahashi and Yuki Mitsufuji and Julian McAuley and Taylor Berg-Kirkpatrick},
booktitle={Proceedings of International Conference on Learning Representations (ICLR)},
year={2023}
}
