[feature] add a frontend module in wespeaker and support wavlm#344
[feature] add a frontend module in wespeaker and support wavlm#344
Conversation
b9b8fb2 to
a85085f
Compare
| with torch.cuda.amp.autocast(enabled=configs['enable_amp']): | ||
| features, _ = model.module.frontend(wavs, wavs_len) | ||
|
|
||
| with torch.cuda.amp.autocast(enabled=configs['enable_amp']): |
There was a problem hiding this comment.
I don't think it is necessary to add amp context here. There is no pytorch model involved.
| def spec_aug(feats, num_t_mask=1, num_f_mask=1, max_t=10, max_f=8, prob=0.6): | ||
| # feats batch: (B,T,F) | ||
| # do spec_aug on all batch samples using a same group of params randomly | ||
| # TODO (hongji): do spec_aug on each sample separately |
There was a problem hiding this comment.
I think you can directly use the implementation in https://pytorch.org/audio/master/generated/torchaudio.transforms.FrequencyMasking.html#torchaudio.transforms.FrequencyMasking
There was a problem hiding this comment.
Good idea. I will try it later.
|
Hello @JiJiJiang , I have listed some comments. Besides, there seems no independent recipe with run.sh. |
|
do you have any checkpoint on wavlm+ecapa-tdnn ? |
Sorry, I have lost the access to my exp dir, as well as the checkpoint. |
…-e2e#344) * [feature] add a frontend module in wespeaker and support wavlm * update .gitignore * update wavlm configs * update wespeaker/frontend/__init__.py * [fix] remove trailing whitespaces * [fix] fix lint errors * [fix] fix lint errors * [fix] fix lint errors * [fix] fix spelling mistakes * update run.sh * update wavlm configs and add run_wavlm.sh * update README.md
All pre-trained models and configs in the pretrained page can be loaded and used normally after this update!