Beatboxing classification using the Amateur Vocal Percussion (AVP) dataset. This is the code used for our final report of the MSCI 446 project.
Install conda and create env from given file:
conda env create -f environment.yml
Download the dataset here and unzip it.
We use the onset labels to split up the audio clips into individual utterances. To run this pre-processing, do:
python preprocess.py --root <DATASET_ROOT_DIR> --val_ratio <VALIDATION_SET_RATIO>
To run the training, setup the config.yaml with the correct dataset path and desired hyperparameters then run:
python train.py config.yaml
To produce our final model, we used Weights & Biases to run a hyperparameter sweep over the learning rate, weight decay, and eps parameters. To run the sweep, you first need to login using "wandb login" and entering your wandb API key. To start a sweep, first run:
wandb sweep sweep.yaml
A sweep ID will be printed out in the terminal. To then start training models, run:
wandb agent <SWEEP_ID>
The rest of our experiments were done in the MFCC_learning.ipynb notebook. This includes all non-CNN supervised learning and all unsupervised leanring.