Tackling Low-Resource Sign Language Translation: UPC at WMT-SLT 22

This repository contains the implementation for the WMT-SLT22 UPC team submission. The paper will be linked and available soon.

First steps

Clone this repository, create the conda environment and install Fairseq:

git clone -b wmt-slt22 [email protected]:mt-upc/fairseq.git
cd fairseq

conda env create -f ./examples/sign_language/environment.yml
conda activate sign-language

pip install --editable .

The execution of scripts is managed with Task. Please follow the installation instructions in the official documentation.

We recommend using the following

sh -c "$(curl --location https://taskfile.dev/install.sh)" -- -d -b path-to-env/sign-language/bin

Pre-processing steps

Fnid the different tasks related to the pre-processing steps defined inside Taskfile.yml. The nomenclature of the tasks is (challenge):dataset:(partition):task where tasks are defined as:

download: downloads the dataset
extract: decompresses the dataset
convert2video: from frames to video, necessary for the phoenix dataset.
videos_to_25fps: converts videos from any fps to 25 fps. Necessary for FocusNews.
extract_mediapipe: extracts mediapipe following the pose-format library.
generate_tsv: generates the tsv files for the dataset necessary in Fairseq.
train_sentencepipece: trains the sentencepiece model with the provided dataset data.

Tip: you can create an .env file that contains all local variables such as paths, WandB project, etc.

After the environment is set up, you can run the following command to run the different tasks:

task (challenge):dataset:(partition):task

Training

We provide the script to train.sh. The experiment launched should have a corresponding .yaml file, you can find the different .yaml used in the configs folder. The script creates a folder with the name of the experiment and saves the checkpoints, the logs and the wandb files.

Test

Similarly to the pre-processing steps, we have created a task to generate the predictions. The task are called generate and generate_no_target.

Citations

Some scripts from this repository use the GNU Parallel software.

Tange, Ole. (2022). GNU Parallel 20220722 ('Roe vs Wade'). Zenodo. https://doi.org/10.5281/zenodo.6891516
If you use this code, please cite the following paper:

@inproceedings{EMNLP WMT-SLT 2022, author = {Laia Tarrés, Gerard Ion Gállego, Xavier Giró-i-Nieto, Jordi Torres}, title = {Tackling Low-Resource Sign Language Translation: UPC at WMT-SLT 22}, booktitle = {}, year = {2022} }

Check the original Fairseq README to learn how to use this toolkit.

Name		Name	Last commit message	Last commit date
Latest commit History 2,241 Commits
.circleci		.circleci
.github		.github
docs		docs
examples		examples
fairseq		fairseq
fairseq_cli		fairseq_cli
scripts		scripts
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
README_FAIRSEQ.md		README_FAIRSEQ.md
RELEASE.md		RELEASE.md
hubconf.py		hubconf.py
pyproject.toml		pyproject.toml
release_utils.py		release_utils.py
setup.cfg		setup.cfg
setup.py		setup.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Tackling Low-Resource Sign Language Translation: UPC at WMT-SLT 22

First steps

Pre-processing steps

Training

Test

Citations

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

mt-upc/fairseq

Folders and files

Latest commit

History

Repository files navigation

Tackling Low-Resource Sign Language Translation: UPC at WMT-SLT 22

First steps

Pre-processing steps

Training

Test

Citations

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages