NeurST: Neural Speech Translation Toolkit

NeurST aims at easily building and training end-to-end speech translation, which has the careful design for extensibility and scalability. We believe this design can make it easier for NLP researchers to get started. In addition, NeurST allows researchers to train custom models for translation, summarization and so on.

NeurST is based on TensorFlow2 and we are working on the pytorch version.

Features

Fast training

NeurST uses LightSeq 2.0 for fast training. LightSeq 2.0 is an efficient training acceleration library for Transformers implemented in CUDA. It enables highly efficient computation of modern NLP models such as BERT, Transformer, etc. Currently, only fused encoder and decoder layers have been integrated, further acceleration models such as loss layer and optimizer is coming. In the standard machine translation task, NeurST has a single GPU and eight GPU acceleration ratio of 1.52 and 1.33 respectively with batch size 4096, compared with Tensorflow implementation of Transformers. In the case of batch size 512, the acceleration ratio is 2.09 and 1.56.

Comparison of training speed with different batch sizes are shown below.

8 GPUs

1 GPU

Models

NeurST provides reference implementations of various models, including:

Transformer (self-attention) networks
- Attention Is All You Need (Vaswani et al., 2017)
- Pay Less Attention With Lightweight and Dynamic Convolutions (Wu et al., 2019)
- CTNMT (Transformer with BERT enhanced encoder) from Towards making the most of bert in neural machine translation (Yang et al., 2020), see the examples in CTNMT
- Prune-Tune: Finding Sparse Structures for Domain Specific NMT

Recipes and Benchmarks

NeurST provides several strong and reproducible benchmarks for various tasks:

Text Translation
- Transformer models on WMT14 en->de
Speech-to-Text Translation
- Augmented Librispeech
- MuST-C
Weight Pruning
Quantization Aware Training

Additionally

multi-GPU (distributed) training on one machine or across multiple machines
- MirroredStrategy / MultiWorkerMirroredStrategy
- Byteps / Horovod
mixed precision training (trains faster with less GPU memory)
multiple search algorithms implemented:
- beam search
- sampling (unconstrained, top-k and top-p)
large mini-batch training even on a single GPU via delayed updates (gradient accumulation)
TensorFlow savedmodel for TensorFlow-serving
TensorFlow XLA support for speeding up training
extensible: easily register new datasets, models, criterions, tasks, optimizers and learning rate schedulers

Requirements and Installation

Python version >= 3.6
TensorFlow >= 2.3.0

Install NeurST from source:

git clone https://github.com/bytedance/neurst.git
cd neurst/
pip3 install -e .

If there exists ImportError during running, manually install the required packages at that time.

To enable LightSeq for fast training, you need to choose the right pypi package based on tensorflow version and cuda version. For example, if your platform is tensorflow 2.4.1 with cuda 11.0, you can install LightSeq using command:

git checkout lightseq
pip install lightseq-tf2.4.1-cuda11.0.221==2.0.1

If you train with multiple GPUs, you should use horovod as the distributed strategy, for exmaple

python3 -m neurst.cli.run_exp \
    --config_paths wmt14_en_de/training_args.yml,wmt14_en_de/translation_bpe.yml \
    --hparams_set transformer_base \
    --model_dir wmt14_en_de/benchmark_base \
    --enable_xla --distribution_strategy horovod

Citation

@misc{zhao2020neurst,
      title={NeurST: Neural Speech Translation Toolkit}, 
      author={Chengqi Zhao and Mingxuan Wang and Lei Li},
      year={2020},
      eprint={2012.10018},
      archivePrefix={arXiv},
}

Contact

Any questions or suggestions, please feel free to contact us: [email protected], [email protected].

Acknowledgement

We thank Bairen Yi, Zherui Liu, Yulu Jia, Yibo Zhu, Jiaze Chen, Jiangtao Feng, Zewei Sun for their kind help.

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
docs/images		docs/images
examples		examples
neurst		neurst
neurst_pt		neurst_pt
tests		tests
.editorconfig		.editorconfig
.flake8		.flake8
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
requirements.apt.txt		requirements.apt.txt
requirements.txt		requirements.txt
run.sh		run.sh
run_cli.sh		run_cli.sh
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeurST: Neural Speech Translation Toolkit

Features

Fast training

Models

Recipes and Benchmarks

Additionally

Requirements and Installation

Citation

Contact

Acknowledgement

About

Uh oh!

Releases 2

Packages

Contributors 5

Uh oh!

Languages

License

bytedance/neurst

Folders and files

Latest commit

History

Repository files navigation

NeurST: Neural Speech Translation Toolkit

Features

Fast training

Models

Recipes and Benchmarks

Additionally

Requirements and Installation

Citation

Contact

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 5

Uh oh!

Languages

Packages