Skip to content

tipt0p/SparseBayesianRNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bayesian Compression of Recurrent neural networks

This repo contains the code for our EMNLP18 paper Bayesian Compression for Natural Language Processing and NeurIPS 2018 CDNNRIA Workshop paper Baeysian Sparsification of Gated Recurrent Neural Networks. We showed that Variational Dropout leads to extremely sparse solutions in recurrent neural networks.

Code uses Lasagne + Theano.

Launch text classification experiments

Scripts for different datasets (IMDB, AGNews) are stored in Experiments folder.

cd Experiments/<dataset>
python main.py CONFIG

CONFIG is a 6-chars string defining which model to train. Each element of CONFIG is one of 3 chars:

  • L: probabilistic weight with log-uniform prior and normal approximate posterior
  • D: deterministic learnable weight
  • C: constant weight with all elements equal to 1.

6 chars stand for:

  • CONFIG[0]: embedding layer, individual weights
  • CONFIG[1]: embedding layer, group variabes for vocabulary elements
  • CONFIG[2]: LSTM layer, individual weights
  • CONFIG[3]: LSTM layer, group variables for preactivation of gates
  • CONFIG[4]: LSTM layer, group variables for input and hidden neurons
  • CONFIG[5]: fully-connected layer, individual weights

Configs for paper Bayesian Compression for Natural Language Processing:

  • Original: DCDCCD
  • Sparse-VD: LCLCCL
  • SparseVD-Voc: LLLCCL

Configs for paper Baeysian Sparsification of Gated Recurrent Neural Networks:

  • Original: DCDCCD
  • SparseVD W: LCLCCL
  • SparseVD W+N: LLLCLL
  • SparseVD W+G+N: LLLLLL

During training, accuracy and compression rates are printed to the output stream. Results from the papers are obtained with Python 3.6.3, theano 1.0.1 and lasagne 0.2.dev1.

Launch language generation experiments

Scripts for char-level and word-level tasks on PTB dataset are stored in Experiments folder (Char_PTB, Word_PTB).

cd Experiments/<dataset>
python main.py CONFIG

CONFIG is a 4-chars string defining which model to train. Each element of CONFIG is one of 3 chars:

  • L: probabilistic weight with log-uniform prior and normal approximate posterior
  • D: deterministic learnable weight
  • C: constant weight with all elements equal to 1.

There are also 2 additional options for group sparsification of neurons in LSTM layer:

  • R: L option ONLY for hidden neurons
  • I: L option ONLY for input neurons

4 chars stand for:

  • CONFIG[0]: LSTM layer, individual weights
  • CONFIG[1]: LSTM layer, group variables for preactivation of gates
  • CONFIG[2]: LSTM layer, group variables for input and hidden neurons
  • CONFIG[3]: fully-connected layer, individual weights

Configs for paper Bayesian Compression for Natural Language Processing:

  • Original: DCCD
  • Sparse-VD: LCCL
  • SparseVD-Voc: LCIL

Configs for paper Baeysian Sparsification of Gated Recurrent Neural Networks:

  • Original: DCCD
  • SparseVD W: LCCL
  • SparseVD W+N: LCRL
  • SparseVD W+G+N: LLRL

During training, accuracy and compression rates are printed to the output stream. Results from the papers are obtained with Python 2.7.13, theano 0.9.0.dev-RELEASE and lasagne 0.2.dev1.

Citation

If you found this code useful please cite one of our papers:

@InProceedings{SparseBayesianRNN,
  author = 	"Chirkova, Nadezhda and Lobacheva, Ekaterina and Vetrov, Dmitry",
  title = 	"Bayesian Compression for Natural Language Processing",
  booktitle = 	"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
  year = 	"2018",
  publisher = 	"Association for Computational Linguistics",
  pages = 	"2910--2915",
  location = 	"Brussels, Belgium",
  url = 	"http://aclweb.org/anthology/D18-1319"
}
@InProceedings{SparseBayesianGatedRNN,
  author = 	"Lobacheva, Ekaterina and Chirkova, Nadezhda and Vetrov, Dmitry",
  title = 	"Bayesian Sparsification of Gated Recurrent Neural Networks",
  booktitle = 	"Proceedings of the Workshop on Compact Deep Neural Networks with industrial applications, NeurIPS 2018",
  year = 	"2018"
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages