Skip to content

Peacer68/FMSeq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FMSeq

Official Codebase for Enable Fast Sampling for Seq2Seq Text Diffusion.

Performance Workflow

Performance: Picture on the left shows BLEU Scores of different models for the paraphrase task on the QQP dataset. Our FMSeq beats all the models when using a single sampling step and achieves comparable performance to DiffuSeq (2000 steps) with only 10 steps. Workflow: Picture on the right shows workflow of FMSeq. We utilize embedding to map the discrete token space into a continuous space. The forward process diffuses the target embedding along a linear path, and the model fits the velocity of the target part conditioned on clean source embedding and noisy target embedding.

Dataset

Prepare datasets and put them under the datasets folder. Take datasets/CommonsenseConversation/train.jsonl as an example. We use four datasets in our paper.

Task Datasets Source
Open-domain Dialogue CommonsenseConversation download
Question Generation Quasar-T download
Text Simplification Wiki-alignment download
Paraphrase QQP-Official download
Machine Translation iwslt14-de-en download

Training

For Non-MT (Machine Translation) tasks, run:

cd scripts
# qqp:
bash train_qqp.sh
# others: modify learning_steps, dataset, data_dir, notes

For MT tasks, run:

cd scripts
bash train_de2en.sh

The trained checkpoints are provided here: link of ckpt

Decoding

cd scripts
bash run_decode.sh
# core parameters: step and td

Evaluation

cd scripts
bash eval.sh
# you can eval single file or multiple file which are in the same folder (mbr in default)

Citation

Please add the citation if our paper or code helps you.

@inproceedings{liu2024enable,
  title={Enable Fast Sampling for Seq2Seq Text Diffusion},
  author={Liu, Pan and Tian, Xiaohua and Lin, Zhouhan},
  booktitle={Findings of the Association for Computational Linguistics: EMNLP 2024},
  pages={8495--8505},
  year={2024}
}

Acknowledgements

This implementation is based on DiffuSeq

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors