Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration

Yun-Yen Chuang · Hung-Min Hsu · Kevin Lin · Chen-Sheng Gu · Ling-Zhen Li · Ray-I Chang · Hung-yi Lee
[Slide] [Poster]

Our Paper at NeurIPS 2024

This project is based on our paper accepted at 38th Conference on Neural Information Processing Systems (NeurIPS 2024), titled "Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration". You can find the paper here.

Meta-DiffuB

Comparison between S2S-Diffusion model (i.e., DiffuSeq) and the proposed Meta-DiffuB. The shades of color represent different amounts of noise being imposed. Different from prior works that use a fixed noise, we introduce a novel scheduler-exploiter framework, Meta-DiffuB, which achieves trainable noise scheduling inspired by Meta Exploration. Our scheduler model schedules contextualized noise, enhancing the training and generation of the S2S-Diffusion model, resulting in state-of-the-art (SOTA) performance compared to previous S2S-Diffusion models, as detailed in Section 4.

Getting started

Our implementation is based on Python 3.8, PyTorch 1.11 and Fairseq 0.10.2. The following command will install the dependencies and this package in a Conda environment:

conda install pytorch==1.11.0 -c pytorch
pip install -e .

According to the provided steps, after confirming the installation of fairseq, please replace the code in the installed environment with the code from the fairseq and fairseq_cli folders that we provided.

Datasets

For the non-translation task, we follows DiffuSeq dataset settings. Prepare datasets and put them under the datasets folder. Take datasets/WA/train.jsonl as an example. We use four datasets in our paper.

Task	Datasets	TRaiing Sample	Source	Used in Meta-DiffuB
Open-domain Dialogue	Commonsense Conversation	3382k	CCM	download
Question Generation	Quasar-T	117k	OpenQA	download
Text Simplification	Wiki-Auto	677k	Wiki-auto	download
Paraphrase	Quora Question Pairs	144k	Kaggle	download

For the translation task, we follow the instructions of Fairseq to preprocess the translation datasets. Then we adopt knowledge distillation using Transformer models trained on the same datasets. To binarize the distilled and tokenized datasets, run following command (take the IWSLT14 De-En dataset as an example):

fairseq-preprocess \
    --source-lang de --target-lang en \
    --trainpref {PATH-TO-YOUR-DATASET}/train \
    --validpref {PATH-TO-YOUR-DATASET}/valid \
    --testpref {PATH-TO-YOUR-DATASET}/test \
    --destdir data-bin/iwslt14_de_en_distill \
    --joined-dictionary \
    --workers 20

Training, inference, and Evaluation

All training, inference, and evaluation scripts are located in the {model_type}/scripts directory. For example, to train Meta-DiffuB-Difformer on the QQP dataset, simply run:

bash scripts/qqp/train.sh

To run inference and evaluate Meta-DiffuB-Difformer on the QQP dataset, run:

bash scripts/qqp/evaluate.sh

For Meta-DiffuB-DiffuSeq, a different approach is required. Instead of bash scripts, Jupyter Notebook files are used for training, inference, and evaluation. Specifically:

To train Meta-DiffuB-DiffuSeq, execute scripts/Train.ipynb in Jupyter Notebook.
To run inference, execute scripts/Inference.ipynb in Jupyter Notebook.
To evaluate the model, execute scripts/Evaluate.ipynb in Jupyter Notebook.

You can modify the parameters in the .ipynb files (such as the dataset) to fit your specific usage scenario.

Baseline Model Reference

The other S2S-Diffusion models' code we run for experiments.

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
Meta-DiffuB-BG-DiffuSeq		Meta-DiffuB-BG-DiffuSeq
Meta-DiffuB-Difformer		Meta-DiffuB-Difformer
Meta-DiffuB-DiffuSeq		Meta-DiffuB-DiffuSeq
Meta-DiffuB-Dinoiser		Meta-DiffuB-Dinoiser
Meta-DiffuB-LD4LG		Meta-DiffuB-LD4LG
Meta-DiffuB-RDM		Meta-DiffuB-RDM
Meta-DiffuB-SeqDiffuSeq		Meta-DiffuB-SeqDiffuSeq
fairseq		fairseq
fairseq_cli		fairseq_cli
img		img
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration

Our Paper at NeurIPS 2024

Meta-DiffuB

Getting started

Datasets

Training, inference, and Evaluation

Baseline Model Reference

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration

Our Paper at NeurIPS 2024

Meta-DiffuB

Getting started

Datasets

Training, inference, and Evaluation

Baseline Model Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages