Skip to content

patrickpynadath1/dab

Repository files navigation

arxiv project

Code-base for "Controlled LLM Decoding via Discrete Auto-regressive Biasing", by Patrick Pynadath, and Ruqi Zhang.

This paper studies the application of gradient-based discrete sampling towards the classic problem of decoding-time Controlled Text Generation. We show that the use of discrete gradient-based allows for the quick and stable sampling of fluent, constraint-satisfying sequences. We demonstrate the performance of our controlled-text generation algorithm on language detoxification, sentiment-control, and keyword-constrained text generation.

DAB Overview

Installation and Setup

Our code-base is built off of the BOLT code-base from "BOLT: Fast Energy-based Controlled Text Generation with Tunable Biases". After cloning the repository, run the following code:

cd ./transformers 
pip install -e .
cd ..

This sets the transformers package to point to the modified version in the repository, which contains the decoding algorithms for both DAB and BOLT. If you are curious, the decoding algorithms can be found This python file. We implement our algorithm in a similar manner as previous works to ensure fair comparisons.

Model Checkpoints

For comparison purposes, we use the same model checkpoints for both sentiment control and language detoxification as BOLT. They make the model checkpoints available at this link.

Sentiment Experiment

To run the sentiment experiment, run the following:

python main.py --exp sentiment dlp

Language Detoxification

python main.py --exp detoxify dlp

Keyword-constrained Text Generation

python main.py --exp keywords dlp

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published