Uncertainty Drives Social Bias in Quantized Large Language Models

This repository contains code to replicate the experiments performed in the paper Uncertainty Drives Social Bias in Quantized Large Language Models by Stanley Hua, Sanae Lotfi and Irene Chen.

We perform a large-scale study on social bias in quantized large language models. On 13 curated datasets, we evaluate 5 quantization methods (RTN/AWQ/GPTQ/SmoothQuant) on 10 open-source models (LLaMA/Qwen/Mistral) ranging from 0.5B to 14B parameters. We find that uncertain responses are the most susceptible to changing post-quantization, social groups experience this asymmetrically, and response flipping can occur largely despite no change in dataset-aggregate metrics. Unsurprisingly, we find that 8-bit quantization leads to lesser bias changes than 4-bit quantization, and that quantization disrupts prior rankings on bias. On the other hand, we found that no evidence that larger (14B) models are particularly more safe to this phenomenon than smaller (0.5B) models.

We hope our work challenges the research community to think carefully about deploying quantized LLMs and to consider the varied impacts these subtle choices make on different members in society. Furthermore, we hope that by example, our work can serve as inspiration to improve standards and rigor in benchmarking efforts for measuring social bias in LLMs.

💴 Datasets

Style	Capability	Dataset	Questions
Closed	1	CEB-Recognition	1,600
Closed	1	CEB-Jigsaw	1,500
Closed	2	CEB-Adult	1,000
Closed	2	CEB-Credit	1,000
Closed	3	BiasLens-Choices	10,917
Closed	3	SocialStigmaQA	10,360
Closed	3	BBQ	29,238
Closed	3	IAT	13,858
Closed	3	StereoSet-Intersentence	2,123
Open	3	BiasLens-GenWhy	10,972
Open	3	CEB-Continuation	800
Open	3	CEB-Conversation	800
Open	3	FMT10K-IM	1,655
Open	3	Total	85,823

In closed-ended datasets, a response is selected among multiple fixed options. We use geometric average tokene probability in each choice to select a response. In open-ended datasets, a text response is generated with greedy decoding and evaluated later asynchronously using LLaMA Guard 8B.

🔧 Quickstart

Package Installation via Pixi

# Get repository
git clone https://github.com/stan-hua/PostTrainingBiasBenchmark
cd [repository]

# Install pixi (a faster package manager alternative to conda)
curl -fsSL https://pixi.sh/install.sh | sh

# Install dependencies
# NOTE: -e specifies the environment
# NOTE: The following environments are available
#       `vllm`: for performing inference with vLLM
#       `analysis`: for performing analysis and generating plots
#       `quantizer`: for quantizing models locally
#       `simpo`: for performing SimPO experiment
pixi shell -e vllm

(Optional) Registering your OpenAI key

NOTE: Most of our code is designed to run models locally. One exception is the use of OpenAI models to extract social groups from datasets.

echo 'export OPENAI_KEY="[ENTER HERE]"' >> ~/.bashrc
source ~/.bashrc

🏃 How to Run

Generate LLM responses

# Activate environment
pixi shell -e vllm

# Option 1. In shell
MODEL_NICKNAME="llama3.1-8b-instruct"    # shorthand defined in config.py / MODEL_INFO
python -m scripts.benchmark generate ${MODEL_NICKNAME};

# Option 2. In a SLURM batch job
# NOTE: Modify sbatch script to run specified models
sbatch slurm/generate_responses.sh

Use LLaMA-Guard to evaluate safety of open-ended responses

# Option 1. In shell
MODEL_NICKNAME="llama3.1-8b-instruct"    # shorthand defined in config.py / MODEL_INFO
python -m scripts.benchmark bias_evaluate ${MODEL_NAME};

# Option 2. In a SLURM batch job
# NOTE: Modify sbatch script to evalute specified models
sbatch slurm/evaluate_responses.sh

Reproduce paper figures and tables

# Option 1. In a SLURM batch job
sbatch slurm/create_paper_figures.sh

Adding models

To add a new model, please update MODEL_INFO in config.py.

Example: "Meta-Llama-3.1-8B-Instruct-GPTQ-4bit"

1. In `MODEL_INFO['model_group']`, append "llama3.1-8b-instruct"
2. In `MODEL_INFO['model_path_to_name']`, provide mapping of HuggingFace / local path to a model shorthand.
NOTE: It should follow the standard: `[original_model]-[q_method]-[bit_configuration]`.
  e.g., {"Meta-Llama-3.1-8B-Instruct-GPTQ-4bit": "llama3.1-8b-instruct-gptq-w4a16"},

👏 Acknowledgements

Special thanks to the authors of the CEB Benchmark, whose code base served as the starting point for this repository.

Citation

If you find our work useful, please consider citing our paper!

@article{YourName,
  title={Your Title},
  author={Your team},
  journal={Location},
  year={Year}
}

🌲 About the repository

To guide contributors, we provide 1-line explanations describing important folders in the repository.

./
├── data/                   # Data directory
│   ├── closed_datasets/         # Closed-ended datasets
│   ├── open_datasets/           # Open-ended datasets
│   └── save_data/              # Saved artifacts from inference
│       ├── llm_generations/        # Contains responses generated by each model
│       ├── analysis/               # Contains analysis related data
│       └── models/                 # Contains local models
├── scripts/                # Contains scripts to run
├── slurm/                  # Contains scripts for running on SLURM server
├── src/
│   ├── bin/                # Contains command-line script for renaming models
│   └── utils/              # Contains code for LLM inference and evaluation
└── config.py               # Contains global constants

Name		Name	Last commit message	Last commit date
Latest commit History 248 Commits
assets		assets
data		data
envs		envs
scripts		scripts
slurm		slurm
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
config.py		config.py
pixi.lock		pixi.lock
pixi.toml		pixi.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Uncertainty Drives Social Bias in Quantized Large Language Models

💴 Datasets

🔧 Quickstart

🏃 How to Run

Adding models

👏 Acknowledgements

Citation

🌲 About the repository

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Uncertainty Drives Social Bias in Quantized Large Language Models

💴 Datasets

🔧 Quickstart

🏃 How to Run

Adding models

👏 Acknowledgements

Citation

🌲 About the repository

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages