Proving membership in LLM pretraining data via data watermarks

This is an official implementation for our paper, Proving membership in LLM pretraining data via data watermarks , ACL Findings 2024

@inproceedings{wei2024provingmembershipllmpretraining,
    title = "Proving membership in {LLM} pretraining data via data watermarks",
    author = "\text{J. Wei*}  and
      \textbf{R. Wang*}  and
      Jia, Robin",
    booktitle = "Findings of the Association for Computational Linguistics ACL 2024",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand and virtual meeting",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.findings-acl.788",
    doi = "10.18653/v1/2024.findings-acl.788",
    pages = "13306--13320",
}

Note that the gpt-neox folder in this repository is a near-identical clone of the GPT-NeoX repository by EleutherAI found here https://github.com/EleutherAI/gpt-neox.

Setting up the Environment

Ensure that Conda is installed
Run the following commands, which creates a new Conda environment and installs Pytorch and CUDA dependencies"

conda create -n hubble python=3.8
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
conda install cudatoolkit=11.7 -c conda-forge # this is probably not necessary anymore
conda install -c conda-forge cudatoolkit-dev

cd into the gpt-neox directory and run the following commands, which installs the GPT-NeoX dependency:

pip install -r requirements/requirements.txt
pip install -r requirements/requirements-wandb.txt # optional, if logging using WandB

# optional: if you want to use FlashAttention
# for the next line, ssh into any machine with cuda version > 11.6
export CUDA_HOME=<path_to_your_conda>/envs/hubble # replace this with your conda environment path
pip install -r ./requirements/requirements-flashattention.txt
pip install triton

Preparing the Watermarked Data

To watermark pre-training data, you first need a base pre-training dataset (e.g. a shard from pile) stored in jsonl format. Ensure that this file is stored in the data directory.

cd to the root directory. The following command will insert a watermark of 10 characters into 32 documents inside a base pre-training dataset data/pile1e8_orig.jsonl.

bash perturb_data.sh

To run the unicode perturbations, set exp_name to "unicode_properties"

Pre-training using NeoX

Tokenizing the data

cd to the root directory. The following command will tokenize the watermarked data.

bash tokenize_data.sh

Pre-training using NeoX

cd to the root directory. The following command will pre-train the model using the watermarked data.

bash pretrain.sh

By default, the code will run a 70M model in Pythia configs with 1 training step for demo purposes. The following is a list of important changes to make:

In the model configs:

global_num_gpus: the number of GPUs to use
train_batch_size: the batch size in total
train_micro_batch_size_per_gpu: the batch size per GPU
gradient_accumulation_steps: the number of steps to accumulate gradients over
train_iters: the number of steps to train for
seq_len: the sequence length of the model

In the setup configs:

data_path: the path to the data (tokenized already)
save: the path to save the model
include: allows you to specify gpus to use by setting to the string "localhost:0,1,2,3"
master_port: the port to use for training. Different runs on the same machine should use different ports

Converting to HF format

cd to the root directory. The following command will convert the model to the Hugging Face format.

bash convert_neox_to_hf.sh

Hypothesis Testing w. HF Model

cd to the root directory. The following command will run inference using the HF converted model.

bash score_model.sh

For Unicode experiments, change score_model.py to call calculate_scores_unicode_properties instead of calculate_scores_unstealthy. Note that in order to run hypothesis testing for unicode, you must prepare the watermarked data using unicode (by setting exp_name to "unicode_properties" when running bash perturb_data.sh)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Proving membership in LLM pretraining data via data watermarks

Setting up the Environment

Preparing the Watermarked Data

Pre-training using NeoX

Tokenizing the data

Pre-training using NeoX

Converting to HF format

Hypothesis Testing w. HF Model

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
configs		configs
data		data
gpt-neox		gpt-neox
src		src
.gitignore		.gitignore
convert_neox_to_hf.sh		convert_neox_to_hf.sh
main.py		main.py
perturb_data.py		perturb_data.py
perturb_data.sh		perturb_data.sh
readme.md		readme.md
score_model.py		score_model.py
score_model.sh		score_model.sh
tokenize_data.sh		tokenize_data.sh
train.sh		train.sh

ryanyxw/datawatermarks

Folders and files

Latest commit

History

Repository files navigation

Proving membership in LLM pretraining data via data watermarks

Setting up the Environment

Preparing the Watermarked Data

Pre-training using NeoX

Tokenizing the data

Pre-training using NeoX

Converting to HF format

Hypothesis Testing w. HF Model

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages