Skip to content

JHU-CLSP/BloomScrub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BloomScrub: Certified Mitigation of Worst-Case LLM Copyright Infringement

This repository contains the implementation and experimental code for BloomScrub, a method for mitigating copyright infringement in large language model outputs through iterative quote detection and rewriting.

Paper: Certified Mitigation of Worst-Case LLM Copyright Infringement (EMNLP 2025)

ArXiv: arXiv:2504.16046

Authors: Jingyu Zhang, Jiacan Yu, Marc Marone, Benjamin Van Durme, Daniel Khashabi


Quick Links


Overview

BloomScrub addresses worst-case copyright risks in LLM generations by:

  1. Detecting verbatim quotes from copyrighted sources using Bloom filters (via Data Portraits/QUIP)
  2. Dynamically rewriting detected quotes using LLM-based paraphrasing
  3. Iteratively repeating this process until outputs fall within safe risk thresholds
  4. Abstaining from answering when rewriting cannot reduce risk sufficiently

This repository extends the CoTaEval benchmark for evaluating copyright takedown methods.


What's New in This Repo

This repository builds on CoTaEval by adding:

1. BloomScrub Rewriting Methods

  • Multiple rewriting strategies for copyright mitigation
  • Iterative rewriting with configurable rounds
  • Dynamic rewriting with risk thresholds
  • Abstention mechanism for high-risk cases

2. New Datasets

  • NewsSpan: NYT articles dataset for copyright evaluation
  • Gigaword/NYT: Additional news article datasets
  • Processing scripts and evaluation data

3. Enhanced Evaluation

  • Question generation for utility evaluation
  • Longform evaluation metrics (relevance, faithfulness, hallucination)
  • Integration with external LLM judges (Llama 3.3, GPT-4, etc.)

4. CAD-FAISS

  • Context-aware decoding with FAISS-based retrieval
  • Efficient similarity search for copyright filtering

5. Utilities

  • QUIP API integration for quote detection
  • Infinigram API for frequency analysis
  • PTB tokenizer utilities
  • Helper functions for data processing

Installation

1. Create Conda Environment

conda env create -f environment.yml
conda activate cotaeval

2. Install Modified Transformers for R-CAD

cd cad/transformers_cad
pip install -e .
cd ../..

3. Install Additional Dependencies

pip install sentence-transformers faiss-cpu

4. Set Up QUIP Server (for BloomScrub)

BloomScrub requires a running QUIP server for quote detection. See data-portraits/memfree_usage.md for setup instructions.


Quick Start: Running BloomScrub

Basic Infringement Evaluation

python main.py \
  --model_name llama3.1-8b-newsspan-fullft-ep2 \
  --num_test 1000 \
  --context_len 200 \
  --completion_len 200 \
  --datatype newsspan \
  --eval_infringement \
  --no_context \
  --intervention quip-rewrite-mix \
  -r 5 \
  -t 50 \
  --rewrite_model llama3.1-8b-inst \
  --dynamic_rewrite

Key Parameters

  • --intervention: Rewriting method (see below)
  • -r, --rewrite_times: Maximum number of rewriting iterations (default: 5)
  • -t, --abstain_threshold: Character threshold for abstention (default: 50)
  • --dynamic_rewrite: Enable dynamic rewriting (stop when below threshold)
  • --rewrite_model: Model to use for rewriting (default: same as base model)

BloomScrub Rewriting Methods

BloomScrub supports multiple rewriting strategies:

Core Methods

  1. quip-rewrite-mix (Recommended)

    • First iteration: General paraphrasing
    • Subsequent iterations: Targeted rewriting of QUIP-detected quotes
    • Balances fluency and copyright mitigation
  2. quip-rewrite-para

    • Paraphrases text with special focus on quoted segments
    • Uses natural language instructions to highlight quotes
  3. quip-rewrite

    • Directly instructs model to rewrite verbatim quotes
    • Provides explicit quoted segments to remove
  4. rewrite-paraphrase

    • General paraphrasing without quote detection
    • Baseline method for comparison

Advanced Options

  • quip-rewrite-paramulti: Handles multiple quoted segments with highlighting
  • quip-rewrite-multiple: Provides numbered list of quoted segments
  • rewrite: System prompt-based rewriting
  • rewrite-no-minor: Emphasizes substantial changes

Configuration

# Run 5 rounds of rewriting, abstain if longest quote > 50 chars
python main.py \
  --intervention quip-rewrite-mix \
  -r 5 \
  -t 50 \
  --dynamic_rewrite \
  ...

# Use different model for rewriting
python main.py \
  --model_name llama3.1-8b-newsspan-fullft-ep2 \
  --rewrite_model llama3.1-8b-inst \
  --intervention quip-rewrite-mix \
  ...

Datasets

NewsSpan

Primary dataset for BloomScrub experiments:

# Full NewsSpan dataset (1000 examples)
python main.py --datatype newsspan ...

# Small debug set (100 examples)
python main.py --datatype newsspan100 ...

Other Datasets

# NewsQA (original CoTaEval)
python main.py --datatype newsqa ...

# Gigaword/NYT
python main.py --datatype nyt100 ...
python main.py --datatype nyt1000 ...

Evaluation

1. Infringement Metrics

Evaluate copyright risk using multiple metrics:

python main.py \
  --eval_infringement \
  --model_name llama3.1-8b-newsspan-fullft-ep2 \
  --datatype newsspan \
  --intervention quip-rewrite-mix \
  -r 5 -t 50 --dynamic_rewrite

Metrics computed:

  • ROUGE-1, ROUGE-L
  • Longest Common Subsequence (character and word)
  • Average Common Subsequence
  • Levenshtein Distance
  • Semantic Similarity
  • MinHash Similarity

2. Process Results

# Compute all metrics
python process.py \
  --input_dir res/output_res \
  --output_dir res/output_res_processed \
  --file_name <output_csv_file>

# Compute win rates
python winrate_compute.py \
  --data_type newsspan \
  --model_name llama3.1_8b \
  --scenario memorization

3. Utility Evaluation

Longform Quality

Evaluate relevance, faithfulness, and hallucination:

# Relevance
bash scripts/longform_eval/relevance.sh

# Faithfulness
bash scripts/longform_eval/faithfulness.sh

# Hallucination
bash scripts/longform_eval/hallucination.sh

These scripts use an external judge model (e.g., Llama 3.3-70B) to evaluate response quality.

Parse Judge Responses

python src/parse_judge_response.py \
  <dataset_path_1> <dataset_path_2> ... \
  --output_name relevance_eval

Experiment Scripts

Pre-configured scripts for reproducing paper experiments:

Infringement Evaluation on NewsSpan

# Baseline (no intervention)
bash scripts/inf_newsspan/none.sh

# System prompts
bash scripts/inf_newsspan/sysp-dbrx.sh

# MemFree (n-gram filtering)
bash scripts/inf_newsspan/memfree-n3.sh
bash scripts/inf_newsspan/memfree-n6.sh

# CAD-FAISS
bash scripts/inf_newsspan/cad-faiss-a1.sh
bash scripts/inf_newsspan/cad-faiss-a3.sh

# BloomScrub variants
bash scripts/inf_newsspan/quip-rewrite-mix_r5_t50_lm3rm.sh
bash scripts/inf_newsspan/quip-rewrite-para_r5_t50_lm3rm.sh
bash scripts/inf_newsspan/rewrite-paraphrase_r5_t50_lm3rm.sh

Question Generation

bash scripts/run_question_gen.sh

Key Files and Directories

Core Implementation

  • main.py: Main entry point for infringement evaluation
  • lib/decoding_intervention.py: Rewriting methods and CAD implementation
  • lib/eval.py: Evaluation functions for different datasets
  • process.py: Metric computation
  • winrate_compute.py: Win rate analysis

Utilities

  • src/quip_api.py: QUIP API integration for quote detection
  • src/build_faiss.py: FAISS index building for CAD-FAISS
  • src/parse_judge_response.py: Parse LLM judge evaluations
  • infinigram_api.py: Infinigram API for frequency analysis
  • jack_utils.py: General utility functions

Data Processing

  • src/process_newsspan.py: NewsSpan dataset processing
  • src/cond_perplexity.py: Perplexity-based filtering
  • src/mapping_normalized_to_original.py: Text normalization utilities

Evaluation Templates

  • prompt_templates/: JSON templates for evaluation prompts
    • relevance_eval.json: Relevance scoring
    • faithfulness_eval.json: Faithfulness scoring
    • hallucination_eval.json: Hallucination detection
    • question_gen.json: Question generation

Experiment Scripts

  • scripts/inf_newsspan/: Infringement evaluation scripts
  • scripts/longform_eval/: Utility evaluation scripts
  • scripts/create_bf_newsspan.sh: Bloom filter creation

Data

  • eval_data/newsspan/: NewsSpan evaluation data
  • eval_data/gigaword/: NYT/Gigaword data
  • eval_data/newsqa/: Original CoTaEval NewsQA data

Configuration

Model Paths

Edit modeltype2path in main.py to configure model paths:

modeltype2path = {
    'llama3.1-8b-inst': "meta-llama/Llama-3.1-8B-Instruct",
    'llama3.1-8b-newsspan-fullft-ep2': '/path/to/finetuned/model',
    ...
}

Environment Variables

Set in your shell or scripts:

export PROJ_DIR=/path/to/copyright-at-scale
export DATA_DIR=/path/to/data
export MODEL_DIR=/path/to/models

Common Issues and Solutions

QUIP Server Not Running

Error: QUIP API Request failed with exception: Connection refused

Solution: Start the QUIP server:

cd data-portraits
python easy_redis.py --start-from-dir /path/to/bloom_filter

FAISS Index Not Found

Error: FileNotFoundError: faiss_index/newsspan_minilm-l6-v2/faiss.index

Solution: Build the FAISS index:

python src/build_faiss.py \
  --dataset newsspan \
  --output_dir faiss_index/newsspan_minilm-l6-v2

Citation

If you use this code or the BloomScrub method, please cite:

@inproceedings{zhang2025bloomscrub,
  title={Certified Mitigation of Worst-Case LLM Copyright Infringement},
  author={Zhang, Jingyu and Yu, Jiacan and Marone, Marc and Van Durme, Benjamin and Khashabi, Daniel},
  booktitle={Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing},
  year={2025},
  url={https://aclanthology.org/2025.emnlp-main.1784.pdf}
}

Acknowledgments

This repository is built on the CoTaEval benchmark. We thank the original authors for their excellent work and for making their code publicly available. See COTAEVAL_ORIGINAL.md for the original CoTaEval documentation.


Contact

For questions or issues, please open a GitHub issue, or email the authors (see paper for email address).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published