Skip to content

jrohsc/Chain-of-Code-Collapse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chain-of-Code Collapse: Reasoning Failures in LLMs via Adversarial Prompting in Code Generation

📄 Paper: https://arxiv.org/pdf/2506.06971

📎 Dataset Download Link


Overview of Break-The-Chain

🧠 Overview

BREAK-THE-CHAIN introduces a new framework to test the reasoning robustness of Large Language Models (LLMs) by applying adversarial, yet meaning-preserving, modifications to code generation tasks. Despite strong performance on clean prompts, we find that LLMs are highly sensitive to superficial prompt changes, revealing brittle reasoning under surface-level linguistic shifts.

We generate 700 perturbed prompts from 100 LeetCode-style problems using 7 distinct transformation types, and evaluate 9 popular LLMs (Claude, Gemini, DeepSeek, Qwen, LLaMA) on their reasoning resilience.


💡 Key Contributions

  • 🔧 Introduced 7 adversarial perturbation types that retain problem semantics while changing prompt structure:

    • Storytelling, Gamification, Distracting Constraints, Domain Shift, Example Perturbation, Negation Objective, Soft Negation
  • 📊 Evaluated 9 leading LLMs on 700 total instances (100 clean problems × 7 transformations) using Pass@1 metric and difficulty stratification.

  • 📉 Found severe reasoning failures:

    • Claude-3.7 Sonnet dropped -54.3% under domain shift.
    • Claude-3.7 Sonnet dropped -42.1% under distracting constraints.
  • 📈 Discovered accuracy gains in certain cases:

    • Qwen2.5-Coder improved by +24.5% with Example Perturbation.
    • LLaMA-3.1-Instruct improved by +35.3% with Storytelling.
    • Gemini-2.0-Flash gained +12.0% under Example Perturbation.
  • 🧪 Released a perturbed benchmark dataset and evaluation scripts for robust testing of LLM reasoning behavior.


🔬 Example Perturbation (Gamification)

Clean Prompt Gamified Prompt
"Given a 2D integer array, return the final matrix sum score" "In the realm of Azura, brave adventurers collect treasure maps..."

📁 See all transformed prompts from the downloaded dataset.


📊 Results Snapshot

Model Clean Storytelling Gamification Distracting Constraints
Gemini 2.5 Flash 95.0% 97.4% (+2.4%) 96.9% (+1.9%) 95.5% (+0.5%)
Claude 3.7 Sonnet 90.0% 63.4% (-26.6%) 50.0% (-40%) 47.9% (-42.1%)
LLaMA 3.1 Instruct 19.0% 44.7% (+25.7%) 37.8% (+18.8%) 37.6% (+18.6%)

🧠 Natural-sounding rewrites like storytelling can improve performance, while distracting constraints severely hurt reasoning ability. Except for Claude-3.7-Sonnet, which is the model that achieved with the highest clean accuracy but a drop across all perturbations.


📥 Run

Access all clean and modified prompts here:
📎 Dataset Download Link

1. Download the folders and place them into the corresponding directories:

  • modified_datadata_modified/
  • clean_datadata/

2. For Batch Runs:

  • For clean input testing:
python run_script_main.py
  • For perturbed input testing:
python run_script_main_perturbation.py

‼️ Make sure to enter Anthropic and Gemini Key where it says ENTER_KEY in the main.py and main_perturbation.py.


📝 Note

We utilized the lcb_runner/ folder directly from LiveCodeBench repository for a smooth and accurate evaluation.

📜 Citation

If you use this dataset or findings, please cite:

@misc{roh2025breakthechain,
  title        = {{BREAK-THE-CHAIN: Adversarial Prompting in Code Generation}},
  author       = {Jaechul Roh and Varun Gandhi and Shivani Anilkumar and Arin Garg},
  year         = {2025},
  howpublished = {\url{https://github.com/jrohsc/685_Project/tree/main/LiveCodeBench}},
  note         = {UMass Amherst CS685 Advanced NLP Project},
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published