This repo provides code and data associated with EMNLP 2025 paper "LLMs are Better Than You Think: Label-Guided In-Context Learning for Named Entity Recognition".
git clone https://github.com/bflashcp3f/deer.git
cd deerUsing Conda (recommended):
conda env create -f environment.yaml
conda activate deerUsing pip:
python -m venv deer_env
source deer_env/bin/activate
pip install -r requirements.txtSet your OPENAI_API_KEY (or TOGETHER_API_KEY) keys as environment variables
pip install -e .Download the preprocessed datasets from Google Drive and extract them to the appropriate directories:
# Download deer_data.tar.gz from Google Drive link above
# Then extract the data
tar -xzf deer_data.tar.gz
# The data should be organized as follows:
data/
├── ncbi/
├── conll03/
├── bc2gm/
├── ontonotes/
└── tweetner7/Each dataset directory contains train, validation/dev, and test splits in JSONL format, along with pre-computed embeddings.
- NCBI: Biomedical entity recognition
- CoNLL-03: Popular NER benchmark (Person, Location, Organization, Misc)
- bc2gm: Gene mention detection in biomedical text
- OntoNotes: 18 entity types across multiple domains
- TweetNER7: Social media NER with 7 entity types
See the scripts/ directory for dataset-specific examples:
# Run in-context learning step of DEER on NCBI
bash scripts/ncbi/run_deer_icl.sh 8 openai text-embedding-3-small openai gpt-4o-mini-2024-07-18 64 1.0 1.0 0.01
# Run error reflection step of DEER on NCBI
bash scripts/ncbi/run_deer_er.sh 8 deer openai text-embedding-3-small openai gpt-4o-mini-2024-07-18 64 1.0 1.0 0.01 1 0.75 0.75 0.95Detailed evaluation results can be found in the Jupyter notebooks under the notebooks/ directory:
If you use DEER in your research, please cite:
@inproceedings{bai-etal-2025-llms,
title = "{LLM}s are Better Than You Think: Label-Guided In-Context Learning for Named Entity Recognition",
author = "Bai, Fan and
Hassanzadeh, Hamid and
Saeedi, Ardavan and
Dredze, Mark",
booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
year = "2025",
publisher = "Association for Computational Linguistics",
}This project is licensed under the MIT License - see the LICENSE file for details.
For questions and feedback, please open an issue on GitHub.