autoqrels is a tool for automatically inferring query relevance assessments (qrels).
Currently, it supports the one-shot labeling approach (1SL) presented in MacAvaney and Soldaini, One-Shot Labeling for Automatic Relevance Estimation, SIGIR 2023.
This package adheres to the ir-measures API, which means it can
be directly used by various tools, such as PyTerrier.
You can install autoqrels using pip:
pip install autoqrelsYou can also work with the repository locally:
git clone https://github.com/seanmacavaney/autoqrels.git
cd autoqrels
python setup.py developThe primary interface in autoqrels is autoqrels.Labeler. A Labeler exposes a
method, infer_qrels(run, qrels), which returns a new set of qrels that covers the
provided run:
runis a Pandas DataFrame with the columnsquery_id(str),doc_id(str), andscore(float)qrelsis a Pandas DataFrame with the columnsquery_id(str),doc_id(str), andrelevance(int)- The return value is a Pandas DataFrame with the columns
query_id(str),doc_id(str), andrelevance(float)
Labelers also expose several measure definitions compatible with ir_measures:
labeler.SDCG@k,
labeler.RBP(p=persistence),
labeler.P@k.
These measures can be used to calculate the corresponding effectivness, with the
addition of the labeler's inferred qrels. See the ir-measures documentation
for more details.
We'll now explore the available Labeler implementations.
Reproduction: See repro instructions in repro/oneshot.
One-shot labelers work over a single known relevant document per query. An error is raised if multiple relevant documents are provided.
Example:
import autoqrels
import ir_datasets
dataset = ir_datasets.load('msmarco-passage/trec-dl-2019')
duot5 = autoqrels.oneshot.DuoT5(dataset=dataset, cache_path='data/duot5.cache.json.gz')
# measures:
duot5.SDCG@10
duot5.P@10
duot5.RBPIf you use this work, please cite:
@inproceedings{autoqrels,
author = {MacAvaney, Sean and Soldaini, Luca},
title = {One-Shot Labeling for Automatic Relevance Estimation},
booktitle = {Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval},
year = {2023},
url = {https://arxiv.org/abs/2302.11266}
}