Skip to content

Added DocumentRetrievalEvaluator to Azure AI Evaluation to support evaluation of document search#39929

Merged
singankit merged 20 commits intomainfrom
abhahn/document_retrieval_evaluator
Apr 19, 2025
Merged

Added DocumentRetrievalEvaluator to Azure AI Evaluation to support evaluation of document search#39929
singankit merged 20 commits intomainfrom
abhahn/document_retrieval_evaluator

Conversation

@abhahn
Copy link
Contributor

@abhahn abhahn commented Mar 4, 2025

Description

This PR includes a new class, DocumentRetrievalEvaluator, to produce document retrieval evaluator metrics over a set of input document, measured against a set of input ground-truth documents.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

@github-actions github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Mar 4, 2025
@azure-sdk
Copy link
Collaborator

API change check

API changes are not detected in this pull request.

@abhahn abhahn marked this pull request as ready for review March 18, 2025 20:41
Copilot AI review requested due to automatic review settings March 18, 2025 20:41
@abhahn abhahn requested a review from a team as a code owner March 18, 2025 20:41
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new evaluator class, DocumentRetrievalEvaluator, to compute document retrieval metrics such as NDCG, XDCG, fidelity, and top-K relevance for document search queries. Key changes include:

  • Adding an init.py file that exposes DocumentRetrievalEvaluator.
  • Implementing DocumentRetrievalEvaluator with methods to compute metrics and perform input validation.

Reviewed Changes

Copilot reviewed 2 out of 4 changed files in this pull request and generated no comments.

File Description
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_document_retrieval/init.py Exposes the DocumentRetrievalEvaluator class
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_document_retrieval/_document_retrieval.py Implements the evaluator methods and metric computations
Files not reviewed (2)
  • sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_document_retrieval/input.schema: Language not supported
  • sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_document_retrieval/metrics.schema: Language not supported
Comments suppressed due to low confidence (3)

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_document_retrieval/_document_retrieval.py:42

  • The call to super().init() is unnecessary since DocumentRetrievalEvaluator does not extend a base class. Consider removing it to avoid confusion.
super().__init__()

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_document_retrieval/_document_retrieval.py:166

  • The output key 'ratioholes' does not match the TypedDict definition which specifies 'holes_ratio'. Consider updating it for consistency.
"ratioholes": 0,

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_document_retrieval/_document_retrieval.py:211

  • The key 'ratioholes' is inconsistent with the TypedDict definition that uses 'holes_ratio'. Update it accordingly.
"ratioholes": ratioholes,

@singankit singankit enabled auto-merge (squash) April 18, 2025 22:16
@singankit singankit merged commit 1100a73 into main Apr 19, 2025
19 checks passed
@singankit singankit deleted the abhahn/document_retrieval_evaluator branch April 19, 2025 01:11
cRui861 pushed a commit that referenced this pull request May 14, 2025
…aluation of document search (#39929)

* Added new evaluator code for Azure AI Evaluation

* Added TypedDict for input validation and created json schema specs for input and output schemas

* Added a temporary hack to make the example runnable; updated schema

* Implementation improvements to align with applied science recommendations

* Added docstrings and cleaned up input schema file

* Updates based on in-person feedback

* Addressed comments from the PR and SDK review

* small fix for threshold dict update

* Updates to support complex object inputs in DocumentRetrievalEvaluator

* Silence cspell errors for metric names'

* Updates to cspell.json

* Some updates for style enforcement; removed json schema files

* Reformatted with black

* Added tests, addressed a few comments and handled some edge cases

* Updates to tests and a few code fixes

* Docstring updates and added samples

* PR comments

* A few small test updates

---------

Co-authored-by: Abby Hartman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Evaluation Issues related to the client library for Azure AI Evaluation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants