Skip to content

Conversation

@n-poulsen
Copy link
Contributor

Evaluation refactor

Improvements to the evaluation code, as well as new tests to ensure that mAP scores match the pycocotools implementation.

Change list:

  • Moved all metric computation code to a deeplabcut/core/metrics folder (as metrics are computed with numpy)
  • Cleaned metric computation code so the prediction/ground truth matching always happens
    • Refactored in a way such that no OOM errors should occur, even on very large datasets (>60k images)
  • Multi-animal RMSE: only compute RMSE using (ground-truth, detection) matches with non-zero RMSE
  • Add compute_detection_rmse to compute "detection" RMSE, matching the DeepLabCut 2.X implementation
  • Fixed the bug for PAF models documented in Evaluation error with PAF heads: ValueError: matrix contains invalid numeric entries #2631

@n-poulsen n-poulsen requested review from MMathisLab and jeylau July 19, 2024 12:12
Copy link
Member

@MMathisLab MMathisLab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, although I did not test

Copy link
Member

@MMathisLab MMathisLab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but see suggested changes to docstrings

Copy link
Member

@MMathisLab MMathisLab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe in main docs we need to add information about these new metrics as well @n-poulsen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants