Skip to content
This repository was archived by the owner on Apr 8, 2025. It is now read-only.

Add possibility to do nested cross validation#508

Merged
Timoeller merged 14 commits intodeepset-ai:masterfrom
PhilipMay:nested_xval
Sep 14, 2020
Merged

Add possibility to do nested cross validation#508
Timoeller merged 14 commits intodeepset-ai:masterfrom
PhilipMay:nested_xval

Conversation

@PhilipMay
Copy link
Copy Markdown
Contributor

@PhilipMay PhilipMay commented Sep 1, 2020

This adds the possibility to do nested cross validation.

The PR introduces a new class DataSiloForNestedCrossVal that inherits from DataSiloForCrossVal.
It does nested cross validation:

  • other cross validation splits the data to a rest set and a test set
  • inner cross validation splits the rest data to train and dev

This is related to #507

Todo

  • write test
  • discuss with @Timoeller (and @johann-petrak)
  • add quest. answering
  • add doc
  • add number of folds pram for outer and inner folds

@Timoeller
Copy link
Copy Markdown
Contributor

After some offline discussion with @PhilipMay we are going forward with the nested Crossvalidation.

We need a way to handle the variance in performance that comes from different restarts (be it different seeds or slightly different training data).

This PR adds a solution in a structured way, so glad to have it in FARM.

@PhilipMay
Copy link
Copy Markdown
Contributor Author

An other plus for this nested cross validation is that the train and dev datasets are also created in a stratified way.

@PhilipMay PhilipMay force-pushed the nested_xval branch 2 times, most recently from 9c9a968 to d694bb4 Compare September 7, 2020 08:12
@PhilipMay
Copy link
Copy Markdown
Contributor Author

Rebased on master.

Copy link
Copy Markdown
Contributor

@Timoeller Timoeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking well but could be a bit better integrated into existing code.
I made some comments along the way - open to discussion

@PhilipMay
Copy link
Copy Markdown
Contributor Author

@Timoeller I made your suggested changes.
I hope the integration into the existing class is fine with you.

I had a look on nested cross validation for question answering.
But to be honest the whole thing is a bit too fiddly for me.
I'm not deep enough in the question answering topic and would build many bugs I guess.

Maybe you could just add this later when you / someone else wants / needs it?

IMO this is good to go to be merged.

@Timoeller Timoeller self-requested a review September 14, 2020 15:16
Copy link
Copy Markdown
Contributor

@Timoeller Timoeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, thanks for integrating the proposed changes.

@Timoeller Timoeller merged commit 3685035 into deepset-ai:master Sep 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants