Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Conversation

@natuan
Copy link
Contributor

@natuan natuan commented Jun 10, 2022

Currently it's not possible for the two trainers in sparseml transformers both extending a functionality from the upstream HuggingFace Trainer and sharing common code. This PR is to fix that, for now enabling them share saving best model after a specified epoch and removing unused column.

Additionally, create common training args for different training flows, and let them share distill teacher and recipe args. The same could be done when needed for data and model args.

Qualification: tested with load best model after epoch for QA flow.

Copy link
Contributor

@KSGulin KSGulin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @natuan. Just had a few comments

@rahul-tuli
Copy link
Member

Could we update the description with the code for loading best model for QA before landing?

rahul-tuli
rahul-tuli previously approved these changes Aug 3, 2022
@natuan natuan force-pushed the transformers_refactor branch from f193ccc to 15e8b3b Compare August 9, 2022 22:05
Copy link
Member

@anmarques anmarques left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natuan natuan merged commit 3ee8043 into main Aug 10, 2022
@natuan natuan deleted the transformers_refactor branch August 10, 2022 17:14
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants