Skip to content
This repository was archived by the owner on Jun 4, 2025. It is now read-only.

Conversation

@KSGulin
Copy link

@KSGulin KSGulin commented Oct 14, 2022

Note: currently targeted at copy of upstream release branch for easy comparison. Diff is intended to land on our main branch.

This PR integrates our existing transformers integration with the updated v4.23.1 upstream. See here for HuggingFace release notes.

Meant to land concurrently with neuralmagic/sparseml#1081

Testing Plan

  • Integration tests (both cadences)
  • SparseML QA docs training and export commands
  • One training command from the text classification and token classification docs each

@KSGulin KSGulin self-assigned this Oct 14, 2022
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 14, 2022

The documentation is not available anymore as the PR was closed or merged.

@KSGulin KSGulin requested review from a team, Chibukach, DaltheCow and corey-nm and removed request for a team October 14, 2022 10:43
@KSGulin KSGulin changed the title Upgrade to transformers release V4.23 Upgrade to transformers release V4.23.1 Oct 14, 2022
Copy link

@corey-nm corey-nm Oct 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm almost positive we could hot patch this stuff in instead of having repo forks.

Something like

import transformers

from transformers.models.bert.modeling_bert import BertSelfAttention as _BertSelfAttention

class PatchedBertSelfAttention(_BertSelfAttention):
    def __init__(self, *args, **kwargs):
        self.attention_scores_matmul = ...
    def forward(self):
        ...

transformers.models.bert.modeling_bert.BertSelfAttention = PatchedBertSelfAttention

Copy link
Author

@KSGulin KSGulin Oct 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think now that most of the implementation has been moved on the sparseml side, there's definitely potential to explore here. One thing to keep in mind is that every time we've upgraded the HF repo had changes that broke our integration, so the hot patch would need to be easy to debug and amend. But in general I'm all for this

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reasoning for adding this class? Was this functionality not here before?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a file I noted for myself to remove before pushing and did not follow through

@KSGulin KSGulin force-pushed the upstream_sync_4.23 branch from 9282e09 to 6133554 Compare October 14, 2022 17:38
bfineran and others added 3 commits October 14, 2022 18:17
Disable FP16 on QAT start (#12)

* Override LRScheduler when using LRModifiers

* Disable FP16 on QAT start

* keep wrapped scaler object for training after disabling

Using QATMatMul in DistilBERT model class (#41)

Removed double quantization of output of context layer. (#45)

Fix DataParallel validation forward signatures (#47)

* Fix: DataParallel validation forward signatures

* Update: generalize forward_fn selection

Best model after epoch (#46)

fix sclaer check for non fp16 mode in trainer (#38)

Mobilebert QAT (#55)

* Remove duplicate quantization of vocabulary.

enable a QATWrapper for non-parameterized matmuls in BERT self attention (#9)
update Zoo stub loading for SparseZoo 1.1 refactor (#54)

add flag to signal NM integration is active (#32)

Add recipe_name to file names
@KSGulin KSGulin force-pushed the upstream_sync_4.23 branch from 6133554 to 7548869 Compare October 14, 2022 18:18
@KSGulin
Copy link
Author

KSGulin commented Oct 14, 2022

Committed the sin of editing history and paired down our fork changes to 3 commits

@bfineran bfineran requested a review from anmarques October 14, 2022 18:25
f"files downloaded from {val}. Found {framework_file_names}. Check "
"if the given stub is for a transformers repo model"
)
framework_dir_path = Path(framework_file_paths[0]).parent.absolute()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will all the framework_file_paths have the same parent? Is that why we can just use the 1st one?

Comment on lines +277 to +279
return tuple(
[_download_dataclass_zoo_stub_files(output) for output in outputs]
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if i like this better but thought i'd add

Suggested change
return tuple(
[_download_dataclass_zoo_stub_files(output) for output in outputs]
)
return tuple(map(_download_dataclass_zoo_stub_files, outputs))

self.wrap_qat = True
self.qat_wrapper_kwargs = {
"num_inputs": 2,
"num_outputs": 0,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be here? Don't see it in others

@bfineran
Copy link

@corey-nm good comments, going to keep changes to existing flows out of scope for this PR as it's mean to be a rebase only, let's dig in together offline

@bfineran bfineran merged commit 46f78c1 into upstream-v4.23-release-copy Oct 18, 2022
@DaltheCow DaltheCow removed their request for review October 18, 2022 14:36
KSGulin added a commit that referenced this pull request Jun 19, 2023
* Add recipe_name to default file names

* Upgrade to transformers release V4.30.2 (#62)

* Update trainer and model flows to accommodate sparseml

Disable FP16 on QAT start (#12)

* Override LRScheduler when using LRModifiers

* Disable FP16 on QAT start

* keep wrapped scaler object for training after disabling

Using QATMatMul in DistilBERT model class (#41)

Removed double quantization of output of context layer. (#45)

Fix DataParallel validation forward signatures (#47)

* Fix: DataParallel validation forward signatures

* Update: generalize forward_fn selection

Best model after epoch (#46)

fix sclaer check for non fp16 mode in trainer (#38)

Mobilebert QAT (#55)

* Remove duplicate quantization of vocabulary.

enable a QATWrapper for non-parameterized matmuls in BERT self attention (#9)

* Utils and auxillary changes

update Zoo stub loading for SparseZoo 1.1 refactor (#54)

add flag to signal NM integration is active (#32)

Add recipe_name to file names

* Fix errors introduced in manual cherry-pick upgrade

Co-authored-by: Benjamin Fineran <[email protected]>

* update build versions for NM fork pypi push (#74)

* fix nightly package name (#75)

* add make build command (#76)

* add GHA workflow files to build nightly and release packages (#77)

* add GHA workflow files to build nightly and release packages

* fix name

---------

Co-authored-by: dhuang <[email protected]>

* bump up version to 1.6.0 (#79)

Co-authored-by: dhuang <[email protected]>

---------

Co-authored-by: Konstantin <[email protected]>
Co-authored-by: Konstantin Gulin <[email protected]>
Co-authored-by: dhuangnm <[email protected]>
Co-authored-by: dhuang <[email protected]>
dsikka pushed a commit that referenced this pull request Aug 17, 2023
* Add recipe_name to default file names

* Upgrade to transformers release V4.30.2 (#62)

* Update trainer and model flows to accommodate sparseml

Disable FP16 on QAT start (#12)

* Override LRScheduler when using LRModifiers

* Disable FP16 on QAT start

* keep wrapped scaler object for training after disabling

Using QATMatMul in DistilBERT model class (#41)

Removed double quantization of output of context layer. (#45)

Fix DataParallel validation forward signatures (#47)

* Fix: DataParallel validation forward signatures

* Update: generalize forward_fn selection

Best model after epoch (#46)

fix sclaer check for non fp16 mode in trainer (#38)

Mobilebert QAT (#55)

* Remove duplicate quantization of vocabulary.

enable a QATWrapper for non-parameterized matmuls in BERT self attention (#9)

* Utils and auxillary changes

update Zoo stub loading for SparseZoo 1.1 refactor (#54)

add flag to signal NM integration is active (#32)

Add recipe_name to file names

* Fix errors introduced in manual cherry-pick upgrade

Co-authored-by: Benjamin Fineran <[email protected]>

* update build versions for NM fork pypi push (#74)

* fix nightly package name (#75)

* add make build command (#76)

* add GHA workflow files to build nightly and release packages (#77)

* add GHA workflow files to build nightly and release packages

* fix name

---------

Co-authored-by: dhuang <[email protected]>

* bump up version to 1.6.0 (#79)

Co-authored-by: dhuang <[email protected]>

---------

Co-authored-by: Konstantin <[email protected]>
Co-authored-by: Konstantin Gulin <[email protected]>
Co-authored-by: dhuangnm <[email protected]>
Co-authored-by: dhuang <[email protected]>
dsikka pushed a commit that referenced this pull request Aug 17, 2023
* Add recipe_name to default file names

* Upgrade to transformers release V4.30.2 (#62)

* Update trainer and model flows to accommodate sparseml

Disable FP16 on QAT start (#12)

* Override LRScheduler when using LRModifiers

* Disable FP16 on QAT start

* keep wrapped scaler object for training after disabling

Using QATMatMul in DistilBERT model class (#41)

Removed double quantization of output of context layer. (#45)

Fix DataParallel validation forward signatures (#47)

* Fix: DataParallel validation forward signatures

* Update: generalize forward_fn selection

Best model after epoch (#46)

fix sclaer check for non fp16 mode in trainer (#38)

Mobilebert QAT (#55)

* Remove duplicate quantization of vocabulary.

enable a QATWrapper for non-parameterized matmuls in BERT self attention (#9)

* Utils and auxillary changes

update Zoo stub loading for SparseZoo 1.1 refactor (#54)

add flag to signal NM integration is active (#32)

Add recipe_name to file names

* Fix errors introduced in manual cherry-pick upgrade

Co-authored-by: Benjamin Fineran <[email protected]>

* update build versions for NM fork pypi push (#74)

* fix nightly package name (#75)

* add make build command (#76)

* add GHA workflow files to build nightly and release packages (#77)

* add GHA workflow files to build nightly and release packages

* fix name

---------

Co-authored-by: dhuang <[email protected]>

* bump up version to 1.6.0 (#79)

Co-authored-by: dhuang <[email protected]>

---------

Co-authored-by: Konstantin <[email protected]>
Co-authored-by: Konstantin Gulin <[email protected]>
Co-authored-by: dhuangnm <[email protected]>
Co-authored-by: dhuang <[email protected]>
bfineran added a commit that referenced this pull request Oct 26, 2023
* Add recipe_name to default file names

* Upgrade to transformers release V4.30.2 (#62)

* Update trainer and model flows to accommodate sparseml

Disable FP16 on QAT start (#12)

* Override LRScheduler when using LRModifiers

* Disable FP16 on QAT start

* keep wrapped scaler object for training after disabling

Using QATMatMul in DistilBERT model class (#41)

Removed double quantization of output of context layer. (#45)

Fix DataParallel validation forward signatures (#47)

* Fix: DataParallel validation forward signatures

* Update: generalize forward_fn selection

Best model after epoch (#46)

fix sclaer check for non fp16 mode in trainer (#38)

Mobilebert QAT (#55)

* Remove duplicate quantization of vocabulary.

enable a QATWrapper for non-parameterized matmuls in BERT self attention (#9)

* Utils and auxillary changes

update Zoo stub loading for SparseZoo 1.1 refactor (#54)

add flag to signal NM integration is active (#32)

Add recipe_name to file names

* Fix errors introduced in manual cherry-pick upgrade

Co-authored-by: Benjamin Fineran <[email protected]>

* update build versions for NM fork pypi push (#74)

* fix nightly package name (#75)

* add make build command (#76)

* add GHA workflow files to build nightly and release packages (#77)

* add GHA workflow files to build nightly and release packages

* fix name

---------

Co-authored-by: dhuang <[email protected]>

* bump up version to 1.6.0 (#79)

Co-authored-by: dhuang <[email protected]>

---------

Co-authored-by: Konstantin <[email protected]>
Co-authored-by: Konstantin Gulin <[email protected]>
Co-authored-by: dhuangnm <[email protected]>
Co-authored-by: dhuang <[email protected]>
bfineran added a commit that referenced this pull request Oct 27, 2023
(previous commits)
* Add recipe_name to default file names

* Upgrade to transformers release V4.30.2 (#62)

* Update trainer and model flows to accommodate sparseml

Disable FP16 on QAT start (#12)

* Override LRScheduler when using LRModifiers

* Disable FP16 on QAT start

* keep wrapped scaler object for training after disabling

Using QATMatMul in DistilBERT model class (#41)

Removed double quantization of output of context layer. (#45)

Fix DataParallel validation forward signatures (#47)

* Fix: DataParallel validation forward signatures

* Update: generalize forward_fn selection

Best model after epoch (#46)

fix sclaer check for non fp16 mode in trainer (#38)

Mobilebert QAT (#55)

* Remove duplicate quantization of vocabulary.

enable a QATWrapper for non-parameterized matmuls in BERT self attention (#9)

* Utils and auxillary changes

update Zoo stub loading for SparseZoo 1.1 refactor (#54)

add flag to signal NM integration is active (#32)

Add recipe_name to file names

* Fix errors introduced in manual cherry-pick upgrade

Co-authored-by: Benjamin Fineran <[email protected]>

* update build versions for NM fork pypi push (#74)

* fix nightly package name (#75)

* add make build command (#76)

* add GHA workflow files to build nightly and release packages (#77)

* add GHA workflow files to build nightly and release packages

* fix name

---------

Co-authored-by: dhuang <[email protected]>

* bump up version to 1.6.0 (#79)

Co-authored-by: dhuang <[email protected]>

---------

Co-authored-by: Konstantin <[email protected]>
Co-authored-by: Konstantin Gulin <[email protected]>
Co-authored-by: dhuangnm <[email protected]>
Co-authored-by: dhuang <[email protected]>

minor improvements for build workflow files (#83)

Co-authored-by: dhuang <[email protected]>

fix minor issue (#84)

Co-authored-by: dhuang <[email protected]>

OPT with quantizable MatMuls (#85)

fix a minor issue for release build (#86)

Co-authored-by: dhuang <[email protected]>

update version in version.py

Testmo (#91)

* improve GHA workflow files to build nightly and release, and report status to testmo

* clean up

* report exit code

* Assign value to exit_code

---------

Co-authored-by: dhuang <[email protected]>

Update trainer.py - fix DistributedSampler import (#93)

DistributedSampler is used but not imported in `trainer.py`

Research/llama/bmm quantization (#94)

* Quantize attention matmuls

* Quantize attention matmuls

bump base transformers version
@dbogunowicz dbogunowicz deleted the upstream_sync_4.23 branch December 5, 2023 10:29
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants