-
Notifications
You must be signed in to change notification settings - Fork 31.4k
[Tests] Add attentions_option to ModelTesterMixin #15909
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tests] Add attentions_option to ModelTesterMixin #15909
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I am not super familiar with this codebase so I am not going to approve it but LGTM!
LysandreJik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you @NielsRogge
Pinging @ydshieh as well for second review
|
You'll need to fix the failing tests as well. |
|
Thanks for this work @NielsRogge ! I left a review comment (question) about And then a minor question here: It is a bit strange (to me) to have |
|
@LysandreJik I've fixed the failing tests, failing test seems unrelated |
ydshieh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
|
Feel free to merge if every test passes after a rebase |
2192fef to
904d957
Compare
What does this PR do?
The library is called HuggingFace Transformers, I know.. but we recently have additions of non-Transformer based models, that don't use attention (namely ConvNeXT and PoolFormer). Soon, we'll also have ResNet in the library (which will back Transformer-based models such as DETR).
As these models don't use attention, they need to overwrite 3 tests:
test_attention_outputstest_retain_grad_hidden_states_attentionstest_model_outputs_equivalenceThis PR adds a
has_attentionsattribute toModelTesterMixinwhich can be set to False for such models. It will then make sure these tests are properly tested, without taking into account attention.