Fix contrastive search to correctly handle input with padding #33507

ducviet00 · 2024-09-16T12:56:00Z

What does this PR do?

This PR fixes contrastive search to correctly handle input padding in decoder-only & encoder-decoder models.

Details

I encountered the issue and discovered that the Contrastive Search implementation is highly sensitive to padded tokens.

For example, with the prompt: The whispered legends of the haunted mansion spoke, the output from Hugging Face without padding is:

The whispered legends of the haunted mansion spoke of the "souls of the dead" who were "falling out of the sky" and "falling into the sea."\n

However, with padding tokens, the output becomes:

The whispered legends of the haunted mansion spoke of the "soul of the dead" and the "blood of the dead."\nThe ghost of Dr. H. P. Lovecraft was a man

You can check the Colab notebook that demonstrates the issue here

How did I fix the issue

The issue arises when input_ids contain padded tokens; the hidden_states of the model then include embeddings for these padded tokens if the model doesn't eliminate them during processing. The _ranking_fast function also calculates values based on these padded tokens, leading to incorrect outputs. This is critical because it significantly degrades the model's performance.

To fix this, I created a cosine_matrix_mask based on the attention_mask and penalized the cosine_matrix using this mask (ignoring padding positions) by applying large negative values. After that, I padded the cosine_matrix_mask with ones to match the output length.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@gante @ArthurZucker @amyeroberts @Rocketknight1

gante

@ducviet00 thank you for identifying the underlying numerical issue, proposing a very reasonable fix, and implementing it with a test 💛

I agree with the suggested fix: by masking cosine_matrix with a large negative value when the corresponding tokens are masked, degeneration_penalty will never be related to the masked tokens and, therefore, masked tokens will not have an impact on contrastive_score.

I've added a few minor suggestions

src/transformers/generation/utils.py

tests/generation/test_contrastive_search.py

ducviet00 · 2024-09-17T16:52:28Z

@gante Thanks for your feedback! I initially thought encoder-decoder models wouldn’t be affected since decoder_input_ids don’t seem to need padding tokens, except in continuous batching (not in this case). Applying the cosine_matrix_mask to these models might add some minor overhead, though. I hadn’t considered the decoder prompt before, so thanks for pointing that out!

I’ll push an update with support for encoder-decoder models based on your suggestions soon.

ducviet00 · 2024-09-17T20:11:02Z

@gante
I push an update to support encode-decoder models. I also added test for t5 model, with padding for decoder_input_ids. On the main branch, outputs_with_padding:

Ich muss diese Aufgabe noch vor Ende des Tages beenden.

It should be:

Ich muss diese Aufgabe vor Ende des Tages beenden.

You can check the code inside test_padding_input_contrastive_search_t5, in tests/generation/test_utils.py

ducviet00 · 2024-09-17T20:20:07Z

src/transformers/generation/utils.py

This code initializes a default mask and then updates it based on the model type.

For encoder-decoder models, if model_kwargs contains a decoder_attention_mask (and it is not None), cosine_matrix_mask is set to this mask. If decoder_attention_mask is missing, it falls back to the default mask.

For decoder-only models, cosine_matrix_mask is set to the attention_mask from model_kwargs.

Please let me know if there are any additional logic checks that need to be added

ducviet00 · 2024-09-19T10:59:31Z

Hi @gante
PTAL 🤗

gante

Thank you for iterating 💛

LysandreJik

Very impressive change and tests! Thanks for your PR @ducviet00

gante · 2024-09-20T15:52:25Z

@ducviet00 thank you for making transformers better for everyone 💛

…gface#33507) * fix: handle padding in contrastive search for decoder-only models * fix: handle padding in contrastive search for encoder-decoder models * tests: move padding contrastive test to test_util, add t5 test * fix: handle if model_kwargs["decoder_attention_mask"] is None * refactor: improve padding input contrastive search generation tests * chore: _ranking_fast to use LongTensor for cosine_matrix_mask

amyeroberts added the Generation label Sep 16, 2024

gante reviewed Sep 17, 2024

View reviewed changes

ducviet00 commented Sep 17, 2024

View reviewed changes

ducviet00 added 4 commits September 18, 2024 03:21

fix: handle padding in contrastive search for decoder-only models

055b692

fix: handle padding in contrastive search for encoder-decoder models

daa2696

tests: move padding contrastive test to test_util, add t5 test

a10ec98

fix: handle if model_kwargs["decoder_attention_mask"] is None

6a86c0d

ducviet00 requested a review from gante September 17, 2024 20:22

ducviet00 added 2 commits September 18, 2024 10:18

refactor: improve padding input contrastive search generation tests

b8a08dd

chore: _ranking_fast to use LongTensor for cosine_matrix_mask

21d0207

ducviet00 changed the title ~~Fix contrastive search to correctly handle input padding in decoder-only models~~ Fix contrastive search to correctly handle input padding Sep 19, 2024

ducviet00 changed the title ~~Fix contrastive search to correctly handle input padding~~ Fix contrastive search to correctly handle input with padding Sep 19, 2024

gante approved these changes Sep 19, 2024

View reviewed changes

gante requested a review from LysandreJik September 19, 2024 18:07

LysandreJik approved these changes Sep 20, 2024

View reviewed changes

gante merged commit dc8b6ea into huggingface:main Sep 20, 2024

vidyasiv mentioned this pull request Sep 30, 2024

transformers_future: contrastive search failing with Incompatible input shapes, broadcast not possible. huggingface/optimum-habana#1385

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix contrastive search to correctly handle input with padding #33507

Fix contrastive search to correctly handle input with padding #33507

Uh oh!

ducviet00 commented Sep 16, 2024 •

edited

Loading

Uh oh!

gante left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ducviet00 commented Sep 17, 2024

Uh oh!

ducviet00 commented Sep 17, 2024

Uh oh!

ducviet00 Sep 17, 2024

Uh oh!

ducviet00 commented Sep 19, 2024

Uh oh!

gante left a comment

Uh oh!

LysandreJik left a comment

Uh oh!

gante commented Sep 20, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix contrastive search to correctly handle input with padding #33507

Fix contrastive search to correctly handle input with padding #33507

Uh oh!

Conversation

ducviet00 commented Sep 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Details

How did I fix the issue

Before submitting

Who can review?

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ducviet00 commented Sep 17, 2024

Uh oh!

ducviet00 commented Sep 17, 2024

Uh oh!

ducviet00 Sep 17, 2024

Choose a reason for hiding this comment

Uh oh!

ducviet00 commented Sep 19, 2024

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

LysandreJik left a comment

Choose a reason for hiding this comment

Uh oh!

gante commented Sep 20, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ducviet00 commented Sep 16, 2024 •

edited

Loading