Fix Bug in Flax Seq2Seq Models #16021

sanchit-gandhi · 2022-03-09T16:06:42Z

This PR makes two changes to each of FlaxEncoderDecoderModel and FlaxSpeechEncoderDecoderModel:

Amends the input docstrings to remove incorrect information about the model "shifting tokens right for denoising". In Flax, decoder_input_ids are obtained by shifting the target labels right outside of the seq2seq model, not within as stated in the docstrings.
Raises a ValueError if decoder_input_ids are not provided. The current behaviour allows for decoder_input_ids to be omitted, in which case they default to None. This causes errors when decoder_input_ids=None is manipulated with JAX functions to build the decoder_attention_mask and decoder_position_ids should they be omitted from the arguments too.

The following code snippet throws the error aforementioned in 2:

from transformers import FlaxSpeechEncoderDecoderModel
import jax.numpy as jnp
model = FlaxSpeechEncoderDecoderModel.from_encoder_decoder_pretrained('hf-internal-testing/tiny-random-wav2vec2', 'hf-internal-testing/tiny-random-gpt2', encoder_from_pt=True, decoder_from_pt=True)
inputs = jnp.ones((2, 5000), dtype=jnp.float32)
outputs = model(inputs)

Output:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/sanchitgandhi/transformers/src/transformers/models/speech_encoder_decoder/modeling_flax_speech_encoder_decoder.py", line 688, in __call__
    decoder_attention_mask = jnp.ones_like(decoder_input_ids)
  File "/Users/sanchitgandhi/venv/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 3706, in ones_like
    _check_arraylike("ones_like", a)
  File "/Users/sanchitgandhi/venv/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 570, in _check_arraylike
    raise TypeError(msg.format(fun_name, type(arg), pos))
TypeError: ones_like requires ndarray or scalar arguments, got <class 'NoneType'> at position 0.

HuggingFaceDocBuilderDev · 2022-03-09T16:11:45Z

The documentation is not available anymore as the PR was closed or merged.

patil-suraj

Thank you for fixing this!

patil-suraj · 2022-03-10T10:46:17Z

src/transformers/models/encoder_decoder/modeling_flax_encoder_decoder.py

maybe write it this way

Suggested change

"For sequence to sequence training, `decoder_position_ids` must be specified as an input argument."

"`decoder_input_ids` can not be `None`. For sequence to sequence training, `decoder_input_ids` must be specified as an input argument."

patil-suraj · 2022-03-10T10:46:21Z

src/transformers/models/speech_encoder_decoder/modeling_flax_speech_encoder_decoder.py

Think we should keep this comment but just mention that this needs to be done outside of the model and does not happen automatically. Same for the FlaxEncoderDecoder model.

patil-suraj · 2022-03-10T10:47:00Z

src/transformers/models/speech_encoder_decoder/modeling_flax_speech_encoder_decoder.py

same comment as above.

mishig25 mentioned this pull request Mar 9, 2022

Doc builder fix push 2 #16024

Closed

5 tasks

patil-suraj approved these changes Mar 10, 2022

View reviewed changes

sanchit-gandhi added 2 commits March 10, 2022 12:32

Fix Bug in Flax Seq2Seq Models

80d6051

incorporate suggested changes

f6e8345

sanchit-gandhi force-pushed the flax-enc-dec branch from a564383 to f6e8345 Compare March 10, 2022 11:32

sanchit-gandhi merged commit 741e493 into huggingface:master Mar 10, 2022

sanchit-gandhi deleted the flax-enc-dec branch March 10, 2022 16:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix Bug in Flax Seq2Seq Models #16021

Fix Bug in Flax Seq2Seq Models #16021

Uh oh!

sanchit-gandhi commented Mar 9, 2022 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Mar 9, 2022 •

edited

Loading

Uh oh!

patil-suraj left a comment

Uh oh!

patil-suraj Mar 10, 2022

Uh oh!

patil-suraj Mar 10, 2022

Uh oh!

patil-suraj Mar 10, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	"For sequence to sequence training, `decoder_position_ids` must be specified as an input argument."
	"`decoder_input_ids` can not be `None`. For sequence to sequence training, `decoder_input_ids` must be specified as an input argument."

Fix Bug in Flax Seq2Seq Models #16021

Fix Bug in Flax Seq2Seq Models #16021

Uh oh!

Conversation

sanchit-gandhi commented Mar 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Mar 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patil-suraj left a comment

Choose a reason for hiding this comment

Uh oh!

patil-suraj Mar 10, 2022

Choose a reason for hiding this comment

Uh oh!

patil-suraj Mar 10, 2022

Choose a reason for hiding this comment

Uh oh!

patil-suraj Mar 10, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sanchit-gandhi commented Mar 9, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 9, 2022 •

edited

Loading