-
Notifications
You must be signed in to change notification settings - Fork 32.7k
ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds Transformers Translation Tutorial Repro #24254
Description
System Info
Context
Hello There!
First and foremost, congrats for Transformers Translation tutorial. 👍
It serves as a Spark for building english-to-many translation languages models!
I´m following it along with TF mostly reproducing it in a jupyter Notebook with TF for mac with GPU enabled
Using the following dependency versions.
tensorflow-macos==2.9.0
tensorflow-metal==0.5.0
transformers ==4.29.2
* NOTE : tensorflow-macos dependencies are fixed for ensuring GPU training
Who can help?
@ArthurZucker @younesbelkada
@gante maybe?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Issue Description
Im finding the following error when fitting the model for finetunning a model coming from TFAutoModelForSeq2SeqLM autoclass
with tf.device('/device:GPU:0'):
model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=1, callbacks= callbacks )
It is returning
ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds
Call arguments received by layer "decoder" (type TFT5MainLayer):
• self=None
• input_ids=None
• attention_mask=None
• encoder_hidden_states=tf.Tensor(shape=(32, 96, 512), dtype=float32)
• encoder_attention_mask=tf.Tensor(shape=(32, 96), dtype=int32)
• inputs_embeds=None
• head_mask=None
• encoder_head_mask=None
• past_key_values=None
• use_cache=True
• output_attentions=False
• output_hidden_states=False
• return_dict=True
• training=False
Call arguments received by layer "tft5_for_conditional_generation" (type TFT5ForConditionalGeneration):
• self={'input_ids': 'tf.Tensor(shape=(32, 96), dtype=int64)', 'attention_mask': 'tf.Tensor(shape=(32, 96), dtype=int64)'}
• input_ids=None
• attention_mask=None
• decoder_input_ids=None
• decoder_attention_mask=None
• head_mask=None
• decoder_head_mask=None
• encoder_outputs=None
• past_key_values=None
• inputs_embeds=None
• decoder_inputs_embeds=None
• labels=None
• use_cache=None
• output_attentions=None
• output_hidden_states=None
• return_dict=None
• training=False
Backtrace
Tried:
-
Remove callbacks : The model is trained, but of course not loaded into the Hub, nor the metrics computed
-
Followed Loading from AutoModel gives ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds #16234 , this comment and ensured that Im using AutoTokenizer. This glimpsed that this could be related to TFAutoModelForSeq2SeqLM .
model = TFAutoModelForSeq2SeqLM.from_pretrained(checkpoint)
Seems to be working correctly. Therefore I assume that the pre-trained model is loaded
- Also followed PushToHubCallback is hanging on the training completion #21116 and added
save_strategy=noargument in PushToCallBack , but the error persisted
Expected behavior
Model trained should be uploaded to the Hub.
The folder appears empty , there is an error
Hypothesis
At this point, what Im guessing is that once I load the model I shall redefine the verbose error trace?
Any help please of how to do this ? :) or how can I fix it ? Do I have to define a specific Trainer ? Any idea of where I can find this in docs?