Skip to content

[Pytorch] Unexpected task example translation : text-generation instead of Translation in model card and Hub #25931

@SoyGema

Description

@SoyGema

System Info

Hello there!
Thanks for making translation example with Pytorch.
🙏🙏 The documentation is amazing and the script is very well structured! 🙏🙏

System Info

- `transformers` version: 4.32.0.dev0
- Platform: macOS-13.4.1-arm64-arm-64bit
- Python version: 3.10.10
- Huggingface_hub version: 0.16.4
- Safetensors version: 0.3.2
- Accelerate version: 0.21.0
- Accelerate config:    not found
- PyTorch version (GPU?): 2.0.1 (False)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed

Who can help?

@patil-suraj

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Context

Fine-tuning english-hindi Translation model with t5-small and opus100 dataset
Running the example run_translation.py from transformers repository.

Small modification for making the dataset a little bit smaller for testing end-to-end

Checked recommendations from README.md when using T5 family models

  • 1. Add --source_prefix flag
  • 2. Change 3 flags accordingly --source_lang , --target_lang and --source_prefix
python run_translation.py \
    --model_name_or_path t5-small \
    --do_train \
    --do_eval \
    --source_lang en \
    --target_lang hi \
    --source_prefix "translate English to Hindi: " \
    --dataset_name opus100 \
    --dataset_config_name en-hi \
    --output_dir=/tmp/english-hindi \
    --per_device_train_batch_size=4 \
    --per_device_eval_batch_size=4 \
    --overwrite_output_dir \
    --num_train_epochs=3 \
    --push_to_hub=True \
    --predict_with_generate=True
    --report_to_all
    --do_predict

Model trains correctly. It is also connected to W&B
Trace of model card . Once the model is trained

[INFO|modelcard.py:452] 2023-09-02 23:08:32,386 >> Dropping the following result as it does not have all the necessary fields:
{'task': {'name': 'Sequence-to-sequence Language Modeling', 'type': 'text2text-generation'}, 'dataset': {'name': 'opus100', 'type': 'opus100', 'config': 'en-hi', 'split': 'validation', 'args': 'en-hi'}}

The model is pushed to the HUB

Expected behavior

  • Correct task recognition and inference : Somehow the task is uploaded in Hub as text-generation and not as a translation task.
    Inference shows text-generation as well, And the model card seems to point at that too.
    During search , I visited/Read forum, but I think it makes reference to the BLEU generation metric and not the task (if Im understanding well ) I´ve also checked Tasks docs and I think it gives you a guide on how to add a task, not change it - please let me know if I shall follow this path - and Troubleshoot page , but couldn´t find anything.
text-generation-instead-translation

Tangential Note :
Im aware that the Bleu Score is 0 , and I tried another languages and modifying some logic in compute_metrics function , as well as trying with another language that computed BLEU well. However the model was also loaded as text-generation. If keeping the experimentation up can prove some hypothesis I might have about this logic and BLEU ( that impact languages with alphabets distinct from latin ones) I will let you know , but I made those experiments to test if the task issue was somehow related to the task

Captura de pantalla 2023-09-03 a las 10 14 15

Any help with clarifying and poiniting to translation task would be much appreciated
And if some change in the script or docs might come from this happy to contribute
Thanks for making transformers 🤖 , for the time dedicated to this issue 🕞 and have a nice day 🤗!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions