-
Notifications
You must be signed in to change notification settings - Fork 32.7k
The 54b MoE! #19593
Description
System Info
Hello,
I'm (And I believe many others) are intrigues by the Model of Experts model.
I've looked at the only documentation I could find here: https://github.com/facebookresearch/fairseq/blob/nllb/examples/nllb/modeling/README.md
However the arguments for generation / evaluation are unclear to me :)
I will be starting a data analysis job shortly and I see some possible applications for the 54b model.
Surely there are the other models, but I believe many enthusiasts are looking forward to trying translations with this model. I'm just looking for starting arguments for translating, i.e. eng to de.
I saw the model is coming to huggingface at some point so definitely looking forward to that.
I am also interested in running real person evaluations of 3.3b model, 54b MoE model, Deepl and others to see how far the models have come :)
Thank you so much for your work.
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)