The 54b MoE!

### System Info

Hello,

I'm (And I believe many others) are intrigues by the Model of Experts model.

I've looked at the only documentation I could find here: https://github.com/facebookresearch/fairseq/blob/nllb/examples/nllb/modeling/README.md

However the arguments for generation / evaluation are unclear to me :)

I will be starting a data analysis job shortly and I see some possible applications for the 54b model.

Surely there are the other models, but I believe many enthusiasts are looking forward to trying translations with this model. I'm just looking for starting arguments for translating, i.e. eng to de.

I saw the model is coming to huggingface at some point so definitely looking forward to that.

I am also interested in running real person evaluations of 3.3b model, 54b MoE model, Deepl and others to see how far the models have come :)

Thank you so much for your work. 

### Who can help?

_No response_

### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

-

### Expected behavior

-

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The 54b MoE! #19593

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The 54b MoE! #19593

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions