Pretrained models for masked LM do not work as expected

I have been trying to use the pretrained `DebertaV2ForMaskedLM` based on the [example code](https://huggingface.co/transformers/model_doc/deberta_v2.html), but it is not working. The following BERT code (for which the example code looks basically identical) works as expected:

    tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
    model = BertForMaskedLM.from_pretrained("bert-base-uncased")
    inputs = tokenizer("The capital of France is [MASK].", return_tensors="pt")
    labels = tokenizer("The capital of France is Paris.", return_tensors="pt")["input_ids"]
    outputs = model(**inputs, labels=labels)
    word_indices = torch.argmax(outputs["logits"], dim=2)[0]
    print(tokenizer.decode(word_indices)) # prints ". the capital of france is paris.."

However, substituting in the following options for the first two lines does not work:

    tokenizer = DebertaV2Tokenizer.from_pretrained("microsoft/deberta-v2-xlarge")
    model = DebertaV2ForMaskedLM.from_pretrained("microsoft/deberta-v2-xlarge")
    
    tokenizer = DebertaTokenizer.from_pretrained("microsoft/deberta-base")
    model = DebertaForMaskedLM.from_pretrained("microsoft/deberta-base")

Using either of these options results in nonsense output. Is the documentation missing something?

I am using:

Python 3.9.6
torch 1.9.0
transformers 4.12.5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pretrained models for masked LM do not work as expected #74

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pretrained models for masked LM do not work as expected #74

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions