Skip to content

Pretrained models for masked LM do not work as expected #74

@AnandA777

Description

@AnandA777

I have been trying to use the pretrained DebertaV2ForMaskedLM based on the example code, but it is not working. The following BERT code (for which the example code looks basically identical) works as expected:

tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
model = BertForMaskedLM.from_pretrained("bert-base-uncased")
inputs = tokenizer("The capital of France is [MASK].", return_tensors="pt")
labels = tokenizer("The capital of France is Paris.", return_tensors="pt")["input_ids"]
outputs = model(**inputs, labels=labels)
word_indices = torch.argmax(outputs["logits"], dim=2)[0]
print(tokenizer.decode(word_indices)) # prints ". the capital of france is paris.."

However, substituting in the following options for the first two lines does not work:

tokenizer = DebertaV2Tokenizer.from_pretrained("microsoft/deberta-v2-xlarge")
model = DebertaV2ForMaskedLM.from_pretrained("microsoft/deberta-v2-xlarge")

tokenizer = DebertaTokenizer.from_pretrained("microsoft/deberta-base")
model = DebertaForMaskedLM.from_pretrained("microsoft/deberta-base")

Using either of these options results in nonsense output. Is the documentation missing something?

I am using:

Python 3.9.6
torch 1.9.0
transformers 4.12.5

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions