-
Notifications
You must be signed in to change notification settings - Fork 240
Pretrained models for masked LM do not work as expected #74
Copy link
Copy link
Open
Description
I have been trying to use the pretrained DebertaV2ForMaskedLM based on the example code, but it is not working. The following BERT code (for which the example code looks basically identical) works as expected:
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
model = BertForMaskedLM.from_pretrained("bert-base-uncased")
inputs = tokenizer("The capital of France is [MASK].", return_tensors="pt")
labels = tokenizer("The capital of France is Paris.", return_tensors="pt")["input_ids"]
outputs = model(**inputs, labels=labels)
word_indices = torch.argmax(outputs["logits"], dim=2)[0]
print(tokenizer.decode(word_indices)) # prints ". the capital of france is paris.."
However, substituting in the following options for the first two lines does not work:
tokenizer = DebertaV2Tokenizer.from_pretrained("microsoft/deberta-v2-xlarge")
model = DebertaV2ForMaskedLM.from_pretrained("microsoft/deberta-v2-xlarge")
tokenizer = DebertaTokenizer.from_pretrained("microsoft/deberta-base")
model = DebertaForMaskedLM.from_pretrained("microsoft/deberta-base")
Using either of these options results in nonsense output. Is the documentation missing something?
I am using:
Python 3.9.6
torch 1.9.0
transformers 4.12.5
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels