-
Notifications
You must be signed in to change notification settings - Fork 32.7k
DebertaForMaskedLM cannot load the parameters in the MLM head from microsoft/deberta-base #15216
Copy link
Copy link
Closed
Description
my env:
transformers 4.15.0
from transformers import AutoModelForMaskedLM
model = AutoModelForMaskedLM.from_pretrained("microsoft/deberta-base")
The warning goes like:
Some weights of the model checkpoint at microsoft/deberta-base were not used when initializing DebertaForMaskedLM: ['deberta.embeddings.position_embeddings.weight', 'lm_predictions.lm_head.dense.bias', 'lm_predictions.lm_head.dense.weight', 'lm_predictions.lm_head.bias', 'lm_predictions.lm_head.LayerNorm.weight', 'lm_predictions.lm_head.LayerNorm.bias']
- This IS expected if you are initializing DebertaForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DebertaForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of DebertaForMaskedLM were not initialized from the model checkpoint at /home/tfangaa/Downloads/ptlm/deberta-base/ and are newly initialized: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
It seems that the checkpoints provided by microsoft/deberta-base doesn't possess the weights needed for the MLM head, so that DebertaForMaskedLM cannot be directly used for masked token prediction.
Is this a bug of the DebertaForMaskedLM class or the checkpoints provided by Microsoft? Thanks!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels