-
Notifications
You must be signed in to change notification settings - Fork 32.7k
Outputs are not consistent when using DeBERTa for inference #27586
Copy link
Copy link
Closed
Description
System Info
transformersversion: 4.34.1- Platform: Windows-10-10.0.22621-SP0
- Python version: 3.8.18
- Huggingface_hub version: 0.17.3
- Safetensors version: 0.4.0
- Accelerate version: 0.24.0
- Accelerate config: not found
- PyTorch version (GPU?): 2.1.0+cu121 (True)
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
I am running a simple mask-filling task on DeBERTa and find that the output logits of the same input sentence vary every run (reloading the model from the disk).
I have set eval() manual_seed(), but they do not work. And the outputs are very different and do not look like they are caused by random seeds.
Even the official discript shows the same problem. By the way, it works well when holding the model in memory and feed the same input twice to it.
from transformers import AutoTokenizer, DebertaForMaskedLM
import torch
tokenizer = AutoTokenizer.from_pretrained('microsoft/deberta-base', cache_dir='model_cache')
model = DebertaForMaskedLM.from_pretrained('microsoft/deberta-base', cache_dir='model_cache').eval()
inputs = tokenizer("The capital of France is [MASK].", return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
# retrieve index of [MASK]
mask_token_index = (inputs.input_ids == tokenizer.mask_token_id)[0].nonzero(as_tuple=True)[0]
predicted_token_id = logits[0, mask_token_index].argmax(axis=-1)
tokenizer.decode(predicted_token_id)
labels = tokenizer("The capital of France is Paris.", return_tensors="pt")["input_ids"]
# mask labels of non-[MASK] tokens
labels = torch.where(inputs.input_ids == tokenizer.mask_token_id, labels, -100)
outputs = model(**inputs, labels=labels)
print(round(outputs.loss.item(), 2))
# Run twice without reloading models work fine
# with torch.no_grad():
# logits = model(**inputs).logits
#
# # retrieve index of [MASK]
# mask_token_index = (inputs.input_ids == tokenizer.mask_token_id)[0].nonzero(as_tuple=True)[0]
#
# predicted_token_id = logits[0, mask_token_index].argmax(axis=-1)
# tokenizer.decode(predicted_token_id)
#
# labels = tokenizer("The capital of France is Paris.", return_tensors="pt")["input_ids"]
# # mask labels of non-[MASK] tokens
# labels = torch.where(inputs.input_ids == tokenizer.mask_token_id, labels, -100)
#
# outputs = model(**inputs, labels=labels)
# print(round(outputs.loss.item(), 2))
Expected behavior
By feeding the same input, I should get the same outputs.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels