-
Notifications
You must be signed in to change notification settings - Fork 32.7k
RAG generate function uses input_ids even when context_input_ids are given. #7871
Copy link
Copy link
Closed
Closed
Copy link
Description
Environment info
transformersversion: 3.3.1- Platform: Linux-5.4.0-51-generic-x86_64-with-debian-buster-sid
- Python version: 3.6.8
- PyTorch version (GPU?): 1.6.0 (True)
- Tensorflow version (GPU?): 2.3.1 (False)
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: No
Who can help
I think @patrickvonplaten has been checking RAG issues
Information
Model I am using RagTokenForGeneration:
The problem arises when using:
- [x ] the official example scripts: (give details below)
- my own modified scripts: (give details below)
The tasks I am working on is:
- an official GLUE/SQUaD task: (give the name)
- [x ] my own task or dataset: (give details below)
To reproduce
One can use the demo for RAG currently in PR but it will happen in any case.
- Load a RagTokenForGeneration model.
- Generate the context_input_ids (in the demo done doing a forward pass.
- use the generate function without giving
input_idswhich is supposed to be an optional input. - The function check the
batch_sizeusinginput_idsand breaks since they are a None in this line:transformers/src/transformers/modeling_rag.py
Line 1311 in 9f7b2b2
batch_size = input_ids.shape[0]
query = "My question"
tokenizer = RagTokenizer.from_pretrained("facebook/rag-token-nq")
rag_conf = RagConfig.from_pretrained("facebook/rag-token-nq")
retriever = RagRetriever.from_pretrained("facebook/rag-token-nq", question_encoder_tokenizer = tokenizer.question_encoder, generator_tokenizer = tokenizer.generator, index_name="custom", indexed_dataset=dataset)
model = RagTokenForGeneration.from_pretrained("facebook/rag-token-nq", retriever=retriever)
device = "cuda:0"
input_ids = tokenizer(query, return_tensors="pt").input_ids.to(device)
with torch.no_grad():
# retrieve support docs
retrieved_outputs = model(input_ids, labels=None, output_retrieved=True)
dl_scores = retrieved_outputs.doc_scores[0].tolist()
dp_scores = retrieved_outputs.doc_scores.softmax(dim=-1)[0].tolist()
doc_dicts = retriever.index.get_doc_dicts(retrieved_outputs.retrieved_doc_ids)[0]
support_docs = [
{"score": ls, "proba": ns, "title": ti, "text": te}
for ls, ns, ti, te in zip(dl_scores, dp_scores, doc_dicts["title"], doc_dicts["text"])
]
# generate answers
generated_ids = model.generate(
context_input_ids=retrieved_outputs.context_input_ids,
context_attention_mask=retrieved_outputs.context_attention_mask,
doc_scores=retrieved_outputs.doc_scores,
num_beams=4,
num_return_sequences=4,
min_length=2,
max_length=64,
length_penalty=1.0,
)
Expected behavior
The batch size should be obtained differently. For instance batch_size = doc_scores.shape[0].
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels