Skip to content
This repository was archived by the owner on Apr 8, 2025. It is now read-only.
This repository was archived by the owner on Apr 8, 2025. It is now read-only.

QA: Returning answers that are longer than context #333

@tholor

Description

@tholor

Describe the bug
When the length of the answer is longer than context_window_size the QuestionAnsweringHead returns invalid, negative offsets. This is only relevant at inference time, but can cause problem then in upstream usage (e.g. in haystack).

Expected behavior
Context should always cover the full answer

Details
We should add a check in create_context() to increase the context for such cases

def create_context(self, ans_start_ch, ans_end_ch, clear_text):
if ans_start_ch == 0 and ans_end_ch == 0:
return None, 0, 0
else:
len_text = len(clear_text)
midpoint = int((ans_end_ch - ans_start_ch) / 2) + ans_start_ch
half_window = int(self.context_window_size / 2)
context_start_ch = midpoint - half_window
context_end_ch = midpoint + half_window
# if we have part of the context window overlapping start or end of the passage,
# we'll trim it and use the additional chars on the other side of the answer
overhang_start = max(0, -context_start_ch)
overhang_end = max(0, context_end_ch - len_text)
context_start_ch -= overhang_end
context_start_ch = max(0, context_start_ch)
context_end_ch += overhang_start
context_end_ch = min(len_text, context_end_ch)
context_string = clear_text[context_start_ch: context_end_ch]
return context_string, context_start_ch, context_end_ch
@staticmethod

System:

  • FARM version: 0.4.2

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingtask: QAQuestion Answering

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions