QA: Returning answers that are longer than context

**Describe the bug**
When the length of the answer is longer than `context_window_size` the `QuestionAnsweringHead` returns invalid, negative offsets. This is only relevant at inference time, but can cause problem then in upstream usage (e.g. in haystack).

**Expected behavior**
Context should always cover the full answer

**Details**
We should add a check in `create_context()` to increase the context for such cases
https://github.com/deepset-ai/FARM/blob/89826c1f4269a5c0d0b5a6193ece94a3401766d9/farm/modeling/prediction_head.py#L1262-L1282

**System:**
 - FARM version: 0.4.2

	def create_context(self, ans_start_ch, ans_end_ch, clear_text):
	if ans_start_ch == 0 and ans_end_ch == 0:
	return None, 0, 0
	else:
	len_text = len(clear_text)
	midpoint = int((ans_end_ch - ans_start_ch) / 2) + ans_start_ch
	half_window = int(self.context_window_size / 2)
	context_start_ch = midpoint - half_window
	context_end_ch = midpoint + half_window
	# if we have part of the context window overlapping start or end of the passage,
	# we'll trim it and use the additional chars on the other side of the answer
	overhang_start = max(0, -context_start_ch)
	overhang_end = max(0, context_end_ch - len_text)
	context_start_ch -= overhang_end
	context_start_ch = max(0, context_start_ch)
	context_end_ch += overhang_start
	context_end_ch = min(len_text, context_end_ch)
	context_string = clear_text[context_start_ch: context_end_ch]
	return context_string, context_start_ch, context_end_ch

	@staticmethod

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QA: Returning answers that are longer than context #333

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

QA: Returning answers that are longer than context #333

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions