Conversation
tholor
left a comment
There was a problem hiding this comment.
This would fix the inferencer, but don't we have similar issues in other use cases (e.g. training deepset-ai/haystack#488)? Maybe a similar fix inside DataSilo could tackle this?
|
I thought so, too and used corrupted files for normal training (using the datasilo). I could not get QA processing to fail with these corrupted files. I can try corrupting the files a bit more to see if we can reproduce datasilo errors. Will report back here. |
|
Ok I found out some strange behaviour when the context is either empty or does not contain the answer. I opened a separate issue and assigned @bogdankostic to it. I would prefer to merge this solution for the inferencer now and work on the datasilo in a separate PR. |
tholor
left a comment
There was a problem hiding this comment.
Ok sounds good. Let's tackle that separately
fixes #454
Only happens during multiprocessing.
When one multiprocessing chunk only contains invalid examples we run into errors. This PR catches these errors and produces an error message. It does not solve the root cause.
Invalid examples can be: