Skip to content
This repository was archived by the owner on Apr 8, 2025. It is now read-only.
This repository was archived by the owner on Apr 8, 2025. It is now read-only.

examples/natural_questions.py fails #520

@ftesser

Description

@ftesser

Describe the bug
Running the examples/natural_questions.py script I obtain the error below.
The error is thrown from model.inference_from_dicts method.

Error message

"""
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/fabio/src/git_repositories/FARM/farm/infer.py", line 569, in _create_datasets_chunkwise
    dataset, tensor_names, baskets = processor.dataset_from_dicts(dicts, indices, return_baskets=True)
  File "/home/fabio/src/git_repositories/FARM/farm/data_handler/processor.py", line 361, in dataset_from_dicts
    id_external = self._id_from_dict(d)
  File "/home/fabio/src/git_repositories/FARM/farm/data_handler/processor.py", line 403, in _id_from_dict
    ext_id = try_get(ID_NAMES, d["qas"][0])
  File "/home/fabio/src/git_repositories/FARM/farm/utils.py", line 432, in try_get
    ret = dictionary[key]
TypeError: string indices must be integers
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/fabio/src/git_repositories/FARM/examples/natural_questions.py", line 142, in <module>
    question_answering()
  File "/home/fabio/src/git_repositories/FARM/examples/natural_questions.py", line 135, in question_answering
    result = model.inference_from_dicts(dicts=QA_input, return_json=False) # result is a list of QAPred objects
  File "/home/fabio/src/git_repositories/FARM/farm/infer.py", line 696, in inference_from_dicts
    return Inferencer.inference_from_dicts(self, dicts, return_json=return_json,
  File "/home/fabio/src/git_repositories/FARM/farm/infer.py", line 474, in inference_from_dicts
    return list(predictions)
  File "/home/fabio/src/git_repositories/FARM/farm/infer.py", line 545, in _inference_with_multiprocessing
    for dataset, tensor_names, baskets in results:
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 865, in next
    raise value
TypeError: string indices must be integers

Expected behavior
NO errors

Additional context
It seems that QA_input in the format "qas" + "context" is not managed here.
In fact if I try to use the "questions" + "text" keys in the QA_input dictionary, no errors are thrown.

To Reproduce
run examples/natural_questions.py (to save time: the same error is obtained running the script just from point 9 (

# 9. Since training on the whole NQ corpus requires substantial compute resources we trained and uploaded a model on s3
)

System:

  • OS: Linux Ubuntu 20.04.1 LTS
  • GPU/CPU: both
  • FARM version: 0.4.7

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions