-
Notifications
You must be signed in to change notification settings - Fork 49
Can't insert all my data into sqlite database. #56
Description
Hello,
I used grobid to covert 2657 pdf files in xml and then with this command #!python -m paperetl.file /Users/kellytsorb/paperetl/file/XML_files /Users/kellytsorb/paperetl/SQLite
I insert the xml files into database that this comand creates but only 549 of these are inserted and I don't know why because in the past some of the papers that aren't inserted now I tried a smaller number of them and they were okk. Is there a limitation of number of articles that I can insert into database?
Process Process-2:
Traceback (most recent call last):
File "/Users/kellytsorb/anaconda3/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/Users/kellytsorb/anaconda3/lib/python3.11/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/Users/kellytsorb/anaconda3/lib/python3.11/site-packages/paperetl/file/execute.py", line 94, in process
for result in Execute.parse(*params):
File "/Users/kellytsorb/anaconda3/lib/python3.11/site-packages/paperetl/file/execute.py", line 74, in parse
yield TEI.parse(stream, source)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kellytsorb/anaconda3/lib/python3.11/site-packages/paperetl/file/tei.py", line 37, in parse
title = soup.title.text
^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'text'
Total articles inserted: 549
Thank you in advance!