Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Pinboard RSS parsing valid links as None #822

Merged
merged 1 commit into from
Aug 4, 2021

Conversation

overhacked
Copy link
Contributor

Summary

Fixes #821.

item.find(p) returns either an ElementTree.Element or None. The lambda on line 24 coerces the return value to a bool, which is False if the <link> element has no children (see ElementTree.py line 207), so the lambda returns None.

Further, returning a Link with url=None violates an assertion in index/schema.py, which crashes the archivebox add command.

Changes these areas

  • Bugfixes
  • Feature behavior
  • Command line interface
  • Configuration options
  • Internal architecture
  • Snapshot data layout on disk

Sorry, something went wrong.

`item.find(p)` returns either an `ElementTree.Element` or `None`.  The
[lambda on line 24][lambda] coerces the return value to a bool, which is
`False` if the `<link>` element has no children (see
[`ElementTree.py` line 207][etbooldef]), so the lambda returns `None`.

Further, returning a `Link` with `url=None` violates
[an assertion in `index/schema.py`][assertion], which crashes
the `archivebox add` command.

[lambda]: https://github.com/ArchiveBox/ArchiveBox/blob/3d54b1321bf8c56627aaa50efcc809cd99caee52/archivebox/parsers/pinboard_rss.py#L24
[etbooldef]: https://github.com/python/cpython/blob/3d8993a744813c5144851da5347d7b4b1885f234/Lib/xml/etree/ElementTree.py#L207
[assertion]: https://github.com/ArchiveBox/ArchiveBox/blob/3d54b1321bf8c56627aaa50efcc809cd99caee52/archivebox/index/schema.py#L165
@pirate pirate merged commit 2e5937d into ArchiveBox:dev Aug 4, 2021
@cdzombak
Copy link
Contributor

cdzombak commented Nov 15, 2021

@pirate is there a chance of getting a new release / Docker image with this fix?

Edit: nvm, I see a dev tag is available so I'll use that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Crash during Pinboard RSS import
3 participants