Skip to content

Database-independent json export and import#885

Merged
lonvia merged 18 commits intokomoot:masterfrom
lonvia:json-dump-exchange-format
May 8, 2025
Merged

Database-independent json export and import#885
lonvia merged 18 commits intokomoot:masterfrom
lonvia:json-dump-exchange-format

Conversation

@lonvia
Copy link
Copy Markdown
Collaborator

@lonvia lonvia commented May 7, 2025

This PR replaces the existing backend-dependent json dump code with a generic json dumper that writes out a generalised dump format directly from the PhotonDoc. It then adds an import function to read from these files again. Both, export and import take into account the settings for -languages, -extra-tags, -countries and -import-geometry-column. That means that you can on one-side produce dump of different degrees of detail and on the other side, have simple filtering of a large dump in place.

The dump file remains a line-delimited json format. Photon will ensure to dump one document per line, so that it is easy to use tools like grep to pre-filter the data when saving or importing the file. On importing, however, it can handle any concatenated json.

There is now also support for a special CountryInfo document which can contain country-level information like country names. When available, Photon can use the information to set the country names in a place document. That saves a lot of duplication in the dump file. Having the country names on each place document separately is supported as well but country info takes precedence.

Dumping the full planet takes about 3.5 hours for the standard configuration and produces a 20GB bzipped-file. A full dump with all features takes 9.5h and has about 33GB. Reading the full dump takes about 11h. There is clearly room for improvement here. To some extend bzipping might have been the limiting factor.

Supersedes #438.
Closes #868.
Closes #291.
Fixes #411.

@lonvia lonvia force-pushed the json-dump-exchange-format branch from ab4f843 to 21ac674 Compare May 7, 2025 21:22
@lonvia lonvia merged commit 630459e into komoot:master May 8, 2025
4 checks passed
@lonvia lonvia deleted the json-dump-exchange-format branch May 8, 2025 07:39
@lonvia lonvia mentioned this pull request Aug 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Build indexes for structured search on an existing database Excluding context in mappings breaks reindexing Provide json dump importer

1 participant