Database-independent json export and import by lonvia · Pull Request #885 · komoot/photon

lonvia · 2025-05-07T16:07:15Z

This PR replaces the existing backend-dependent json dump code with a generic json dumper that writes out a generalised dump format directly from the PhotonDoc. It then adds an import function to read from these files again. Both, export and import take into account the settings for -languages, -extra-tags, -countries and -import-geometry-column. That means that you can on one-side produce dump of different degrees of detail and on the other side, have simple filtering of a large dump in place.

The dump file remains a line-delimited json format. Photon will ensure to dump one document per line, so that it is easy to use tools like grep to pre-filter the data when saving or importing the file. On importing, however, it can handle any concatenated json.

There is now also support for a special CountryInfo document which can contain country-level information like country names. When available, Photon can use the information to set the country names in a place document. That saves a lot of duplication in the dump file. Having the country names on each place document separately is supported as well but country info takes precedence.

Dumping the full planet takes about 3.5 hours for the standard configuration and produces a 20GB bzipped-file. A full dump with all features takes 9.5h and has about 33GB. Reading the full dump takes about 11h. There is clearly room for improvement here. To some extend bzipping might have been the limiting factor.

Supersedes #438.
Closes #868.
Closes #291.
Fixes #411.

Linked places should never be imported.

lonvia added 18 commits May 1, 2025 09:29

unify json dumpers to create a common format

3bc02d8

start importing of dumps

7b9627a

adapt to changes in multi-doc handling

ae8e0bc

basic json import complete

de088ba

add mixin of country names and processing of lang-dept address

b1de7ee

add support for parsing addresslines

99cb512

rename osm_type/id to more generic object_type/id

cb97898

generalize tag_key/value into categories

2df3272

make geometry import optional when loading from file

8a37ea1

implement extraTags config and introduce ALL option

83f002d

filter useful when importing simple docs

ae5bf79

export of geometries

af820e8

remove linked place ID from PhotonDoc

d6c3582

Linked places should never be imported.

enable country filtering with file import

cddc47a

tests for json exporter

6ed8723

add tests for json reader

c33fbe5

fix formatting

b0a215f

downgrade JsonUnit for Java 11 compatibility

21ac674

lonvia force-pushed the json-dump-exchange-format branch from ab4f843 to 21ac674 Compare May 7, 2025 21:22

lonvia merged commit 630459e into komoot:master May 8, 2025
4 checks passed

lonvia deleted the json-dump-exchange-format branch May 8, 2025 07:39

lonvia mentioned this pull request May 8, 2025

Add option to import json dump file (#291) #438

Closed

lonvia mentioned this pull request Aug 25, 2025

Importing other sources #53

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Database-independent json export and import#885

Database-independent json export and import#885
lonvia merged 18 commits intokomoot:masterfrom
lonvia:json-dump-exchange-format

lonvia commented May 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lonvia commented May 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant