Skip to content

Accept URLs for describe, validate, and convert input#98

Merged
tschaub merged 1 commit intomainfrom
urls
Oct 13, 2023
Merged

Accept URLs for describe, validate, and convert input#98
tschaub merged 1 commit intomainfrom
urls

Conversation

@tschaub
Copy link
Copy Markdown
Member

@tschaub tschaub commented Oct 13, 2023

This adds support for using URLs in addition to file paths as the input for the describe, validate, and convert commands.

# gpq describe https://github.com/opengeospatial/geoparquet/raw/v1.0.0/examples/example.parquet
╭────────────────────┬────────┬────────────┬────────────┬─────────────┬──────────┬───────────────────────┬───────────────────────────┬──────────────────────────╮
│ COLUMN             │ TYPE   │ ANNOTATION │ REPETITION │ COMPRESSION │ ENCODING │ GEOMETRY TYPES        │ BOUNDS                    │ DETAIL                   │
├────────────────────┼────────┼────────────┼────────────┼─────────────┼──────────┼───────────────────────┼───────────────────────────┼──────────────────────────┤
│ pop_est            │ double │            │ 0..1       │ snappy      │          │                       │                           │                          │
│ continent          │ binary │ string     │ 0..1       │ snappy      │          │                       │                           │                          │
│ name               │ binary │ string     │ 0..1       │ snappy      │          │                       │                           │                          │
│ iso_a3             │ binary │ string     │ 0..1       │ snappy      │          │                       │                           │                          │
│ gdp_md_est         │ int64  │            │ 0..1       │ snappy      │          │                       │                           │                          │
│ geometry           │ binary │            │ 0..1       │ snappy      │ WKB      │ Polygon, MultiPolygon │ [-180, -90, 180, 83.6451] │  edges │ planar          │
│                    │        │            │            │             │          │                       │                           │  crs   │ WGS 84 (CRS84)  │
├────────────────────┼────────┴────────────┴────────────┴─────────────┴──────────┴───────────────────────┴───────────────────────────┴──────────────────────────┤
│ Rows               │ 5                                                                                                                                        │
│ Row Groups         │ 1                                                                                                                                        │
│ GeoParquet Version │ 1.0.0                                                                                                                                    │
╰────────────────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

You can also validate given an input URL (with or without the --metadata-only flag, with the flag skips scanning all geometries):

# gpq validate https://github.com/opengeospatial/geoparquet/raw/v1.0.0/examples/example.parquet --metadata-only
Summary: Passed 16 checks.

Metadata and schema checks only.  Skipped 4 data scanning checks.

 ✓ file must include a "geo" metadata key
 ✓ metadata must be a JSON object
 ✓ metadata must include a "version" string
 ✓ metadata must include a "primary_column" string
 ✓ metadata must include a "columns" object
 ✓ column metadata must include the "primary_column" name
 ✓ column metadata must include a valid "encoding" string
 ✓ column metadata must include a "geometry_types" list
 ✓ optional "crs" must be null or a PROJJSON object
 ✓ optional "orientation" must be a valid string
 ✓ optional "edges" must be a valid string
 ✓ optional "bbox" must be an array of 4 or 6 numbers
 ✓ optional "epoch" must be a number
 ✓ geometry columns must not be grouped
 ✓ geometry columns must be stored using the BYTE_ARRAY parquet type
 ✓ geometry columns must be required or optional, not repeated

And the same works for the convert command:

gpq convert https://github.com/opengeospatial/geoparquet/raw/v1.0.0/examples/example.parquet example.geojson

This doesn't yet add support for reading from blob storage. I'll add that separately.

Fixes #93.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

describe and validate remote geoparquet files

1 participant