New tool alert: geoparquet-io is both a stand-alone CLI1 tool and a Python library for reading, transforming, and writing GeoParquet files. Remember, GeoParquet is a cloud-native2 geodata format (basically, the geo-flavour of the Parquet3 format).
geoparquet-io is open-source, carries the Apache 2.0 license and its version 1.0 is currently in beta. It can be installed using pipx and pip, respectively, or using uv.
With geoparquet-io, you can convert other formats to GeoParquet, inspect and optimize existing GeoParquet files, add a spatial index, partition large files using several partition strategies, and more. The tool also supports building pipelines of operations in Python or piping intermediate results through a workflow via Unix pipes in the CLI. The website features performance tips and best practices. If you use geoparquet-io for conversion to GeoParquet, these are applied automatically.
The extract tool looks particularly interesting: It can be used to directly extract geodata from, for example, a WFS4, Esri ArcGIS Feature Services, or BigQuery5 tables and write it to GeoParquet.
Footnotes
Command-line interface.↩︎
I.e., taking full advantage of the cloud computing model; in the data world typically through enabling partial downloads of file contents through so-called HTTP range requests.↩︎
An open-source, columnar storage file format designed for efficient data processing.↩︎
Web Feature Service, an OGC standard interface for querying and retrieving vector geodata over the web.↩︎
Google’s data warehouse (DWH) Platform-as-a-Service offering.↩︎