GeoParquet 2.0

Exciting news from the #cloudnative and #cloudnativegeo world: GEOMETRY and GEOGRAPHY become native data types in both #Parquet and #Iceberg.
Author
Published

February 13, 2025

On the heels of this, news from the #cloudnative and #cloudnativegeo world:

The Parquet specification has officially adopted geospatial guidance, enabling native storage of GEOMETRY and GEOGRAPHY types

and

Iceberg 3 now includes GEOMETRY and GEOGRAPHY as part of its official specification

What does it mean? Well, GEOMETRY and GEOGRAPHY are now native logical data types in both Apache Parquet1 and Apache Iceberg2, just like, e.g., INT32 is. This is a big step for mainstreaming “geo” in the cloud-native3 data community.

What’s going to be even more interesting are the next steps in order to make the cloud-native paradigm work better for spatial data. The rough roadmap and more details are available from CNG.

Footnotes

  1. an open-source, columnar storage file format designed for efficient data processing↩︎

  2. an open-source, high-performance table format for large datasets in data lakes↩︎

  3. i.e.  taking full advantage of the cloud computing model; in the data world typically through enabling partial downloads of file contents through so-called HTTP range requests↩︎