Raster-enabling Apache Hop

Stefan Ziegler continues his series on #geoenabling #ApacheHop, this time adding #raster support to his hop-gdal-plugin. Stefan walks through an #ETL pipeline that computes building heights from LiDAR and vector data using new raster transforms (Raster Clip, Raster Zonal Stats) built on the newer #GDAL tool structure.
Author
Published

April 3, 2026

Stefan Ziegler’s efforts to geo-enable Apache Hop1 (previously) continues: After vector data support through OGR2 reader and writer plug-ins, a geometry inspector, some INTERLIS3 integration, and more, Stefan is now tackling raster data. In Let’s Hop 6: Pixel für Pixel (alternatively: on LinkedIn), he describes4 how he added several raster transforms to his hop-gdal-plugin using the newer GDAL tool structure.5

To demonstrate the new capabilities, Stefan walks through a pipeline that computes building heights from LiDAR6 and vector data: an OGR reader ingests building footprints from a GeoPackage7 file, while a Raster Clip transform clips the corresponding extent from an externally hosted COG8 file. The two streams are joined9, and a Raster Zonal Stats transform10 is used to calculate average and maximum height per building. The result is written back to GeoPackage via an OGR output.

Demo workflow (source: Stefan Ziegler)

Nice to see the geospatial capabilities of Hop further expanded. To recap Stefan’s series11:

Stefan’s INTERLIS made easy series is at 61 instalments. Let’s see how far “Let’s Hop” will go.13

Footnotes

  1. If you need a reminder: Apache Hop is an open-source ETL (extract-transform-load) solution that – in its own words and certainly in the hopes of some stakeholders – “aims to be the future of data integration.”↩︎

  2. OGR is the vector data library within GDAL supporting a wide range of vector formats.↩︎

  3. INTERLIS is a Swiss standard for geodata modelling and data exchange.↩︎

  4. in German↩︎

  5. Think, for example, gdal raster hillshade instead of the very broad gdal_translate or gdalwarp.↩︎

  6. Light Detection and Ranging, an active remote sensing technique that uses laser pulses to generate 3D point clouds of the Earth’s surface or objects.↩︎

  7. An open SQLite-based container format for geospatial data, defined by the OGC.↩︎

  8. Cloud-optimized GeoTIFF.↩︎

  9. Here, Hop feels somewhat unwieldy, I would say.↩︎

  10. Here, Stefan finds the user interface somewhat overwhelming.↩︎

  11. The articles are all on LinkedIn, as well, but here I use the more sovereign links to Stefan’s blog.↩︎

  12. While some articles have English titles, they are all written in German.↩︎

  13. 🤞– which, of course, is not a sustainable strategy.↩︎