Does your STAC1 contain assets that depend on each other, for example a digital elevation model and a derived stream network? Spatialnode has published an article on STACD, an extension for STAC to work with DAGs2, essentially to be aware of workflows that may connect different STAC assets. This makes it possible that a change (update) to an upstream STAC asset triggers re-computation of downstream STAC assets.
From Spatialnode:
STACD extends STAC with three main capabilities:
- Complete Lineage Tracking: Every dataset records not just its immediate parents, but the entire chain of algorithms, input datasets, and parameters used to create it. (…)
- Algorithm Versioning: STACD introduces a formal way to describe algorithms and their versions. When you update a terrain classification algorithm or switch from one machine learning model to another, the system tracks which datasets need to be recomputed.
- Selective Re-Computation: Instead of rerunning everything, STACD identifies exactly which parts of a workflow are affected by changes. (…)
The article links to an open-access conference paper titled “STACD: STAC Extension with DAGs for Geospatial Data and Algorithm Management” by Laud et al. and Spatialnode provides a reference implementation using the Apache Airflow framework.
Footnotes
SpatioTemporal Asset Catalog↩︎
A “DAG” or “Directed Acyclic Graph” is concept that can be used to describe a workflow, for example in data manipulation or transformation (“directed acyclic” simply means that the graph has a direction of “flow” and that it is impossible to loop back when traversing the graph).↩︎
