Geospatial Cloud Native Data Overview: Mapping the Future

Kailin Opaleychuk
Kailin Opaleychuk
  • Updated

FME Version

  • FME 2024.0

Introduction

Cloud-native refers to a modern approach to managing both spatial and non-spatial datasets, which are optimized for deployment and hosting within cloud environments. By leveraging cloud architecture, cloud-native formats have eased the heavy load of provider and user interactions across networks, saving time, money and storage.

The concept behind cloud-native is to simply ‘read what you need’. This often involves some sort of indexing or tiling of the data, so small extent queries can be applied and extracted from large datasets. Because cloud-native standards inherently enable these partial reads, data publishers can rely on these standards to optimize their datasets without much additional configuration.

FME can help enable seamless integration of cloud-native datasets stored across various cloud platforms, including Amazon’s S3, Azure, or Google Cloud. Safe Software has now added support for five key cloud-native formats to FME Form 2024.0. 

 

Formats

Spatio-Temporal Asset Catalog (STAC)

Spatio-Temporal Asset Catalog is a format that stores cloud-based assets that relate to a geographic area or time. The assets are typically templated in a JSON catalog/collection. Although STAC was developed around raster data, it also supports vector products. For example, a STAC Collection can have Assets that store geopackage layers or COG bands as items.

 

Cloud-Optimized GeoTIFF (COG)

Cloud-Optimized GeoTIFF is designed based on the GeoTIFF raster specification. This format supports raster pyramiding, tiling and compression.

 

GeoParquet

GeoParquet is a cloud-friendly vector format built on the Parquet standards. As a result, Geoparquet benefits from a mature set of applications, libraries and tools available initially designed for Parquet. The format is column oriented and supports a wide range of geometries

 

FlatGeobuf

FlatGeobuf is a vector format built on Google’s Flatbuffers library. A buffer is considered a file and everything within it. Although it is not required, FlatGeobuf uses indexing to help reduce the amount of data that would need to be transferred over a potentially slow network.

 

Zarr

Based on NetCDF / HDF data cube formats, Zarr is a multidimensional raster array / time series storage optimized for the web. In Zarr, each time step represents a separate band with its own properties.

 

Cloud-Optimized Point Cloud (COPC)

A Cloud-Optimized Point Cloud stores point clouds optimized for the web, allowing you to read what you need from large 3D data volumes. This format is inspired by the LAS/LAZ specification.

 

 

Was this article helpful?

Comments

0 comments

Please sign in to leave a comment.