Spatial Analysis on Unbounded Data Streams

Files

Calculate Distance.fmw
- 100 KB
- Download
Create Buffers.fmwt
- 30 KB
- Download
Snap data to a network rush hour.fmwt
- 1000 KB
- Download
Dynamic Geofences.fmw
- 90 KB
- Download

Introduction

When working with location-enabled streams, understanding the relationship between points in the incoming stream and other features is a key workflow. Proximity analysis can help efficiently filter features, clean data, and detect events in real-time.

Streams Diagrams Rebranded - Proximity Analysis - Buffers.png

Examples

Create Buffers

Buffers enable areas to be created around either the incoming stream data or the feature being compared. Once in place, buffers make it easy to determine if a feature is within a certain proximity of another feature.

Scenario

A city is monitoring the location of all of its operational vehicles (e.g., SUVs, refuse vehicles, maintenance vans). They have two vehicles that cannot approach schools as they contain hazardous materials. The sensors on the vehicles report their location every five seconds. Decision-makers in the city wish to receive alerts when a hazardous vehicle goes within 100m, 200m, and 300m of the school.

Workflow

The workspace connects to the message broker; in this case, we are using Kafka. The JSON-formatted payload, which represents the location of the vehicle, is then flattened into individual attribute,s and the attribute containing the WKT geometry is set to be the geometry of the feature. Since we are only interested in hazardous vehicles, a Tester checks the vehicle IDs to filter them out.

The schools for the entire city are also read, and buffers are created in sizes of 100m, 200m, and 300m. The locations of hazardous vehicles are then compared against the buffers using the SpatialFilter. If a point falls within a buffer, the data is written to a database to trigger downstream business processes.

Dynamic Geofences

The functionality of buffers can be extended by using dynamic buffers or geofences that change in location. Proximity comparisons with incoming features are calculated based on the buffer’s location at that moment in time.

Scenario

A ride-hailing company wishes to optimize its business by lowering wait times for potential riders who request a driver. The ride-hailing app drivers and users have their locations frequently tracked. If drivers are not at least within 1 km of a user, they will be redistributed to be closer to app users.

Workflow

The workspace connects to the driver locations stream, and in this case, our message broker is Kafka. The JSON-formatted payload, which represents the locations of every vehicle, is then flattened into individual attribute,s and the attribute containing the WKT geometry is set to be the geometry of the feature.

The TimeWindower will group the drivers into 5-minute windows. Using the Sorter and Sampler and grouping by the window_id attribute, the most recent location of each driver in each window will be used. For each window, the Google Big Query table that holds the current locations of all app users will be queried and buffered. The SpatialFilter will check if any drivers fall outside the user buffers, and the vehicle IDs will be passed into a separate driver redistribution workflow.

For more information, and to see this workflow in action, see our video on Stream Processing: Geofencing.

Snap Data to a Network

At best, GPS data only has an accuracy of a couple of meters. To perform network analysis, points need to be snapped to the network. Doing this on an unbounded stream before it is committed to the database is beneficial, as it allows the data to be filtered and cleaned before being loaded into the data store.

Scenario

An insurance company is trialing a new program where drivers pay for insurance based on the distance they travel. They have 50,000 people in the trial, and each participant has a sensor in their vehicle that tracks their location. The sensors report their location every five seconds.

The insurance company only wants to analyze vehicle movement on the primary routes (highway and main arterial roads). Vehicles on minor routes need to be filtered out, and then vehicles on the major routes need to be snapped to the road network so network analysis can be carried out downstream.

Workflow

The workspace connects to the message broker; in this case, we are using Kafka. The JSON-formatted payload, which represents the vehicle's location, is then flattened into individual attributes, and the attribute containing the WKT geometry is set to be the geometry of the feature.

A TimeWindower transformer is then used to sample the data so the vehicle's location is at a 30-second interval instead of 5 seconds. The road network is also read, and using the NeighborFinder, vehicles within 5m of the main routes are snapped to the network. Attributes from the road are transferred to the vehicle point (speed limit, road name).

For more information, and to see this workflow in action, see our video on Stream Processing: Snap Data to a Network.

Calculate Distance

Being able to calculate the distance between a target feature and another feature is an important workflow when working with unbounded streams. It allows you to analyze the distance between the location of an asset and another feature. This means rather than storing all of the data from a stream, you can just assess if the distance between two features meets a threshold, and if it does, either store the data or trigger an event.

Scenario

A utility company wants to develop a workflow that enables decision-makers to identify which vehicles are closest to a new power outage, allowing them to dispatch a crew to repair it. When a power outage occurs, the closest two vehicles need to be identified and an email sent to a decision-maker with a list of these vehicles.

Workflow

The workspace connects to the message broker; in this case, we are using Kafka. The JSON-formatted payload, which represents the power outage event, is then flattened into individual attributes.

The current locations of all vehicles are retrieved from Google BigQuery, and the NeighborFinder is then used to identify the two closest vehicles to the power outage event. The closest vehicles to the outage are then passed downstream using the FMEFlowJobSubmitter or FME Flow Automation writer to another FME Flow process, which can send an email to decision-makers.

Search

Spatial Analysis on Unbounded Data Streams

Files

Introduction

Examples

Create Buffers

Scenario

Workflow

Dynamic Geofences

Scenario

Workflow

Snap Data to a Network

Scenario

Workflow

Calculate Distance

Scenario

Workflow

Was this article helpful?