Data QA: Identifying Invalid Spatial Relationships

Liz Sanderson
Liz Sanderson
  • Updated

FME Version

  • FME 2021.2


Invalid relationships are where there is a spatial association between features that is illogical in some way; for example, a bicycle path runs through a lake or a street lighting column appears inside a building footprint (like below):


Whether such a relationship is truly invalid depends on both the user's interpretation of the data and any special circumstances. For example, the bicycle path might run through the lake on a wooden walkway, or the street lighting column might genuinely be on the roof of a building or underneath an overhang (as at least one of the above appears to be).

Therefore FME can highlight possible problems, but only the end user can decide what is wrong and what is right, and how this should be fixed.

Invalid relationships also cover any special rules that an organization may have, to cover situations that are not by nature invalid. For example, a mapping organization may decree that a bicycle path is incorrect if it passes through an area of land not owned by the city. Or street lighting columns might be invalid if they are more than 50 metres apart, representing the distance at which lighting would become unacceptably dim (like below):


Because there are various relationships that can be tested, there are various transformers in FME that can be used to test them. The following example and notes cover just a few of these.


Source Data

The source datasets for this example are a set of lines (in three Esri Shapefiles) representing cycle routes in the city of Vancouver, and a single polygon (again in Esri Shapefile format) representing the extents of the city of Vancouver.

The dataset looks like this in the FME Data Inspector:


The scenario here is that we wish to check if any cycle route erroneously falls outside of the city boundary; for example it passes through a harbour or creek, or it partially falls across the boundary and into a different municipality.


Step-by-Step Instructions

Part 1: Locating Invalid Cycle Path

Follow these steps to learn how to identify cycle path features that have an invalid relationship with the city boundary.

1. Start FME Workbench and begin with an empty canvas

Select Readers > Add Reader from the menubar. In the dialog that opens set the data format to Esri Shapefile. Since both source datasets are Shapefile format we can use the same reader to read both of them. Hold shift to 


2. Clip features to the city boundary
The bicycle paths are long features and it would not be particularly useful to simply identify which lines overlap the city boundary; instead it's necessary to clip out the actual invalid parts of each line.

So, add a Clipper transformer to the workspace. Connect the VancouverLandBoundary dataset to the Clipper port and the bicycle path feature types to the Clippee port:


3. Run the workspace and inspect the output
Run the workspace with Feature Caching enabled and inspect the Clipper:Outside port.

It's clear that there are several locations where the bicycle paths extend beyond the Vancouver land boundary, but it's not so clear which dataset is correct unless we can see a background map.

So in the FME Data Inspector select Tools > FME Options and select a background map to display. Alternatively select File > Add Dataset, set the format to GeoTIFF (Geo-referenced Tagged Image File Format) and select the GeoTIFF files mentioned in the Downloads section.


Now we can see whether the cycle paths or land boundary data is correct (in the above screenshot the cycle path is obviously correct). Examine all clipped sections of bike path to see if any need fixing.


Part 2: Counting Invalid Cycle Paths

Counting the number of bad features is quite easy because we have already filtered them out. For example, even the Workbench feature counts show us there are 28 bad pieces. But it would be useful to know how many cycle paths have a problem and how many bad pieces there are per cycle path.

To create a count stored in an attribute is simple using the StatisticsCalculator transformer.

4. Add two StatisticsCalculators and connect them to the Clipper:Outside port
Make sure that the Summary output port is connected out of the first StatisticsCalculator, but the Complete output port in the second.

Open the parameters dialog for the first StatisticsCalculator. This will be used to tell us how many bad pieces there are in each path. So set the Group By parameter to PathId.

Next select PathId as the Attribute to Analyze. In truth it doesn't really matter which attribute we select, since we only want a count of features.

Click under the Total Count column to add it as a Statistic to Analyze. That will provide a count of the bad sections for each path.


Click OK to close the dialog.


Part 3: Fixing Invalid Cycle Paths

Fixing invalid features like this is generally not possible using FME, because of the need for user validation of the bad features. In fact, if you examine the output from this example you'll see that in all cases the "bad" sections of cycle paths are caused by either a) passing over a bridge, or b) incorrect land boundary geometry.



1: Instead of adding the GeoTIFF data to the Data Inspector we could turn on a background map, or even add the GeoTIFF as an automatic backdrop.

2: To find problem point features, like the street lighting columns located inside a building, follow the instructions for Point-in-Polygon processing under the Common GIS Operations tutorial

3: To find problem features like areas incorrectly overlapping, check the Slivers and Overlaps article in this tutorial, or the Extracting Polygon Intersections article in the Common GIS Operations tutorial


Data Attribution

The data used here originates from open data made available by the City of Vancouver, British Columbia. It contains information licensed under the Open Government License - Vancouver.

Was this article helpful?



Please sign in to leave a comment.