Data QA: Identifying Self-Intersections with FME

Liz Sanderson
Liz Sanderson
  • Updated

FME Version

  • FME 2021.2

Introduction

Self-intersections (also known as "loops", "bowties" or "fishtails") are features whose boundary twists around such that it intersects itself, causing a loop:

selfintersection1.png


The left-hand example shows where the end point of a polygon did not meet the start point, causing the polygon to close itself with a loop.

The right-hand feature is an example of points that are perhaps out of sequence. It comes very close to being a line that directly reverses back on itself and very close to being a spike, but the layout actually causes a loop to form.

In most cases, self-intersections are not as obvious as these. The left-hand example is often very much smaller - to the point of being invisible - and the right-hand example often has angles that are so acute it looks like a single line.

FME incorporates the ability to locate and fix self-intersections using the GeometryValidator transformer.

In this example, we will look at identifying and fixing self-intersections in a dataset containing building outlines.

There are a lot of buildings so (for the sake of simplicity) we'll say that a user has reported two buildings that contain problems. They have the ID numbers 102521333 and 301873712 and it is up to us to investigate.
  

Step-by-Step Instructions

Locating Self-Intersections

Follow these steps to learn how to locate self-intersecting features with a GeometryValidator transformer.
 

1. Start FME Workbench and begin with an empty canvas. Select Readers > Add Reader from the menubar.

Set the data format to OpenStreetMap (OSM) XML and select the attached OSM dataset as the source. Set the Workflow Options to Single Merged Feature Type (to make sure all building objects are read as a single layer) and click OK to add the reader.

selfintersection3.png

 

2. Add a Tester transformer 
Set up the Tester to test for where ID=102521333 or ID=301873712:

selfintersection4.png

NB: We could just pass all buildings into the GeometryValidator and ignore the passed features, but for the purposes of this exercise it's quicker to run and easier to spot the errors when the suspected features are already isolated.
 

3. Place a GeometryValidator transformer connected to the Tester:Passed port.
Open the parameters and select Self-Intersection in 2D from the drop down list for Issues to Detect. Set Attempt to Repair to No. Connect Inspector transformers to the GeometryValidator output ports and run the workspace. The Failed and IssueLocations output will look like this:

selfintersection5b.png

Map tiles by Stamen Design, under CC-BY-3.0. Data by OpenStreetMap, under CC-BY-SA.


The left-hand building - if it's a single polygon - has obvious problems, with the outer perimeter crossing at several locations. In fact, query the feature to prove that it is a single polygon with a single perimeter.

The right-hand building has less obvious problems. You will need to zoom in very closely to the top right-hand (north-east) corner of the building to see the problem:

selfintersection6.png


Incidentally, that little offset is 0.7mm (about 0.03") so you can see how a tiny mistake like that can cause a problem geometry. Now we have all of the features containing a self-intersection, with a point feature denoting where that self-intersection is.
 

Counting Self-Intersections

Counting self-intersections is very simple with the StatisticsCalculator transformer. The only decision is whether to count the number of problem features, or the number of self-intersections, which is not necessarily the same thing.

4. Place a StatisticsCalculator transformer.
This will be used to count invalid features connect it to the GeometryValidator:Failed output port. To count self-intersections connect it to the GeometryValidator:IssueLocations output port.


5. Setup the parameters for the StatisticsCalculator transformer.
Select any attribute in the Attribute to Analyze parameter (it doesn't matter which one). Now set the Total Count Attribute to a new attribute name such as NumberBadFeatures or NumberSelfIntersections, whichever is appropriate.
2021-11-08_13-38-43.png


6. Attach an Inspector transformer to a StatisticsCalculator output port (if you select the Summary port you get a single feature, if you select the Complete port you get all the original features).
Run the workspace. You should now have a count of the number of bad features (2) or the number of self-intersections (4).

 

Fixing Self-Intersections

There are a number of ways in which a self-intersection could be fixed. For example, a gap could be opened at the point of intersection so that the two pieces of linework don't cross:

selfintersection7.png
 

Another solution is to actually divide the feature into two (or more) polygons, splitting them at the intersection point. In fact, this is the technique that FME takes.
 

7. Check the GeometryValidator transformer's parameters
Set the Attempt Repair parameter to Yes.
 

8. [Optional] Reassigning the Coordinate System
To workaround a fault with the GeometryValidator (it now drops the coordinate system of failed features) place a CoordinateSystemSetter transformer between the GeometryValidator:Failed port and the connected Inspector transformer. Check the parameters and set the Coordinate System parameter to LL84.
 

9. Re-run the workspace and inspect the output
Take a look at the Repaired output. Notice that FME has split the original single polygon features into a multi-part feature, consisting of several polygons/outlines:

selfintersection8.png


We now have a set of data that has been cleaned of self-intersection. If automatic cleaning is not desirable then the intersection points can be used to identify places to check where problems might be manually resolved.

 

Data Attribution

OpenStreetMap Datasets: © OpenStreetMap contributors. See http://www.openstreetmap.org/copyright

Was this article helpful?

Comments

0 comments

Please sign in to leave a comment.