Data QA: Identifying Short Line Features

Liz Sanderson
Liz Sanderson
  • Updated

FME Version

  • FME 2021.2

Introduction

Short lines are those whose length is less than a specified tolerance. These can be found through the combination of measuring the length of lines and then applying a test condition with a filter transformer.

Testing for short lines is a good QA test because lines less than a certain length are usually indicative of problems such as duplicate vertices, invalid network edges, dangles (overshoots), and generally poor linework.

For example, denoted by the arrows below are two extremely short lines, dangling from where the road network was clipped against a polygon boundary that didn't quite match:

qa-shortlines-1.png

It's very simple to count how many of these bad features exist. However, as discussed below, fixing lines like this automatically is more difficult.

 

Source Data

The source dataset for this example is a set of lines (in an AutoCAD DWG dataset) representing roads in the city of Vancouver.

The dataset looks like this in the FME Data Inspector:

qa-shortlines-2.png

The scenario here is that we wish to create a proper edge/node network, but recognize that we should add some QA checks to ensure that no bad linework is being used.

 

Step-by-Step Instructions

Part 1: Locating Short Lines

Follow these steps to learn how to identify short line features.
 

1. Start FME Workbench and create a new workspace

Select Readers > Add Reader from the menubar.

Set the data format to Autodesk AutoCAD DWG/DXF. Select the attached dwg file as the source dataset. Click the Parameters button and set Group Entities By to "Attribute Schema".

Click OK and OK again to add the reader.
 

2. Calculating length of the lines
The source dataset is made up of line features. To test their length we first have to measure them using a LengthCalculator transformer.

So, add a LengthCalculator transformer. Connect all of the reader feature types to the LengthCalculator input port.

qa-shortlines-3.png


3. Filtering features based on length
To assess the length of a feature requires a transformer that can filter features based on a particular condition. The easiest to use - and most common - transformer is the Tester.

So, add a Tester transformer and connect it to the LengthCalculator:Output port.
 

4. Setup tester parameters
How short a line needs to be before it becomes suspect is a subjective decision. However, a road link shorter than it is wide appears to be an obvious candidate. Wikipedia tells us that an average road lane is 3 metres wide. Most roads in Vancouver are made up of two lanes. Therefore we should flag all road segments that are less than six metres in length.

So, open the Tester parameters dialog. Set a test condition to check for road features whose measured length is less than 6:

qa-shortlines-4.png

 

5. Inspect the results
Run the workspace and inspect the output by clicking on the green magnifying glass next to each Tester output port. When the output appears, adjust the feature colors in the FME Data Inspector to make the short lines stand out more.

qa-shortlines-5.png

Examine the red features to see if these really are problem features and why. You'll find there are isolated problems spread throughout the city.
 

6. Identifying the type of roads affected
Let's pretend we want to identify the type of roads that are affected; i.e. are they Private, Residential, Secondary, etc. The first step to doing so is to expose that information as an attribute.

So, click the gear button on any of the source feature types to open its properties dialog. Click the Format Attributes tab and locate the attribute fme_feature_type.

qa-shortlines-6.png

Put a check-mark against the attribute to expose it (make it available within the workspace).
 

7. Inspecting by type of road
Add an Inspector to the canvas. Connect it to the Tester:Passed port. Open the parameters dialog for the Inspector.

Click the browse button for the Group By setting and select the newly exposed fme_feature_type

Re-run the workspace. Each type of road will be separated into its own layer in the Data Inspector. You'll see that of the 14 bad features; 4 are arterial roads, 7 are residential, and 3 are secondary.

 

Part 2: Counting Short Lines

Counting the number of bad features is quite easy because we have already filtered them out. For example, even the Workbench feature counts show us the numbers involved:

qa-shortlines-7.png

To create a count stored in an attribute is simple using the StatisticsCalculator transformer.

Follow these steps to learn how to count short line features.
 

8. Create a count stored in an attribute
Add a StatisticsCalculator between the Tester:Passed port and its Inspector transformer. Open the parameters dialog.

First, select _length as the Attribute to Analyze. In truth, it doesn't really matter which attribute we select, since we only want a count of features.

Delete all the values from the Calculate Attributes field and then add "BadFeatures" under the Total Count field. That will provide a count of the bad features. Click OK to close the dialog.
2021-12-07_15-34-46.png

Re-run the workspace. This time the output should include an attribute that denotes how many bad features there are of each type.

NB: If you connected the StatisticsCalculator:Summary output port to the Inspector, there will only be a single output feature. To get all output features, ensure that the Complete port is connected.
 

9. Group the count based on road type
If you want a count of the bad features (as an attribute) based on the road type, re-open the StatisticsCalculator parameters dialog and set the Group By parameter to group by the attribute fme_feature_type

Run the workspace and you will see the number of bad features for each type of road.
2021-12-07_15-36-29.png

 

Part 3: Fixing Short Lines

There is no simple solution that will fix short lines automatically because there are a number of different ways in which short lines can occur. The simplest case to tackle is a short line that should be merged into its neighbor using the LineMerger transformer; but, as the following example shows, sometimes trying to fix short lines automatically can introduce more errors than it repairs and can become quite a complex process.

12. Add a LineCombiner transformer
Add a LineCombiner transformer to the workspace (called the LineCombiner in FME2018 or newer) and insert it between the reader feature types and the LengthCalculator transformer:

qa-shortlines-8.png

 

13. Setup the LineCombiner parameters
Open the LineCombiner transformer's parameters dialog. This transformer has a number of parameters to control how it operates. One danger here is unwanted connections, since there is no way to tell it only the short features should be joined.

So, set the Combine On Attributes parameter to "HBLOCK" and the Consider Node Elevation parameter, under the Advanced drop-drown parameters, to "Yes".

Now lines will not be connected unless they are part of the same road, and not if they pass over each other at a different elevation.
 

14. Fix incidentally joined lines
Even making use of the LineCombiner parameters, some lines may get inadvertently connected together. For example where two segments in the same block should be split by an incoming connection but are now joined together:

qa-shortlines-10.png

 

To avoid this, add an Intersector transformer between the LineCombiner and LengthCalculator transformers.

qa-shortlines-9.png
 

15. Re-run the workspace

You will see that some of the problem features have been cleaned up. For example, look for the feature with a StreetId of 11255. Before running the cleanup workspace it was an unnecessary short section. Afterwards, the LineCombiner and Intersector have merged the feature and intersected the block at the junction with StreetId 11342.

However, other features have been unnecessarily intersected (see StreetId 13545):

qa-shortlines-11.png

So this is a case where the intersection solves some problems, but causes others.
 

16. Re-join erroneously intersected features
What we need to do now is re-join features that were erroneously intersected, leaving those that were intersected because they were part of an original merge. This is where things start to get more complex, but we can tell one feature from another by comparing the number of parts it was broken into, to the number of parts it had after the LineCombiner.

Open the parameters dialog for the LineCombiner. Check the box labeled Generate List and enter a list name (such as MyList, for example). Set Add to List to All Attributes. This will create an FME list - an attribute with multiple values - that tells us which (and how many) road segments are being joined.
 

17. Add ListElementCounter and Tester transformers, and a second LineCombiner - all connected at the end of the workspace like so:

qa-shortlines-13.png

Open the ListElementCounter parameters dialog and set the List Attribute to MyList{}. That will record the number of parts to each feature originally joined together.

Open the Tester parameters dialog. Set up a test condition for _element_count = _segments. This will tell us whether to join features back together. i.e. if the LineCombiner and Intersector had no effect on the number of parts, then it should not have been altered and should be fixed.

Finally, open the LineCombiner parameters dialog and set the Group By parameter to StreetId. What this will do is join together lines based on StreetId, putting back together streets that were unnecessarily connected.
 

18. Run the workspace and inspect the outputs

You may - if you wish - add a second LengthCalculator/Tester combination to see whether there are any remaining short lines:
2021-12-07_16-16-27.png
 

Notice that the output now has most of the original short lines fixed.

You may also have noticed that there are now more short lines than originally! This is because this technique has split up lines at intersections where there was previously no node. In other words, it's found lines that had been incorrectly noded and fixed that too - revealing some short lines in the process.

 

Data Attribution

The data used here originates from open data made available by the City of Vancouver , British Columbia. It contains information licensed under the Open Government License - Vancouver.

 

Was this article helpful?

Comments

0 comments

Please sign in to leave a comment.