FME Version
Files
-
- 70 KB
- Download
Introduction
There are multiple ways of joining data together using FME. This example will look at using the FeatureMerger transformer. The FeatureMerger copies and merges the attributes/geometry from one feature with another feature by carrying out a non-spatial join using key values. Please read the FeatureMerger documentation for a more detailed description of what this transformer does as well as details about each parameter.
If you are used to working with SQL, try using the FeatureJoiner transformer instead. The FeatureMerger handles joining data differently, in that by default it only creates a single match per source feature. The FeatureJoiner supports multiple matches.
Using the FeatureMerger
The FeatureMerger receives two streams of features via its input port:
- Requestor: Requestors are the features that will receive new attributes and/or geometry.
- Supplier: Suppliers provide attributes and/or geometry to be merged onto the Requestors.
Key things to remember:
- If the Requestor already has an attribute that the Supplier also has, the Requestor's original value for that attribute can be preserved or overridden.
- A single Supplier may be used by many Requestors.
- Many Suppliers can be merged on to a single Requestor.
- When attribute names conflict, you can choose whether Requestor or Supplier attribute values are maintained by using the Conflict Resolution parameter and whether null values follow the same behavior by using the Ignore Nulls parameter.
Many Requestors to One Supplier:
The People table is the Requestor and the Jobs table is the Supplier. Since each person is a unique value the Jobs table is merged successfully onto each record in the Person table. They were merged with the key Job.
One Requestor to Many Suppliers (Process Duplicate Suppliers Disabled):
However, now we have switched the tables. The Jobs table is now the Requestor and the People Table is the Supplier. Only Joe is merged with the IT job since it is the first occurrence of an IT job, the remaining people with IT jobs were sent to the rejected port. This is because there are duplicate suppliers. To process duplicate suppliers we need to create a list:
One Requestor to Many Suppliers (Process Duplicate Suppliers Enabled):
Now to handle the duplicate suppliers coming from the People Table we need to create a list. This is done by enabling the Generate List parameter in the FeatureMerger. For this example, we created a list name called SameJob and only created the list with the attribute Name. So now when we run the translation and inspect the features, in the Feature Information Window we can see a list for each person with the same job. Alternatively, you can explode the list using a ListExploder after the FeatureMerger to see each list item in a table.
.
How to Use the FeatureMerger
Now that we have discussed some theory and hypothetical FeatureMerger scenarios, let's look at an example using the FeatureMerger.
Scenario: We want to merge voting divisions and locations data with the voter results for mayor, but we want to keep the geometry. To do this we will use a FeatureMerger transformer.
1. Open FME Workbench
In a blank workspace, add a GML (Geography Markup Language) reader and browse to the ElectionVoting.gml dataset, add both the VotingDivisions and VotingPlaces Feature Types.
Add a second reader to the canvas. This time choose Microsoft Excel as the Format and browse to the ElectionResults.xlsx dataset. Only add the Mayor sheet.
2. Merge Features
Add a FeatureMerger transformer to the canvas. Connect the VotingDivisions to the Requestor input port. Then connect the Mayor Feature Type to the Supplier input port on the FeatureMerger.
VotingDivisions connected to the Requestor port, Mayor connected to Supplier port
In the FeatureMerger parameters set the Join On to Division for both the Requestor and the Supplier.
FeatureMerger Join On parameter Requestor and Supplier set to Division
Then under Merge Parameters set the Reject Null and Missing Keys to Yes because we don’t want any values without a Division associated with it. This will send any features to the <Rejected> port that there is no value for Division on any of the tables we are trying to merge.
We want to leave Process Duplicate Suppliers unchecked because there should be only one record for each division. When we run the translation we should double check the UnusedSuppliers output. This is a form of QA test to see if there are either duplicate or different values that we should investigate. The rest of the parameters can just be left as the defaults.
Reject Null and Missing Keys set to Yes in the FeatureMerger
3. Run the Translation
Turn on Run with Feature Caching (2018.0+) or Run with Full Inspection (2017.1 or older) then run the translation. You should get 55 Merged features.
55 features output through the Merged port on the FeatureMerger.
4. Inspect Output
Inspect the Merged feature output, we should now have the geometry of the Divisions as well as the attributes from Mayor.
FeatureMerger Merged output port inspected in the Data Inspector.
5. Merge Voting Places
Next, we will merge the VotingPlaces with the Merged output from the FeatureMerger to add all of the VotingPlaces attributes to each Division.
To do this, add a second FeatureMerger to the canvas. Connect the Merged output port on the first FeatureMerger to the Requestor input port on the second FeatureMerger. Then connect VotingPlaces to the Supplier input port on the second FeatureMerger. To prevent your connections from crossing, right click on the Requestor port on the second FeatureMerger and select Move Down. This will move the port down and clean up the connections.
Second FeatureMerger connected to the Voting Places and the Merged output port on the first FeatureMerger
In the parameters of FeatureMerger_2, set the Join On to Division for both the Requestor and the Supplier and set the Comparison Mode to String. Leave the rest of the parameters as the defaults.
Run the translation with Feature Caching or Run with Full Inspection. It looks like we have five features going to the <Rejected> port. Let’s inspect them and find out why they were rejected.
If you add a base map, these five features are definitely within our area of interest but the fme_rejection_code is EXTRA_REFERENCE_FEATURE. This means that we have multiple VotingPlaces within a single VotingDivision. We want all the VotingPlaces though, so do to this we will create a list.
In the parameters of the FeatureMerger_2, under the Merge Parameters enable Generate List. Then for List Name call it VotingLocations, then change Add To List to All Attributes.
Generate List parameter in FeatureMerger_2
6. Run the Translation
Run the translation again. Now there won’t be any <Rejected> features. Inspect the Merged output port on the FeatureMerger_2 to see the list we just created.
Click on the Divison at the bottom of the top half of the peninsula, Division 17. This one contains two VotingPlaces. In the Feature Information Window, you can see that a list has been created for these two features.
Inspected Merge output port on FeatureMerger_2 with the features as a list
7. Explode List
We want to be able to read the list in a table format, instead of only inside the Feature Information Window.
Back in FME Workbench, add a ListExploder transformer to the canvas and connect it to the Merged output port on the FeatureMerger_2. In the ListExploder parameters set the List Attribute to VotingLocations{} and the Element Index Attribute to List_Index.
ListExploder attributes set to VotingLocations{}
8. Remove Attributes
We have a bunch of GML attributes that we don’t need. Add an AttributeRemover transformer to the Elements output port on the ListExploder. Remove all attributes that start with gml_ then click OK.
ListExploder and AttributeRemover parameters
9. Inspect Final Output
Re-run the translation to see the changes. 55 features will be output from the AttributeRemover. The extra five features were the features that were sent to the rejected features, that we have then added by creating a list. We then exploded the list to create a feature for each.
When you inspect the output, each of the Divisions that have more than one VotingPlace within them will be a darker color. Also looking at the Table View for Division 17, each VotingPlace now has the Mayor results merged with it as well.
Final output showing the geometry of the Divisions and the exploded list as a table
Additional Resources
Merging or Joining Spreadsheet or Database Data
Merging or Joining Spatial Data
Data Attribution
The data used here originates from open data made available by the City of Vancouver, British Columbia. It contains information licensed under the Open Government License - Vancouver.
Comments
0 comments
Please sign in to leave a comment.