Files
Introduction
This tutorial will work through several exercises to introduce the core concepts used in typical tabular workflows and the associated FME transformers that can be utilized to perform merging and joining operations. Each exercise builds upon the lessons from the previous exercise.
It can be difficult to determine which of the merging and joining transformers is most applicable to a given dataset. The article "Merging or Joining Spreadsheet or Database Data" provides a summary of the various merge and join transformers and where they might be applied.
We will generate a report that enhances Water Meter data for an inspection program by supplementing it with additional attributes (fields) from Municipal Lot/Property data.
The exercise files are in the ZIP folder in the article's files section.
Step-by-step Instructions
Exercise 1: FeatureJoiner
1. Inspect water_meters
Open the starting workspace Merging and Joining Tabular\Workspaces\Exercise1-4Begin.fmw in FME Workbench. This workspace already has the readers and writers set up for us, but we will be adding in the transformers.
First, let’s view the starting water_meter dataset. This is the dataset we will base our entire project around. Enable Data Caching (formerly Feature Caching), then run the workspace. Inspect the output of the water_meters reader feature type. There isn’t much in this dataset that would be useful for an inspector in the field. Note the PID attribute, which will serve as a key to join additional tables to this water meter dataset.
2. Add Lot Zone Codes for Each Water Meter
The public.CoS-lot table comes from the Training PostgreSQL database. It contains the neighborhood zones for each property that we want to add to the water meters dataset.
To connect to the Safe Software provided training database, use the following credentials when setting up your connection. For detailed instructions on how to set up a database connection, please see Getting Started with FME Form: Working with Databases.
Please be aware that this is a public database, and the data available can change at any time. This database is for training purposes only and should only be used during training courses and articles:
- Name: Training
- Host: postgis.train.safe.com
- Port: 5432
- Database: fmedata
- Username: fmedata
- Password: fmedata
To join the datasets together, we will use the FeatureJoiner transformer. Add a FeatureJoiner to the canvas, connect the water_meters reader feature type to the Left input port, and connect the public to the Right input port.CoS-lot reader feature type to the Right input port.
Open the FeatureJoiner parameters and set the Join Mode to Left to retain all of the water meter features. Then use the PID attribute for both the Left and Right join attributes.
A Left join will retain all the water meter features, whether there is a matching CoS-lot feature or not:
3. Inspect Joined Datasets
Run the workspace and inspect the output from the FeatureJoiner's Joined port. Notice how the attributes from the Lot table are now joined to the water meter data.
The screenshot above shows the attributes from the Lot table, prefixed with a _, to highlight which table they came from. Your data will not have this.
Note: We could have used the DatabaseJoiner instead, but it is slower at 2 minutes vs. 6.5 seconds with the FeatureJoiner in this scenario. You may consider using the DatabaseJoiner to join smaller tables (e.g., lookups) or to join an indexed column. In this case, since the column is not indexed, the FeatureJoiner will perform better. For more information, see the DatabaseJoiner documentation.
Exercise 2: FeatureMerger
The source water meter data contains a FOLIO attribute with only one value. However, the field inspectors are requesting that we replace the existing FOLIO value with all the FOLIO numbers for each water meter. We'll need to perform a one-to-many (1:M) join. Let's do this with the Property Details table and a FeatureMerger.
4. Get All FOLIO Values for Each Water Meter
To retrieve all FOLIO values for each water meter, we will use the FeatureMerger transformer. Add a FeatureMerger to the canvas and connect the Requestor input port to the Joined output port on the FeatureJoiner. Then connect the Supplier input port to the public.CoS-property-detail reader feature type.
In the FeatureMerger parameters, use the PID as the join key for both the Requestor and Supplier. Then enable Process Duplicate Suppliers to ensure all FOLIO values are matched with their corresponding PID. Set the Number of Suppliers Attribute to _fMerger_matches. We’ll use this attribute to help validate the features with multiple FOLIO numbers later on.
Next, enable Generate List and set the List Name to _folio. Click on the ellipsis next to Selected Attributes and select FOLIO. Click OK to close the parameters.
5. Inspect _fMerger_matches
Run the workspace and inspect the Merged output port on the FeatureMerger. Find a feature with a value greater than 1 for the _fMerger_matches attribute. Select the feature in the Table view, and then open the Feature Information Window. You should see a list attribute called _folio{}.FOLIO, which will contain one FOLIO number per list element.
6. Concatenate FOLIO Numbers
We need all of these FOLIO numbers to be contained within a single attribute. Add a ListConcatenator to the canvas and connect it to the Merged output port of the FeatureMerger. In the parameters, set the List Attribute to _folio{}.FOLIO and set the Separator Character to a comma ( , ). Finally, set the Destination to FOLIO; this will overwrite the FOLIO attribute with the newly concatenated list of FOLIO numbers.
If you were to rerun the workspace, you’ll see the FOLIO attribute now has the attributes concatenated, separated by a comma.
Exercise 3: DatabaseJoiner
7. Lookup Zone Descriptions
From the previous Exercise, notice that some water meters don't have matching property details (inspect the Unmerged Requestor port in the FeatureMerger). We want to continue working with all our water meter features in the next step of the workflow.
Add a DatabaseJoiner to the canvas, and connect it to both the Output port from the ListConcatenator and the Unmerged Requestor output port on the FeatureJoiner to ensure we're performing a Left Join from the previous step.
In the DatabaseJoiner parameters, set the Format to Microsoft Excel, then browse to the ZoneCode.xlsx dataset. Click on the ellipsis next to Table and select Zone Code Description.
For the Join On parameter, the Feature Attribute is from the data that is imported into the DatabaseJoiner from the workspace. The Table Field is from the table we are reading in using the DatabaseJoiner. Use the ZONE_CODE attribute for both parameters to perform the join.
Unlike the FeatureJoiner, you can filter out any attributes you do not want to merge into your dataset. For the Fields to Add parameter, add the ZONE_DESCRIPTION. The DatabaseJoiner also allows for greater control over cardinality. For this example, since we know that there should only be one ZONE_CODE in the lookup table for each ZONE_CODE, we’ll select Must Match Exactly One (1:1) as the Cardinality.
Finally, so we know which table the data is coming from, set the Accumulation Mode to Prefix Joined and set the Prefix parameter to LOT_.
8. Inspect Joined Zones
Run the workspace and inspect the Joined output port on the DatabaseJoiner. You should see LOT_ZONE_DESCRIPTION merged into the data, along with an attribute called _matched_records that indicates the number of matched records resulting from the join.
Exercise 4: Obtain Addresses & Output Validation Report
The field inspectors require the lot addresses for each water meter. We'll use the FeatureJoiner again to do a quick join with Addresses-SURREY.csv.
9. Join Addresses
Add a FeatureJoiner to the canvas and connect the Left input port to the Joined output port on the DatabaseJoiner. Then connect the Right input port to the Addresses-SURREY reader feature type.
In the FeatureJoiner_2 parameters, set the Join Mode to Inner. This will allow us to filter out any water meters or PIDs without an associated address. Then set PID as the join key for both Left and Right tables.
10. Inspect Joined Addresses
Run the workspace and inspect the Joined output port on the FeatureJoiner_2, which contains the water meter information for our field inspectors. The information from the Unjoined Left output port includes the water meters that lack addresses.
Note that the workspace may fail if a feature is missing the right key. You can either set Workspace: Rejected Feature Handling to Continue or by connecting a Logger to the <Rejected> output port on the FeatureJoiner_2.
11. Output Results
Connect the Joined output port on the FeatureJoiner_2 to the AttributeManager. The AttributeManager was set up ahead of time to remove the attributes we don’t need and clean up others. Take a moment to review the changes in the AttributeManager.
The AttributeManager is connected to the @Substring(@Value(POSTAL_CODE),0,3). Water Meters writer feature type. This writer's feature type is creating a worksheet for the Water Meters - Field Inspection Microsoft Excel Workbook. Each postal code FSA (the first three characters of a postal code) will have its own worksheet with the water meter data listed.
Now connect the UnjoinedLeft output port on the FeatureJoiner_2 to the Missing Addresses writer feature type. This is writing a validation report for all water meters that lack addresses to the same Water Meters - Field Inspection workbook for the data custodians to review at a later date.
Run the workspace, and inspect the output Water Meters - Field Inspection INLINEQUERIER.xlsx file.
If you have time, see Optional: Build an HTML Report, Publish to FME Flow and Create a Workspace App at the bottom of this article. You will create an HTML report that will be handy for field inspectors, and use FME Flow to build an App to generate it.
Save and close the workspace. We will start with a different workspace for Exercise 5.
Exercise 5: InlineQuerier
We can also join datasets as we did in Exercises 1-4 using an InlineQuerier transformer. The InlineQuerier can perform SQL queries on any dataset already read into your workspace. As you’ll see in this exercise, the InlineQuerier can help simplify your workspace for those who are comfortable working with SQL.
12. Add an InlineQuerier Transformer
Open Exercise5Begin.fmw in FME Workbench. This workspace uses the same data we used in the previous exercises, but it only includes the readers and writers.
Add an InlineQuerier to the canvas and connect all of the reader feature types to the Connect Input port (there should be five reader feature types connected in total).
13. Create Output Port SQL Queries
After connecting the reader feature types, a new input port was created for each dataset, but there are no output ports. We will need to modify the parameters to create these ports.
Open the InlineQuerier parameters, and notice that all of the input datasets are listed at the top. Under Outputs, enter the following parameters:
| Output Port | SQL Query |
|---|---|
| water_meters+lots+addresses |
|
| missing_addresses |
|
14. Connect Writer Feature Types
One final step is to connect the newly created output ports. Connect the water_meters+lots+addresses output port to the @Substring(@Value(POSTAL_CODE),0,3).Water Meters writer feature type. Then connect the missing_addresses to the Missing Addresses writer feature type.
The SQL queries perform the exact same joins as we did in Exercises 1 - 4, but are condensed into a single transformer.
Exercise 6: SQLExecutor & SQLCreator
We’ll be modifying our original workspace from Exercises 1-4 again, this time demonstrating how the SQLExecutor can be used to leverage the database instead of reading all your features into your workspace.
When you have multiple tables being read in from a database, consider using the SQLCreator or SQLExecutor transformers. These transformers allow you to query the database directly, often improving performance without having the need to read entire tables into your workspace. Learn more about leveraging the power of databases in Tutorial: Let the Database Do the Work.
15. Generate the Water Meter Report using an SQLCreator
Open Exercise6Begin.fmw in FME Workbench. The two writer feature types from the previous exercises have been added to the canvas. For this exercise, all the source tables are provided in an SQLite database in Merging and Joining Tabular\Data\CoS Data.sqlite
Add an SQLCreator to the canvas. The SQLCreator lets you create a workspace from a database query.
Open the SQLCreator parameters, select SQLite as the Format, and navigate to Merging and Joining Tabular\Data\CoS Data.sqlite.
Click the ellipsis next to the SQL Statement parameter. Let’s create a SQL query that replicates the joins performed earlier in Exercises 1-4, joining the water_meters table with the lot, zone_code_description and addresses tables:
SELECT *
FROM "water_meters" m
LEFT JoiN "lot" l
ON m."PID"=l."PID"
LEFT JOIN "zone_code_description" z
ON l."ZONE_CODE"=z."ZONE_CODE"
INNER JOIN "addresses" a
ON m."PID"=a."PID";Click OK to exit from the SQL Statement dialog and back to the SQLCreator Parameters. Click on the ellipsis next to the Attributes to Expose parameter. Click on the Populate from SQL Query… button in the resulting dialog. This will generate the attributes that will be output by the SQLCreator Result output port from the SQL Query above.
Connect the Result output port from the SQLCreator to the @Substring(@Value(POSTAL_CODE),0,3).Water Meters writer feature type for the water meter inspection report.
16. Generate the Data Validation Report Using an SQLExecutor
Let’s create the Data Validation Report for missing addresses using the SQLExecutor transformer. The SQLExecutor enables midstream database queries.
Add an SQLExecutor to the canvas and connect it to the SQLCreator <Schema> output port–we’re simply using the schema feature as an initiator for the SQLExecutor.
Open the SQLExecutor parameters, select SQLite as the Format, and navigate to Merging and Joining Tabular\Data\CoS Data.sqlite.
Click the ellipsis next to the SQL Statement parameter. Let’s create a SQL statement to query for water meters without associated addresses. We’ll do this by joining the water meters table with the addresses table and determining which features have no postal codes:
SELECT *
FROM "water_meters" m
LEFT JOIN "addresses" a
ON m.PID=a.PID
WHERE a."POSTAL_CODE" IS NULL;Click OK to exit from the SQL Statement dialog and back to the SQLCreator Parameters. Click on the ellipsis next to the Attributes to Expose parameter. Click on the Populate from SQL Query… button in the resulting dialog. This will generate the attributes that will be output by the SQLExecutor Result output port from the SQL Query above.
Connect the Result output port from the SQL Executor to the Missing Addresses writer feature type.
Run the workspace, and inspect the output Water Meters - Field Inspection SQL.xlsx file.
OPTIONAL EXERCISE: Build an HTML Report, Publish to FME Flow, and Create a Workspace App
This is the optional section of Exercise 4. In this section, we will learn how to use FME to generate an HTML report from an uploaded Water Meters dataset. We will then stream the results (HTML report) of a workspace using an FME Flow Workspace App, as shown in the image below.
1. Create Geometry
Open the Exercise1-4Complete.fmw workspace to begin or continue in the workspace you were building in Exercise 4. We're going to use the VertexCreator to create point geometry from the coordinate values that originated in the water_meters dataset (JSON) and were carried through the workflow. Add a VertexCreator to the canvas and connect it to the UnjoinedLeft output port on the FeatureJoiner_2.
In the VertexCreator parameters, set the X Value to json_geometry.coordinate{0} and the Y Value to json_geometry.coordindate{1}.
2. Define Coordinate System
Now that we have created the point geometry, we need to define the coordinate system. These points were recorded using UTM83-10. Add a CoordinateSystemSetter to the canvas and connect it to the VertexCreator. In the parameters, set the Coordinate System to UTM83-10.
3. Create Label for Pop-Up Box
Next, we’ll create a label to use in the pop-up box that will appear when you click on a point in the HTML Report. Add an AttributeCreator to the canvas and connect it to the CoordinateSystemSetter. Create a new attribute called Label. Since some features in this dataset lack an image URL, we will use a conditional statement to set the label attribute. Click on the drop-down arrow for Attribute Value and select Conditional Value.
For the Condition Statement, double-click on the box next to If to open a dialog similar to the Tester transformer. Set the following condition:
IMAGE Attribute Has a Value AND IMAGE NOT <enter in a single space> Set the Attribute Value to:
<img src='@Value(IMAGE)' alt='@Value(PID)' width ='320' height= '240'>
<br>PID: @Value(PID)
<br>Meter Code: @Value(METER_CODE)Then click OK. Set the Else Attribute Value to:
<br>PID: @Value(PID)
<br>Meter Code: @Value(METER_CODE)4. Create HTML Report
Now that we have our label created, we can create the HTML report. Add an HTMLReportGenerator to the canvas and connect it to the AttributeCreator. Set the first Page element to Map (Esri Leaflet), then click anywhere to update the parameters on the right. Set the Label Attribute to the Label attribute we just created, and then set the Escape HTML to No.
Next, add a Table Page element by clicking on the plus sign (+), and click anywhere to update the parameters on the right. Configure the table as follows:
| Column Contents | Column Name |
|---|---|
| FACILITYID | ID |
| LOT_ZONE_DESCRIPTION | Zone |
5. Add HTML Writer
Add an HTML writer to the canvas and browse to the Merging and Joining Tabular Data\Output folder. Name the file WaterMeterInspection.html.
Connect the HTMLReportGenerator to the HTML writer feature type.
6. Publish to FME Flow
-
To publish a workspace from FME Workbench to FME Flow, follow the generic instructions below using the specific details listed in the article. If the article does not have a specific repository name listed, we recommend creating a Training repository. If a specific service is not listed in the article, please use the default Job Submitter.
1. Click Publish
From the File menu, click Publish to FME Flow, or from the toolbar ribbon, click Publish.
Toolbar Ribbon:

File Menu:
Deploy Menu (2025.1+):
2. Add a Web Connection
In the Publish to FME Flow wizard, a FME Flow connection needs to be selected or created. To create an FME Flow connection, click the drop-down under FME Flow Connection and select Connect to FME Flow.

The following instructions are for a basic connection, but if your FME Flow is set up for Microsoft Azure, SAML, or Windows Credentials, please see the Using Web Connections to Connect to FME Flow documentation.
An FME Flow Connection dialog will appear; here, we will set up the connection. The connection information and credentials can be obtained from your FME Flow Administrator. If you have not yet logged into FME Flow after installing, please do so prior to continuing, as you will be prompted to change the password from the default:
Username: admin
Password: admin-
• Web Service: FME Flow
• Server URL: http://localhost
• Connection Name: Training FME Flow
• Authentication: Basic
• Username: your username
• Password: your password
Click OK. FME will try to authenticate the connection. Once authenticated, this connection will be available whenever you publish a workspace. The connection will only need to be created once.
-
• Web Service: FME Server
• Connection Name: Training FME Server
• Server URL: http://localhost
• Username: your username
• Password: your password
Click Authenticate. FME will try to authenticate the connection. Once authenticated, this connection will be available whenever you publish a workspace. The connection will only need to be created once.
Click Next to move to the Create Repository page of the dialog
3. Create Repository
Repositories are used to store groups of related items that have been published to FME Flow (similar to directories). Items such as workspaces, source data, custom formats, custom transformers, and templates can be published to a repository.
If you need to create a new repository, click on the New button next to Repository Name. Enter the repository name and an optional description. Click OK to close the Create New Repository dialog.

To select an existing repository, click the drop-down next to Repository Name and select the desired repository.

4. Upload Files
Some workspaces require additional files to run correctly on FME Flow, so these need to be uploaded along with the workspace. To upload files, enable Upload data files at the bottom of the dialog.

When files are uploaded through the publishing menu, they are stored within the repository. If you want to upload folder files (such as geodatabases) or to specify where the files stored, the Select Files button allows you to do that. For detailed instructions on how to use the Select Files button, see the documentation.
5. Register Services
The final step before publishing a workspace is to register services. Services control how the data will be output after the workspace is run.

Services return results in different forms:
• Data Download Service returns results as a downloadable zip file
• Data Streaming Service returns results as a data stream
• Job Submitter Service accepts and runs workspace job requests
• KML Network Link returns a KML Network Link that can be used in Google Earth
• Notification Service allows for event-driven messaging
Typically, the Job Submitter is always enabled, but multiple services can be enabled. Once services have been enabled, click Publish.
6. Review Translation Log
To confirm whether or not the workspace was published successfully, check the Translation Log. The Translation log will show which repository you published to, which files were included, and a quick link to run the workspace.

-
Create a New repository called MergingAndJoining, then confirm that Upload Files is enabled. Click Next to continue.
Upload the Training PostgreSQL database by enabling the check box, then click Next.
Finally, select the Data Streaming Service, then click Publish. You can leave the Data Submitter Service enabled as well.
The workspace is now published to FME Flow.
7. Build Workspace App
Open FME Flow by either opening a web browser and typing in http://localhost (or your Named Server) or by going to Start > FME Flow > FME Flow Web Interface (in Windows).
Log in using the admin/FMELearnings credentials (or your FME Flow Administrator-supplied credentials).
Expand Flow Apps on the left menu bar, then go to Create Workspace App.
On the Create Workspace App page, set the Name to WaterMeters, then set the Title to Water Meter Report. Change the Repository to MergingAndJoining. Confirm that the Service is set to Data Streaming, then click OK to create the app. If you have extra time, customize the app in the Customize section.
After clicking OK, a URL will appear. Click on the URL to open the app.
Additional Resources
Webinar: Data Integration Basics | Merging & Joining Data: the exercises in this tutorial were inspired by a demo from this webinar
Tutorial: Merging or Joining Spreadsheet or Database Data
The FeatureJoiner Transformer tutorial
The FeatureMerger Transformer tutorial
The InlineQuerier Transformer tutorial
The SQLExecutor and SQLCreator Transformers tutorial
Using the SQLExecutor to do a SQL Join
Data Attribution
The data used here originates from open data made available by the City of Surrey, British Columbia. It contains information licensed under the Open Government License - Surrey.