Updating a PostgreSQL Database Using UPSERT

Files

UpdatePostgreSQLUsingUPSERT.zip
- 200 KB
- Download

Introduction

FME can UPSERT updates to PostgreSQL and PostGIS databases. UPSERT allows you to add a new row if it doesn’t already exist, or update an existing row in a single action. To use UPSERT, your database must have a unique ID. In this tutorial, we will read from an existing PostgreSQL table, add a new row, modify a couple of other rows, and write them back to the same table using UPSERT.

Step-by-step Instructions

Part 1: Seed Data

Before you begin, you will need to seed the data into your personal PostgreSQL database or one you’ve created, as described in Creating a PostgreSQL/PostGIS Training Database with FME.

1. Add a CSV Reader

Open a blank workspace in the FME workbench and add a CSV reader to the canvas. Browse to the Original.csv dataset available for download from the Files section of this article. Click OK to add the reader.

2. Write to PostgreSQL

Add a PostgreSQL writer to the canvas and connect to the FMETraining database created in Creating a PostgreSQL/PostGIS Training Database with FME. Set the Table Definition to Automatic and click OK.

In the Feature Type dialog, set the Table Name to CellSignals and the Table Qualifier to public. We can leave Table Handling to Create If Needed, since there isn’t currently a CellSignals table in the FMETraining database.

Click OK, then connect to the public.CellSignals writer feature type to the CSV reader feature type.

Reopen the public.CellSignals writer feature type parameters, and switch to the User Attribute tab. Change the Attribute Definition to Manual. Then, for the featureid attribute, click the Index drop-down and select Primary Key. To UPSERT, a primary key must be identified.

3. Run the Workspace

Run the workspace; it will create the CellSignals table in the training database, giving us a blank slate to demonstrate UPSERTs.

Disable the CSV reader and the PostgreSQL writer as we no longer need them. If you need to reset the table, you can rerun this part of the workspace. Before rerunning this section, change the Table Handling to Drop and Create.

Part 2: UPSERT into PostgreSQL

1. Add a PostgreSQL Reader

Continuing in the same workspace as Part 1, but in a blank section, add a PostgreSQL reader to the canvas and connect to the training database (or your own database where you seeded the data).

In the parameters, select the public.CellSignals table, then click OK three times to finish adding the reader.

2. Add the Updated CSV

Next, we need to read the CSV containing the updates. Add another CSV reader to the canvas and browse to the Updates.csv dataset.

In the parameters, change the Feature Type Name(s) to From File Name(s), then click OK twice to finish adding the reader.

3. Detect Changes

Using the ChangeDetector transformer, we can detect any changes to the original dataset (public.CellSignals) and then assign a database operation to control how the data gets written.

Add a ChangeDetector to the canvas and connect the Original input port to the public.CellSignals reader feature type, then connect the Revised input port to the Updated CSV reader feature type.

In the parameters, set the Update Detection Key Attributes to featureid. Next, set Attribute Matching Strategy to Match Selected Attributes, then for Selected Attributes select:

code
featureid
num_measures
power
quality
recorded_tstamp
stationid

Finally, disable Check Geometry, then click OK.

4. Set UPSERT

Now that we have identified which attributes have been updated or inserted, we need to tell FME to set the fme_db_operation to UPSERT. Add an AttributeCreator to the canvas and connect it to both the Inserted and Updated output ports on the ChangeDetector.

In the parameters, create a new attribute named fme_db_operation and set its value to UPSERT. So we can see exactly which records we’ve updated or inserted, let’s set the num_measures attribute to 999.

5. Write to PostgreSQL

Add another PostgreSQL writer to the canvas and connect to the FMETraining database. Set the Table Definition to Copy from Reader and then click OK.

In the Select Feature Type dialog, select the public.CellSignals reader.

Connect the writer feature type to the AttributeCreator and the Deleted output port on the Change Detector.

Next, in the PostgreSQL writer parameters, set the Feature Operation to fme_db_operation, then set the Table Handling to Use Existing. Now, for Row Selection, select featureid for Match Columns. This attribute needs to be unique for each record. Then click OK.

Note: If you are writing to an existing table, confirm that the Match Column attribute is the primary key.

6. Run the Workspace and Inspect the Output

Now run the workspace, and then inspect the output in Visual Preview. If you sort the num_measures attribute by descending, you’ll see all of the attributes that were UPSERTed, as indicated by the value of 999.

Data Attribution

The data used here originates from open data made available by the City of Vancouver, British Columbia. It contains information licensed under the Open Government License - Vancouver.