Bulk Edits PostgreSQL: Push Data from the Database to an Application in Real-Time with Webhooks

Files

postgis-bulk-trigger.zip
- 50 KB
- Download
postgis-bulk-trigger.fsproject
- 300 KB
- Download

As of FME 2024.0, support for Esri ArcGIS Online (AGOL), ArcGIS Enterprise Portal, and ArcGIS Server Feature Service formats has transitioned to a unified approach using the new Esri ArcGIS Feature Service (Format). This format replaces the legacy reader/writer formats for each of the three services.

Starting in FME 2026.1, the legacy formats will be hidden in the Quick Add menu. Instead, equivalent functionality is now provided by the downloadable Esri ArcGIS Connector package, available on FME Hub. The package also includes new web services and key transformers such as the ArcGISOnlineConnector, ArcGISAttachmentConnector, and ArcGISBranchVersionManager. The package can be installed directly through the Quick Add menu in FME Workbench. Once installed, the new Esri Feature Service format will appear in the Gallery as a unified method for interacting with all three types of Esri ArcGIS Feature Services.

For details on this transition, including guidance on updating existing workspaces, please refer to the article: Working with Esri ArcGIS Feature Services in FME.

Introduction

With FME Flow, you can manage PostgreSQL database changes using triggers and Automations to push data in real-time. It’s possible for database triggers to send jobs directly to workspaces on FME Flow using workspace webhooks. However, using this method misses out on the additional functionality that Automations has to offer, like webhook triggers, multi-workspace chaining, email notifications, and much more. By combining the power of FME Flow with PostgreSQL database triggers, the options for applying these procedures to your organization’s workflows are endless.

So why should you care about bulk database triggers? Bulk database operations are great for editing large datasets in a shorter time than singular row edits. Imagine you have a central internal database with external application datasets that need to be synchronized. It would require a lot of time and manual work to update the application dataset whenever a change occurs in the central database. When using database triggers with FME Flow, Automations will do all the work for you in an efficient and streamlined process.

Requirements

PostgreSQL 12.x or newer
- PostGIS extension
- PL/Python (Python 3)
ArcGIS Online account (minimum Creator license). If you do not have a licensed AGOL account, you can use a Shapefile instead for Parts 5 and 6.

Scenario

1. Parks data will be loaded into a temporary table in our PostGIS database using an FME Flow App.

2. A trigger function in the database will detect any edits made on the temporary table and populate the main table by emulating bulk inserts/updates/deletes.

3. After, another database function will be called to send a JSON message to our FME Flow webhook containing the IDs of every edited record.

4. As soon as the Automation webhook trigger receives the message, it passes it into the logger workspace and inserts the changed records into an ArcGIS Online feature service.

The workspaces, Automation, and SQL scripts used throughout this article are meant to be relatively basic to illustrate the potential of FME Flow for your database applications. The code snippets and scripts can be customized to fit your environment. While a spatial database is being used in this article, the same methods can be applied to non-spatial data.

Step-By-Step Instructions

This tutorial was created and tested with a previous FME Form and Flow version, PostgreSQL 12.4, and Python 3.7. If you are using different versions, the user interface, terminology, and functionality may vary, but the fundamentals remain largely the same. Note that we will be using the PostgreSQL plpython3u extension for a Python 3 interpreter, rather than plpythonu (Python 2).

The PostgreSQL writer supports only bulk inserts. We will emulate bulk updates and deletes with database functions and a temporary table. There are several methods for emulating bulk database operations, including the use of the COPY statement, temporary or intermediate tables, and statement-level triggers. This method allows the database to do the heavy lifting and provides significantly faster table edits.

1) Set up PostgreSQL and Python

The first step is to install PostgreSQL and use it as the central database. For this demonstration, we will be using PostgreSQL 12. After installing PostgreSQL, open StackBuilder to install PostGIS and PL/Python. To learn more about the Stack Builder utility, please read the documentation.

Select the PostGIS and EDB Language Pack. While you can use Python from a different source, the EDB Language Pack is one of the easiest and most straightforward installations. Please be aware that the EDB Language Pack is not supported for PostgreSQL 13 (as of November 2020). To learn more about PL/Python and PostgreSQL version compatibility with different operating systems, please consult the PostgreSQL documentation for more details. It is crucial that you follow these steps while running the Command Prompt as an administrator to set up your system environment variables; otherwise, PostgreSQL won’t be able to locate the Python interpreter.

In postgres_bulk_script.sql, create a new database in your PostgreSQL server and run CREATE EXTENSION postgis to spatially enable the database. Run CREATE EXTENSION plpython3u to enable PL/Python.

CREATE EXTENSION postgis;
CREATE EXTENSION plpython3u;

If you encounter the error “The specified module could not be found” when attempting to add plpython3u to your database, you may need to verify that your environment variables and Python have been set up properly. The error message is slightly misleading; what it actually means is “Cannot find dependency DLL python3x.dll in PATH”.

2) Set up the Automation

Now we will create an Automation to receive the incoming JSON-formatted messages from the PostgreSQL database trigger. The message will be passed into our logger workspace.

Open FME Flow and create a new Automation. Add a Webhook Trigger to the canvas. Save the Automation to generate a webhook URL, which will be used in the database trigger function.

3) Create Database Trigger and Request Functions

Download and unzip postgis-bulk-trigger.zip. Open the SQL script in a text editor. Copy and run the CREATE TABLE statements with the PostgreSQL query tool to create the main and temporary tables. Note that a geometry column is included with EPSG:4326 (LL84) as the coordinate system.

CREATE TABLE bulk_example
(
  parkid INTEGER PRIMARY KEY,
  name TEXT,
  facilities TEXT,
  washrooms TEXT,
  streetnumber TEXT,
  streetname TEXT,
  neighbourhood TEXT,
  hectare REAL,
  geom geometry(Geometry,4326)
);

In step 4 of the script, we are initiating the HTTP request function that will be called by the trigger function (step 5). The send_request function uses PL/Python to perform the following:

Takes the FME Flow URL and custom message as arguments
Imports the Python libraries required to parse the URL and make a web request
Parses the URL and extracts the necessary components
Sends a POST request to FME Flow with the custom message as a JSON-formatted content payload (the header specifies the payload type, which in our case is application/json)
Gets the response back from FME Flow and returns it as a string

Before running step 5, replace the URL parameter on line 90 with the webhook URL from the previous section. This part of the script sets up the trigger function using PL/pgSQL to perform the following:

Declares the database operation text variable
Checks the temporary table to see what database action is associated with the rows
Based on the database operation, edit the main bulk_example table with the corresponding bulk action (INSERT, UPDATE, or DELETE)

Run steps 4 to 7 in the query tool. Step 6 of the script binds bulk_function to the bulk_example_temp table. Note that this trigger function emulates bulk operations by executing per statement rather than per row. Triggers and functions can be viewed and managed in pgAdmin by going into Schema > public > Trigger Functions, and Schema > public > Tables > [Table Name] > Triggers (binded trigger functions).

4)Set up the Data Loader Workspace

For demonstration purposes, we will use a simple FME Flow App to submit jobs to our data editing workspace and perform various bulk operations on our PostGIS database. The Server App will be configured later.

Open the postgis_bulk_loader.fmw workspace.

Edit the PostGIS writer to connect to your own database.

Note that in the PostGIS writer parameters, the bulk_http() function is being called after the writer finishes inserting the records into the temporary table to send the JSON message of edited rows to our FME Flow webhook.

The source shapefile and destination PostGIS database published parameters can be removed if they were added. When you are ready, publish the loader workspace to FME Flow in a new repository named ‘PostGIS-Bulk’ as a Job Submitter service. Include the database connection and parks.zip file (upload to a shared resource folder).

While the loader workspace and Server App are not particularly practical in their current states, imagine a scenario where we have a Server App and non-FME users who can make edits without needing direct access to FME Flow or the database.

5) Create an ArcGIS Online Change Log Feature Service

For this section, we will use ArcGIS Online to demonstrate a synchronized external application that monitors changes to the table without requiring direct access to the database. Log in to your ArcGIS Online account and create a new hosted point feature service. Name the layer ParksChangeLog. In the layer’s Data > Fields, add four new fields:

id (integer)
Name (string)
RowAction (string)
DateEdited (date)

You may find it useful to create a simple ArcGIS Online web app to display the feature service.

6) Set up the ArcGIS Online Logger Workspace

Open the postgis_bulk_logger.fmw workspace.

Open the FeatureReader and edit the connection parameters to connect to your PostGIS bulk_example_temp table. Connect the table output port to the DateTimeStamper. Edit the ArcGIS Online Feature Service writer parameters to connect to your ArcGIS Online account. Make sure the output geometry is set to arcgisonline_point, and the layer name is 'ParksChangeLog'. Manually change the user attribute types to match the settings below:

If you don’t have a licensed ArcGIS Online account, please use the Shapefile writer instead. Enable the Shapefile writer and disable the AGOL writer. For the writer destination, enter $(FME_SHAREDRESOURCE_DATA) to write to the shared resource data folder on FME Flow. The name of the shapefile will be based on when the logger workspace is run.

The logger workspace will be placed after the webhook trigger in our Automation to perform the following:

Gets the JSON message of edited row IDs and the database operation from the webhook trigger
Reads records from the temporary table where the IDs match
After reading the rows, SQL is executed to truncate the temporary table
Logs the current date and time and converts it to a compatible format that ArcGIS Online can read (we are using ISO datetime)
Removes and renames attributes to match the AGOL feature service layer
Writes point features to the changelog feature service

When finished configuring the logger workspace, publish it to FME Flow as a Job Submitter in the same repository as the data loader workspace, along with your ArcGIS Online web connection. You do not need to re-upload the database connection.

7) Set up the FME Flow App

Complete the following to configure the Server App:

In the FME Flow workspace repository, open postgis-bulk-loader.
Under Advanced, click Create an FME Flow App.
Name the app ‘postgis-bulk-loader-app’, and enter ‘PostGIS Bulk Edit Triggers and FME Flow’ for the title.
Click OK when you are done configuring the app. Open the Server App URL.

8) Complete the Automation and Run the Examples

Open the Automation created in Part 2. Add a Workspace Action to the canvas and select the postgis_bulk_logger.fmw workspace.

Open the Loader FME Flow App and run the examples in order. You will be redirected to a job status page that displays the current status of the job, including when it is running and whether it has been completed/failed. Check the bulk_example table in PostGIS and the Automation log to make sure the edits successfully went through. If you added the webhook message content log action to the Automation, the JSON-formatted message sent from the database trigger will also appear in the logs.

Check your ArcGIS Online feature service/shapefile to verify that the edited records were created.

Now that you’re equipped with the fundamentals of PostgreSQL database triggers and how to use them with FME Flow, start spending less time doing manual work and let FME Flow do the work for you! To learn about making single-edit database triggers with PostgreSQL, please see Single Edits PostgreSQL: Push data from the Database to an Application in Real-Time with Webhooks.

Additional Resources

Getting Started with Automations

Single Edits PostgreSQL: Push data from the Database to an Application in Real-Time with Webhooks

Routing Data Between Workspaces in Automations

Tutorial: Update Databases Using an FME Flow Workspace App

Data Attribution

The data used here originates from open data made available by the City of Vancouver, British Columbia. It contains information licensed under the Open Government License - Vancouver.

Search