Using Append to Write Big Data to ArcGIS Online in FME

Liz Sanderson
Liz Sanderson
  • Updated

FME Version

Introduction

In FME, there are many ways to connect to Esri ArcGIS Online, including through readers, writers, and transformers, which is covered in detail in the tutorial Getting Started with ArcGIS Online and Portal. If you are updating data in ArcGIS Online, then the ArcGIS Online Feature Service Writer is your friend. But what about if you have tens or hundreds of thousands of records? Behind the scenes, the writer uses the ArcGIS REST API to add 1000 rows at a time, which can be a performance bottleneck.

In this article, we will walk through how to use the Esri Append REST API to connect to ArcGIS Online, which is useful for writing large datasets containing tens of thousands of records or more.

Append is a faster API but can be more complicated. It is best suited for regular ArcGIS Online updates where the number of rows is in the tens or hundreds of thousands. The principle of using Append is to:

  • Truncate existing data
  • Append data to a dataset from a table


Before we truncate, we also need to prepare our data:

  • Delete any old, uploaded data
  • Generate a zip file then upload

 

About the Author

This is a guest article by Andrew Shakes (FME Certified Professional) of KiwiRail, New Zealand. KiwiRail maintains the New Zealand rail network and operates freight and scenic trains. Train timetables in the Auckland and Wellington metro rail networks are used by KiwiRail for managing track maintenance and upgrade projects. These normally forecast a few months ahead, but KiwiRail plans project years in advance. KiwiRail uses FME to forecast timetables two years ahead, which means a lot of rows of data. ArcGIS Online is used for querying and visualizing the information, but train timetables change regularly, so need to be updated.

 

Step-by-Step Instructions

The FME workspace created in this example includes a few main parts: deleting outdated data in ArcGIS Online, generating and uploading a table, performing Truncate and Append operations, and monitoring the append using looping. Download the attached workspace template to follow along.
 

Part 1: Delete old uploaded data (if needed) ​​​​​​​

Before we upload a file to ArcGIS Online, it pays to remove any copies you have previously uploaded to avoid duplication errors or making your administrator grumpy with lots of old bulky zip file content. You may choose to do this at the end of your update process, but it's good practice to run this at the beginning in case the Append process fails and prevents your clean-up process from running.

We will assume we have run the process before, so there’s an existing uploaded file we need to remove. This portion of the workspace looks like this and includes two ArcGISOnlineConnector transformers:

01 Data prep workspace.png

1. Create a Web Connection to ArcGIS Online
If you haven’t already done so, follow these steps to set up a Web Connection to ArcGIS Online.

Tip: Ensure the web connection you are using is authorized to delete content. I opted for using an admin connection (with all the content owned by the admin) just to be sure.

2. Add a Creator
Add a Creator transformer to start off the workspace.

3. Add an ArcGISOnlineConnector
Add an ArcGISOnlineConnector. In the parameters, set the Request Action to ‘List'. This will list the contents of a folder in ArcGIS Online, which we have set up to store the zip file.

02 ArcGISOnlineConnector parameters.png

In "Attributes to Add", select these Output Attributes:
03 Attributes to add.png

4. Add a Tester
Add a Tester to check if the existing ArcGIS Online layer name (Title) matches the one you’re uploading (you might need to come back to set these parameters later).

Screenshot 2022-12-02 at 11.01.08 AM.png

5. Add a second ArcGISOnlineConnector
Add a second ArcGISOnlineConnector, and connect it to the Passed output port on the Tester. Set this transformer to delete it using the _id.

04 ArcGISOnlineConnector parameters.png

6. Add a FeatureJoiner
It's good practice to author your workspace to handle a situation where no previously uploaded file exists. You can use a FeatureJoiner with a Left Join to ensure the process will continue regardless, but ensure the old copy is deleted if one exists.

Connect the second ArcGISOnlineConnector to the "Right" port, and the first ArcGISOnlineConnector to the "Left" port.

05 Left join.png

Set the parameters as follows:
Screenshot 2022-12-02 at 11.04.14 AM.png

7. Add an AttributeKeeper
Add an AttributeKeeper and keep only "_creation_instance". This will clean up the workspace by keeping only the necessary attributes.
 

Part 2: Generate a table, then upload

Now that the workspace has deleted the old upload, it’s time to prepare and upload the new zip file.

In this example, a train timetable is processed from Auckland Transport’s open data GTFS to make it a bit easier to work with in ArcGIS Online – I’ve simplified this into a semicolon-delimited CSV file, which is attached to this article. There are over 300,000 rows when we predict 2 years ahead.

The Append REST documentation covers support for CSV uploads, but I prefer to use Esri’s File Geodatabase format as it does the schema mapping automatically, which can be a bit tricky through the API.

1. Add a CSV Reader
Add a CSV reader, and read in the source CSV file attached to this article.

2. Add a FeatureWriter
To write the rows read from the CSV to File Geodatabase, add a FeatureWriter and set it to write to Esri Geodatabase (File Geodb Open API).

Tips:

  • It’s good practice to enable ‘Overwrite Existing Geodatabase’ in the feature writer parameters, so you are certain it isn’t uploading an old version of the data.
  • Make sure the schema of the feature type matches the ArcGIS Online layer
  • Make sure the geodatabase is zipped, and the geodatabase name matches the zip file name, so ArcGIS Online is happy to consume it
  • Give the feature type (gdb feature class) a different name to the layer name, so you don’t get it confused. I’ve called mine AT_predicted_services

 

Zip file name Geodatabase name Feature type Title (layer name)
GDB_AT_Timetable.zip GDB_AT_Timetable.gdb AGOL_AT_gtfs_uploads AT_predicted_services


This part of the workspace looks like this:
Screenshot 2022-11-29 at 10.58.51 AM.png

3. Add a FeatureMerger
Add a FeatureMerger, and connect the FeatureWriter to Requestor and the AttributeKeeper to Supplier.

FeatureMerger.png

In the parameters, set both the Requestor and Supplier to 1. This merges the output of the delete process with the output of the File Geodatabase writer so the upload will wait until both have finished.

Screenshot 2022-12-02 at 11.11.02 AM.png

4. Add an ArcGISOnlineConnector
Add a third ArcGISOnlineConnector to upload the zip file. Ensure "Attributes to Add" includes _id_private_url.

AGOLConnector 3.png

Tips:

  • I recommend you upload a zipped File Geodatabase.
  • I recommend you create a specific folder in your ArcGIS Online Content for the zip file.
  • Go back to your first (deleted) ArcGISOnlineConnector to set the same folder.
  • While you are there, also check the Tester has the same name as the Title.
  • At least 1 tag is mandatory.

 

Part 3: Using the Esri Truncate and Append REST APIs

The Truncate REST API is much faster than using the ArcGIS Online Feature Service writer's delete feature operation. Like a database operation, it truncates all the records in the table instantly.

After truncating, we will append the contents of the table by calling the Append REST API. The Append operation should take less than a minute to append 300,000 rows.

1. Add an HTTPCaller
After the last ArcGISOnlineConnector, add an HTTPCaller.

HTTPCaller_3.png

Configure it as follows to truncate the table. Be careful to truncate the correct table. It uses an administration URL in ArcGIS Online – different to the portal URLs used for append, delete, or addfeatures that you may be familiar with. You will need to use an admin connection to truncate.

If you need to, check out the Esri Truncate documentation.

In the example, we’re using the same admin web connection, which allows us to execute the Truncate API.

HTTPCaller_3 parameters.png

Tips:

  • To get the URL of the service you want to truncate, explore the services directory.
    • Generate an admin token
    • Then navigate to your organization’s ArcGIS Online services directory: https://services9.arcgis.com/<organisationid>/ArcGIS/admin/services


In this example, we’re truncating feature layer number 1 (it may be number 0 in yours).

2. Add a second HTTPCaller
Add another HTTPCaller to call the Append REST API.

HTTPCaller append.png

Use the web connection to authenticate, and set the Query String Parameters as follows:
HTTPCaller append parameters.png

The append operation is actioned on the (parent) feature service, with the layer number specified in the ‘layers’ query string. In this example, it is layer number 1, but it is normally layer 0 if you have a single layer.
 

Part 4: Monitor the Append Operation

I recommend configuring the FME workspace to monitor the append operation to identify any issues. The final part of the workspace looks like this:

Monitor.png

1. Add a JSONExtractor
Add a JSONExtractor to retrieve the job URL for monitoring progress.

2. Create or copy the CheckAppendStatus looping custom transformer
A looping custom transformer is used to check the status every second using an HTTPCaller. Don’t forget to set a time limit.

Check out this webinar for more information on looping.

The custom transformer looks like this:

Custom transformer.png

The HTTPCaller within the loop requests status updates from the job URL every second.

HTTPCaller within loop.png

Ensure you set the Concurrency/Looping Options to 1 Maximum Number of Concurrent HTTP Requests.

The JSONExtractor retrieves the appendstatus, then a Tester checks if the status is completed, failed, or the time limit is exceeded.

Tester.png

Make sure you +1 second before looping back. This is done in the ExpressionEvaluator.

3. Add a TestFilter
Add a TestFilter to check the appendstatus for logging and potentially terminating the workspace. Configure it as follows.

Screenshot 2022-12-02 at 11.36.44 AM.png

4. Add a Logger and a Terminator
After the TestFilter "Completed" port, add a Logger to log that the workspace was successful. After the "Failed" and "Time out" ports, add a Terminator to stop the workspace and log that the workspace failed to append.

You are ready to run the workspace.
 

Data Attribution

The data used here originates from data made available by Auckland Transport. It contains information licensed under the Creative Commons Licence 4.0.

Was this article helpful?

Comments

0 comments

Please sign in to leave a comment.