Parallel Processing in FME

Liz Sanderson
Liz Sanderson
  • Updated

FME Version

Introduction

Parallel processing is the division of work over a number of processes. Modern computers have multiple CPU cores and spreading work across these cores makes the most of the available computing power.
Parallel processing in FME means the FME Engine divides its work into multiple worker processes. The workspace author decides when to use parallel processing and how many processes should be created. The operating system then decides how resources are allocated to those processes.
To use parallel processing within a workflow, the desired section to process in parallel will need to be contained within a custom transformer, see the Step-By-Step Instructions below for more information. 

To process workspaces in parallel, FME Flow with multiple engines is required, for more information see Job Orchestration with Automations or Getting Started with the Split-Merge Block.
 

When To Use Parallel Processing

Parallel processing is most effective in two specific scenarios.

  1. A small number of groups, each with a large amount of processing to do. Parallel processing is less effective where there are a large number of small groups. 
  2. A large number of small tasks are being offloaded elsewhere. 

 

Step-By-Step Instructions

1. Open ParallelProcessing-Begin.fmwt Template Workspace
In FME Workbench, open the ParallelProcessing-Begin.fmwt template workspace, which is available to download from the Files section of this article. 
Workspace.png
This workspace is creating digital elevation models (DEMs) from a series of contour lines. Since DEM creation is a slow process and the RasterDEMGenerator is already using Group Processing to process the DEMs, this workflow is a good candidate for parallel processing. 
GroupBy.png

Run the workspace and take note of how long the workspace took to run. On this computer, it took 25.1 seconds, with the single CPU taking 23.4 seconds. Depending on your computer hardware the elapsed time may be different. Also depending on your hardware, parallel processing may not be the best option. 
PreStats.png

2. Create Custom Transformer
Click and drag a box around all of the transformers, excluding the Inspector transformer. Right-click on one of the highlighted transformers and select Create Custom Transformer or press Ctrl+T (Cmd+T) on your keyboard. 
CreateCustom.png

In the popup dialog, name the custom transformer DEMCreator then click OK. Note, if you tried to name the transformer DEMGenerator, it would not be accepted because that name already exists as a transformer in FME. 
CustomName.png

For more information on creating custom transformers, see the documentation

3. Expose Attributes 
After clicking OK in the Create Custom Transformer dialog, the canvas should have automatically switched to the new DEMCreator tab. All of the parallel processing parameters are contained within the custom transformer. 
CustomWorkflow.png

After creating the custom transformer, you’ll notice that the 3DForcer has turned red, indicating that it is incomplete. Since the 3DForcer is using the Elevation attribute as a parameter, we need to expose it within the custom transformer. Double-click on the green 3DForcer_Input custom transformer input, then in the Edit Transformer Input dialog, enable Elevation and click OK. After clicking OK, the 3DForcer should valid. 
ExposeAttributes.png

4. Enable Parallel Processing
While still on the DEMCreator canvas, in the Navigator, expand Transformer Parameters, then double-click on Parallel Processing. Change Parallel Processing to Moderate. 
SetParallel.png

Moderate will enable processes based on the number of cores available. For example, if your computer has 4 cores, 4 processes will be enabled. 

5. Remove Existing Group By Parameters
Before we can run the workspace with parallel processing, we will need to set up the custom transformer’s Group By parameter. First, before we can do that, we will need to disable Group By set up in individual transformers. 
For this workflow, only the RasterDEMGenerator is using Group By. Open the RasterDEMGenerator parameters, and disable Group Processing. 
DisableGrou.png

6. Enable Custom Transformer Group By
The final step is to set Group By on the custom transformer. Switch back to the Main canvas tab. 
WorkflowWithCustom.png

Double-click on the DEMCreator custom transformer to open the parameters. Set Group By to the fme_feature_type attribute, then click OK. 
Note that any exposed attributes will also show up in custom transformer parameters. 
Prompt.png

Now, when the workspace is run, the custom transformer will create a separate process for each fme_feature_type. Several processes will run in parallel (with Moderate processing, one for each CPU core).
 
7. Run the Workspace
Run the workspace and take note of the elapsed time. With parallel processing, it should run much more quickly. For this computer, it took 20.3 seconds, with each CPU taking 1.9 seconds. 
PostStats.png


For more information and a different parallel processing example, see the Design for Performance course and navigate to the Process Data in Bulk with Custom Transformer Parallel Processing module in the FME Academy. 
 

Data Attribution

The data used here originates from open data made available by the City of Vancouver, British Columbia. It contains information licensed under the Open Government License - Vancouver.

Was this article helpful?

Comments

0 comments

Please sign in to leave a comment.