Dynamic Workspaces: Data Driven Parallel Processing in FME Flow Automations

Files

DynamicWorkspacesProject.fsproject
- 300 KB
- Download
DynamicWorkspaces.zip
- 30 KB
- Download

Introduction

This article will walk you through how to use dynamic workspaces in FME Flow (Formerly FME Server) automations. Dynamic workspaces allow the user to specify the workspace that is to be executed at run-time rather than at “author time”. Using this capability users can invoke any workspace that matches a template workspace. This can be done without having to stop the automation. This example demonstrates two enterprise integration patterns: the splitter pattern, and the content enricher pattern. For more information on patterns, see Getting Started with Enterprise Integration Patterns.

In this scenario, you are a retail company that purchases a number of items from hundreds or thousands of different manufacturers. You want to retrieve details from invoices and then process them. Once the details have been processed, you want to automate sending an email to each of the manufacturers to complete the purchases. Every time we add or delete a manufacturer, we need to update the workflow, but cannot stop our automation to edit. Using Dynamic Workspaces will help us solve this problem.

Step-by-step Instructions

To solve this problem, we need to implement two different patterns; the splitter pattern and the content enricher pattern.

Part 1: Splitter Pattern

In this workflow, we implement the splitter pattern with a simple workspace. This workspace decodes the invoice, which is represented by an input JSON file. The workspace ensures the invoice is valid JSON (you could also set this up for XML, CSV, or any data source supported by FME), then fragments the JSON to its constituent invoice items before writing out to the Automations writer.

This workspace affects the automation of turning a single order message into many messages (one for each item in the invoice). The workspace splits up the invoice into its constituent pairs, which can then be processed separately and in parallel.

Splitter Pattern Overview

The splitter pattern addresses the common challenge of:
"How can we process a message if it contains multiple elements, each of which may have to be processed in a different way?"

Splitter

Splitter image by Enterprise Integration Patterns , under CC-BY-4.0

The splitter pattern breaks out the composite message into a series of individual messages, each containing data related to one item.

Splitter Pattern Implementation

To implement the splitter pattern on our input JSON file containing the invoice information, we can do the following steps:

1. Add a Text File Reader to Read the Invoice File

In a blank workspace, add a Text File reader and read in the Invoice.json dataset. This JSON dataset is just a placeholder as we will use Resource or Network Directory (formerly Directory Watch) in our FME Flow Automation later on.

The JSON file object is invoice_items, then the key pairs are Item, Price, Manufacturer, Code, and Quantity, which will be translated into attributes. For example:

{
  "invoice_items": [
    { "Item": "Macbook Pro",
      "Price" : 3299.00,
      "Manufacturer": "Apple",
      "Code": "MBPA",
      "Quantity": 1},
   ]
 }

2. Modify the User Parameter

When adding source data, a user parameter is automatically created. We need to modify this parameter so that it is easy to identify in FME Flow later. In the Navigator pane, expand User Parameters, right-click on [SourceDataset_TEXTLINE], and select Manage User Parameters.

In the User Parameters dialog, set the following:

Parameter Identifier: Invoice
Prompt: Invoice:

UserParameters

3. Add a JSONFragmenter to Split the Invoice into Invoice Items

Add a JSONFragmenter to the canvas, connect it to the Text File reader feature type and set the following parameters in the dialog box.

JSON Attribute: text_line_data
JSON Query: json[“invoice_items”][*]
Reject Features which Produce No Fragments: No
Flatten Query Results into Attributes: Yes

JSONFragmeterParameters

The JSON Query will read all of the key pairs in the invoice_item object. Flatten Query Results into Attributes will turn all the key pairs into attributes.

The final step is to expose the attributes. Click on the ellipsis next to Attributes to Expose, and manually enter the following:

Code
Manufacturer
Quantity
Cost
Item

4. Write Invoice Items to Automations

The final step is to use the Automations writer to write each invoice item to the automations framework. We set the exposed attributes so they are available throughout the automation. Add an FME Flow Automations writer to the canvas, then set the Feature Type Definition to Automatic.

In the Feature Type dialog, set the FeatureType Name to InvoiceItem.

5. Publish to FME Flow
Save the workspace as DecodeInvoice.fmw, then publish to FME Flow. Create a new Repository called DynamicWorkspaces and register with the Job Submitter.

Part 2: Content Enricher Pattern

The content enricher pattern is a common pattern used when a message doesn't contain all the required information necessary for downstream processing.

Here we have an invoice item message that has the manufacturer, but we need to know the name of the workspace that is to be run for that manufacturer. FME's ability to pull information from virtually every system is key here, along with the Automation writer that makes it easy for us to "enrich" the content of a message.

To do this, we need to create a simple workspace that joins the manufacturer's name to the corresponding workspace name. We use a CSV file here for demo purposes, but it could be any other format supported by FME. When a new manufacturer is added, we update the CSV file. This means anyone can update the file with the new workspace name with little to no FME knowledge.

Content Enricher Pattern Overview

The content enricher pattern addresses the common challenge of:
"How do we communicate with another system if the message originator does not have all the required data items available?"

Message Router Pattern image by Enterprise Integration Patterns , under CC-BY-4.0.

Use the Content Enricher pattern to access an external data source in order to augment a message with missing information.

Content Enricher Pattern Implementation

To implement the Content Enricher pattern to attach the workspace name for the manufacturer, we do the following steps:

1. Create User Parameters

In a blank workspace, we will create two user parameters: ManufacturerName and WorkspaceMapping.

In the Navigator window, right-click on User Parameters and select Manage User Parameters. In the User Parameters dialog enter the following:

Type: Text (1)
Parameter Identifier: ManufacturerName
Prompt: Manufacturer
Published: Yes
Required: Yes

Then create a second user parameter with the following parameters:

Type: File/Folder/URL (2)
Parameter Identifier: WorkspaceMapping
Prompt: Manufacturer Workspace Mapping:
Default Value: C:\Users\Administrator\Documents\DynamicWorkspaces\manufacturerWorkspaceMap.csv (or the path where the CSV file is saved)

2. Create Manufacturer Attribute

Now we need to create the Manufacturer attribute. First, add a Creator to the canvas, then connect an AttributeCreator.

@LowerCase($(ManufacturerName))

This will set any values input into the ManufacturerName user parameter to lowercase.

3. Join Workspace Mapping File

Using a DatabaseJoiner, we will now join the manufacturer attribute with the manufacturer column in the WorkspaceMapping CSV file to get the workspaces that will be run. To do this, add a DatabaseJoiner to the canvas and connect it to the AttributeCreator. In the parameters, set the following:

Format: CSV (Comma Separated Value)
Dataset: $(WorkspaceMapping) [User Parameter]
Table: click the ellipsis and select CSV
FeatureAttribute: Manufacturer
Table Field: manufacturer
Fields to Add: workspace
Cardinality: Must Match Exactly One (1:1)

4. Write to Automations

The final step is to use the Automations writer so that these attributes can be used throughout the automation. Add an FME Flow Automations writer to the canvas and set the Feature Type Definition to Automatic.

In the Feature Type dialog, set the Feature Type Name to ManufacturerInfo. Connect the writer feature type to the Joined output port on the DatabaseJoiner.

5. Publish to FME Flow

Save the workspace as getWorkspace.fmw, then publish to FME Flow. Publish the getWorkspace.fmw to the same DynamicWorkspaces repository as in Part 1. Enable Upload data files, then click Select Files.

In the Select Files to Upload dialog, click Select Location. Then in the Select Upload Location dialog, select “Upload to a shared resource folder”. Navigate to Data > DynamicWorkspaces and create a new folder called WorkspaceMapping. Ensure the WorkspaceMapping folder is selected and click OK.

Next, select Add Files and then browse to the manufacturerWorkspaceMap.csv, then click OK.

Part 3: Dynamic Workspaces: Manufacturer Workspaces

Here, we address the challenge of how to run a specific workspace for a manufacturer. In this scenario, we have thousands of manufacturers, making it not practical or possible to have a standard workspace runner tool for each separate manufacturer.

Here we are going to use dynamic workspaces in FME Flow (2020.1 and later), along with a simple control file. The control file is a simple lookup from the manufacturer name to the workspace with specific manufacturer actions.

Using this, we will be able to run any single workspace from a large collection of multiple workspaces. To illustrate how this is done, we will first create the workspace. Since this is a demonstration, we will be creating one workspace, and then duplicating it with different file names. In a real situation, these workspaces would do something different to satisfy the communication requirements of the different manufacturers. The only thing that needs to be the same across all of the workspaces are the user parameters and the output feature types from the Automation writer.

Manufacturer Workspace Implementation

Implementing the manufacturer workspace is no different than implementing other workspaces in FME. The only thing that is important is that the user parameters and the output automation writer feature types must be consistent when an Automation writer is used.

The steps for building our manufacturer workspace that will be run dynamically in our automation are given below.

1. Create User Parameters
In a blank workspace, we will need to create three user parameters, Manufacturer_Item_Number, Description, and Quantity.

In the Navigator window, right-click on User Parameters and select Manage User Parameters. In the User Parameters dialog, enter the following:

Type: Number
Parameter Identifier: Manufacturer_Item_Number
Prompt: Item Number:
Published: Yes
Required: Yes
Number Configuration:
Lower Limit - None
Upper Limit - None
Numeric Precision: Float

Create the second user parameter with the following:

Type: Text
Parameter Identifier: Description
Prompt: Description:
Published: Yes
Required: Yes

Create the third user parameter with the following:

Type: Number
Parameter Identifier: Quantity
Prompt: Quantity:
Configuration:
Lower Limit - Greater than or equal to value
Value: 0
Upper Limit - Less than or equal to value
Value: 0
Published: Yes
Required: Yes

3. Create Simple Workspace

Add a Creator and attach a Logger. In this demo, we are only really interested in the published parameters, but in reality, you would have a more complicated workspace.

4. Publish to FME Flow

The workspace is ready to be published to FME Flow. Before publishing, ensure that you save as we will be duplicating this workspace in the next step. Save the workspace as appleOrder.fmw

Publish to the DynamicWorkspaces repository and register with the Job Submitter service.

5. Duplicate Workspace

To save time since this is just a demonstration, we will duplicate the workspace we just saved in the previous step. Optionally, you can download the other workspaces (mattelOrder.fmw, hasbroOrder.fmw, samsungOrder.fmw)

In your file browser, copy and paste the appleOrder.fmw, rename it to mattelOrder.fmw. Open the workspace and publish it to FME Flow.

Repeat the steps and create hasbroOrder.fmw and samsungOrder.fmw.

You should now have four order workspaces published to FME Flow.

Part 4: Dynamic Workspaces

With everything published to FME flow, we can now create the automation that will run all of these workspaces.

1. Open FME Flow

Open and log into the FME Flow Web UI. In the side menu bar, expand Automations and click on Create Automation. Close the Getting Started dialog if it pops up.

2. Set-Up the Resource or Network Directory Trigger
To trigger our automation, we will use a Resource or Network Directory trigger. Drag and drop a trigger to the automations canvas. In the parameters, set the following:

Trigger: Resource or Network Directory (updated).
Directory to Watch: $(FME_SHAREDRESOURCE_DATA)/DynamicWorkspaces/
Events to Watch for: Create, Modify
Poll Interval: 1 Minute

3. Run DecodeInvoice.fmw (Implementing the Splitter Pattern)
Now we will run the first workspace called DecodeInvoice.fmw. Add an action to the canvas and connect it to the Resource or Network Directory trigger. In the parameters, set the Action to Run Workspace, the Repository to DynamicWorkspaces, and the Workspace to DecodeInvoice.fmw.

This is where we can begin to control our automation externally without needing to pause the automation to change anything if we add another vendor/workspace. Click the drop-down arrow next to the Invoice parameter. Expand Directory and select File Path. This will read the JSON file that we will put into the DynamicWorkspaces folder to trigger the automation.

After clicking Apply, InvoiceItem output port will appear under the DecodeInvoice action; this is created from the Automations writer we set up in the workspace and will allow for the attributes to be carried through the rest of the automation.

4. Run getWorkspace.fmw (Implementing the Content Enricher Pattern)
The next step is to add the workspace called getWorkspace.fmw, that will join our input JSON file with the lookup table containing all of the order workspace names.

Add another action to the canvas and connect it to the InvoiceItem output port on the Decode Invoice action.

In the parameters, set the Action to Run Workspace, the Repository to DynamicWorkspaces, and the Workspace to getWorkspace.fmw

Next, click the drop-down arrow next to Manufacturer, then expand Workspace > Invoice Item and select Manufacturer. Then for ManufacturerWorkspaceMapping, click the ellipsis button and browse to Data > Dynamic Workspace > Workspace Mapping and then select the manufacturerWorkspaceMap.csv file that was uploaded in Part 1.

5. Run Manufacturer Workspaces Using Dynamic Workspace Capability
Now we will add the dynamic workspace. Add an action component to the canvas and connect it to the manufacturerInfo output port on the get Workspace action.

In the Action parameters, set the Action to Run a Dynamic Workspace and set the Repository to DynamicWorkspaces. For Workspace, we will need to point to the lookup table containing the workspace names. Click the drop-down arrow next to Workspace and expand Workspace > manufacturerInfo and then select workspace. This parameter comes from the GetWorkspace.fmw Automations writer.

Next, we will need to set up the user parameters. In the dialog, select Import Parameters. Set DynamicWorkspaces as the Template Repository and appleOrder.fmw as the Template Workspace. After clicking OK, the user parameters from the appleOrder.fmw workspace will be available.

We need to map the parameters to the values from the input JSON file. Click on the drop-down next to Item Number and expand Workspace > InvoiceItem then select Code. Repeat with Description, and Quantity and set those to Item and Quantity, respectively. The details window should look like the following image.

6. Trigger Automation
With the automation built, we can now save it and start it.

Once the automation is started, navigate to Files & Connections on the side menu bar, expand it and then select Resources.

In the Resources, browse to Data > DynamicWorkspaces. In the DynamicWorkspaces folder, click on the Upload drop-down and select File(s). Select the Invoice.json file. Wait 1 minute.

After a minute has passed, navigate to Jobs > Completed on the side menu bar. You will see the DecodeInvoice.fmw workspace run once, which triggers the GetWorkspace.fmw once for every workspace indicated in the manufactureWorkspaceMap.csv. Note that the appleOrder.fmw workspace was run twice, this is because there are two entries for Apple in our Invoice.json file.

If you wanted to add more order workspaces, you would need to upload the CompanyNameOrder.fmw workspace for the new company, and then add the name to the mapping file.

Search