FME Version
Introduction
With the Split-Merge Block adding a new level of advanced functionality in FME Flow Automations, it’s pivotal to understand how various configurations affect the block's behavior. The following guide provides a deeper look into how the Split-Merge Block works, how to maximize its potential, and troubleshooting suggestions.
Prerequisites
Please note that this is an advanced Automation authoring article. It’s imperative to have a working knowledge of the Split-Merge Block, Automation Writer and output keys. For quick introductions, please read the Split-Merge Block guide, Automation Writer documentation, Output Keys documentation, and Building Integrations with the FME Flow Automation Writer.
Automation Design Considerations With the Split-Merge Block
1. Build your workspaces with the Automation Writer and Split-Merge Block in mind.
When building advanced Automations, carefully consider the following:
- Do I need to pass information between workspaces? If so, use the Automation Writer.
- Can these jobs make use of parallelism? Or, do I need to scale down my jobs after an Automation Writer so I can run downstream actions? If so, use the Split-Merge Block.
2. Parallel jobs may be accessing the same data source or destination at the same time. Consider the data sources and destinations in your Automation.
Single-user data formats may run into issues with parallel jobs especially if they are writing to the same file. Similarly, jobs may run into rate limit problems if they’re interacting with the same web service. Multi-user source and destination formats like databases and Enterprise Geodatabases are often ideal for parallel jobs if they are reading from/writing to the same dataset. Be careful that database transactions do not interfere with each other across parallel jobs.
3. When designing the Automation, assess the number of jobs that will be generated by Automation Writers.
If your FME Server only has a few Engines, this may be a bottleneck if your Automation splits out to hundreds or thousands of jobs. If a high job count cannot be handled by your current Engine count, look for ways to filter or aggregate the output features before sending them out of the Automation Writers.
4. Consider using Queue Control to choose which Engines the parallel jobs will run on.
If your Automation scales up to a large number of jobs, it could overwhelm your job queue and cause a backlog of jobs. If you want to reserve one or more Engines for high-priority jobs, make use of queue control and job routing.
Output Port Behavior
Since the Split-Merge Block treats all contained workspaces as a singular action, one input message corresponds to one output message. After the jobs in a Split-Merge Block finish running, it will only release a failure message if:
- Workspace(s) are connected to the Block’s failure port, AND
- At least one job from these connected workspaces reports a failure.
There are a variety of ways you can control this behavior based on how you connect workspaces to the Split-Merge Block’s output ports. Below are a few examples of how you can configure this output behavior.
Connect One or More Workspaces to Both Output Ports
The rule above applies whether you have just one workspace (1), multiple workspaces in parallel (2), or multiple chained workspaces (3) connected to the Block’s failure port. These three configurations are the most common.
In (1), the Split-Merge Block will only release a failure message if Report Generator fails. Since Process Parcels isn’t connected to the Block’s failure port, the Block will not check if this job fails. If you want to monitor all contained workspaces for failures, make sure all these actions’ failure outputs are connected to the Split-Merge Block’s failure port, like example (3).
Note in (2), connecting multiple workspaces to the success port is only truly necessary if you’re running workspaces in parallel and want the Split-Merge Block to wait for the parallel jobs to finish before releasing the output message (much like the Merge Action).
Connect Workspace(s) To Only the Success Port
If your workspace(s) are only connected to the Split-Merge Block’s success port, it will always release a success message regardless of whether the jobs succeed or fail. This is because the failure port on the Block is not connected to any workspace action and won’t respond to a failure.
Note: In the example above, if Check Requests fails or outputs zero work order requests, Process Order will not run. However, the Split-Merge Block will report that it “Received 0 notification(s) for successful split-merge action” and still run downstream actions connected to the Block’s success port. If you wish to receive failure notifications, we recommend a configuration like example (3) above.
No Connections To Either Output Ports
This is still considered a valid configuration and will always release a success message regardless of whether any job succeeds or fails.
Connect an Automation Writer To the Output Ports
This is a niche use case. If you connect an Automation Writer to the Block’s success port, the final message output from the Split-Merge Block will contain the attribute information of one feature. If you connect the Automation Writer to the Block’s failure port, any feature that’s routed here will cause the Block to report a failure.
This scenario should not be used if more than one feature could be routed out of the Automation Writer. See an example use case below called “Using an Automation Writer Output as a Failure Condition”.
Tip: To guarantee that only a single feature is output from the Automation Writer, consider using the Aggregator transformer.
Aggregating Output Keys from Multiple Ports
Another great use case for utilizing a Split Merge Block is to merge output keys from multiple ports. This can come in use when you need to utilize keys from both a job status output port and an Automation Writer port.
For example, if you want to send a summary email that includes job status, finish time, and attribute values you can utilize a Split-Merge Block to merge the output keys from multiple ports to send a single email (or log a single message) with details about the job and attributes. In the example below, the workspace outputs summary statistics through the Automation Writer, and we want to include this information along with the job status information. By connecting both output ports to the Split-Merge Block, the Email Action has access to both of these sets of keys.
Similar to the tip in the previous section, this is analogous to using the default settings in an Aggregator in an FME Workspace since it will only retain the attribute values from the first feature to arrive at the Automations Writer Feature Type. If you have multiple features being sent to the Automation Writer and you want these values to be included in the next action, consider concatenating the attribute in the workspace by using an Aggregator and specifying the Attributes to Concatenate or by using a StringConcatenator before the writer feature type.
Special Use Cases
Imitating External Actions in a Split-Merge Block
External Actions in Automations let you send messages to external clients or inside FME Server. However, External Actions are not available to use inside a Split-Merge block. But did you know that for many External Actions, an equivalent Transformer exists? For example, the Emailer and the Email Action, the HTTPCaller and the HTTP Action, FTPCaller and the FTP Action, and so forth. This substitution can provide a workaround for performing an external action within a Split-Merge Block.
Consider this scenario in the screenshot above: a user wishes to receive a single email notification after all files have successfully uploaded to an FTP site. A Split-Merge Block is needed to wait for all the FTP uploads to finish before sending the email. They substitute the FTP Action with a workspace that uploads the files with the FTPCaller.
Note: One caveat for this workaround is that running a workspace requires using an Engine. External Actions do not use Engines.
Using an Automation Writer Output as a Failure Condition
It’s possible to make a Split-Merge Block report a failure by routing output features with the Automation Writer. This can be useful in validation workflows where the job always succeeds regardless if it receives good or bad data.
Consider the Automation in the screenshot below where the geometries of datasets are being validated. Downstream Work should not be run if any invalid features are detected. If any features are invalid, an HTML report on the geometry issues is generated and routed from the “HTML Report” Automation Writer to the Block’s failure port. This causes the Split-Merge Block to fail.
Since we’re routing an Automation Writer message to the Block’s output port, we have access to the message’s output keys, including the HTML Report attribute. This information can be added to the connected Email Action body.
The key takeaways from this scenario are:
- The failure port of the Geometry Checker workspace isn’t a useful failure condition since the job succeeds even when it receives bad data.
- The Automation Writer can route information about bad features to the failure port of the Split-Merge Block to trigger a Block-wide failure and prevent Downstream Work from running.
- Split-Merge Blocks only release a single message when they finish, so the Geometry Checker workspace was designed to guarantee that only one feature, the HTML report, will be output from the Automation Writer.
- Attribute information from this single message can be used in downstream actions such as the body of an email notification.
Troubleshooting & FAQs
Having trouble conceptualizing how jobs are scaled up by the Automation Writer and down with the Split-Merge Block?
Think of one-in-one-out logic. Downstream workspaces from an Automation Writer will run N times per N output features. In contrast, the number of output messages from a Split-Merge Block is dependent on how many events/messages are expected to enter the Block’s input. Please review our introductory article Getting Started with the Split-Merge Block.
What Automation components can I use in the Split-Merge Block?
Currently, you can only use the Workspace and Dynamic Workspace actions inside a Split-Merge Block. However, you can still connect workspaces inside the Block to other types of Actions outside the Block. You cannot include Triggers inside a Block, but you may want to consider looking at the Merge Action for this type of functionality.
When creating the Automation, should I create a Split-Merge Block first and add workspaces into it, or should I create the workspaces first and then create a Split-Merge Block to contain them?
Both methods are completely fine, but keep a close eye on the input port. The Split-Merge Block won’t automatically reconnect components at the input or output ports when moving the Block to encircle a set of workspaces.
What can I connect to the input port of a Split-Merge Block?
You can connect almost any Automation component to a Split-Merge Block input port including Triggers, Actions, External Actions, and Automation Writers. However, you cannot connect the output of one Block directly to the input of another Block.
Do I need to connect all success ports of workspace actions to the Split-Merge Block’s success port?
In typical use cases, only the last workspace(s) in the Block should have their success outputs connected to the Success port. For uncommon exceptions, please review the output port behaviour section of this article.
If my Split-Merge Block runs 10 jobs, but job #3 fails, what happens to the remaining 7 jobs?
The Block is designed to wait until all jobs are complete regardless of whether they succeed or fail. The remaining 7 jobs will still run. The resulting output message from the Split-Merge Block will depend on your output port configuration between the actions and the Block.
If two different workspaces’ jobs fail in my Split-Merge Block, will I receive more than one message?
No, the Split-Merge Block will only release a single failure message.
If a job fails but succeeds when retried (from the workspace retry settings), is that still considered a failure?
Yes, this counts as a failure notification even if a retry eventually succeeds.
My Split-Merge Block is still outputting a success message even though one or more jobs are failing in the Block. Why is this happening?
This is likely because you haven’t connected the failure port of that workspace to the failure port of the Block. Please review the output port behavior section of this article.
Why isn’t the output message of the Split-Merge Block a combination of every message?
The “merge” aspect of the Split-Merge Block is where the contained jobs are downscaled rather than all the event messages being combined. Think of it like an Aggregator output that only has attributes of one feature/job.
Why doesn’t the Split-Merge Block output the same message every time?
Certain jobs may finish before others in an inconsistent order, and the Split-Merge Block only releases information about the final job.
Can I access Automation Writer keys in downstream actions from the Split-Merge Block?
Yes, you can but note that only one set of Automation Writer message values will be accessible based on the last job that ran inside the Block. If the Automation Writer is releasing more than one message, your workflow cannot reliably receive the same Automation Writer outputs from the Split-Merge Block.
Can I use looping with a Split-Merge Block?
Yes, but be careful to avoid an infinite loop.
Additional Resources
Getting Started with the Split-Merge Block
Job Orchestration with Automations
Comments
0 comments
Please sign in to leave a comment.