FME Server on Kubernetes: Utilizing Engine Assignment and Job Routing

Introduction

Engines can now be assigned to queues based on either engine properties or engine names. Engine properties, by default, include the OS, the FME Flow build, the license type, etc. Users are also able to add their own engine properties.

Jobs are routed to queues based on user-defined rules. The traditional repository-based routing method is still available, but there is also the option to route based on workspace statistics. Workspace statistics track runtime information such as peak memory usage, % CPU utilization, and more. See our blog for an introduction to Analyzing Job Statistics in FME Flow.

This is beneficial for containerized deployments of FME Flow, such as Docker and Kubernetes. Engine names are more likely to change in these deployments, so assigning engines to queues based on engine properties is more useful.

We recommend that the engine properties parameter be defined in values.yaml, and that queues be managed through the FME Flow Web UI, as described in this article. For information on all of the parameters available, please refer to our GitHub.

This article walks through an example that combines FME Flow Engine Management with Kubernetes Node Selectors to make the best use of the cluster infrastructure, enabling FME Flow jobs to be processed faster.
Engine deployment groups will be assigned to nodes of different types (general-purpose, compute-optimized, and memory-optimized). Then, FME Flow job routing rules will be configured to ensure that workspaces are processed on the correct engine based on their statistics.

This article assumes existing knowledge of how to deploy and manage an FME Flow Kubernetes Cluster.

Configure Nodes

Kubernetes allows pods to be constrained to run on a particular set of nodes(s).
You may want to do this to take advantage of different node instance types and sizes. For example, this would allow you to run FME Flow jobs that are more memory-intensive on a node type that is suited to those workloads.

The recommended approach to do this is to attach labels to a node and use node selectors when scheduling pods.

You can follow the Kubernetes documentation for details on how to do this: Attach a label to the node.

In this example, I have 3 nodes, labelled according to their VM type:

Screen Shot 2021-04-23 at 10.23.27 AM.png

The label key is Property, and the label value is generalPurpose, memoryOptimized, or computeOptimized.

Here are the nodes as shown in the Azure Portal:

Screen Shot 2021-04-23 at 9.41.14 AM.png

There are three different sizes of nodes:

The Dv2 and DSv2-series feature powerful CPUs and optimal CPU-to-memory configurations, making them suitable for most production workloads.
The Eav4-series sizes are ideal for memory-intensive enterprise applications.
The Fsv2-series is really fast for any computational workload.

Configure the FME Flow Deployment

Once the nodes are labelled, Kubernetes needs to know which node(s) to schedule engine pods onto. This is done using a nodeSelector in the values.yaml.

In order to set up Engine Assignment Rules in FME Flow, the engineProperties parameter needs to be set.
The container resources can also be configured for each engine group. Kubernetes will use this information to decide where to place a pod. It will not place a pod on a node that does not have enough available resources. For more information, see: Managing Resources for Containers

Below is an example of 3 different engine deployment groups, designed to run on different nodes and process different jobs.

You can see how the engineProperties, nodeSelector, and resources are configured differently for each engine deployment group.

engines:
   - name: "standard-group-1"
     engines: 1
     type: "STANDARD"
     engineProperties: "generalPurpose"
     labels: {}
     affinity: {}
     nodeSelector:
       property: generalPurpose
     tolerations: []
     resources:
       requests:
         memory: 2Gi
         cpu: 500m
   - name: "standard-group-2"
     engines: 1
     engineProperties: "memoryOptimized"
     type: "STANDARD"
     labels: {}
     affinity: {}
     nodeSelector:
       property: memoryOptimized
     tolerations: []
     resources:
       requests:
         memory: 4Gi
         cpu: 500m
   - name: "standard-group-3"
     engines: 1
     engineProperties: "computeOptimized"
     type: "STANDARD"
     labels: {}
     affinity: {}
     nodeSelector:
       property: computeOptimized
     tolerations: []
     resources:
       requests:
         memory: 1Gi
         cpu: 1000m

Applying this values.yaml file to the FME Flow deployment results in the engine pods being scheduled onto the correct nodes:

Screen Shot 2021-04-23 at 11.13.10 AM.png

Configure Engine Assignment Rules

Once FME Flow has been deployed, you will see the newly defined engine properties on the engines page in the FME Flow Web UI:

Screen Shot 2021-04-22 at 7.56.09 PM.png

Next, create queues that correspond to the engine properties and how you’d like to route your jobs. In this example, I’m using the Default queue for general-purpose FME Flow jobs, and have created 2 new queues for compute and memory-intensive workflows:

Screen Shot 2021-04-22 at 7.50.00 PM.png

Assign engines to the newly created queues on the Engine Assignment Rules tab.

Add a property that matches the engine properties defined in the values.yaml (you can refer back to the Engines tab to check this). Assign this property to a Queue:

Screen Shot 2021-04-22 at 7.53.26 PM.png

For this example, we have 3 Engine Assignment Rules:

Screen Shot 2021-04-23 at 10.46.58 AM.png

Configure Job Routing Rules

At the moment, any jobs run in FME Flow are sent to the Default queue, which is processed on the general-purpose nodes.

We will configure Job Routing Rules based on workspace statistics, so that workspaces get processed on the most suitable node. Ideally, workspaces approaching 100% CPU utilization will be processed on CPU-optimized nodes, and those with the highest peak memory usage will be processed on |memory-optimized nodes.

Here we can see the metrics of the workspaces that have been run on the general purpose node:

Screen Shot 2021-04-22 at 8.55.26 PM.png

Job Routing Rules can be created based on workspace repository or workspace statistics and will be evaluated top-down.

On this FME Flow, three rules have been created:

Screen Shot 2021-04-23 at 10.51.46 AM.png

If a workspace statistic shows its % CPU usage exceeding 85%, it will be routed to the Compute Intensive queue.
If a workspace statistic reports peak memory use of 100 MB or more, it will be routed to the Memory Intensive queue. Workspaces from the RasterProcessing repository will also get routed to this queue, on the assumption that they will likely require more memory to process.

Any workspaces that do not meet this criterion will be routed to the Default queue.

Running the workspaces again shows that they have been processed by the correct queues:

Screen Shot 2021-04-22 at 8.57.53 PM.png

The workspaces highlighted in red were routed to a different queue. Each of these workspaces processed faster on the optimized node.

Additional Resources:

Documentation: Defining FME Engines and Queue Control Properties

Search