Why would you use a Kubernetes deployment for FME Flow?

Introduction

FME Flow (formerly FME Server) has been available on Kubernetes since 2019, and due to the increased awareness of serverless compute services, more customers are starting to use it for their FME Flow deployments or inquire about it.
Kubernetes is still a new technology for many people, so this article will explain the benefits of using Kubernetes (for FME Flow).

What is Kubernetes?

Kubernetes is an open-source container orchestration tool that automates the deployment, scaling, and management of containerized applications. If you’re not familiar with containers, check out our blog article from when we introduced FME Flow for Docker.

Kubernetes is a complex technology, so if you aren’t already familiar with Kubernetes or are just getting started, we recommend you familiarize yourself with the Kubernetes Concepts before trying to deploy FME Flow using Kubernetes.

Kubernetes solves several issues that arise with manual container management and deployment. Like Docker Compose, Kubernetes takes a declarative approach, meaning that you describe how FME Flow should be installed and configured in a YAML file. We have already created these YAML files so that Kubernetes will deploy FME Flow with the correct containers, services, networking, resources, etc.

What is Helm?

Helm is a package manager for Kubernetes applications. It is the tool we use to install, upgrade, and manage FME Flow on Kubernetes. The YAML files mentioned above are grouped into a Helm chart. Our Helm Charts are available on GitHub.

Any variables or parameters an FME Flow administrator wishes to change will be defined in a values.yaml and applied using Helm. This file contains the chart's configuration values and the FME Flow's desired state. The list of supported configurable parameters and the default values can be found on our GitHub.

The FME Flow administrator can save their values.yaml file which makes using Kubernetes (with Helm) to deploy FME Flow a quicker, easier and repeatable solution.

What are the benefits of using Kubernetes?

The Kubernetes documentation has a good overview of what Kubernetes is and what it can do. Some of those features can be beneficial to an FME Flow installation:

Service discovery and load balancing

Kubernetes can load balance traffic across a deployment so that it’s stable. If you have multiple FME Flow core pods running (which has the core and web application server containers inside one pod), then network traffic will be distributed between those pods to avoid one pod getting overwhelmed. Kubernetes will only send traffic to pods that are running and ready.

Service discovery makes scaled FME Flow deployments easier. In a traditional deployment, adding additional distributed components (most likely engines) may require additional networking or firewall configuration to ensure that all FME Flow components can still communicate. With Kubernetes, when you add more nodes (hosts) or scale pods (FME Flow engines or cores), it takes care of all of the networking and communication between pods for you. This benefits FME Flow users who distribute or scale their engines.

Storage Orchestration

Kubernetes allows you to automatically mount a storage system of your choice, such as local storage, public cloud providers, and more. This is where the FME Flow System Share will reside. You need to set up your storage provider before installing FME Flow.

Automated rollouts and rollbacks

In Kubernetes, you can describe the desired state for your deployed containers, and the actual state will be changed to the desired state at a controlled rate. For example, you can automate Kubernetes to create new containers for your deployment, remove existing containers, and adopt all their resources to the new containers. This may be useful for managing FME Flow engines (scaling, queues, etc) and minor version upgrades (bug fixes), but should not be used for major FME Flow upgrades. Going from 2024.0 to 2024.1 would be considered a major upgrade, and therefore not supported by helm upgrade. Instead, we recommend launching the new version of FME in a new namespace and swapping the ingress hostname once the new version is ready.

The helm upgrade command works for minor FME Flow upgrades. For example, it supports going from 2024.0.1 to 2024.0.2.

Automatic bin packing

You provide Kubernetes with a cluster of nodes that it can use to run containerized tasks. You tell Kubernetes how much CPU and memory (RAM) each container needs. Kubernetes will fit containers onto your nodes to best use your resources. For FME Flow, we recommend setting this on your engines so that when Kubernetes schedules engine pods, it will only put them on a node with enough free resources. You will need to make sure that you have enough nodes, or big enough nodes that have enough resources to run your FME Flow deployment, otherwise pods will get stuck in a pending state until Kubernetes can successfully schedule them.

This ensures that your containers have sufficient resources to complete their work.

Self healing

Kubernetes restarts containers that fail, replaces containers, kills containers that don’t respond to your user-defined health check, and doesn’t advertise them to clients until they’re ready to serve. In the FME Flow Helm Chart, we have already defined the liveness and readiness probes for the pods.
This makes FME Flow running on Kubernetes fault-tolerant, as any pods or containers that get into a bad state will get restarted or replaced. If an engine pod went down, Kubernetes wouldn’t advertise that pod as available to the core, so you don’t have to worry about jobs running on unhealthy containers. Failed jobs in FME Flow are re-queued and will get processed on the next available engine.

If Kubernetes is running as a service on a public cloud provider (for example, AWS EKS), nodes in a Kubernetes clusters are provided through auto-scaling groups, allowing you to easily scale nodes. This also has the benefit that if a node went down or was accidentally removed, the auto-scaling group would start a new one to replace it. Kubernetes would schedule any missing pods for the replacement node as soon as they were available, minimising any downtime for the application.
This makes Kubernetes a good solution if you’re looking to deploy a large-scale, highly available FME Flow.

Who should be using Kubernetes?

Kubernetes has a steeper learning curve than Docker (and Docker Compose); we do not recommend this deployment to users without experience managing Kubernetes deployments.
If you want to start with Kubernetes, we recommend learning about the technology before deploying it to FME Flow. Many resources and training courses are available online.

FME Flow containers are built on Ubuntu images, so this would not be a good deployment option for someone who needs Windows-based format support (e.g., Esri).

This deployment type would be best suited to organizations with in-house expertise in Kubernetes that want to take advantage of the benefits.

Where can you deploy Kubernetes?

One of Kubernetes's benefits is that it can be used anywhere. Many Cloud Providers, such as AWS, Azure, and Google Cloud Platform, support Kubernetes; however, there may be differences in deployment between them. One example of this would be setting up the volumes for FME Flow to use. Our documentation has instructions for the major Cloud Service Providers.

It is possible to deploy Kubernetes on-premise (or locally for testing), and we have had success using Minikube and kind.

FME Flow on Kubernetes has no special licensing requirements, as it is done through the FME Flow web UI. Kubernetes's easily scalable nature makes it well suited for Dynamic Engines.

FME Flow pricing can be found here.
Documentation on licensing can be found here.

Search