Introduction
The FME Flow REST API V3 /healthcheck request returns the operational status of FME Flow's critical components. A request can be made to check for either “Liveness” or “Readiness”.
This article provides details on which component checks are performed for each of these requests and tips for investigating failures.
Why Use Health Checks?
A health check request can be used by any person, service, or application that needs to verify that FME Flow is available and operational. Some examples include:
- An FME Flow Administrator, for automated monitoring and notifications.
- A load balancer, to determine where jobs should be routed.
- A webhook service, before sending event data
Liveness
Liveness is when all FME Flow critical components are running and responsive.
A liveness check is performed by default when no value is set for the “ready” parameter, or when the “ready” parameter is set to false, e.g.
http://<hostname>/fmerest/v3/healthcheck
or
http://<hostname>/fmerest/v3/healthcheck?textResponse=false&ready=false
A liveness check verifies that the following components are active and responsive:
Component | Process | Log File |
---|---|---|
Configuration | FMEConfiguration.exe | core > fmeconfiguration.log |
Resources | FMEMountPoint.exe | core > fmesharedresource.log |
Connections | FMEConnection.exe | core > fmeconnection.log |
Scheduler | FMEScheduler.exe | core > fmescheduler.log |
Notifier | FMENotifier.exe | core > subscribers > fmesubscribers.log |
Relayer | FMERelayer.exe | core > publishers > fmepublishers.log |
Cleanup | FMECleanup.exe | service > fmeprocessmonitorcleanup.log |
Tomcat | FMEFlow_ApplicationServer.exe | tomcat > catalina.log |
If all listed components are responsive, the liveness check will return a 200 status code and message:
OK
If one or more listed component(s) are not responsive, the liveness check will return a 500 status code and message:
An FME Flow component is unavailable.
Readiness
Readiness is when all FME Flow critical components are running and responsive and the system is accepting jobs.
A readiness check is performed when the “ready” parameter is set to true, e.g.
http://<hostname>/fmerest/v3/healthcheck?textResponse=false&ready=true
A readiness check verifies that the following components are active and responsive:
Component | Process | Log File |
---|---|---|
Configuration | FMEConfiguration.exe | core > fmeconfiguration.log |
Resources | FMEMountPoint.exe | core > fmesharedresource.log |
Connections | FMEConnection.exe | core > fmeconnection.log |
Scheduler | FMEScheduler.exe | core > fmescheduler.log |
Notifier | FMENotifier.exe | core > subscribers > fmesubscribers.log |
Relayer | FMERelayer.exe | core > publishers > fmepublishers.log |
Cleanup | FMECleanup.exe | service > fmeprocessmonitorcleanup.log |
Tomcat | FMEFlow_ApplicationServer.exe | tomcat > catalina.log |
Queue | memurai.exe or redis-server.exe * | queue > localhost_queue.log |
Post-install scripts have successfully executed | N/A | installation |
*note: memurai.exe is used by FME Flow 2023+ win64. Redis-server.exe is used by all other FME Flow versions.
If all listed components are responsive and the queue is active, a successful readiness check will return a 200 status code and message:
OK
If one or more listed component(s) are not responsive and/or the queue is not active, a readiness check will return a 500 Status Code and message:
FME Flow is Not Ready.
How to Investigate a Health Check Failure (REST API V3)
In the FME Flow REST API V3, health check requests are only a pass or fail test to alert us to a problem. When a failed status is returned, the response does not tell us which component(s) are causing the failure. When we receive a failure message, we must do further investigation to identify which component is inactive or unresponsive.
Log Files
Start a log file investigation by looking for ERROR or FATAL messages in the fmeserver.log and fmeprocessmonitorcore.log (core folder). Proceed to check each log file associated with each listed component process. Note that log folders are divided into current and old folders. If a process is crashing, the error messages may be found in the old folder.
The FME Flow Debugging Toolbox: Log Files contains instructions for using FME Flow log files as a troubleshooting tool.
Processes
Ensure that each FME Flow component process is running.
Instructions for investigating FME Flow Processes can be found in the FME Flow Debugging Toolbox: System Tools.
How to Investigate a Health Check Failure (REST API V4)
NOTE: FME Flow REST API V4 is still in technical preview. We do not recommend using V4 in production workflows until it is officially released.
In the FME Flow REST API V4, health check requests can return more detailed information by using the includeDetails parameter. This is useful after a general health check fails.
Health check request in V4:
http://<hostname>/fmeapiv4/healthcheck/liveness
Health check request in V4 with the includeDetails parameter:
https://<hostname>/fmeapiv4/healthcheck/liveness?includeDetails=true
The response will include a list of all FME Flow components and their respective statuses:
{ "status": "ok", "message": "FME Flow is healthy.", "components": [ { "component": "resources", "status": "up" }, { "component": "relayer", "status": "up" }, { "component": "notifier", "status": "up" }, { "component": "cleanup", "status": "up" }, { "component": "configuration", "status": "up" }, { "component": "scheduler", "status": "up" }, { "component": "core", "status": "up" }, { "component": "connections", "status": "up" } ] }
Any components listed as "down" should be further investigated.
Comments
0 comments
Please sign in to leave a comment.