Full Guide: FME Flow Troubleshooting Guide
Are you unable to start your FME Flow (formerly FME Server) Engines? Are you seeing Engines going missing from the Web UI, or even experiencing an issue with the engine after a certain job is run?
Please read below for some common troubleshooting tips, questions and resources.
Content Overview
- Initial Troubleshooting
- Common Issues
- “FME Flow license does not allow more than maximum of 1 FME Engine(s)”
- “My Engine host is not listed in the Web UI so I cannot change the engine count”
- “Engines are missing from the Web UI after hitting a Python Interpreter mismatch error”
- “FME Flow Jobs are not processing!”
- "My Engines are missing and in the fmeserver.log I see ‘Failed to connect to Job Queue’"
- “FME Flow Engine continuously restarts with ERROR 'Missing or expired workspace chaining context’”
- “My FME Flow Web UI has become unresponsive after navigating to the Engines page”
- “After an uninstall/reinstall of the Engine Service there are duplicate hosts in the FME Flow Web UI”
- "My Engines went missing and the FME Flow Engine service disappeared."
- "After killing the FMEEngine.exe process in Task Manager, my distributed engines are still disconnecting intermittently."
- "My engines are crashing and the job logs end abruptly with a message saying a .dll could not be loaded"
- Are you still experiencing issues?
- Have ideas on how to improve this?
Initial Troubleshooting
There are various issues that can arise in relation to FME Flow Engines which lead to different troubleshooting steps that you should take. In all cases reviewing the log files should be the place to start:
<FMEFlowSystemShare>\resources\logs\core\current\fmeserver.log
<FMEFlowSystemShare>\resources\logs\engine\current\*.log
<FMEFlowSystemShare>\resources\logs\engine\current\jobs\job_xx.log
A common error found in the <FMEFlowSystemShare>\resources\logs\engine\current\*.log file is:
'FME Engine failed to register with FME Flow '<your FME Flow host>' on port 7070'
If you are receiving this error, the troubleshooting within this section will typically resolve it.
If this is a new installation and you are unable to get your Engines up, start by checking:
- Is FME Flow licensed?
- Once FME Flow is installed you must license it before Engines are available [Learn More].
- Is the FME Flow Engines Service running?
- If this is not an express install you should make sure this service is running under a service account with read/write permissions to both the install directory and system share.
- Are the correct ports open?
- FME Flow Engines establish communication with the FME Flow Core over port 7070. Once this connection is established, another random port is opened for dedicated communication. Make sure port 7070 is open and if these are distributed engines consider defining the FME_SERVER_PORT_POOL in fmeFlowConfig.txt. This range should cover the ports that the engines should be reassigned to and the associated firewall exception should be created.
- Can the engines communicate with the System Database?
- If these are distributed engines, are the System Database Connection details specified in the fmeDatabaseConfig.txt correct e.g. DB_JDBC_URL, DB_USERNAME and DB_PASSWORD?
- Note: In older versions of FME Flow this information may be in the fmeCommonConfig.txt and/or fmeFlowWebApplicationConfig.txt.
If your Engines are successfully connected but are now missing from the Web UI, start by checking:
- Is the FME Flow Engines Service running?
- If this is not an express install you should make sure this service is running under a service account with read/write permissions to both the install directory and system share.
- Are the correct ports open?
- FME Flow Engines establish communication with the FME Flow Core over port 7070. Once this connection is established, another random port is opened for dedicated communication. Make sure port 7070 is open and if these are distributed engines consider defining the FME_SERVER_PORT_POOL in fmeFlowConfig.txt. This range should cover the ports that the engines should be reassigned to and the associated firewall exception should be created.
- Can the engines communicate with the System Database?
- If these are distributed engines, are the System Database Connection details specified in the fmeDatabaseConfig.txt correct e.g. DB_JDBC_URL, DB_USERNAME and DB_PASSWORD?
- Note: In older versions of FME Flow this information may be in the fmeCommonConfig.txt and/or fmeFlowWebApplicationConfig.txt.
- On the Engine host if you open Windows Task Manager, under the Details tab do you see any FMEEngine.exe processes running? If yes, once you kill these do the engines reappear in the Web UI?
- Are distributed engines in the same time zone as the core?
- What was the last job to run on the engine?
- Did the job contain python, or use any third-party libraries?
- Make sure that these libraries have been uploaded to the Engine as a resource [Learn More].
- Is there enough memory on the engine host?
- Is the Engine crashing during the job run?
- Go to <FMEFlowSystemShare>\resources\logs\engine\current\jobs and check if there are multiple logs for the job ID. If yes, this indicates your job is being submitted through Job Recovery.
- In fmeFlowConfig.txt update the MAX_TRANSACTION_RESULT_SUCCESSES and MAX_TRANSACTION_RESULT_FAILURES to 1 and restart FME Flow. After making this change does the problem still persist?
- Did the job contain python, or use any third-party libraries?
Common Issues
“FME Flow license does not allow more than maximum of 1 FME Engine(s)”
After installation, FME Flow continually tries to launch two engines, however, I am only licensed for a single-engine. This results in errors reported in the fmeserver.log and subsequently System Events.
To resolve this error in the FME Flow Web UI go to Engine Management > Engines > Hosts and reduce the engine count from two to one. If the count reverts back to two, reduce the count to zero first, and then back up to one. Please see this article for more information.
“My Engine host is not listed in the Web UI so I cannot change the engine count”
This can happen when you have installed a distributed FME Engine Service on a separate machine. To resolve this issue make sure that the appropriate ports are open and the Engine host can connect to the System Database. Please see this article for more information.
“Engines are missing from the Web UI after hitting a Python Interpreter mismatch error”
Once a Python Interpreter has been loaded onto an FME Engine it will be stored until the Engine restarts. If a job is submitted to an Engine that requires a different interpreter, the engine should go through a normal restarted and the job will then be resubmitted. If the engine is not restarting after being shut down due to a mismatch, in fmeServerConfig.txt update the MAX_TRANSACTION_RESULT_SUCCESSES and MAX_TRANSACTION_RESULT_FAILURES to 1 and restart FME Flow. This will set the Engines to restart after every job, so you will never encounter a python mismatch.
“FME Flow Jobs are not processing!”
There are a number of reasons jobs may not run, not all related to the FME Flow Engines. Please see this article for more information.
"My Engines are missing and in the fmeserver.log I see ‘Failed to connect to Job Queue’."
This issue arises when the memurai.exe (FME Flow 2023+ win64) or redis-server.exe (all other FME Flow versions) process does not start running. Usually, this is because the queue folder is missing from <FMEFlowSystemShare>\Resources\Logs, or the account running the FME Flow Services does not have permission to write to this folder. Please see this article for more information.
“FME Flow Engine continuously restarts with ERROR 'Missing or expired workspace chaining context’”
Using the FMEFlowJobSubmitter, when a child engine is launched a job context is started that associates this engine process with the engine running the parent job. In some circumstances, the chaining context can be broken and the child engine will endlessly restart. Please see this article for more information.
“My FME Flow Web UI has become unresponsive after navigating to the Engines page”
When FME Flow has multiple engine hosts, if one of them is disconnected and you try to load the Engines page of the Web UI and then navigate away before the attempt to connect to this engine host has timed out, your FME Flow will become unresponsive. Please see this article for more information.
“After an uninstall/reinstall of the Engine Service there are duplicate hosts in the FME Flow Web UI”
This can occur if a different case was used when specifying the hostname in the installer. To resolve this issue, stop the Engine Service on the host, then remove one of the hosts from the list in the Web UI. Please see this article for more information.
"My Engines went missing and the FME Flow Engine service disappeared."
This can happen when antivirus software quarantines the FMEProcessMonitorEngines.exe process run by the startEngine.bat or stopEngine.bat files. If this occurs, make sure you add exceptions for "C:/Program Files/FMEFlow/Utilities/jre/bin/FMEProcessMonitorEngines.exe" and "Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\FME Flow Engines" (in the Registry Editor). Restore these files to their correct locations if needed. Please see this article for more information.
"After killing the FMEEngine.exe processes in Task Manager, my distributed engines are still disconnecting intermittently."
FME Engine waits indefinitely for requests and never shuts down due to a lack of incoming requests. However, in some setups, a network monitor shuts down connections that remain inactive beyond a preset period of time. This can cause the Engine to enter a hung state. To prevent this from occurring, set the RECEIVE_TIMEOUT directive in fmeEngineConfig.txt file to a non-zero value in milliseconds, <timeout_period_ms>. Setting this directive to a non-zero value forces the Engine to terminate itself after it does not receive a translation request for the specified time, breaking it out of the hung state. We’d recommend performing testing first, and only setting this value if you are finding your engines disconnect from the core and do not recover.
Example setting timeout to 300 seconds (5 minutes):
- Run a Notepad as Administrator and open <InstallDir>\Server\fmeEngineConfig.txt
- Find the RECEIVE_TIMEOUT directive and set it to 300000
- Save the file and restart FME Flow to apply the change
"My engines are crashing and the job logs end abruptly with a message saying a .dll could not be loaded"
For example:200 2023-09-24 19:49:20 | Library 'C:\Program Files\FMEFlow\Server\fme\plugins/mylibrary.dll' was found but could not be loaded. Ensure that all the dependent modules exist for this library 201 2023-09-24 19:49:21 | Module 'mylibrary' is unavailable for use with this FME edition (end of log file)
Versions of Microsoft Windows Server prior to 2022 can only load a certain number of custom libraries (DLLs). Custom libraries that are loaded by jobs will continue to accumulate on the engines until a restart. As suggested within this article, configurating FME Flow to restart the engines after every job will clear custom libraries from memory and help prevent this issue. In fmeServerConfig.txt, update the MAX_TRANSACTION_RESULT_SUCCESSES and MAX_TRANSACTION_RESULT_FAILURES to 1 and restart FME Flow
If the problem persists, there may be a single workspace that is loading a large number of custom libraries at once. In addition to restarting the engine after every job, split the workspace up into several, so that the libraries are loaded and cleared across several jobs on the engines instead of a single job.
Are you still experiencing issues?
Please consider posting to the FME Community Q&A if you are still experiencing issues that are not addressed in this article. There are also different support channels available. When contacting support please share a description of the issue, details of your FME Flow architecture and the core and engine log folders located in Resources > Logs.
Have ideas on how to improve this?
You can add ideas or product suggestions to our Ideas Exchange.
Comments
0 comments
Please sign in to leave a comment.