Symptom
When running FME Server on a linux based system, including Docker and Kubernetes, you may notice an accumulation of core dump files that are generated by Java. These will either be hs_err_pid.log files or core.pid files. You may have one or both of these types of files.
As you cannot view these files in the FME Server web ui, you will likely be made aware of these files after noticing that your FME Server System Share has increased significantly in size. Listing files in the Repositories folder will show you these files.
Cause
Due to historical issues with engines not shutting down correctly, resulting in the loss of engines for some customers, FME Server uses a more aggressive shutdown procedure in order to terminate engines so that they will restart successfully and engine capacity is not reduced.
As a result, if Java has been loaded onto an engine, it will likely generate a core dump file. This will happen once an engine is restarted due to it reaching the maximum translation success or failures or approximately 90 seconds after an engine has been idle. You may see references in the fmeprocessmonitorengine.log file to forcible engine shutdowns (this is expected).
Java interprets the FME Server engine shutdown command as a crash, leading to the creation of those files, even though the FME Server engine process is being intentionally killed.
JDBC formats being used in workspaces use Java and are often associated with this issue.
Note: Please ensure that these files are being generated as a result of engine shutdown and that there isn’t a genuine crash. If any job log files stop without completing or there are shutdown warnings in the log file, you may be experiencing a genuine crash that needs to be investigated. If in doubt, please contact support.
Resolution
Depending on the files that you are getting generated in the Repositories folder, there are different resolutions.
The following resolutions may be intermediate steps or workarounds while we explore alternative ways to shut down FME Server engines that aren’t interpreted by Java as a crash.
If you are seeing hs_err_pid files in the Repositories folder
If you are seeing core.pid files in the Repositories folder
If you are seeing these files generated in a Docker installation of FME Server
If you are seeing these files generated in a Kubernetes installation of FME Server
If you are seeing hs_err_pid files in the Repositories folder:
hs_err_pid files are controlled by the JAVA_TOOL_OPTIONS environment variable. In FME Server 2020.2.3+ this variable has been set so that these files will be generated in the engine temp folder and are cleaned up very quickly. This means that these files will no longer accumulate in the FME Server System Share.
If you are on an older version of FME Server, you may be able to set this value yourself:
JAVA_TOOLS_OPTIONS = -XX:ErrorFile=<FME_TEMP>/hs_err_pid%p.log
If you are seeing core.pid files in the Repositories folder:
core.pid files are controlled by ulimits. Setting ulimits core file size to a value of 0 suppresses the creation of these files.
If you run ulimit -a in your FME Server engine environment you will be able to check the core file size setting:
To set this value you can look up how to set ulimits for your operating system. Bear in mind there are hard and soft ulimits, so ensure that you set the correct one.
If you are seeing these files generated in a Docker installation of FME Server:
You can edit the fmeserverengine service to include the settings for ulimits and JAVA_TOOL_OPTIONS. You may not need to set both of these (see above to determine which settings you need).
You can set these in the Docker Compose file, under the fmeserverengine service:
fmeserverengine: image: 'safesoftware/fmeserver-engine:20252' volumes: - 'fmeserver:/data/fmeserverdata' restart: always depends_on: - fmeservercore environment: - EXTERNALPORT=${EXTERNALPORT:-443} - WEBPROTOCOL=${WEBPROTOCOL:-https} command: bash -c "export JAVA_TOOL_OPTIONS=\" XX:ErrorFile='/tmp/fmeengines/$${HOSTNAME}/hs_err_pid_%p.log'\" && /fmeengine/start_engine.sh" ulimits: core: hard: 0 soft: 0 networks: - database - web
If you are seeing these issues generated in a Kubernetes installation of FME Server:
Unfortunately, Kubernetes does not support these settings in the same way that Docker does. In 2020.2.3+ you should find that hs_err_pid files do not get generated, but core.pid files may still get created in the Repositories folder.
While we investigate alternative ways of shutting down FME Server engines (tracked internally as FMESERVER-16211), our recommendation is to create a regular clean up process that will delete any core.pid files in the Repositories.
If you are using JDBC format(s), you may consider swapping to ODBC instead.
Comments
0 comments
Please sign in to leave a comment.