FME Version
Introduction
When integrating with log management services, one of the first things you need to identify no matter which service you plan to stream logs to is which logs to stream. The FME Flow (formerly FME Server) documentation provides a full overview of the key FME Flow log files and this is a good place to start. The sections below go into a bit more depth on some of the key log files that you might want to watch.
Tomcat Log Files
There are two main Tomcat log files you need to be aware of the Catalina Log and the Access Log.
Catalina Log
Information in this log file is related to the health of the FME Flow web application server. All errors that Tomcat raises are logged here, and any unexpected errors that are not correctly handled by the servlet are logged here. The following status levels are used in the Catalina log file:
- INFO - Informational message, related to the server activity
- WARNING - Warning messages
- SEVERE - Captures exception and Error
Access Log
Tomcat’s access logs show every HTTP request made to FME Flow. For each request the HTTP request status, method, and total processing time are detailed.
Metrics
Datadog has an excellent article that walks you through the key metrics for monitoring Tomcat. I have pulled out a few metrics below that we have run to.
Description |
Log File |
Example Log Pattern |
The number of errors generated by server requests. Provides a high level warning of potential issues. |
tomcat\<server-name>\ catalina.*.log |
09-Mar-2020 08:16:27.988 SEVERE [ContainerBackgroundProcessor[StandardEngine[Catalina]]] org.apache.catalina.startup.HostConfig.deployWARs Error waiting for multi-thread deployment of WAR files to complete |
Out of memory error. See this when Tomcat runs out of memory. |
tomcat\<server-name>\ catalina.*.log |
org.apache.catalina.core.StandardWrapperValve invoke SEVERE: Servlet.service() for servlet [jsp] in context with path [/sample] threw exception [javax.servlet.ServletException: java.lang.OutOfMemoryError: Java heap space] with root cause java.lang.OutOfMemoryError: Java heap space |
Number of requests. If there is more traffic than usual it might mean someone is trying to exploit the server. |
tomcat\<server-name>\ localhost_access_log.*.txt |
[11/Mar/2020:18:43:38 +0000] "POST /fmeserver/login/now HTTP/1.1" 200 5401 31 31 |
The time to process a single request. A spike in processing time might mean a page isn’t loading or an associated process taking too long to complete. |
tomcat\<server-name>\ localhost_access_log.*.txt |
[11/Mar/2020:18:42:37 +0000] "GET /fmerest/v3/passwordreset/status HTTP/1.1" 200 15 1906 1906 |
Server-side errors (5xx) indicates the server failed to process a request |
tomcat\<server-name>\ localhost_access_log.*.txt |
[11/Mar/2020:18:42:37 +0000] "GET /fmerest/v3/passwordreset/status HTTP/1.1" 500 15 1906 1906 |
FME Flow Log Files
The messages in all FME Flow log files are structured the same way. For each log entry, a status is returned:-
- INFORM - Important information that should be logged under normal conditions.
- WARN - This might be a problem or it might not.
- ERROR - Definitely a problem that should be investigated.
- FATAL - Overall application or system failure that should be investigated immediately.
These statuses can be evaluated and then alerts triggered. Here is an overview of several key metrics that you might want to monitor. Note, the FME Engine log files (Job logs) are separate from these logs and are not covered in this article.
Description |
Log File |
Example Log Pattern |
Database: If FME Flow is unable to communicate with the database | core/current/fmeserver.log (could show up in any of the FME Server logs) |
Mon-16-Mar-2020 05:30:21.810 PM ERROR Thread-0 org.postgresql.util.PSQLException: Connection to localhost:7082 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections. |
Licensing: If the FME Engines are failing to register with the core. | engines/current/ fmeprocessmonitorengine.log | Mon-16-Mar-2020 07:38:26.979 PM INFORM Thread-306 fmesrvr-monitor_Engine3 FME Engine failed to register with FME Server 'fmesrvr-monitor' on port 7070. Failed to verify message authenticity |
Active Directory is in use and users are not correctly importing (problem with sync). | core/current/fmeserver.log | Fri-11-May-2018 10:15:58.277 AM ERROR RequestHandler-Thread 408010 : (Active Directory) Exception: "LDAPException(resultCode=82 (local error), errorMessage='An error occurred while attempting to initialize the JAAS login context for GSSAPI authentication: javax.security.auth.login.LoginException: null (68) caused by KrbException: null (68) caused by KrbException: Identifier doesn't match expected value (906)')" |
Active Directory connection fails | core/current/fmeserver.log | Fri-11-May-2018 10:15:58.277 AM ERROR RequestHandler-Thread 408010 : (Active Directory) Exception: "LDAPException(resultCode=82 (local error), errorMessage='An error occurred while attempting to initialize the JAAS login context for GSSAPI authentication: javax.security.auth.login.LoginException: null (68) caused by KrbException: null (68) caused by KrbException: Identifier doesn't match expected value (906)') |
The FME Flow queue is offline. | core/current/fmeserver.log | Tue-31-Mar-2020 02:20:43.530 PM ERROR Engine-fmesrvr-monitor-fmesrvr-monitor_Engine2 402902 : Failed to connect to Job Queue. Please ensure Job Queue is started. |
Process Crashes: If there are unhandled exceptions and memory errors. | core/ current / fmeprocessmonitorcore.log | Watch for WARN messages which will alert you to the fact there is an issue that the system is trying to respond to, e.g. Tue-31-Mar-2020 02:17:02.167 PM WARN Queue 393562 : Process "Queue" ended unexpectedly. Being restarted on attempt 1... Tue-31-Mar-2020 02:48:27.294 PM WARN Subscriber_workspace 393564 : Process "Subscriber_workspace" ended unexpectedly and has reached its start attempts limit of 20. |
Publishers/Subscribers: They often depend on 3rd party components that are likely to throw errors if they go down. | e.g. core/current/publishers /s3watch.log | Tue-24-Mar-2020 06:22:09.978 PM ERROR main An error occurred: The AWS Access Key Id you provided does not exist in our records. (Service: Amazon S3; Status Code: 403; Error Code: InvalidAccessKeyId; Request ID: 261FCF4AAC888A83; S3 Extended Request ID: +o4VCv3c3Mj8SO/ssFgJeldhNTTQ=) |
Comments
0 comments
Please sign in to leave a comment.