Performance
In this page we will guide you through certain steps that will show how to improve the performance of your Log4j configuration to serve your particular use case best.
The act of logging is an interplay between the logging API (i.e., Log4j API) where the programmer publishes logs and a logging implementation (i.e., Log4j Core) where published logs get consumed; filtered, enriched, encoded, and written to files, databases, network sockets, etc. Both parties can have dramatic impact on performance. Hence, we will discuss the performance optimization of both individually:
Using Log4j API efficiently
Log4j API bundles a rich set of features to either totally avoid or minimize expensive computations whenever possible. We will walk you through these features with examples.
Remember that a logging API and a logging implementation are two different things. You can use Log4j API in combination with a logging implementation other than Log4j Core (e.g., Logback). The tips shared in this section are logging implementation agnostic. |
Don’t use string concatenation
If you are using String
concatenation while logging, you are doing something very wrong and dangerous!
-
Don’t use
String
concatenation to format arguments! This circumvents the handling of arguments by message type and layout. More importantly, this approach is prone to attacks! ImagineuserId
being provided by the user with the following content:placeholders for non-existing args to trigger failure: {} {} {dangerousLookup}
/* BAD! */ LOGGER.info("failed for user ID: " + userId);
-
Use message parameters. Parameterized messages allow safe encoding of parameters and avoid formatting totally if the message is filtered. For instance, if the associated level for the logger is discarded, no formatting will take place.
/* GOOD */ LOGGER.info("failed for user ID `{}`", userId);
Use Supplier
s to pass computationally expensive arguments
If one or more arguments of the log statement are computationally expensive, it is not wise to evaluate them knowing that their results can be discarded. Consider the following example:
/* BAD! */ LOGGER.info("failed for user ID `{}` and role `{}`", userId, db.findUserRoleById(userId));
The database query (i.e., db.findUserNameById(userId)
) can be a significant bottleneck if the created the log event will be discarded anyway – maybe the INFO
level or the associated
marker
is not accepted for this package, or due to some other filtering.
-
The old-school way of solving this problem is to level-guard the log statement:
/* BAD! */ if (LOGGER.isInfoEnabled()) { LOGGER.info(...); }
While this would work for cases where the message can be dropped due to insufficient level, this approach is still prone to other filtering cases; e.g., maybe the associated marker is not accepted.
-
Use
Supplier
s to pass arguments containing computationally expensive items:/* GOOD */ LOGGER.info("failed for user ID `{}` and role `{}`", () -> userId, () -> db.findUserRoleById(userId));
-
Use a
Supplier
to pass the message and its arguments containing computationally expensive items:/* GOOD */ LOGGER.info(() -> new ParameterizedMessage("failed for user ID `{}` and role `{}`", userId, db.findUserRoleById(userId)));
Tuning Log4j Core for performance
Below sections walk you through a set of features that can have significant impact on the performance of Log4j Core.
Extra tuning of any application will deviate you away from defaults and add up to the maintenance load. You are strongly advised to measure your application’s overall performance and then, if Log4j is found to be an important bottleneck factor, tune it carefully. When this happens, we also recommend you to evaluate your assumptions on a regular basis to check if they still hold. Remember, premature optimization is the root of all evil. |
Remember that a logging API and a logging implementation are two different things. You can use Log4j Core in combination with a logging API other than Log4j API (e.g., SLF4J, JUL, JPL). The tips shared in this section are logging API agnostic. |
Layouts
Layouts are responsible for encoding a log event in a certain format (human-readable text, JSON, etc.) and they can have significant impact in your overall logging performance.
Location information
Several layouts offer directives to include the location information: the caller class, method, file, and line. Log4j takes a snapshot of the stack, and walks the stack trace to find the location information. This is an expensive operation and should be avoided in performance-sensitive setups.
Note that the caller class of the location information and the logger name are two different things. In most setups just using the logger name – which doesn’t incur any overhead to obtain while logging! – is a sufficient and zero-cost substitute for the caller class. Example demonstrating that the logger name can be a substitute for the caller name
In the above example, if the caller class (which is expensive to compute!) is omitted in the layout, the produced log line will still be likely to contain sufficient information to trace back the source by just looking at the logger name. |
Asynchronous loggers need to capture the location information before passing the log message to another thread; otherwise the location information will be lost after that point. Due to associated performance implications, asynchronous loggers and asynchronous appenders do not include location information by default. You can override the default behaviour in your asynchronous logger or asynchronous appender configuration.
Even if a layout is configured not to request location information, it might use it if the information is already available. This is always the case if the location information is captured at build time using the Log4j Transform Maven Plugin. |
Layout efficiency
Not all layouts are designed with the same performance considerations in mind. Following layouts are known to be well-optimized for performance-sensitive workloads:
- JSON Template Layout
-
It encodes log events into JSON according to the structure described by the template provided. Its output can safely be ingested into several log storage solutions: Elasticsearch, Google Cloud Logging, Graylog, Logstash, etc.
- Pattern Layout
-
It encodes log events into human-readable text according to the pattern provided.
Asynchronous logging
Asynchronous logging is useful to deal with bursts of events. How this works is that a minimum amount of work is done by the application thread to capture all required information in a log event, and this log event is then put on a queue for later processing by a background thread. As long as the queue is sized large enough, the application threads should be able to spend very little time on the logging call and return to the business logic very quickly.
Trade-offs
There are certain trade-offs associated with asynchronous logging:
Benefits
- Higher peak throughput
-
Applications that occasionally need to log bursts of messages, can take advantage of asynchronous logging. It can prevent or dampen latency spikes by shortening the wait time until the next message can be logged. If the queue size is large enough to handle the burst, asynchronous logging will prevent your application from falling behind during a sudden increase of activity.
- Lower logging latency
-
Logger
method calls return faster, since most of the work is done on the I/O thread.
Drawbacks
- Lower sustainable throughput
-
If the sustained rate at which your application is logging messages is faster than the maximum sustained throughput of the underlying appender, the queue will fill up and the application will end up logging at the speed of the slowest appender. If this happens, consider selecting a faster appender, or logging less. If neither of these is an option, you may get better throughput and fewer latency spikes by logging synchronously.
- Error handling
-
If a problem happens during the logging process and an exception is thrown, it is less easy for an asynchronous setting to signal this problem to the application. This can partly be alleviated by configuring an exception handler, but this may still not cover all cases.
If logging is part of your business logic, e.g. you are using Log4j as an audit logging framework, we would recommend to synchronously log those audit messages.
See mixed synchronous/asynchronous loggers on how to log some messages synchronously.
- Stateful messages
-
Most
Message
implementations take a snapshot of the formatted message on the calling thread (cf.log4j.async.formatMessagesInBackground
). The log message will not change even if the arguments of the logging call are modified later.There are some exceptions to this rule.
MapMessage
andStructuredDataMessage
for example are mutable by design: fields can be added to these messages after the message object was created. These messages should not be modified after they are logged with asynchronous loggers or asynchronous appenders.Similarly, custom
Message
implementations should be designed with asynchronous use in mind, and either take a snapshot of their parameters at construction time, or document their thread-safety characteristics (seeAsynchronouslyFormattable
). - Computational overhead
-
If your application is running in an environment where CPU resources are scarce, like a VM with a single vCPU, starting another thread is not likely to give better performance.
Asynchronous logging strategies
Log4j provides following strategies users can choose from to do asynchronous logging:
Asynchronous logger
Asynchronous loggers use LMAX Disruptor messaging library to consume log events.
Their aim is to return from a log()
call to the application as soon as possible.
Asynchronous appender
The asynchronous appender accepts references to other appenders and causes log events to be written to them on a separate thread.
The backing queue uses ArrayBlockingQueue
by default, though it can be replaced with a better performing one suitable for your use case.
Garbage-free logging
Garbage collection pauses are a common cause of latency spikes and for many systems significant effort is spent on controlling these pauses.
Log4j allocates temporary LogEvent
, String
, char[]
, byte[]
, etc. objects during steady state logging.
This contributes to pressure on the garbage collector and increases the frequency with which garbage collection pauses occur.
In garbage-free mode, Log4j buffers and reuses objects to lessen this pressure.
See Garbage-free logging for details.