Apache Log4j is a logging utility offered as part of the Apache Logging Services. Log4j supports numerous commercial projects, including the systems that send data to Splunk and some used by Splunk products and apps. It is safe to say that much of the Internet runs off Log4j. You can use Splunk to ingest this data and gain valuable insights into how those applications perform.
A severe zero-day vulnerability in Log4j struck at the end of 2021. If you are here for details on that vulnerability, feel free to skip to the section near the bottom titled The Dec 2021 Security Issue.
What is Log4j?
Log4j is the standard logging method for Java-based applications. Initially developed by Ceki Gülcü in 2001, the Apache Log4j Team wrote a new version called Log4j 2. It is part of the Apache Logging Service. The project details for Log4j can be found here.
Log4j provides the means for developers to choose which log statements to output. Instead of logging every reply, the developer controls the messages going to the log.
Log4j is managed through configuration files written in XML, JSON, YAML, or properties files format or via Java code.
How Splunk ingests Log4j
Ingest Log4j data via monitor or batch inputs when possible. You can find details on configuring monitor and batch inputs here.
Some Log4j developers will use Syslog instead of writing their own log files. In that case, configure the Syslog server to write the data to disk and then use a monitor statement to ingest the data. Another possibility is writing an app-parser for the Splunk Connect for Syslog.
Log4j is a pretrained sourcetype in Splunk
If your data uses the common standard output produced by J2EE servers, then this pretrained sourcetype is an option. However, expect that adjustments to the timestamp and field extractions are required to garner valuable data from that sourcetype.
The recommended method for ingesting new Log4j data is the same as any new Splunk data source. Gather representative events of the data, mainly watch for multiline java dumps in the log, and then use some method to preview the data (Data Preview, DSP, Edge Processor, whatever you have available by the time you read this). Again, watch for multiline events, multiple timestamps, and events that change shape.
Information from Log4j
Log4j hosts a plethora of information for different use cases. Depending upon what the developer shared, the Log4j can contain the transactions between the user and servers or the server and back-end systems. Log4j may also include debugging information, server information, or anything else the developer felt like logging. As such, use cases range from security to application monitoring.
How Splunk uses Log4j Itself
Apache Logging Services, particularly Log4j is standard in apps created with Java. Splunk products based on Java often use Log4j, which includes DFS, ITSI, UBA, and more apps. See the Security Issue section below for details.
The Dec 2021 Security Issue
On December 9, 2021, a zero-day vulnerability in Apache Log4j 2 was reported. This led to a vast scramble by software producers, hosting providers, and customers to correct the issue before it caused catastrophic problems. The Apache Software Foundation assigned the maximum Common Vulnerability Scoring System rating of 10 out of 10. A second major vulnerability was announced on December 14 and given a 9 out of 10 rating.
The media and computer news sites widely covered these Log4j vulnerabilities. Often when people consider Log4j it is due to these security concerns.
Short Version: Splunk Enterprise (except for DFS customers), Splunk Cloud, and Splunk Enterprise Security were not susceptible to these vulnerabilities.
Several Splunk Apps were impacted, including DSP, IT Essentials Work, IT Service Intelligence, and Splunk UBA using the OVA.
Splunk search is a wonderful way to hunt for Log4j vulnerabilities. Splunk has a Log4Shell Overview and Resources page created with many suggestions for hunting malicious Log4j behavior.
In this post we discussed what Log4j is, how to ingest it into Splunk, and covered the security issue from late 2021. Log4j is still widely used in computing and Splunk users are positioned to gain insights from that data.
You don’t have to master Splunk by yourself in order to get the most value out of it. Small, day-to-day optimizations of your environment can make all the difference in how you understand and use the data in your Splunk environment to manage all the work on your plate.
Cue Atlas Assessment: Instantly see where your Splunk environment is excelling and opportunities for improvement. From download to results, the whole process takes less than 30 minutes using the link below: