How do we find what data is truly important when all of it seems important?
Trying to find the needle-in-a-haystack can be overwhelming when it comes to finding missing data in Splunk. While we’d like to think we have all our data accounted for, that’s not likely to be the case. This problem starts because we’re in such a hurry to provide value from our new Splunk deployment that we start bringing in all this data all at once.
In this guide, we show you how to find missing field values and data in Splunk quickly. We’ll also show you some common pitfalls that cause missing Splunk data in the first place so you can avoid them going forward.
Splunk has wonderful charts, graphs, and even d3.js visualizations to impart data in an easily understandable fashion. Often, these graphical representations of the data are what users focus on. Decisions are made and budgets determined due to how the data appears in these visualizations. It’s safe to say, the accuracy of the data that supports these visuals needs to be spot on.
hubspot type=cta portal=6084056 id=06541ff5-1a59-4108-abb3-cb0ecb30f3e8]
The Problem: Inaccurate Data
Splunk brings value through its visualization features. However, for the visuals to be meaningful, the data has to be accurate and complete.
This highlights a challenge: focusing on visualizations often masks incomplete data. Pie charts appear to have all the data. representing the data as a “full” circle, even if we are missing field values. However, that pie chart of external connections to our network is inaccurate if it’s missing one of our firewalls. For example, a security control for “3 fails with 60 minutes per user” is compromised when a third of the data isn’t arriving in Splunk. Let’s take a look at some steps to find that missing data…
The Solution: Finding Your Missing Field Values
There are four reasons your data could be missing and it’s up to you to figure out which one is the culprit. Let’s walk through each one below.
1. Missing Splunk Data Caused by Typos in the Input
There is always the possibility that even though the inputs look correct, there may be a typo that you originally missed. There may also be a configuration that is taking precedence over the one you just wrote. The best way to check is to use btool on the Splunk server configured to receive the data. This command-line interface (CLI) command checks the configuration files, merges the settings that apply to the same stanza heading, and returns them in order of precedence.
When looking for settings that relate to the inputs configured for a data source, this simple command can be run:
./splunk btool <conf_file_prefix> list -app=<app> --debug | grep <string>
Where <string> is a keyword in the input that you are looking for, will help quickly locate the settings that apply to that particular input.
2. Missing Splunk Data Caused by User Permissions
More times than not, the issue preventing Splunk from reading the log data is that the user running Splunk doesn’t have the correct permissions to read the file or folder where the log data is stored. This can be fixed by adding the user running Splunk to the group assigned to the file on the server that is configured to send data to Splunk. You should then, make sure that the group has the ability to read the file. On a Linux host, if you wanted Splunk to read, for example, /var/log/secure/readthisfile.log, you would navigate to the /var/log/secure folder from the command line interface using the following command:
Once there, you would run this command:
This will return results that looks similar to the line below:
rwxr----- creator reader /var/log/secure/readthisfile.log
Where creator, who is the user that owns that file, has the ability to read, write, and execute the file; reader, who is the group that owns the file, has the ability to read the file, and all other users cannot read, write, or execute the file.
Now, in this example, if the user running Splunk is Splunk, then you can check which groups that Splunk belongs to by running the following command:
id splunk OR groups splunk
If the results show that the Splunk user is not a member of the reader group, a user with sudo access or root, can add Splunk to the reader group using the following command:
sudo usermod -a -G reader splunk_reader
3. Missing Splunk Data From Logs
If the Splunk platform’s internal logs are accessible from the Splunk GUI, an admin user can use the following command to check for errors or warnings:
index=_internal (log_level=error OR log_level=warn*)
As a bonus, if your firewall or proxy logs are configured to send data to Splunk, and those logs are capturing data about network traffic between the data source and the Splunk server configured to receive data from your data source, searching these logs for errors by specifying the IP address and/or hostname of the sending or receiving server will help you find out if data is being blocked in transit. On a Linux host, the following commands can also tell you which ports are open:
sudo lsof -i -P -n | grep LISTEN sudo netstat -tulpn | grep LISTEN sudo lsof -i:22 ## see a specific port such as 22 ## sudo nmap -sTU -O localhost
4. Missing Splunk Data Caused by User Error
When all else fails, check manually with your team to see if they know where the data is. Time is of the essence, so here’s how to do it quickly.
Step 1: Create a list of all the data coming into Splunk. Using an account that can search all the indexes, run the following:
| metadata type=sourcetypes index=* | fields sourcetype, totalCount | sort - totalCount
Step 2: Export the table from the previous step. (Good thing there’s an export button in the Splunk UI!)
Step 3: Send the results to your major teams and ask them, “What’s missing from this list?” When you’re thinking about teams to send this to, think Networking Team, Security Team, Windows Operations, Unix Operations, Applications Teams, etc.
Step 4: Gather a list of which systems and system types are missing and investigate whether this is data that you can onboard.
Missing Splunk Field Value Example
Networking looks at your list of sources and realizes it is missing the Juniper VPN. The Networking team sends the FW logs to a syslog server while the Splunk team loads the configs that will handle parsing and search.
How to Avoid Missing Data and Field Values in Splunk
1. Identify your use cases
Here are the use cases you’ll need to understand to kick off your missing data search:
- Where are your employees spending most of their time?
- What reports do they have to create manually every month?
- What can be automated using Splunk?
Find the blind spots
- Where are your organizational blind spots?
- Do you know which servers are experiencing the most activity?
- Are the most active servers the ones you thought it would be?
Clarity on systems
- Are you planning for a major expansion or system adoption?
- Do you have enough resources to accommodate the number of users?
- Is access limited to only those users who need it?
- Do we have an effective means of capacity planning?
Look at the ROI
- Can we cut costs?
- Which systems or over or undersized?
- Do we need more bandwidth?
These and other questions are a good place to start to help you categorize your data needs quickly. Though you will probably not identify all your use cases at once, you will most likely uncover the most pressing issues on the first pass.
2. Prioritize your use cases
Once you have identified the questions you would like to answer, you must arrange your data into categories based on their priority. The easiest grouping is:
- Needs: The things that will benefit the largest group of people and will potentially save your organization money in the long run. The needs are really what bring value to the way the business is run.
- Wants: The things that a subset of users will be delighted to have, but they could continue to function without.
- Luxuries: The cool-to-have things that satisfy a very specific niche request.
These categories will help you segment the use cases into tasks that you should focus on immediately.
3. Identify your data sources
Once you have identified and prioritized the questions you would like to answer, you must identify which data will help you answer those questions. Make sure to consider which data sources will help you satisfy several use cases at once. This will help you correctly determine the size of your daily license and make sure you only focus on the data sources you need to address the needs and wants of your organization.
4. Identify your heaviest users
By creating a list of people who need access to each data source, you can correctly determine how large an environment is needed to support all the data sources you plan to bring in. It also helps when determining each user’s level of access. If a data source is widely popular, it may behoove you to create a dashboard and/or report to quickly disseminate important information that the users may need. It will also help determine how you expand the environment.
By taking these four steps, users will not only feel like their needs are being heard, but it will also help them feel empowered to identify further use cases for future expansion. It will free up their time to focus on more complicated tasks and can mean the difference between them being proactive as opposed to reactionary. By taking the organization’s greatest needs into account, it can mean the difference between users adopting a Splunk implementation as their own or it being discarded as just another tool.
There’s Your Needle in the Haystack
Collecting and maintaining the correct data sets can be a difficult task. From collaborating with many teams to finding the needle in the haystack of missing data, you’ve got your work cut out for you.
If you found this helpful…
You don’t have to master Splunk by yourself in order to get the most value out of it. Small, day-to-day optimizations of your environment can make all the difference in how you understand and use the data in your Splunk environment to manage all the work on your plate.
Cue Atlas Assessment: Instantly see where your Splunk environment is excelling and opportunities for improvement. From download to results, the whole process takes less than 30 minutes using the button below: