The Needle in the Haystack: Missing Data in Splunk

Splunk has wonderful charts, graphs, and even d3.js visualizations to impart data in an easily understandable fashion. Often, these graphical representations of the data are what users focus on. Decisions are made and budgets determined due to how the data appears in these visualizations. It’s safe to say, the accuracy of the data that supports these visuals needs to be spot on.

Visualize Your Data

Splunk brings value through its visualization features. However, for the visuals to be meaningful, the data has to be accurate and complete. This highlights a challenge: focusing on visualizations often masks incomplete data. Pie charts appear to have all the data. representing the data as a “full” circle, even if we are missing data in Splunk. However, that pie chart of external connections to our network is inaccurate if it’s missing one of our firewalls. For example, a security control for “3 fails with 60 minutes per user” is compromised when a third of the data isn’t arriving in Splunk. Let’s take a look at some steps to find that missing data…

Figure 1 - Pie chart missing data in Splunk

Figure 1 – Pie chart missing data in Splunk

Find Your Missing Data

Step 1: Create a list of all the data coming into Splunk. Using an account that can search all the indexes, run the following:

| metadata type=sourcetypes index=* | fields sourcetype, totalCount | sort - totalCount



Figure 2 - Metadata in Splunk

Figure 2 – Metadata in Splunk

Step 2: Export the table that was the resulted from the previous step. (Good thing there’s an export button in the Splunk UI!)

Step 3: Send the results to your major teams and ask them, “What’s missing from this list?” When you’re thinking about teams to send this to, think Networking Team, Security Team, Windows Operations, Unix Operations, Applications Teams, etc.

Step 4: Gather a list of which systems and system types are missing and investigate. Is this data that you can onboard?

Example: Networking looks at your list of sources and realizes it is missing the Juniper VPN. The Networking team sends the FW logs to a syslog server while the Splunk team loads the configs that will handle parsing and search.

Figure 3 - Pie chart showing all sources in Splunk

Figure 3 – Pie chart showing all sources in Splunk

There’s Your Needle

Collecting and maintaining the correct data sets can be a difficult task. From collaborating with many teams to finding the needle in the haystack of missing data, you’ve got your work cut out for you.

At Kinney Group, we’ve spent years finding the proverbial Splunk needle amongst a ton of data. Ensuring that you are ingesting the right data in the right way is one of our Splunk specialties. If you have trouble finding missing data or spinning up the right Splunk visualizations, we’re here to help!


Start typing and press Enter to search