Skip to content
Article

How to the Use CIM to Normalize Splunk Data

 

Written by: The Kinney Group Team | Last Updated:

 
September 26, 2022
 
diagram
 
 

Originally Published:

 
September 16, 2022

In previous blogs “Dude, Where’s My Data” (Part One and Part Two), we focused on the essential steps of onboarding your data into Splunk. But if those guidelines didn’t populate the data into the dashboards properly, you may need to explore Common Information Model (CIM) compliance to normalize that data and make it functional. 

In this post, we’ll walk you through what CIM is and how CIM compliance works so you can get usable data into your dashboards in no time.

What is CIM in Splunk?

The Common Information Model is the way Splunk identifies, categorizes, and recognizes data. Splunk uses the CIM to identify different names for the same data. This helps Splunk to find and correlate different names for the same data.

The Splunk CIM Data Model

The CIM data model is a way for Splunk to normalize your data to identify common data types into a simplified data model. For example, imagine you are standing in the check-out line at the grocery store. You hear terms like “Climbing Pinky”, “Knock-Out”, and “English Tea.” What are these people talking about? The answer is roses.

The same principle is in effect in the CIM. It allows the Splunk end-users and APPs to search common fields across many source types. Instead of different names for roses, for example, fields and sourcetypes may have “user_ID” or “username” or “Login_ID” to identify an entity using a particular system, the “user.” The CIM takes common names for the same data, puts them into the model to normalize different names for the same function or entity across all Splunk data. Using CIM is a way of normalizing data for maximum efficiency at search time.

How Splunk CIM Data Models Work

Much of the work behind normalizing data is already done for you. The CIM has a library of models that already have common data types normalized.  Most APPs come CIM ready and take advantage of these models in their dashboards and searches. Here is a list of data models already in Splunk:

  • Alerts
  • Application State
  • Authentication
  • Certificates
  • Databases
  • Data Loss Prevention
  • Email
    Interprocess Messaging
  • Intrusion Detection
  • Inventory
  • Java Virtual Machines
  • Malware
  • Network Resolution (DNS)
  • Network Sessions
  • Network Traffic
  • Performance
  • Ticket Management
  • Updates
  • Vulnerabilities
  • Web

The CIM is not restricted to just what is in the listed models. You can add new fields to the model as needed. For example, a new user field might be “system_user” which could be added under “user” in the model. The process is as follows:

  • Extract Data Fields: Find the field of data you want to add to the data model
  • Normalize: Add the data to the CIM in the appropriate model.
  • Tag: Add a tag to that field and data so that it can be found across all searches
New call-to-action

How to Use CIM in Splunk

Making data CIM compliant is easier than you might think. Here are the four steps to making your data CIM compliant:

  1. Ensure the CIM is installed in your Splunk environment.
  2. Ensure your data has the proper sourcetype. 
  3. Extract fields from your data.  
  4. Create an alias in the CIM.

Here is an example of creating an alias:

username AS user

The term “username” will return in Splunk searches for “user.” With all the terms for “user” in CIM, a single search for “user” will return all terms for user such as “userID,” “system_user,” “username,” etc., as long as they are in the CIM. Otherwise, a search might have to look something like this:

Index = network “user” OR userID” OR“system_user” OR “username”

The CIM normalizes these terms so that all items in the network index that have an identifier for the entity that uses the system and can be returned with just the term “user.”

Figure 2 - Fields of Authentication event datasets
Figure 2 – Fields of Authentication event datasets

The CIM model can also be used to capture calculated fields or actions.  In this example, an action has several possible outcomes.  By using a calculated field entry into the CIM, a search for “action” can return multiple results of the action.  Take a look at this example:

action=if(action="OK","success","failure")

In this way, the CIM can capture calculated results within a field with just the term “action” without specifying “OK,” “success,” and “failure.”

Figure 3 - Tags used with Authentication event datasets
Figure 3 – Tags used with Authentication event datasets

Lastly, tag your data fields to match them up with a data set within the CIM.

Need Help Simplifying Your Data?

Kinney Group can help jump-start your dashboards by helping you make your data CIM compliant. Our team has real-world experience in matching your data types, extracting fields, and putting them into the CIM so that your data can work for you instead of you working your data to get the critical results you need.

If you found this helpful… 

You don’t have to master Splunk by yourself in order to get the most value out of it. Small, day-to-day optimizations of your environment can make all the difference in how you understand and use the data in your Splunk environment to manage all the work on your plate.

Cue Atlas Assessment: a customized report to show you where your Splunk environment is excelling and opportunities for improvement. Once you download the app, you’ll get your report in just 30 minutes.

New call-to-action