Skip to content
SPL // Splunk

Using the ai Command

KGI Avatar
 

Written by: Robert Caldwell | Last Updated:

 
June 27, 2025
 
Search Command Of The Week: ai
 
 

Originally Published:

 
June 27, 2025

Splunk Search Processing Language (SPL) serves as the backbone for data analysis and security operations in Splunk. This powerful query language enables organizations to extract meaningful insights from machine-generated data. With the AI command, Splunk users can directly interact with Generative AI through the same SPL syntax they are already using for their analysis. 

Understanding AI Search Commands

The AI search command transforms your Splunk operations by leveraging the Machine Learning Toolkit (MLTK) functionality to deliver predictive analytics capabilities directly to your data streams. It empowers analysts through GenAI-driven pattern recognition to reshape how we uncover insights that might otherwise remain hidden in complex datasets. Organizations can use this adaptive command to effectively analyze historical and current data trends simultaneously, enabling strategic responses to emerging patterns and operational challenges. 

 

This command requires both the Splunk Machine Learning Toolkit (MLTK) and Python for Scientific Computing. Since security is a common concern when AI is involved, the MLTK provides the flexibility of using your own LLM API or a popular API of your choice. If security, privacy, flexibility, or pricing is a concern, we recommend using your own LLM. 

Benefits of Using the AI Search Command

The AI search command delivers significant advantages for daily Splunk operations: 

  • Pattern Recognition: You can use this command to automatically identify complex patterns and correlations in your data traditional searches could miss. Security and operations teams can use this to respond to threats and outages more effectively. 
  • Reduced Analysis: This command can also significantly decrease investigation time with its automated intelligence. This lets you focus on high priority tasks and reduces your time spent on manually searching patterns. 
  • Automatic Annotating: LLM algorithms provide automated conclusions instead of manually noting what is found. Plus, it is continuously learning from new data patterns. So, the annotations will become more precise and explicit with every use. 

Basic Syntax

The AI command uses a syntax that follows this structure: 

				
					| ai prompt=”<written-prompt [{field-name}]>” [provider=<provider_name> model=<model_name>] 
				
			

You will prompt your LLM of choice and provide data fields for context with braces ({}). It will feed the values of that field to the LLM and use your prompt to return a result. If you have a default provider and model configured, you will only need the prompt. You can use providers like OpenAI, Gemini, Bedrock, Ollama, and more. You can even use a private LLM if you wish. Provider and model are the provider value and model value you wish to prompt. 

Example: If I wanted to prompt OpenAI’s gpt-4o-mini with data, I could use the following: 

				
					index=wineventlogs 
| dedup eventCode 
| ai prompt= “I want to ingest into Splunk only Windows event logs which hold value to my business sector of finances by event code. Here are the event codes I am currently ingesting {eventCode}. Provide suggestions and explanations as to why certain event codes should be ingested or not.” provider=OpenAI model=gpt-4o-mini 
				
			

This prompt would take one of each type of event code within Windows event logs and feed it to OpenAI’s gpt-4o-mini within the context of the prompt. The GenAI model would then generate a response for each event code. This will include an explanation as to why I should or should not continue to index it with the knowledge I work in the finance sector. We can then use this insight to see which event codes we should keep. We could even feed this back into an LLM within the same search with the returned results from OpenAI to generate more information called multi-chain prompting. 

AI Command Features

#1: Row-By-Row Prompting

Row-by-row prompting uses the AI command for each value individually. This approach enables granular and detailed responses for specific data points, which analysts can use to gain insights into individual events or records. This pattern proves particularly effective for analyzing individual security events or error messages. Each record receives personalized analysis based on its specific attributes. Even as this would scale, it will maintain response quality at the cost of runtime. 

 

For example, a SOC Analyst wants to review a series of malware quarantined by their security system. In this case, row-by-row analysis would give us detailed reasoning for each verdict. 

				
					| inputlookup quarantined_malware.csv  
| ai prompt="A file with the name {filename} was quarantined because of {reason}. It was flagged as being capable of {threat_tags}. What is the likelihood that this file is malicious? Respond with a “Yes” or “No” answer followed by an explanation as to why. Link me any documentation or references that back up this conclusion.” 
				
			

With this search, we are feeding the AI each filename, the reason it was quarantined, and the threats it could be capable of (virus, worm, Trojan, ransomware, etc.). Each prompt will incorporate specific details about each file so it can be reviewed. The SOC Analyst can use the results returned by the AI to aid in determining the chances those quarantined files are threats. 

#2: Multi-Chain Prompting

Multi-chain prompting connects multiple AI commands sequentially. Subsequent prompts can reference previous AI results. If I had a prompt return information I want to send to an AI again, I can use {ai_result_1}, which is the field that is returned by the first prompt. This approach enables step-by-step investigations and deeper analytical insights. You could even use different models to participate in the same analytical chain. You may even want to leverage specialized models for specific analytical steps. 

Let’s take our row-by-row example and expand upon it with a second ai command: 

				
					| inputlookup quarantined_malware.csv  
| ai prompt="A file with the name {filename} was quarantined because of {reason}. It was flagged as being capable of {threat_tags}. What is the likelihood that this file is malicious? Respond with a “Yes” or “No” answer followed by an explanation as to why. Link me any documentation or references that back up this conclusion.” 
| search ai_result_1=“Yes” 
| ai prompt=“I’ve determined that this file found in my network is malicious: {filename} because of {ai_result_1}. How can I ensure that there are no traces of the threat left in my network. List out a series of tools, applications, and actions I should take to ensure that this file is fully removed from my network.” 
				
			
In this example, we are taking the results of our first prompt and using them along with the filename field value for that result to generate a second search. This search will then generate strategies on how we can fully eradicate this malware if it exists elsewhere in the network. 
Example #3: Summarization Approach

Summarization will consolidate the entire dataset together before it is sent to the LLM. We use this to identify patterns and trends in these datasets, so we get the big picture. To do this, we will put all relevant data points under one value, then send it to the AI altogether at a time. This method will reduce our runtime while keeping the scope of information. The most common way to do this is with “| stats values(<field>) as <field>” which will consolidate all values across that field into a multi-value field before prompting the LLM with this information. You can also use a wildcard instead of a field if you plan on using many. 

In this example, I want a holistic idea of the data I am currently bringing in for an audit. I want to ensure that it has a relevant purpose to my organization’s needs: 
				
					| tstats count by index sourcetype 
| eval data = “Index: “.index” – Sourcetype: “.sourcetype  
| stats values(data) as data 
| ai prompt="Tell me about the data I have in Splunk. Tell me how this data can be used in security, operations, or mission. {data}” 
				
			
This search is aggregating all the values of the field I created with the eval command called “data”. This field will have values that look like “Index: _internalSourcetype: scheduler” but in a list which it will send with my prompt to the AI. The model will then return an explanation for what these index:sourcetype pairs are used for. I can then use this for strategic planning on what I should keep and what I should stop ingesting. 

Conclusion

The AI search command transforms your Splunk operations through prompting GenAI platforms to assist in your solutions. It empowers analysts through its adaptive capabilities to reshape how we handle our data. Organizations can use the AI command to effectively respond to security threats, operational challenges, and mission risks. 

Our key takeaways today are: 

  • AI Integration Enhances SPL Capabilities: Seamlessly combine machine learning with your existing Splunk functionality. Your team can leverage generated prompt results as values for further analysis and data transformations, maximizing your investment in Splunk infrastructure.
  • Reduce Operational Overhead: Minimize the time analysts spend on routine tasks and enable them to focus on strategic initiatives while AI handles repetitive processes. This approach optimizes resource allocation and improves overall productivity.
  • Flexible AI Search Solutions for Every Enterprise: The AI search command supports businesses across all sectors and scales, adapting to diverse organizational needs and data types to help drive measurable outcomes. Whether you operate a small business or lead an international corporation, spanning industries from forestry to IT, the AI search command integrates seamlessly into any role you need it to fulfill. 

To access more Splunk searches, check out Atlas Search Library, which is part of the Atlas Platform. Specifically, Atlas Search Library offers a curated list of optimized searches. These searches empower Splunk users without requiring SPL knowledge. Furthermore, you can create, customize, and maintain your own search library. By doing so, you ensure your users get the most from using Splunk.

Helpful? Don't forget to share this post!
LinkedIn
Reddit
Email
Facebook