Article

Splunk Subsearches: Correlating Events for Enhanced Insights

Written by: Michael Tolbert | Last Updated:

February 5, 2024

Originally Published:

July 21, 2023

Splunk subsearch is an analytic technique for correlating events in data and discovering key activities that is occurring your computing environment. Did you know that with Splunk subsearch, you can write SPL queries identify the most frequent shoppers on an online store and explore their purchase history? In this article, we will explore Splunk subsearches, the applications usages, and providing step-by-step guidance and real-world use cases.

What is a Splunk subsearch?

A Splunk subsearch enables users to narrow down their search results by using a secondary search with the main search query, referred to as a subsearch. The subsearch is ran first and acts as a filter to feed resulting fields back to the main search query for process. The subsearch passes fields into your main search that are dynamically changing, so the results could be different each time main search is ran. This provides power capabilities for correlating events across multiple data sources for uncovering critical patterns and connections in your events.

The Benefits of Splunk subsearches

Splunk subsearch offer a host of advantages for data analysis between multiple data sources. Some key benefits are as follows:

Benefit #1 – Monitoring with Limited Information:

In many cases interpreting events is dependance on information that doesn’t exist in a single data source only. A subsearch can solve this dilemma by providing correlating information from multiple data source into your main search.

Benefit #2 – Better Event Recognition:

By leveraging subsearches, searches can be created to gain a comprehensive view of their data relationships and dependencies.

Benefit #3 – Correlate events with different timestamps:

Splunk subsearches are ran first within a main search and can be configured to run at different time range from the main search.

Types of Splunk subsearches

Splunk subsearches offer several approaches to fine-tune your data analysis:

Type #1 – Basic Subsearches: Perform simple filtering to narrow down search results.

Type #2 – Correlating Subsearches: Link events between different data sources for comprehensive insights.

Type #3 – Time-based Subsearches: Analyze events based on specific time ranges or intervals.

Type #4 – Nested or Multiple Subsearches: Uses more than one subsearch in a main search or nested subsearch where a subsearch reside with a subsearch.

Type #5 – Subsearches in other commands: a Subsearch can be used in join, append, and appendcol commands.

How to Use Splunk Basic Subsearches

Step 1: Create a subsearch first where you decide on which field need to return to main search for processing. Use square brackets around your subsearch, for example: [search sourcetype=my_sourcetype | top limit=1 host | fields host]. This subsearch will return to main search a single host value that represents the top host in that sourcetype.

Step 2: Apply the main search parameter with the subsearch and using transforming commands (like stats) to render results. Your main search might look as follows: index=* sourcetype=my_sourcetype [search sourcetype=my_sourcetype | top limit=1 host | fields host] | stats values(host) as TopHost

Step 3: Validated search execute.

[Splunk Tip: When using a subsearches the maximum number of values return from a subsearch is 10,000 and the maximum runtime for a subsearch is 60 seconds.]

Use Case Examples for Splunk Subsearches

Use Case #1: Enhancing Customer Insights

Scenario: Utilize subsearches to identify the most frequent shoppers and their purchase history on an online store.

Tools: Splunk, Buttercup Games data

Step 1: Identify the most frequent shopper on the Buttercup Games online store.

sourcetype=access_* status=200 action=purchase

| top limit=1 clientip

Step 2: Explore the purchase history of the VIP shopper.

sourcetype=access_* status=200 action=purchase clientip=87.194.216.51

| stats count, dc(productId), values(productId) by clientip

Step 3: Combine into one main and subsearch.

sourcetype=access_* status=200 action=purchase [

search sourcetype=access_* status=200 action=purchase

| top limit=1 clientip

| table clientip

]

| stats count, distinct_count(productId), values(productId) by clientip

Use Case #2: Anomaly Detection.

Scenario: Use subsearch to detect anomaly events in your data.

Tools: Splunk, Data source logs, Lookup table contain expected matching values.

Step 1: Set up a search that parse the values to inspect for anomalies.

sourcetype=my_source_events

| stats count by events_to_inspect as suspected_anomalies

| search NOT [ | inputlookup known_match.csv ]

| table component

Step 2: Configured a lookup table with the name [known_matchs.csv] and field names of known_values and matched_return, with the associate values.

known_values, matched_return

value A, value A

value B, value B

value C, value C

Etc.

Step 3: Create main search and subsearch.

sourcetype=my_source_events

| stats count by events_to_inspect as suspected_anomalies

| search NOT [

| inputlookup known_matches.csv

| field matched_return

| rename matched_return as suspected_anomalies

]

| table suspected_anomalies

Splunk Tip: Additional commands maybe required to handle case sensitivity for the known_value and matched_return fields.

Conclusion

In conclusion, Splunk subsearches offer a powerful feature for event correlation and data analysis. By employing these techniques, users can streamline their search process, gain deeper insights, and resolve issues swiftly. Armed with the knowledge of different subsearch types and practical use cases, you can leverage the full potential of Splunk subsearches to transform your data analysis endeavors. Explore these techniques and practice them to master the art of event correlation and enhance your decision-making processes.

If you found this helpful…

You don’t have to master Splunk by yourself in order to get the most value out of it. Small, day-to-day optimizations of your environment can make all the difference in how you understand and use the data in your Splunk environment to manage all the work on your plate.

Cue Atlas Assessment: Instantly see where your Splunk environment is excelling and opportunities for improvement. From download to results, the whole process takes less than 30 minutes using the button below: