Skip to content
SPL // Splunk

Using the spath Command

KGI Avatar
 

Written by: Carlos Diez | Last Updated:

 
October 23, 2024
 
Splunk Search Command Of The Week: spath
 
 

Originally Published:

 
October 23, 2024

Splunk Processing Language (SPL) is the heart of Splunk’s search capabilities, enabling users to extract meaningful insights from vast datasets. Among the many useful commands within SPL, the spath command stands out when dealing with structured data formats like JSON and XML. This command is essential because it allows Splunk users to navigate and extract fields from complex, nested data formats. 

Understanding the spath Command

The spath command is a versatile tool for parsing and extracting information from JSON and XML data. Often, raw data ingested into Splunk comes in structured formats where values are nested within key-value pairs or arrays. The spath command simplifies the process of extracting these nested fields, allowing users to drill down into the data without needing to manually untangle it. 

In essence, the command takes raw structured data and makes it readable and actionable for further analysis. This can be incredibly useful when dealing with logs, metrics, or events where the data format is not flat but hierarchical. 

 

Proper Syntax

To effectively use the spath command, understanding its syntax is critical. Here’s a breakdown of the basic structure: 

				
					| spath [input=<field>] [path=<path_expression>] [output=<new_field>] 
				
			
  • input: This optional argument specifies the name of the field that contains JSON or XML data. If not provided, spath will default to extracting data from the raw event. 
  • path: This is where you define the location of the field you want to extract from the nested structure. Path expressions follow JSON dot notation or XPath syntax for XML data. 
  • output: Use this argument to name the new field that will store the extracted value. 

For example: 

				
					| spath input=data path=order.items[0].price output=first_item_price 
				
			

This query extracts the price of the first item from an order, placing the result in a new field called first_item_price. 

Benefits of Using the spath Command

Using the spath command in your Splunk searches offers several advantages. Here are three key benefits: 

  • Efficient Data Parsing: Easily extract nested information from complex JSON/XML structures without needing custom scripts. 
  • Improved Search Performance: By specifying the exact data path, you can narrow your search and make it more efficient. 
  • Enhanced Data Insights: With spath, you can access and analyze previously hidden or difficult-to-reach data, giving you deeper insights into your logs or events. 

Usage of the spath Command

The power of the spath command shines when put into practical use. Let’s look at a couple of examples to see how it works in real-world scenarios. 

EXAMPLE #1: Parsing a JSON Log File

Suppose you have JSON logs that record transactions, and you want to extract the status and total amount of a specific transaction. The JSON structure might look like this: 

				
					{ 
  "transaction": { 
    "id": "1234", 
    "status": "complete",
    "amount": 250.00 
  } 
} 
				
			

To extract the status and amount fields, you can use: 

				
					index=transactions 
| spath input=_raw path=transaction.status output=transaction_status 
| spath input=_raw path=transaction.amount output=transaction_amount 
				
			

This search will pull out the transaction’s status and amount, placing them into the transaction_status and transaction_amount fields. 

EXAMPLE #2: Querying Splunk Common Information Model (CIM) Data

Splunk CIM data is often structured, and extracting relevant fields is necessary for understanding event details. Let’s say you are working with a CIM-compliant data model where firewall events are logged in JSON format. You need to extract the source IP and the action taken by the firewall. 

The spath command allows you to target these fields efficiently: 

				
					index=firewall_logs | spath input=_raw path=src_ip output=source_ip | spath input=_raw path=action output=firewall_action 
				
			

This query will give you the source IP and firewall action for each log event. 

Conclusion

The spath command in Splunk is an indispensable tool for working with structured data like JSON and XML. It enables efficient data extraction, improves search performance, and enhances data insights by allowing users to access deeply nested information quickly and effectively. 

Key Takeaways: 

  • The spath command simplifies parsing and extracting fields from complex data formats. 
  • It improves search performance by allowing users to target specific data paths. 
  • spath is versatile and essential for working with structured data in both everyday tasks and more complex searches. 

Using spath is an excellent way to make your Splunk searches more efficient, especially when dealing with structured data formats. 

 

To access more Splunk searches, check out Atlas Search Library, which is part of the Atlas Platform. Specifically, Atlas Search Library offers a curated list of optimized searches. These searches empower Splunk users without requiring SPL knowledge. Furthermore, you can create, customize, and maintain your own search library. By doing so, you ensure your users get the most from using Splunk.

Atlas Search Library
Helpful? Don't forget to share this post!
LinkedIn
Reddit
Email
Facebook