Skip to content
Article

SQL in Splunk: Here’s Everything You Need to Know

KGI Avatar
 

Written by: Michael Simko | Last Updated:

 
December 18, 2023
 
Splunk SQL
 
 

Originally Published:

 
February 10, 2023

Let’s explore how to perform SQL-style functions in Splunk Processing Language (SPL). In this post, we’ll look at the most common SQL commands and map those to SPL.

What is SQL?

Structured Query Language (SQL) is used to interact with databases. Everyone from business analysts to researchers uses SQL to CRUD (Create, Read, Update, Delete) records in databases. Many–perhaps most–Splunk users are familiar with SQL. Translating their SQL knowledge to Splunk empowers their search skills

What role does SPL play in SQL?

The Splunk Processing Language (SPL) is how Splunk users explore data stored in Splunk. SPL is, as the name implies, unique to Splunk. It is a pipe-separated language, where each element processes the results set in turn. SPL is one of the main advantages of Splunk, allowing the discrete search of massive sets of data. SPL has many commands used to enhance the search experience and is what powers searches, reports, and dashboards.

How does SPL 2 interact with SQL?

The confusing-named product of all time, SPL2 is a unified search language that searches differently than SPL. It is not a replacement for SPL; ergo, I bristle at the name since SPL is not going away. Splunk Edge Processor and the forthcoming Splunk Search Experience use SPL2.

SPL2 is closer to SQL and should be an easier transition. We’ll do a separate SPL2 post once the products go live (they are not currently GA as of Jan 2023). We focus on SPL in this post.

Splunk SQL Command Examples

Although there isn’t a direct way to query Splunk using standard SQL commands, there are similar commands you can use to get the job done. In this section, we’ll go through the most common/valuable SQL commands and offer suggestions on methods to use in SPL.

1. SQL Command: SELECT

The SQL SELECT statement retrieves data from a database.

Example SQL Statement: SELECT CustomerName,City,Country FROM OurSales;

In SPL, we can mimic this with the search command. Search has a unique property where the “| search” portion is not required if this is the initial element of the search.

Example SPL Command: index=OurSales | fields CustomerName,City,Country

2. SQL Clause: WHERE

The SQL WHERE clause filters to specific conditions.

Example SQL Statement: SELECT CustomerName,City FROM OurSales WHERE Country=’France’;
In SPL we have a few tricks to match this. To match the exact string above we add the Field and Value as a requirement in our search:

Example SPL Statement: index=OurSales Country=”France”| fields CustomerName,City

Another SPL command we can use later in our pipelines is the | where command, which brings more flexibility, including the ability to compare the values of multiple fields.

3. SQL Operator: LIKE

The SQL LIKE operator enhances the WHERE clause to search for patterns instead of literals. The exact syntax changes depending on your database, but we’ll go with the typical characters of percent and underscore.

Example SQL Statement: SELECT CustomerName,Country FROM OurSales WHERE City LIKE ’%polis;
SPL again has a pair of ways to handle this. To match the example, we can use a wildcard statement in the search command, as * in SPL means any characters. We can switch to using the WHERE command to handle specific elements such as a _ which means a single character.

SPL Example: index=OurSales City=”*polis” | fields CustomerName,Country

SPL Example: index=OurSales | fields CustomerName,Country | <some cool thing that justifies doing more> | where City like “%polis”

SPL Example with generated data: | windbag | fields lang, sample | where sample like “%mund të%”

4. SQL Clause: HAVING

HAVING is a workaround to have aggregate functions in a WHERE clause.

Example SQL: SELECT COUNT(Stores), Country FROM OurSales HAVING COUNT(Stores) > 10;
SPL most easily can have a filter like this using the where command.
SPL Example: index=OurSales | stats count(Stores) as StoreCount by Country | where StoreCount > 10

5. SQL Keyword: INNER JOIN

Inner Joins return the overlap between two data sets. In SQL, that means two sets of tables.

SQL Example:
SELECT Orders.OrderID, Customers.CustomerID
FROM ORDERS
INNER JOIN Customers ON Vendors.VendorID = Customers.CustomerID

Splunk SPL supports inner joins using the join command. The default on the command is to perform an inner join.

SPL Example: index=orders | table ProductID, OrderNum, Price, CustomerID | join type=left CustomerID [search index=customers | table CustomerID, CustomerName, City, Country]

6. SQL Keyword: Left Join

Left Joins all data in the first table with the records in the second table that match the values specified in the join command.
SQL Example:
SELECT Orders.OrderID, Customers.CustomerID
FROM ORDERS
INNER JOIN Customers ON Orders.OrderID = Customers.CustomerID
| ORDER BY Customers.CustomerID

SPL supports left joins using the join command with the type set as left.

SPL Example: index=orders | table ProductID, OrderNum, Price, CustomerID | join type=left CustomerID [search index=customers | table CustomerID, CustomerName, City, Country]

7. SQL Keyword: Outer Join

A SQL Outer Join takes all data that matches the search request from either table.

SQL Example:
Select Customers.CustomerName, Orders.OrderID
FROM Customers
FULL OUTER JOIN Orders
ON Customers.CustomerID=Orders.CustomerID;

Splunk SPL performs this without doing a join, which is good because joins are inefficient. In SPL, you use an OR statement and a filter.

SPL Example: index=orders OR index=customers CustomerID=*

8. SQL Operator: UNION

The SQL UNION operator combines the result set of two or more SELECT statements. In the SQL operator, the columns must be similar and be in the same order.

SQL Example:
SELECT Country FROM Clients
UNION
Select Country FROM Providers
Order BY Country;

There are a few methods to mimic the SQL UNION command.

SPL append commands (append, appendcols) can match the combining of similar datasets. The standard search command is the best choice depending on the use case.

SPL Example: index=Clients OR index=Providers Country=*
The SPL union command may perform similar features but can append non-similar datasets.

Example: SPL Append connecting different searches (and then put in stats and make it prettier)
sourcetype=access_c* action=purchase| timechart count as Purchased | append [search sourcetype=linux_secure process=sshd “Failed password” | timechart count as Failed] | timechart first(*) as *

I prefer to use the SPL union command to mimic SQL union.

Example: SPL union connecting different searches from my college football poll app.
SPL: | [|inputlookup fcs_teams.csv][|inputlookup fbs_teams.csv]

9. SQL Command: DELETE

Splunk Indexes are immutable. You don’t get to delete data from indexes and free up disk space. Instead, admins may mark data non-searchable, but if you keep that data for three years, the impact on storage may be significant.
To use the SPL delete command, add that role to your account, craft a search that returns all the data you want to be removed, then append “| delete” to the command.

10. SQL Command: UPDATE

No. Data in Splunk indexes are immutable.

Angry disclaimer: You can delete items in either csv lookups or KV Store collections. It feels like cheating to provide that as an example when we’ve been discussing indexes.

Master the Transition from SQL to SPL

Splunk SPL is a powerful search language with so many commands that it may feel overwhelming. For users with SQL experience, there are mappings for performing the SQL actions you want in SPL. If you are moving to Splunk and not dabbling, then it makes sense to learn the SPL commands so you have performant searching and avoid the SQL limitations.

Splunk Pro Tip: While it is tempting only to use the most SQL-like commands in SPL, be sure to gauge the impact of your choices. The search feature of SPL often offers better performance than SQL-like commands such as join.

If you found this helpful…

You don’t have to master Splunk by yourself in order to get the most value out of it. Small, day-to-day optimizations of your environment can make all the difference in how you understand and use the data in your Splunk environment to manage all the work on your plate.

Cue Atlas Assessment: Instantly see where your Splunk environment is excelling and opportunities for improvement. From download to results, the whole process takes less than 30 minutes using the button below:

Helpful? Don't forget to share this post!
LinkedIn
Reddit
Email
Facebook