Skip to content
Article

A Beginner’s Guide to Regular Expressions in Splunk

 

Written by: Kinney Group | Last Updated:

 
November 14, 2022
 
 
 

Originally Published:

 
November 16, 2020

No one likes mismatched data. Especially data that’s hard to filter and pair up with patterned data. A Regular Expression (regex) in Splunk is a way to search through text to find pattern matches in your data. Regex is a great filtering tool that allows you to conduct advanced pattern matching. In Splunk, regex also allows you to conduct field extractions on the fly.

Let’s get started on some of the basics of regex!

How to Use Regex

The erex command

When using regular expression in Splunk, use the erex command to extract data from a field when you do not know the regular expression to use.

Syntax for the command:

| erex <thefieldname> examples=“exampletext1,exampletext2”

Let’s take a look at an example.

In this screenshot, we are in my index of CVEs. I want to have Splunk learn a new regex for extracting all of the CVE names that populate in this index, like the example CVE number that I have highlighted here:

a CVE index
Figure 1 – a CVE index with an example CVE number highlighted

Next, by using the erex command, you can see in the job inspector that Splunk has ‘successfully learned regex’ for extracting the CVE numbers. I have sorted them into a table, to show that other CVE_Number fields were extracted:

a search job inspector window in front of CVE_Number table
Figure 2 – the job inspector window shows that Splunk has extracted CVE_Number fields

The rex Commands

When using regular expression in Splunk, use the rex command to either extract fields using regular expression-named groups or replace or substitute characters in a field using those expressions.

Syntax for the command:

| rex field=field_to_rex_from “FrontAnchor(?<new_field_name>{characters}+)BackAnchor”

Let’s take a look at an example.

This SPL allows you to extract from the field of useragent and create a new field called WebVersion:

an SPL window
Figure 3 – this SPL uses rex to extract from “useragent” and create “WebVersion”

As you can see, a new field of WebVersion is extracted:

a window displaying WebVersion and its data
Figure 4 – the new field in WebVersion

Splunk Pro Tip: There’s a super simple way to run searches simply—even with limited knowledge of SPL— using Search Library in the Atlas app on Splunkbase. You’ll get access to thousands of pre-configured Splunk searches developed by Splunk Experts across the globe. Simply find a search string that matches what you’re looking for, copy it, and use right in your own Splunk environment. Try speeding up your regex search right now using these SPL templates, completely free.

Atlas Search - Contextual

Run a pre-Configured Search for Free

New call-to-action

 

The Basics of Regex

The Main Rules

^ = match beginning of the line

$ = match end of the line

Regex Flags

/g = global matches (match all), don’t return after first match

/m = multi-line

/gm = global and multi-line are set

/i = case insensitive

Setting Characters

\w = word character

\W = not a word character

\s = white space

\S = not white space

\d = a digit

\D = not a digit

\. = the period key

Setting Options

* = zero or more

+ = 1 or more

? = optional, zero or 1

| = acts as an “or” expression

\ = escape special characters

( ) = allows for character groupings, wraps the regex sets

Some Examples

\d{4} = match 4 digits in a row of a digit equal to [0-9]

\d{4,5} = match 4 digits in a row or 5 digits in a row whose values are [0-9]

[a-z] = match between a-z

[A-Z] = match between A-Z

[0-9] = match between 0-9

(t|T) = match a lowercase “t” or uppercase “T”

(t|T)he = look for the word “the” or “The”

Regex Examples

If you’re looking for a phone number, try out this regex setup:

\d{10} = match 10 digits in a row

OR

\d {3}-?\d{3}-?\d{4} = match a number that may have been written with dashes 123-456-7890

OR

\d{3}[.-]?\d{3}[.-]?\d{4} = match a phone number that may have dashes or periods as separators

OR

(\d{3})[.-]?(\d{3})[.-]?(\d{4}) = using parentheses allows for character grouping. When you group, you can assign names to the groups and label one. For example, you can label the first group as “area code”.

 

If you’re looking for a IP address, try out this regex setup:

\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3} = searches for digits that are 1-3 in length, separated by periods.

Use regex101.com to practice your RegEx:

a practice search at regex101.com
Figure 5 – a practice search entered into regex101.com

We’re Your Regex(pert)

Using regex can be a powerful tool for extracting specific strings. It is a skill set that’s quick to pick up and master, and learning it can take your Splunk skills to the next level. There are plenty of self-tutorials, classes, books, and videos available via open sources to help you learn to use regular expressions.

If you’d like more information about how to leverage regular expressions in your Splunk environment, reach out to our team of experts by filling out the form below. We’re here to help!

New call-to-action
Helpful? Don't forget to share this post!
Share on linkedin
LinkedIn
Share on reddit
Reddit
Share on email
Email
Share on twitter
Twitter
Share on facebook
Facebook

No comment yet, add your voice below!


Add a Comment

Your email address will not be published. Required fields are marked *