Splunk Search Command of the Week: join

When searching across your data, you may find yourself trying to pull fields and values from two different data sources. Le’ts say you’re trying to match ip information from one index with another index with CIDR’s… Or you’re trying to compare values from a lookup because you really need to find values that match or don’t match…

In these cases, there is a command we can use to achieve the results we want.

join

If you’re looking to pull your data from two different sources, join command is the one for you.

WARNING: the join command should not be used lightly. While on the surface it seems like a solution that could be applied to everything, there are a few things we need to know.

  • Join requires a subsearch, this means that a second search inside out main search will run and retrieve results first and then apply those results to the results of the main search
  • The subsearch is limited to returning ONLY the first 50,000 results
  • Search times are not reduced. If you build a complicated subsearch that takes a long time to complete, it will always a long time to complete. You will still have to wait for the main search to finish.

How to Use join

Now that we know what to prepare with join, let’s take a look at the syntax:

|join type= left|inner <matching field> [subsearch]

Type = there are two types of joins, left and inner

  • A left join produces ALL of the results from the main search joined with matching results from the subsearch
  • An inner join produces only results where the main search and subsearch match

Matching field = a field whose name is the same in both searches and correlates between the two data sets

New call-to-action

join In Action

Let’s look at a sample search that draws a simple picture of what you can do to join.

Index=test
| dedup ip
| eval temp_value=0
| table ip temp_value
| join type=left ip
[|inputlookup blacklist.csv | rename ip_address as ip | eval temp_value=1 | table ip temp_value]
| table ip temp_value
| where temp_value=0

In this search, we are looking for ip addresses that are not found on our ip blacklist. First, we start by creating a temporary value that applies a zero to every ip address in our data. Then, we’ll use join to add in the ips from our blacklist, including every ip that matches between the two changes from a 0 to a 1. Now, we can filter our search using “where temp_value =0” and filter out all the results of the match between the two.

Join can be a very powerful tool for building coherent tables of data from multiple sources. However, we want to use it responsibly, so we don’t accidentally clog up our environment. Whenever possible, try to find alternative solutions before using the join command.

Ask the Experts

Our Splunk Search Command of the Week series is created by our Expertise on Demand (EOD) experts. Every day, our team of Splunk certified professionals works with customers through Splunk troubleshooting support, including Splunk search command best practice. If you’re interested in learning more about our EOD service or chat with our team of experts, fill out the form below!

New call-to-action

Splunk Search Command of the Week: chart

 

This week, let’s chat about chart command.

The chart command is a transforming search command that allows you to put your data into a graphical visualization and like the stats command, the chart command can perform statistical functions such count, avg, min, max, etc. Chart command is going to be most utilized when you have fields that you want to build your chart with that do not involve time. Timechart and chart are similar. However, when you use the timechart command, your charts x-axis value is always going to represent time. With chart command, you can represent the x-axis using the over clause with any field you specify.  

 

Chart in Action

Let’s check out this dataset reviewing the ratings from IMBd on Netflix TV shows and movies.

 

Over and By Clause

Here’s an example of chart command and the over clause in action.

 

Figure 1 - Chart command and the over clause
Figure 1 – Chart command and the over clause

Notice that the x-axis is represented by the Age field. This is a product of using the over clause and letting Splunk know that you want Age to be on the x-axis.  The chart command also allows you to manipulate the y-axis by using the by clause.

New call-to-action

Here is an example of using the over clause and the by clause together. You can see the chart broken down over Age by IMDb which is the ratings of those movies in that specific age group.

 

Figure 2 - Chart command and the over clause and by clause
Figure 2 – Chart command and the over clause and by clause

 

Remove NULL and OTHER

The legend on the right-hand side has all the ratings in different colors. You’ll also see two values you may not necessarily be interested in… NULL and OTHER. Chart and timechart commands automatically filter results to include the ten highest values while the surplus values are grouped into the OTHER category. In this particular search, our results are skewed by the NULL and OTHER values.

To remove the NULL and OTHER values, you will use these two arguments “useother=f & usenull=f”. After applying the useother=f and usenull=f, you get the results you see below. You can see how the data looks better and cleaner without the OTHER and NULL values.

 

Figure 3 - Remove NULL and OTHER from your chart legend
Figure 3 – Remove NULL and OTHER from your chart legend

 

The Limit Argument

If you want to adjust the number of series that Splunk returns back, use the limit argument. With limit, specify how many values you’d like Splunk to return with.  If you want Splunk to return an unlimited amount of values, use limit=0. Let’s take a look at this in action. After applying the limit argument of 20, this is what Splunk brings back.

 

Figure 4 - Chart command series limit of 20
Figure 4 – Chart command series limit of 20

Next, let’s take a see what an unlimited amount of values looks like.

 

Figure 5 - Chart command series unlimited
Figure 5 – Chart command series unlimited

There you have it. Splunk has brought back all of the IMDb ratings associated with the movies in each age group. Now, you’ve seen chart command in action and its visualization options.

Ask the Experts

Our Splunk Search Command of the Week series is created by our Expertise on Demand (EOD) experts. Every day, our team of Splunk certified professionals works with customers through Splunk troubleshooting support, including Splunk search command best practice. If you’re interested in learning more about our EOD service or chat with our team of experts, fill out the form below!

New call-to-action

Splunk Search Command of the Week: timechart

 

STATS commands are some of the most used commands in Splunk for good reason. They make pulling data from your Splunk environment quick and easy to understand. But what if you wanted to take your STATS command one step further and see a time breakdown of that data?

We’ve got you covered. In this quick post, we’ll show you how to use the timechart command in Splunk, which timescales you can use, and the agg clauses that can help you further parse through your data.

STATS Command vs. timechart Command

On the surface it may appear that the timechart works exactly like the STATS command. However, it is important to note that there are a few key differences with timechart:

  • Timechart calculates statistics like STATS, these include functions like count, sum, and average. However, it will bin the events up into buckets of time designated by a time span
  • Timechart will format the results into an x and y chart where time is the x -axis (first column) and our y-axis (remaining columns) will be a specified field

Understanding these differences will prepare you to use the timechart command in Splunk without confusing the use cases.

How To Use timechart in Splunk

Now, let’s take a look at the syntax of a common use of the timechart command.

|timechart span=<time value> agg() by <field>

Splunk Tip: The by clause allows you to split your data, and it is optional for the timechart command.

Span = this will need to be a period of time like hours (1hr), minutes (1min), or days (1d)

Timescale Syntax Example Timescale Syntax
seconds s | sec | secs | second | seconds 5s
minutes m | min | mins | minute | minutes 30m
hours h | hr | hrs | hour | hours 12h
days d | day | days 5d
weeks w | week | weeks 7d
months mon | month | months 3mon

Agg()= this is our statistical function, examples are count(), sum(), and avg()

function Definition
count() Counts the number of entries per timespan.
sum() Finds the total sum per timespan.
avg() Finds the average value per timespan.
min() Finds the minimum value per timespan.
max() Finds the maximum value per timespan.

By using the timechart search command, we can quickly paint a picture of activity over periods of time rather than the total for the entire time range.

New call-to-action

Splunk timechart Examples & Use Cases

Let’s take a look at a couple of timechart examples.

1. Find the number of saved searches run throughout the day.

index=_internal sourcetype="scheduler" search_type=scheduled | timechart span=1hr count
Splunk timechart command example 1
Figure 1 – Saved search statistics using timechart

2. Find the number of successful purchases per day by genre.

Index=tutorial sourcetype=access_combined_wcookie action=purchase status=200 | timechart span=1d count by categoryId
Splunk timechart command example 2
Figure 2 – Breakdown of purchases per day using timechart

3. Find the Total Login Attempts per User.

index=_audit action="login attempt" | timechart span=1hr count by user

The beautiful part about timechart is that it provides us great insights into daily, weekly, or even hourly activity within our environment.  When we start utilizing visualization with the results from timechart, we can easily find spikes, lulls, or other anomalies that need further investigation.

If you found this helpful… 

You don’t have to master Splunk by yourself in order to get the most value out of it. Small, day-to-day optimizations of your environment can make all the difference in how you understand and use the data in your Splunk environment to manage all the work on your plate.

Cue Atlas Assessment 30-day free trial: a customized report to show you where your Splunk environment is excelling and opportunities for improvement. You’ll get your report in just 30 minutes.

New call-to-action

Splunk Search Command of the Week: lookup

 

Lookups are a vital part of Splunk. This Splunk search command can be used to enrich data and provide critical insights into the events users are ingesting. Whether it be blacklisted IPs, geo-locations, or product information, you can utilize lookups to find outstanding issues or suspicious events in your environment.

Once your lookups are in Splunk, how do you tie them to our event data? Great question, there are several ways to do this. You might already be familiar with using the Splunk search command, join, to create a sub search, and use inputlookup to bring in the information from the lookup. But what if I told you there was a much easier, much more efficient way to do this?

TA-DA the lookup command. By using the lookup search command, you no longer have to worry about writing sub searches or having to use the join command at all. Instead, we can use this one-stop-shop command to easily integrate our lookup information to our data.

How To Use lookup

Let’s look at the syntax…

|lookup <lookup_name> <correlating field> OUTPUT <field> <field> ….<field>
  • <lookup_name>- name of your lookup
  • <correlating field> – this is a field or field values that match between both the event data and the lookup
  • OUTPUT – required, everything after will be the fields that we are bringing over from the lookup

Then, check out these fields…

Figure 1 - List of field values
Figure 1 – list of field values

 

Example: This particular data set is product purchasing information from a web storefront. As we can see there are a lot of good fields pertaining to the total sales of a product. In this case, we have…

  • productId
  • action
  • status

However, there are a few fields not listed that would really paint the full picture of sales performance through data. Think of fields like product_name and price. Fortunately, you can add a lookup to that information.

Figure 2 - Adding lookup to fields
Figure 2 – Adding lookup to fields

 

New call-to-action

 

lookup Results 

Once we’ve ingested the lookup into Splunk, we can start to use the lookup command to start bringing that data over to my event data.  Check out this search to do just that.

<base_search>|lookup prices.csv productId OUTPUT product_name price

Then, run the search and take a look back at your fields. You can see that product_name and price our now fields that we can manipulate and search on.

Figure 3 - Lookup results in searchable fields
Figure 3 – Lookup results in searchable fields

Finally, with the lookup search command, you can see your data integrated with your lookup information.

Ask the Experts

If reading through this article… what other use cases might be in terms lookup? Take a look at Splunk Search Command of the week: iplocation, to read how to include geolocation information to your data.

Our Splunk Search Command of the Week series is created by our Expertise on Demand (EOD) experts. Every day, our team of Splunk certified professionals works with customers through Splunk troubleshooting support, including Splunk search command best practice. If you’re interested in learning more about our EOD service or chat with our team of experts, fill out the form below!

New call-to-action

How To Use The Splunk iplocation Command (+ Expert Tips)

 

Splunk is full of hidden gems. One of those gems is the Splunk Search Command: iplocation. By utilizing particular database files, iplocation can add geolocation information to the IP address values within your data. If you are ingesting data that contains an external IP address field, such as web storefront or VPN access, we can find location information such as country, city, and region to which the IP address belongs to. 

In this post, we’ll cover how to use the iplocation command in Splunk and some helpful tips you’ll need aling the way.

How To Use the iplocation Command in Splunk

|iplocation <ip_field>

Step 1: Type the iplocation command into your search bar.

|iplocation

Here is sample data that was ingested containing external IP’s under the field name clientip.

Figure 1 - iplocation sample data
Figure 1 – iplocation sample data

Step 2: Add the field that you want to use.

In this example, we’re using clientIp because these are the IP addresses we want to use the command for.

|iplocation <clientIp>
Figure 2 - Add clientip to your search
Figure 2 – Add clientip to your search

Splunk Tip: The iplocation command is false by default. When you add true to the search, it adds a few more fields to the columns.

New call-to-action

Step 3: Run a command using the fields that are present in the iplocation search.

If we look at our interesting fields, we’ll see some new additions.

Figure 3 - Review your interesting fields
Figure 3 – Review your interesting fields

Here, we can see city and country as fields within the iplocation command. So, we can use another command like the stats or table command to retrieve more information about the fields within the iplocation command.

Now that geolocation fields have been added to your fields list, add them to your search.

Figure 4 - Add geolocation fields to your search
Figure 4 – Add geolocation fields to your search

Splunk Tip: When using iplocation, the addresses must be external. Internal addresses may cause the command to work incorrectly.

iplocation Results 

Figure 5 - iplocation results
Figure 5 – iplocation results

There you have it. As you can see, we have successfully added geographical information to our ip addresses. By using this Splunk search command, you can use this information and build heatmaps and cluster map dashboards to visualize activity around the globe.

If you found this helpful…

You don’t have to master Splunk by yourself in order to get the most value out of it. Small, day-to-day optimizations of your environment can make all the difference in how you understand and use the data in your Splunk environment to manage all the work on your plate.

Cue Atlas Assessment 30-day free trial: a customized report to show you where your Splunk environment is excelling and opportunities for improvement. You’ll get your report in just 30 minutes.

New call-to-action

Splunk Search Command of the Week: STATS

 

Here’s the situation: You’re a security analyst that’s been tasked with finding different attacks on your servers. You need to find various events relating to possible brute force attempts, suspicious web page visits, or even suspicious downloads.

This probably isn’t much of a hypothetical — it might be a reality for a lot of people. We get it. Security is incredibly important in the era of technology. Fortunately, Splunk makes it easy to find this information, by using the STATS search command.

What is the STATS command in Splunk?

STATS is a Splunk search command that calculates statistics. Those statistical calculations include count, average, minimum, maximum, standard deviation, etc.  By using the STATS search command, you can find a high-level calculation of what’s happening to our machines.

The STATS command is made up of two parts: aggregation and a by-clause (field). The aggregation part of the command has multiple options to choose from while the by-clause or field is optional.

|stats <aggregation> BY <field>

<aggregation> = count, avg(), max(), sum()

How to Use a STATS Command

Step 1: Find your data.

For this example, we’re using event log data.

Step 2: Run a STATS count.

|stats <count>

In this command, <count> is the aggregation. It applies to all the information in the event log data we pulled in step one.

STATS Use Cases

Let’s take a look at a couple of use cases:

Use Case #1: You want to look at the number of failed login attempts.

index=_audit action="login attempt" info=failed | stats count by user
Figure 1 - Number of failed logins by user
Figure 1 – Number of failed logins by user

Use Case #2: You want to identify values like average, shortest, and the longest runtime on saved searches.

index=_internal sourcetype="scheduler" search_type=scheduled | stats avg(run_time) min(run_time) max(run_time)
Figure 2 - Average, shortest, and longest runtime of saved searches
Figure 2 – Average, shortest, and the longest runtime of saved searches

 

New call-to-action

 

STATS Tips and Tricks

So you can use the STATS command to find calculations and perform investigations, but you can enhance it to make it more readable or increase discovery in your environment.

Use the ‘as’ subcommand to rename fields in the STATS command. This will enable you to create specific tables without making further tweaks.

index=_internal sourcetype="scheduler" search_type=scheduled | stats avg(run_time) as AVGRUNTIME min(run_time) as MINRUNTIME max(run_time) as MAXRUNTIME

Use the ‘values’ aggregator to list all unique values found in the field. This can help discover the full range of not just numbers, but words found in the field. This can show differences more clearly.

index=_audit info=failed | stats values(action) as actions by user

STATS Results

STATS can help provide a strong overview of the activity within your environment. While STATS is a fairly simple command it can provide huge insights into your data. When paired with other commands like iplocation or lookup you can enrich your data to find anomalies such as interactions from certain countries or blacklisted IP addresses.   

If you found this helpful… 

You don’t have to master Splunk by yourself in order to get the most value out of it. Small, day-to-day optimizations of your environment can make all the difference in how you understand and use the data in your Splunk environment to manage all the work on your plate.

Cue Atlas Assessment 30-day free trial: a customized report to show you where your Splunk environment is excelling and opportunities for improvement. You’ll get your report in just 30 minutes.

New call-to-action

Splunk Search Command of the Week: mvexpand

 

Data comes in all different formats. On more than a few occasions in Splunk, I’ve worked with data that contains fields with multiple values. A common example is port numbers on a network. It might like something like this:

Figure 1 - Example: port numbers on a network
Figure 1 – Example: port numbers on a network

When searching across data in Splunk like this, you may not want to find every port value, you may just find all information pertaining to “Cal05”. That’s where the Splunk search command mvexpand comes into play.

To call mvexpand into a search, simply type |mvexpand Ports this will expand the field argument give into their own event.

mvexpand Use Cases

Let’s look at this real-time. My search is index=main sourcetype=random | table Name Status Ports, and my results will look like this:

Figure 2 - Search example without mvexpand
Figure 2 – Search example without mvexpand

This is great if you need to see all of the data, but what if you only want to see information about port: Cal05? We can use mvexpand to separate the values to specify our results.

New call-to-action

mvexpand Results

Let’s start by using just the search command mvexpand. Here is our search: index=”aodtest” sourcetype=”csv” |makemv delim=”,” Ports | mvexpand Ports | table Name Status Ports


Figure 3 - mvexpand example
Figure 3 – mvexpand example

Great — now every port has its own event. Next, we can specify our results to a specific value.  Let’s search: index=”aodtest” sourcetype=”csv” | makemv delim=”,” Ports | mvexpand Ports| search Ports=” Cal05″ | table Name Status Ports

Figure 4 - mvexpand specified example
Figure 4 – mvexpand specified example

As you can see from the results above, we were able to take all our server information and pull out a specific port and find the information related to that value. Problem solved! Your data is consolidated and more user-friendly thanks to the Splunk search command, mvexpand.

Please note: if you run into a situation where your data has multiple multi-values, you will need to use addition commands like mvzip and makemv.

Ask the Experts

Our Splunk Search Command of the Week series is created by our Expertise on Demand (EOD) experts. Every day, our team of Splunk certified professionals works with customers through Splunk troubleshooting support, including Splunk search command best practice. If you’re interested in learning more about our EOD service or chat with our team of experts, fill out the form below!

New call-to-action

Splunk Search Command of the Week: Transaction

 

Have you ever needed to see how long a server has been down? Or maybe find the duration of processing calls? Instead of trudging through a bunch of complicated eval statements or subtracting different time intervals, Splunk has made it simple with an all-in-one Splunk search command: Transaction.

What is the Transaction command in Splunk?

The transaction command allows Splunk users to correlate similar events based on different constraints to transactional information. Transactions usually include information such as the duration between events and the number of events (eventcount).

How to Use Transaction

Using the transaction command is a lot simpler than it might seem. To use it in a Splunk search command, just follow this format :

|transaction

And that’s it. That’s the only requirement for using this command. However, to get the most accurate results, it would be best to add a few more items to the line:

|transaction <field> maxevents=# startswith= “<value>” endswith=”<value>”

This is a solid foundation for most use cases, let’s break it down:

<field> – this would be a field that correlates between the events, something to match events with

Maxevents – maximum number of events between each transaction

Startswith – events containing this term will start off the transaction event

Endswith – events containing this term will close off the transaction event

Splunk Transaction Example

Let’s look at an example. I have a list of different servers that generate a status event and a timestamp:

Splunk Transaction Example
Figure 1 – Servers that generate a status event and a timestamp

Now, what I want to do is create transactions between these events to find the duration in which a server was down. To do this, I’ll want to write a line similar to this:

|transaction server maxevents=2 startswith=”Down” endswith=”Up”
New call-to-action

Transaction Results

Look for these results when running your transaction search command…

Server – the field we match on

Maxevent=2 – we ONLY want to see a singular UP and DOWN event

Startswith=Down – we need the Down event to start us off to find the duration a server has been down

Endswith=Up – this will close off the transaction indicating the server is back up

If you found this helpful… 

You don’t have to master Splunk by yourself in order to get the most value out of it. Small, day-to-day optimizations of your environment can make all the difference in how you understand and use the data in your Splunk environment to manage all the work on your plate.

Cue Atlas Assessment: a customized report to show you where your Splunk environment is excelling and opportunities for improvement. Once you download the app, you’ll get your report in just 30 minutes.

New call-to-action