What is the Sort Command in Splunk?
The Splunk SPL sort command manipulates the direction of search results. Imagine you have a spreadsheet of data, and you want to control the order – that’s the sort command in Splunk. Unlike the spreadsheet example, with Splunk’s sort, you can manipulate based on multiple fields, ascending or descending, and combinations of both. In this post, we’ll explore sort, explore some of the greatness of the command, and provide caveats on when to use it to avoid surprises. In addition, we’ll go through the reverse command.
Example of the Sort Command:
| windbag | dedup lang | sort 75 – lang, +sample | table lang, sample, _raw
What did we do there?
- We made a data cube (the windbag and dedup segments).
- We limited the results to 75 with the optional argument in sort.
- We then set a default direction of descending (overriding the default of ascending).
- We sorted by the field lang.
- We then added a secondary field to sort with, but we want that one in ascending order.
Let’s break down the elements of the sort command in deeper detail before looking at some more interesting use cases.
Elements of the Sort Command:
[Your Data Cube] | sort <limit> <optional direction> field1, <field2>, <field3>,…
The Data Cube: Sort needs all the data in the search, so it runs on the search head after all data returns. Run all your streaming distributed search commands (eval, implication, rex, spath, fields, etc.) before running any centralized or dataset commands. Why? Because then your searches run on the indexers, and a single search will only have one search head but may have dozens* of indexers.
* Large environments often have hundreds and even thousands of indexers.
Of course, make those early data cubes as tidy as possible. If you only need specific results, use filters in your initial search to make that first result as small as possible. Clean data sets are where speed lives.
The pipe character ( | ) separates the different processes in a search. Splunk works by having commands in result sets. Determine where the ordering needs to be in the search and place the sort at that location preceded by a pipe.
The data cube can be simple, like our earlier example of | windbag, or dozens of lines long with lots of preprocessing, such as if we’re pulling in AD results and need to modify the fields for our uses.
Limit: The sort command has a limit to control the number of results returned. Sometimes you only need the first twenty results, and other times you need hundreds of thousands. If you don’t set the limit, the search returns the first 10,000 records. To retrieve all values, enter a 0 as the limit to disable filtering.
Caveat: Make sure to keep uncapped sorts near the end of your searches to avoid inefficiency (e.g., it will be slow, and you won’t like it).
Direction: Basic sort is ascending or descending of one field, yet it can do more. Unlike most commands in Splunk, in SPL Sort, the space in our earlier example has more meaning than readability.
If we add a space between the + or – before the first field, we set the default order of all remaining fields. In the sample below, all three fields are sorted in descending order, beginning with FieldAlpha. Then if multiple entries exist for FieldAlpha, FieldBeta is sorted in descending order.
Example 1: …|sort – FieldAlpha, FieldBeta, FieldCharlie
If we don’t have a space between the initial + or -, then our choice only affects the following field. In example 2, FieldAlpha is sorted descending, but FieldBeta and FieldCharlie are ascending.
Example 2: …|sort -FieldAlpha, FieldBeta, FieldCharlie
What was the difference between the two examples? The space between the direction symbol and the field. Mind blown.
Pairing Sort with Other Commands:
Dedup: One of the caveats on SPL sort is that limiting results doesn’t support ‘by’ statements. Hence, we may restrict to fifty results, but not two results per value of a particular field. One method to help the desire for limited results per value is to use the dedup command (see our post on dedup: https://kinneygroup.com/blog/splunk-dedup-command/). Use dedup before the sort command, so the datacube is as small as possible for the sort command.
Example 3: | windbag | fields lang, sample | dedup sample | sort – lang, sample | table lang sample
StreamStats: Another caveat of sort is that it only sorts fields. If we use that same windbag command and table the fields lang and sample, then our only options for sort order are lexicographical. But what if we sort by something like when the data arrives? We can use streamstats to provide an order, then flip that order with sort.
Example 4: | windbag | fields lang, sample | streamstats count | sort -count | table lang sample
The method we used also has a shorthand command shown below.
The reverse command takes the order of a data cube and flips it. If it was in time sequence, then the cube flips to where the older events are first. The reverse is a simple command with no options.
The reverse command is a dataset processing command, meaning it has to have all the records returned before operating.
Example 5: | windbag | fields lang, sample | reverse | table lang sample
In this post, we saw how to control the order in our data cubes using the sort command. We also explored limiting the number of returned results using sort, as well as special cases combining sort with dedup and streamstats. We also discussed where to use sort (as late in the search as possible) not to affect performance. Good luck and good sorting.
If you found this helpful…
You don’t have to master Splunk by yourself in order to get the most value out of it. Small, day-to-day optimizations of your environment can make all the difference in how you understand and use the data in your Splunk environment to manage all the work on your plate.
Cue Atlas Assessment: Instantly see where your Splunk environment is excelling and opportunities for improvement. From download to results, the whole process takes less than 30 minutes using the button below: