How To Use the Splunk dedup Command (+ Examples)

splunk dedup command

What is the Splunk dedup Command? 

The Splunk dedup command, short for “deduplication”, is an SPL command that eliminates duplicate values in fields, thereby reducing the number of events returned from a search. Typical examples of a dedup  produce a single event for each host or a pair of events for each sourcetype.

New call-to-action

How the dedup Command Works 

Dedup has a pair of modes. We’ll focus on the standard mode, which is a streaming search command (it operates on each event as a search returns the event).  

The first thing to note is the dedup command returns events, which contrasts with stats commands which return counts about the data. Outputting events is useful when you want to see the results of several fields or the raw data, but only a limited number for each specified field. 

When run as a historic search (e.g., against past data), the most recent events are searched first. If the dedup runs in real-time, the first events received are searched, which does not guarantee that they are the most recent (data doesn’t always arrive in a tidy order). 

Splunk dedup Command Example

Let’s run through an example scenario and explore options and alternatives. I will use the windbag command for these examples since it creates a usable dataset (windbag exists to test UTF-8 in Splunk, but I’ve also found it helpful in debugging data). 

Step 1: The Initial Data Cube 

| windbag 

Result: 100 events. Twenty-five unique values for the field lang, with the highest value having eight events.  

Step 2: Using Dedup to reduce events returned 

Now, let’s limit that to 1 event for each of those values in lang.  

| windbag | dedup lang 

Result: 25 events. Lang still has 25 unique values, but there is only one event for each language specified this time. 

We can also reduce by a combination of fields and even create fields before using dedup. 

Step 3: Cast time into a bin, then reduce fields with lang and time bin 

The windbag data is spread out over the past 24 hours (because I’m running 24-hour time). Taking advantage of this, we can create another usable field by using bin to set the time into 12-hour buckets. Using bin like this is one way to split the data. Since I ran this at 21:45, I wound up with four buckets (Who said this was perfect?), with the middle two buckets having forty-two events each.  

| windbag | bin span=12h _time | dedup lang, _time 

Result: 64 events. Twenty-five different lang fields, with the highest event count at 3. 

Step 4: Add a random 1 or 2 to the mix, and dedup off of those three fields. 

The above exercise was one way to divide the data up. This time, we’re going to randomly assign (using random and modulo arithmetic) each event a 1 or 2 for the group, and then use that in a dedup along with the span of 12 hours. 

| windbag | eval group = (random() % 2) + 1 | bin span=12h _time | dedup lang, _time, group 

Result: each run changes. It ranged from seventy-five events to eighty-six in the ten runs I let it try. 

Step 5: What if we want more than one event per field? 

This time we’ll add an integer behind dedup to give us more results per search. 

| windbag | dedup 2 lang 

Result: Each of the twenty-five lang entries returned two events.  

Step 6: How to Use the Data

Great, so we can reduce our count of events. What can we do with this? Anything you can picture in SPL. We may want a table of different fields. Stats counts based upon fields in the data? Why not? 

index=_internal | dedup 100 host | stats count by component | sort - count 

Result: Returned 500 events, then stats counted. In case anyone is wondering, ~80 of that data is the component Metrics (apparently, we need to use this cloud stack more) 

Other dedup Comand Options and Considerations 

There are several options available for dedup that affect how it operates.  

Note: It may be better to use other SPL commands to meet these requirements, and often dedup works with additional SPL commands to create combinations. 

  • consecutive: This argument only removes events with duplicate combinations of values that are consecutive. By default, it’s false, but you can probably see how it’s helpful to trim repeating values.
  • keepempty: Allows keeping events where one or more fields have a null value. The problem this solves may be easier to rectify using fillnull, filldown, or autoregress.
  • keepevents: Keep all events, but remove the selected fields from events after the field event containing that particular combination. 

This option is weird enough to try:  

| windbag | eval group = (random() % 2) + 1 | dedup keepevents=true lang, group 

Then add lang and group to the selected fields. Note how each event has lang and group fields under the events. Now, flip to the last pages. The fields for lang and group are not present for those events. Bonus points if you can tell me why this exists. 

  • sortby: A series of sort options exist, which are excellent if your dedup takes place at the end of the command. All options support +/- (ascending or descending). The options possible are field, auto (let dedup figure it out), ip to interpret results as IPs, num (numeric order), and str (lexicographical order). 
| windbag | bin span=12h _time | dedup lang, _time sortby -lang 

This command will sort descending by language. What is nice is that we don’t have to pass the command to sort, which would result in an additional intermediate search table. 

  • Multivalue Fields: Dedup functions against multivalue fields.  

All values of the field must match to be deduplicated.

  • Alternatives Commands:The uniq command works on small datasets top remove any search result that is an exact duplicate of the previous event. The docs for dedup also suggest not running on _raw, as that field would result in many calculations to determine if it is a dupe.
  • MLTK Sample Command: The Sample command that ships with the machine learning toolkit does a great job of dividing data into samples. If my goal is to separate data, and MLTK exists on the box, then the sample command is preferred. 
  • Stats Commands: The stats command, and its many derivatives, are faster if your goal is to return uniqueness for a few fields. For example, | windbag |bin span=12h _time |stats max(_time) as timebucket by lang returns the max value of _time, similar to dedup after a sort. 

If you found this helpful…

You don’t have to master Splunk by yourself in order to get the most value out of it. Small, day-to-day optimizations of your environment can make all the difference in how you understand and use the data in your Splunk environment to manage all the work on your plate.

Cue Atlas Assessment: a customized report to show you where your Splunk environment is excelling and opportunities for improvement. Once you download the app below, you’ll get your report in just 30 minutes.

New call-to-action

Splunk spath Command: How to Extract Structured XML and JSON from Event Data

splunk dedup command

Your dilemma: You have XML or JSON data indexed in Splunk as standard event-type data.

Sure, you’d prefer to have brought it in as an indexed extraction, but other people onboarded the data before you got to it and you need to make our dashboards work.

How do you handle Splunk data and make it searchable? We could make regular expressions and hope the shape of the data is static—or we can use the easy button: spath command.

New call-to-action

Continue reading

What Size Splunk License Do You Need? Here’s How to Estimate It.

conceptual illustration of different license sizes

What is a Splunk License?

A Splunk license is a file that houses information about your license entitlement. This tells you what your abilities and limitations are within the license including the amount of data you can index per day.

New call-to-action

Types of Splunk Licenses

There are four types of Splunk licenses. Here’s a quick breakdown of each one:

Free Splunk License: Splunk’s free license is a limited version of Splunk Enterprise intended for personal use. It lets Splunk users index data in small volumes of 500MB or less per day and run searches against all public indexes.

Enterprise Splunk License: The enterprise Splunk gives you access to all of the Splunk Enterprise features including machine learning and AI, data streaming, and scalable index. You can also add users and roles.

Dev/Test or Beta License: If you intend to use a Splunk Beta release, you’ll need a different license for it. Free and Enterprise licenses won’t work.

Forwarder License: This Splunk license forwardds unlimited amounts of data and enables secrutiy with a login for each user. This type of license is included in the Splunk Enterprise license.

How big of a Splunk license do I need?

Estimating the Splunk data volume within an environment is not an easy task due to several factors: number of devices, logging level set on devices, data types collected per device, user levels on devices, load volumes on devices, volatility of all data sources, not knowing what the end logging level will be, not knowing which events can be discarded, and many more.

As you begin the process of planning and implementing the Splunk environment, understand that the license size can be increased and the Splunk environment can be expanded quickly and easily if Splunk best practices are followed.

Here is a Kinney Group tested and approved, 7-step process on how to determine what size Splunk license is needed:

  1. Identify and prioritize the data types within the environment.
  2. Install the free license version of Splunk.
  3. Take the highest priority data type and start ingesting its data into Splunk, making sure to start adding servers/devices slowly so the data volume does not exceed the license.  If data volumes are too high, pick a couple of servers/devices from the different types, areas, or locations to get a good representation of the servers/devices.
  4. Review the data to ensure that the correct data is coming in. If there is unnecessary data being ingested, that data can be dropped to further optimize the Splunk implementation.
  5. Make any adjustments to the Splunk configurations needed, and then watch the data volume over the next week to see the high, low, and average size of the data per server/device.
  6. Take these numbers and calculate them against the total number of servers/devices to find the total data volume for this data type.
  7. Repeat this process for the other data types listed until you are completed.

How much does a Splunk License cost?

An Enterprise Splunk License starts at $65 per host, per month and this cost is billed annually. The majority of the cost of Splunk depends on the amount of data you ingest per day which, according to TechTarget, can start at $1,800 per GB. Splunk Enterprise is customized to your organization’s needs, so you’ll need to speak to them directly for 100% accurate pricing.

If you found this helpful… 

You don’t have to master Splunk by yourself in order to get the most value out of it. Small, day-to-day optimizations of your environment can make all the difference in how you understand and use the data in your Splunk environment to manage all the work on your plate.

Cue Atlas Assessment: a customized report to show you where your Splunk environment is excelling and opportunities for improvement. Once you download the app, you’ll get your report in just 30 minutes.

New call-to-action

Splunk MLTK: What It Is And How It Works

What if there was a tool you could use to automate the time-consuming and nearly impossible parts of your job as a Splunk administrator? There is, and it’s called the Splunk Machine Learning Tool Kit (MLTK). It can predict analytics, identify patterns in your data, and even detect abnormalities in your data. 

In this post, we cover how MLTK works and how you can use the power of artificial intelligence to work more efficiently.

What is the Splunk Machine Learning Tool Kit?

The Splunk Machine Learning Tool Kit (MLTK) is an app that lets Splunk creators deploy SPL commands and custom visualizations that explore and analyze data using machine learning technology.

MLTK is available for both Splunk Enterprise and Splunk Cloud Platform on Splunkbase.

There are three main features of the Splunk MLTK app:

    • Anomaly Detection: By analyzing your past data, Splunk’s machine learning tool can automatically detect abnormalities within your current and future data.
  • Predictive Analytics: Predicting events and transactions is made simple with MLTK so you can make informed decisions in real-time.
  • Data Clustering: Clustering data into groups allows MLTK to identify patterns in your data that humans might miss.

How does the Splunk Machine Learning Tool Kit Work?

In order to work efficiently, the Splunk MLTK app must learn information and then provide that knowledge to the end user. Although the process for how MLTK works is not cut and dry, it can be generally outlined like this:

Step 1: The MLTK collects data

Step 2: The MLTK transforms the data into actionable intelligence

Step 3: The MLTK explores and visualizes that data in the proper context

Step 4: The MLTK models the data

Step 5: The MLTK evaluates the data

Step 6: The MLTK deploys the data

The great thing about the MLTK is that you’re not alone when using it. The Assistants are tools within the MLTK that walk you through the tools and features you’ll need when preparing, building, validating, and deploying models.

Machine Learning & Data Science

The gist of machine learning is to provide systems with the ability to learn. That is, we give the system’s algorithms to start with, and they can adapt based upon data, make classifications, and make decisions with little to no human intervention.

The Splunk Machine Learning Toolkit

The MLTK is a Splunk app, which is free by the way, that helps to create, validate, manage, and most importantly, operationalize, machine learning models. The MLTK includes a variety of algorithms including several hundred from the Python for Scientific Computing Library, that give the power to try different algorithms to find the right insights for your data.

Two Example Scenarios
  • Resource Management: When we’ll need more capacity
  • Systems breaking: Identify the items that are indicative of forthcoming system failures

Looking Forward with Splunk MLTK

We are in a new day and age of IT Operations, where many manual processes can start to be automated with the help of these tools. Putting the power of Splunk’s MLTK into the hands of your IT Operations personnel can empower them to begin a transition to a more automated approach to their everyday work. Such as, being able to investigate and troubleshoot a problem before you even see the effects of what may be going on. This approach is not mainstream—and may be daunting to some—but now is the time to get a grasp on the next generation of IT Operations.

Want to know what Splunk MLTK do for you and your organization? You can actually get access to Kinney Group’s deep bench of Splunk experts, on demand. Check out our Expertise on Demand for Splunk service offering for more information on our various packages and let us know how we can help unleash the power of Splunk.

Visit www.kinneygroup.com/contact-us or call us at (317) 721-0500.

New call-to-action

6 Reasons INGEST_EVAL Can Help (Or Hurt) Your Splunk Environment

Note: This post was written as review of a session offered at Splunk .conf20. The information was current at the time of writing (Splunk 8.1) and still holds value today with Splunk 9.x.

As a Splunker, you’re constantly faced with questions about what can help or hurt you in Splunk. And you may be asking yourself this question:

“Should I use INGEST_EVAL?”

The answer to this is a solid… maybe.

At a recent Splunk .conf, Richard Morgan and Vladimir Skoryk presented a fantastic session on different capabilities for INGEST_EVAL. When you get a chance, take a look at their presentation recording! In this review, we’ll go through Richard and Vladmir’s presentation and discuss inspiration derived from it. These guys know what they’re talking about, now I’m giving my two-cents.

Background

Splunk added the ability to perform index-time eval-style extractions in the 7.2 release. It was in the release notes, but otherwise wasn’t much discussed. It generates more buzz in the 8.1 release as these index-time eval-style extractions (say that three times quickly) support the long-awaited index-time lookups. 

The purpose of INGEST_EVAL is to allow EVAL logic on indexed fields. Traditionally in Splunk, we’d hold-off on transformations until search time — old-timers may remember Splunk branding using the term “Schema on the Fly.” Waiting for search-time is in our DNA. Yet, perhaps the ingest-time adjustments are worth investing in.

Let’s look through the key takeaways on what ingest-time eval provides. Then you can review to see it’s worth the hassle to do the prep-work to take advantage of this.

1. Selective Routing

Before you try to yank my Splunk certs away, yes, we already have a version of this capability. This is slightly different than the typical method that can send data to separate indexers, say Splunk internal logs going to a management indexer instead of the common-use one, or security logs to a parent organization’s Splunk instance.

The INGEST_EVAL version allows for selective routing based on whatever you can come up to use in an eval statement. The example from the presentation uses the match function in regex to send data from select hosts to different indexers. Ideally, this would happen on a heavy forwarder, or any other Splunk Enterprise box, before reaching the indexers. Perhaps those security logs are staying on-prem, and the rest of the logs go to Splunk Cloud.

What else could we come up with for this? If data contains a particular string, we can route that to different indexes or indexers. We already have that with transforms. But, transforms are reliant upon regex, whereas this could use eval functions. Move off large profit transaction to a separate set of indexers, if a list of special codewords appear, move it to a different indexer?

Let your imagination run on this, and you’ll find lots of possibilities.

2. Ingest log files with multiple timestamp formats

In the past, we had to dive into the depth of datetime_config.xml and roll a custom solution. INGEST_EVAL, along with if/case statements, can handle multiple timestamp formats in the same log. Brilliant. If you have ever had to deal with logs that have multiple timestamp formats (and the owners of those logs who won’t fix their rotten logs), then you’ll be thrilled to see an easy solution.

INGEST_EVAL can look at the data and search for different formats until it finds a match.

3. Synthesizing dates from raw data mixed with directory names

Sometimes we find data, often IoT or custom syslog data, where the log file only has a timestamp. In these cases, we normally see the syslog server write the file into a directory with a date name. Example: /data/poutinehuntingcyborg/2020-10-31.log 

Using INGEST_EVAL, it’s possible to create an _time that uses part of the source and part of the raw data to create a timestamp that matches what Splunk expects. A lovely solution that wasn’t so easy otherwise.

This simple trick could replace having to use ETL. 

New call-to-action

4. Event Sampling

Using eval’s random function and an if/case statement, it is possible to send along only the percentage of events. Combine with other eval elements such as sending on one in ten login errors or one in one-thousand successful purchases.

By combining multiple eval statements, you could create a sample data set that includes data from multiple countries, different products, and different results. 

5. Event Sampling combined with Selective Routing

 Whoa. Sample the data, and then send the sample to test, dev, or over to your machine learning environment. This is big.

6. Dropping unwanted data from structured data

Using INGEST_EVAL, we can drop fields that we otherwise don’t need. With indexed extractions for csv and json, each column or element becomes a field. Sometimes we don’t want those extra fields.

Let’s look at an example: an excel spreadsheet exported as csv, where a user has been adding notes that are unneeded in Splunk.

In standard Splunk ingest, those notes become fields in Splunk and we have to use SPL to remove them from our searches. How often does a csv dump contain hundreds of fields, but we only care about four? (Answer: often).

Using INGEST_EVEL, we can onboard only the columns or elements that we want, and the rest poof away. Not only would this save disk space, but it makes for cleaner searching. 

Splunk Pro Tip: This type of work can be a considerable resource expense when executing it in-house. The experts at Kinney Group have several years of experience architecting, creating, and solving in Splunk. With Expertise on Demand, you’ll have access to some of the best and brightest minds to walk you through simple and tough problems as they come up.

Kinney Group Expertise on Demand

See for Yourself

My Final Thoughts

Back to our question… “Should I use INGEST_EVAL?” Again, it’s a solid maybe.

If you need to keep licensing down by only ingesting what you need, then sure. If you need to modify data beyond what sed or a regex can perform, then give it a try. INGEST_EVAL isn’t for every Splunk admin, but not every admin hunts down blogs like this.

New call-to-action

Michael Simko’s Top Five Recommended Sessions at Splunk’s .conf20

Splunk .conf20 logo

One of my favorite times of the fall is the annual Splunk user conference. The pandemic has thrown lots of conferences into disarray. The Las Vegas .conf may be off, but virtual .conf is on — and is free. And yes, free as in free, not free like someone tried to give you a dog.

The virtual conference is 20-21 October for AMER, and 21-22 for EMEA and APAC. 

Here are the top five sessions at Splunk .conf20 that I recommend my customers, colleagues, and students attend. There are many more interesting sessions across the Splunk product line and beyond (temperature scanning crowds to find the infected?). 

 

1) PLA1454C – Splunk Connect for Syslog: Extending the Platform 

Splunk Connect for Syslog is an outstanding system for onboarding syslog data into Splunk. Traditionally, Splunk uses a third-party syslog to write data to disk, and then a Universal Forwarder to read that data and send it to Splunk. This has worked well but requires building the syslog server and understanding enough of the syslog rules to configure the data correctly.  

Enter Splunk Connect for Syslog, which handles the syslog configuration, sends the data to Splunk, and for many known sourcetypes makes the onboarding process a snap. 

 

What I like best: This came from engineers looking at a problem and making things better.

 

2) PLA1154C – Advanced pipeline configurations with INGEST_EVAL and CLONE_SOURCETYPE

Eval is powerful way to create, modify, and mask data within Splunk. Traditionally it is performed at search time. This session shows methods for using INGEST_EVAL to perform eval logic as the data in being boarded. This helps with event enrichment, removing unwanted fields, event sampling, and many more uses.  

 

What I like best: INGEST_EVAL opens a world of more control in Core Splunk.

 

New call-to-action

 

3) SEC1392C – Simulated Adversary Techniques Datasets for Splunk

The Splunk Security Research Team has developed test data for simulating attacks and testing defenses in Splunk. In this session, they are going to share this data and explain how to use it to improve detecting attacks.

 

What I like best: Great test data is hard to come by, much less security test data.

 

4) PLA1129A – What’s new in Splunk Cloud & Enterprise

This session shows off the newest additions to Splunk Cloud and Splunk Enterprise. Each year these sessions show the new features that have arrived either in the last year or in new versions that often coincide with Splunk .conf.

What I like best: New toys to play with.

 

5) SEC1391C – Full Speed Ahead with Risk-Based Alerting (RBA)

I’ve talked to several customers who wanted to use a risk-based alerting (RBA) system for their primary defenses. Traditional methods require lots of tuning to avoid flooding the security staff with too many alerts. RBA is a method to aggregate elements together and then present the findings in an easier-to-consume method.

 

What I like best: Another option on how to approach security response.

 

Bonus Sessions: You didn’t think I could really stop at five, did you?

TRU1537C – Hardened Splunk: A Crash Course in Making Splunk Environments More Secure

TRU1276C – Splunk Dashboard Journey: Past Present and Future

TRU1761C – Master joining your datasets without using join. How to build amazing reports across multiple datasets without sacrificing performance

TRU1143C – Splunk > Clara-fication: Job Inspector

 

Join us!

Our KGI team will be on board for .conf20 and we’re more excited than ever to attend with you. With over 200 virtual sessions at Splunk’s .conf20 event, this year is going to be BIG. With exciting updates to Splunk and grand reveal on new product features… Kinney Group is ready to help Splunkers along the way.

Keep your ears perked for some big, Splunk related announcements coming your way from Team KGI this month…

New call-to-action

The Needle in the Haystack: Missing Data in Splunk

Splunk has wonderful charts, graphs, and even d3.js visualizations to impart data in an easily understandable fashion. Often, these graphical representations of the data are what users focus on. Decisions are made and budgets determined due to how the data appears in these visualizations. It’s safe to say, the accuracy of the data that supports these visuals needs to be spot on.

Visualize Your Data

Splunk brings value through its visualization features. However, for the visuals to be meaningful, the data has to be accurate and complete. This highlights a challenge: focusing on visualizations often masks incomplete data. Pie charts appear to have all the data. representing the data as a “full” circle, even if we are missing data in Splunk. However, that pie chart of external connections to our network is inaccurate if it’s missing one of our firewalls. For example, a security control for “3 fails with 60 minutes per user” is compromised when a third of the data isn’t arriving in Splunk. Let’s take a look at some steps to find that missing data…

Figure 1 - Pie chart missing data in Splunk
Figure 1 – Pie chart missing data in Splunk

Find Your Missing Data

Step 1: Create a list of all the data coming into Splunk. Using an account that can search all the indexes, run the following:

| metadata type=sourcetypes index=* | fields sourcetype, totalCount | sort - totalCount

 

 

Figure 2 - Metadata in Splunk
Figure 2 – Metadata in Splunk

Step 2: Export the table that was the resulted from the previous step. (Good thing there’s an export button in the Splunk UI!)

Step 3: Send the results to your major teams and ask them, “What’s missing from this list?” When you’re thinking about teams to send this to, think Networking Team, Security Team, Windows Operations, Unix Operations, Applications Teams, etc.

 

New call-to-action

 

Step 4: Gather a list of which systems and system types are missing and investigate. Is this data that you can onboard?

Example: Networking looks at your list of sources and realizes it is missing the Juniper VPN. The Networking team sends the FW logs to a syslog server while the Splunk team loads the configs that will handle parsing and search.

Figure 3 - Pie chart showing all sources in Splunk
Figure 3 – Pie chart showing all sources in Splunk

There’s Your Needle

Collecting and maintaining the correct data sets can be a difficult task. From collaborating with many teams to finding the needle in the haystack of missing data, you’ve got your work cut out for you.

At Kinney Group, we’ve spent years finding the proverbial Splunk needle amongst a ton of data. Ensuring that you are ingesting the right data in the right way is one of our Splunk specialties. If you have trouble finding missing data or spinning up the right Splunk visualizations, we’re here to help!

New call-to-action

Leverage Splunk at Scale, Fast with Pure Storage

We live in the age of the operational intelligence platform. This technology undeniable for organizations because it is making machine data accessible, usable, and valuable to everyone. By harnessing platforms like Splunk, organizations can tear down departmental silos enterprise-wide, and can creatively ask questions of all their data to maximize opportunities. To make this a reality, hardware and infrastructure is a critical consideration.

Leveraging traditional storage approaches for Splunk at scale is under siege. One of the greatest risks to enterprise-wide adoption of Splunk is inadequate, under-sized, or non-performant hardware. Organizations frequently want to repurpose existing and aging hardware, leaving Splunk customers dissatisfied with the implementation, and possibly the entire platform.

 

New call-to-action

 

Good news. A superior hardware approach is here: running Splunk on Pure Storage FlashStack.

Get the Data Sheet: Click Below

Imagine an all-flash reference architecture that enables true harnessing of a Splunk deployment (technically, and for the bottom-line):

  • Smarter Computing. 5x greater efficiency at the compute layer
  • Operationally Efficient. 10x greater efficiency in rackspace, power, heating, and cooling compared to equal disk-based solution
  • Uniquely Virtualized. On a 100% virtualized environment with native Pure + VMware High Availability features
  • Smarter Spending. Higher ROI on hardware, so the same budget can be reallocated to further harnessing the power of Splunk at scale

The result is true competitive advantage as an organization achieves improved simplicity and performance while lowering the Total Cost of Ownership (TCO) of enterprise Splunk deployments. Splunk on Pure Storage Flashstack empowers organizations to manage large Splunk instances as they journey toward the analytics-driven, software-defined enterprise.

New call-to-action

Two Analytics Platforms Synergize for Holistic Application Monitoring

Pairing Splunk and AppDynamics

We now live in the era of the “software-defined enterprise”. Software applications represent the key enablers for commercial businesses and public sector organizations. Applications are no longer just enablers for back-office processes. Today, software applications are now the “face of the organization” to customers, partners, and also internal co-workers.

The era when customers would tolerate application failures being fixed in hours, days, or weeks are long gone. Today’s constituencies expect applications to be “always on”, and problems identified and resolved in minutes (if not quicker).

The ability to leverage analytics to support critical applications within the software-defined enterprise will define the winners and losers in the market. The power of IT operations analytics holds promise as the enabler for dramatically reducing Mean Time to Repair metrics for critical applications, regardless of where a problem exists.

The paragraphs that follow will provide insights into a proven approach for leveraging the power of analytics to identify and solve application problems quickly and to win in the market as a software defined enterprise.

A One-Two Approach: Winning Against Problematic Application Stacks

Pinpointing problems with large, distributed, and often legacy application stacks is difficult. Troubleshooting and identifying the underlying cause of internal and external customer facing problems can often take weeks or months. The result for organizations unable to solve applications problems is negative. End-user satisfaction goes down and precious customers can be lost forever. Organizations feel the pressure of hectic customer support war rooms, missed goals, and upset leadership and investors. Time is money; inefficiency and downtime for mission critical systems means lost revenue and angry customers.

But, there is hope. It’s a new day in analytics, and several solutions have entered the market recently that attempt to reduce Mean Time to Identify (MTTI) and Mean Time to Repair (MTTR) metrics for application troubleshooting with varying levels of success.

The bottom-line: in order for organizations to get the full picture and achieve holistic application stack monitoring, they need to use Splunk and AppDynamics for a cohesive view of their entire application stack. Splunk can natively see across the application stack to point to an issue. Then, AppDynamics can drill down and see into the proverbial “black box” (as illustrated in Figure 1), which is typically a database layer, the application layer, and UI/ Web layer.

Splunk can see around the “black box”, and AppDynamics can see into it.

Figure 1: Splunk can see around the “black box”, and AppDynamics can see into it.

Where Splunk Ends AppDynamics Begins, and Vice Versa

Splunk and AppDynamics can artistically be woven together to build a cohesive analytics solution for end-to-end application visibility. Here’s how.

Splunk Pros and Cons

Arguably, the most flexible tool to address application stack monitoring is a platform called Splunk. Entering the market in 2005 initially as a type of “Google” for monitoring, Splunk software quickly evolved into a flexible and scalable platform for solving application problems. It also emerged as a platform with a robust and configurable user interface, touting sleek data visualization capabilities. Those qualities have allowed it to become a standardized platform in application stack monitoring teams. How is Splunk better than the rest? There are two main reasons.

First, Splunk’s ability to correlate disparate log sources allows it to identify and find issues in tiered applications. Applications are commonly written in very different languages. Thus, they have few logging similarities in structure, content, or methodologies. For most traditional monitoring tools, configuring data source setups is labor intensive and needs to be aggressively maintained if the application or its environment changes. On the other hand, Splunk is elite in dealing with these differences “on the fly”, as it is able to monitor these disparate log sources in real-time as the data is consumed. Splunk’s advantage is because it can provide very flexible, reporting driven schemas as the data is searched. This is important with legacy applications due to limited standardization, especially in the application layer where most of the business logic and “glue” code resides for an application to work.

 

New call-to-action

 

Second, Splunk is easy to use for monitoring around an application, particularly in the networking, infrastructure, and Operating System (OS) layers, it has standard configurations which are fast to implement and where one can start deriving technical and business insights quickly. The areas where Splunk is the straightforward solution in IT Operations Analytics includes networking, operating system, storage, hypervisor, compute, load balancers, and firewalls.

Where does Splunk need assistance? With deep application performance monitoring in complex, highly distributed environments. This is because many mission-critical applications cannot be easily updated, and doing so is often too labor intensive (or impossible) to use the application logs to derive insights into problems. While legacy approaches to solving these monitoring problems are under siege, their existence is a reality as organizations transform. Splunk’s answer to this issue is in Splunkbase, the community for certified apps and add-ons. There is the Splunk App for Stream to monitor, ingress, and egress communicate points between the layers in the application stacks, database to application, and then application to UI/ Web layer. Still, with Splunk App for Stream this is deficient when compared to AppDynamics because monitoring “around” a problem only describes the downstream impacts, it cannot pinpoint the actual problem quickly.

BusinessTransactions

Figure 2: Pairing Splunk and AppDynamics achieves unparalleled visibility into the entire infrastructure (Splunk) while providing unified monitoring of business transactions to pinpoint issues (AppDynamics).

AppDynamics Pros and Cons

AppDynamics entered the marketplace in 2009 with a simple purpose: be the best for addressing deficiencies in application stack monitoring options, particularly for large, distributed, and often legacy application stack monitoring. They monitor the business transactions, which are the backbone of any application. In doing so, they found a common auditing language that transcends database, application, and UI/ Web layers, including full support for legacy applications, provided the application language is one that AppDynamics supports. You can access a list of languages and system requirements here.

A primary AppDynamics differentiator is that it has the innate ability to understand what “normal” looks like in an environment. The platform automatically accounts for all of the discrete calls that run through an application. Then, it can find bottlenecks by identifying application segments, machines, application calls, and even lines of code that are problematic. Unlike other Application Monitoring Tools (APMs), AppDynamics can monitor the application from the end user point of view.

Regarding business value, what does AppDynamics bring that Splunk cannot? As the application is updated as part of a normal software development cadence, AppDynamics agents will then autodiscover again, saving time on professional services and money on re-customizing monitoring. Conversely, the Splunk App for Stream can require re-customization as application code and topology is updated.

AppDynamics does need some augmentation its counterpart, Splunk, in looking outside of an application at the full stack. If the underlying problem is not with the code, but with the functionality of the environment, such as storage, networking, compute, or the operating system, AppDynamics cannot do in-depth problem diagnosis on broader infrastructure components. Instead, the traditional approach is that APM teams use several, narrowly focused “point tools” to monitor each layer, which causes silos within teams. To skip the silos, cue Splunk. Its sweet spot is as a “Single Pane of Glass” where it can tie together its own visibility and the visibility provided by AppDynamics to identify where in the massive environment the problem lies.

So, where Splunk ends AppDynamics begins, and vice versa.

Skip the Silos: Splunk and AppDynamics Synergize for a Holistic Approach

Splunk and AppDynamics both interact with the application infrastructure in a way that is straightforward to setup, easy to maintain, and can deliver fast time-to-value. By visualizing the output of these two platforms in Splunk, teams achieve a “single pane of glass” monitoring approach that gives the business a real-time, holistic view into distributed, complex application stacks.Spunk-ITSILayout

Figure 3: Visualizing the output of these two platforms together in Splunk, teams achieve a “single pane of glass” for applications and the infrastructure.

Pairing together the analytics platform synergies of Splunk and AppDynamics to achieve holistic application stack monitoring for the mission will reduce MTTI and MTTR. The organization will observe reliable, sustainable ROI as applications and the environment evolve with the inevitable business transformation. Leveraging machine data in real-time is the cutting edge in analytics and empowers organizations to creatively scrutinize all their data in an automated, continuous, and contextual way to maximize insights and opportunities.

About Kinney Group

Kinney Group is a cloud solutions integrator harnessing the power of IT in the cloud to improve lives. Automation is in Kinney Group’s DNA, enabling the company to integrate the most advanced security, analytics, and infrastructure technologies. We deliver an optimized solution powering IT-driven business processes in the cloud for federal agencies and Fortune 1000 companies.

New call-to-action

Pure Storage: the Full Promise of Flash Storage

Buying storage is like watching car ads on television. Oh, a celebrity likes that model, that one has models telling us we call sports by the wrong names, and that one is so testosterone filled that my beard grew just by watching the commercial.

That’s exactly what buying storage is about. Say you want (nay, need) a new truck. You look at the main suppliers: Ford, Chevy/GMC, and Dodge. Then you check out “the new kids” on the block: Toyota and Nissan. You might even look at other options like the ultra-high International, or something so ugly it’s cool like an old Unimog.  The vendors will bring out stats upon stats with what they can do. “We have best in class towing”, “We have the best fuel mileage”, and “We are so cool that after riding in our truck you can wrestle bears.”

So now we come to buying storage. “We are the most sold” – great, but I only want one. “We do 1.2 million IOPS” – okay, that does sound cool. “We are the fastest per BTU” – Gee, that’s something to brag about.

Then every so often a disruptor shows up. Maybe the new trucks run on hydrogen or all-electric. The first disruptors are sweet and expensive. Then affordable versions arrive — that’s when it gets interesting.

In storage that disruptor is flash. The All-flash arrays have shown up in strength and are as awesome to have as you might imagine — assuming you spend your time thinking about storage.

 

New call-to-action

 

At first the old crowd taunts the new trucks and tries to discredit them (all while secretly working on their own). That’s what happened in storage. Now that the big storage boys are all entering the all-flash array space they have stopped calling it a gimmick.

gimmick, noun — Anything you do better than we do.

Now that the market has become mainstream in acceptable, where do you go for a device? Right to the people who do it best at a price that isn’t premium — Pure Storage.

Why Pure Storage instead of Vendor X, Y, Z, and the traditional vendors that are making their appearances? Our investigation led us to Pure Storage as the best all-flash array: proven track record (who wants to use beta-worthy gear?), huge feature set, simplicity (the user guide is on a business card), awesome colors (people buy vehicles for less valid reasons), and performance in the real world.

Pure delivers this flash power at costs comparable to spinning disk. Flash (the super great new thing) at the cost of the traditional disk. I love my trucks, but if someone wants to upgrade my vehicle, who am I to argue?

Pure Storage handles the toughest workloads, such as Database and VDI boot/patch storms, like traditional storage handles file requests. We have customers that put their Pure Storage boxes to the test on heavy workloads all day, every day, and their bottlenecks are no longer storage. The bottlenecks are now how fast can they throw data at their storage.

Pure Storage handles the toughest workloads, such as Database and VDI boot/patch storms, like traditional storage handles file requests.

When Pure Storage built their array to handle Flash from the ground up they did more than just slap in flash disks. They looked at what is painful and obscure in storage and made it easy.

Remember when dealers would sap you for routine maintenance to make all the parts of your vehicles to keep working? With traditional storage your admins need the level of knowledge as those dealer’s mechanics. With Pure Storage all the sensors, parts to maintain, and software have built-in know how. Those trips to the dealer aren’t needed. Pure Storage does the hard work internally — picture your truck giving itself oil changes.

Pure Storage has brought the full promise of flash storage, and we can get the luxury model for the cost of your traditional storage. And as other’s have said, “It is built Pure tough.”

Note: No trucks were harmed in the authoring of this blog post.

New call-to-action