Defining Data Sprawl in Splunk: Why it Matters, and What it’s Costing You

“Data Sprawl” isn’t really a technical term you’ll find in the Splexicon (Splunk’s glossary). Here at Kinney Group, however, we’ve been around Splunk long enough to identify and define this concept as a real problem in many Splunk environments.

What exactly is Data Sprawl? It’s not one, single thing you can point to, rather a combination of symptoms that generally contribute to poorly-performing and difficult-to-manage Splunk implementations. Let’s take a look at each of the three symptoms we use to define Data Sprawl, and break down the impacts to your organization:

  1. Ingesting unused or unneeded data in Splunk
  2. No understanding of why certain data is being collected by Splunk
  3. No visibility into how data is being utilized by Splunk

Ingesting unused or unneeded data in Splunk

When you ingest data you don’t need into Splunk, the obvious impact is on your license usage (if your Splunk license is ingest-based). This may not be terribly concerning if you aren’t pushing your ingest limits, but there are other impacts lurking behind the scenes.

For starters, your Splunk admins could be wasting time managing this data. They may or may not know why the data is being brought into Splunk, but it’s their responsibility to ensure this happens reliably. This is valuable time your Splunk admins could be using to achieve high-value outcomes for your organization rather than fighting fires with data you may not be using.

Additionally, you may be paying for data ingest you don’t need. If you’re still on Splunk’s ingest-based pricing model, and you’re ingesting data you don’t use, there’s a good chance you could lower Splunk license costs by reducing your ingest cap. In many cases, we find that customers have license sizes higher than they need to plan for future growth.

We commonly run into scenarios where data was being brought in for a specific purpose at one point in the past, but is no longer needed. The problem is that no one knows why it’s there, and they’re unsure of the consequences of not bringing this data into Splunk. Having knowledge and understanding of these facts provides control of the Splunk environment, and empowers educated decisions.

No understanding of why certain data is being collected by Splunk

Another common symptom of Data Sprawl is a lack of understanding around why certain data is being collected by Splunk in your environment. Having the ability to store and manage custom metadata about your index and sourcetype pairs — in a sane and logical way — is not a feature that Splunk gives you natively. Without this knowledge, your Splunk administrators may struggle to prioritize how they triage data issues when they arise. Additionally, they may not understand the impact to the organization if the data is no longer is coming in to Splunk.

The key is to empower your Splunk admins and users with the information they need to appropriately make decisions about their Splunk environment. This is much more difficult when we don’t understand why the data is there, who is using it, how frequently it is being used, and how it is being used. (We’ll cover that in more detail later.)

This becomes an even bigger issue with Splunk environments that have scaled fast. As time passes, it becomes easier to lose the context, purpose, and value the data is bringing to your Splunk mission.

Let’s consider a common example we encounter at Kinney Group.

Many organizations must adhere to compliance requirements related to data retention. These requirements may dictate the collection of specific logs and retaining them for a period of time. This means that many organizations have audit data coming in to Splunk regularly, but that data rarely gets used in searches or dashboards. It’s simply there to meet a compliance requirement.

Understanding the “why” is key for Splunk admins because that data is critical, but the importance of the data to end users is likely minimal.

(If this sounds like your situation, it might be time to consider putting that compliance data to work for you. See how we’re helping customers do this with their compliance data today with Atlas.)

The Atlas Data Management application allows you to add “Data Definitions,” providing clear understanding of what data is doing in your environment.

No visibility into how data is being utilized by Splunk

You’ve spent a lot of time and energy getting your data into Splunk but now you don’t really know a lot about how it’s being used. This is another common symptom of Data Sprawl. Making important decisions about how you spend your time managing Splunk is often based on who screams the loudest when a report doesn’t work. But do your Splunk admins really have the information they need to put their focus in the right place? When they know how often a sourcetype appears in a dashboard or a scheduled search, they have a much clearer picture about how data is being consumed.

Actively monitoring how data is utilized within Splunk is extremely important because you can understand how to effectively support your existing users and bring light to what Splunk calls “dark data” in your environment. Dark data is all of the unused, unknown, and untapped data generated by an organization that could be a tremendous asset if they knew it existed.

Kinney Group’s Atlas platform includes Data Utilization — an application designed to show you exactly what data you’re bringing in, how much of your license that data is using, and if it’s being utilized by your users and admins.

Conclusion

Most organizations may not realize that Data Sprawl is impacting their Splunk environment because it doesn’t usually appear until something bad has happened. While not all symptoms of Data Sprawl are necessarily urgent, they can be indicators that a Splunk environment is growing out of control. If these symptoms go unchecked over a period of time they could lead to bigger, more costly problems down the line.

Knowledge is power when it comes to managing your Splunk environment effectively. Kinney Group has years of experience helping customers keep Data Sprawl in check. In fact, we developed the Atlas platform for just this purpose. Atlas applications are purpose-built to keep Data Sprawl at bay (and a host of other admin headaches) by empowering Splunk admins with the tools they need.

Click here to learn more about the Atlas platform, to get a video preview, schedule a demo, or for a free 30-day trial of the platform.

Bridging the Splunk Usability Gap to Achieve Greater Adoption and Expansion

Splunk, the amazing “Data to everything” platform, provides some of the best tools and abilities available to really control, analyze, and take advantage of big data. But you don’t build such a powerful and expansive platform over a decade without it being a bit technical, and even difficult, to fully utilize.

This technical hurdle — that we lovingly call the “Usability Gap” — can stop Splunk adoption in its tracks or stall an existing deployment to its ruin. By clearing the Usability Gap, however, a Splunk environment can prosper and deliver a fantastic return on your investment.

So it begs a question — “what is the Usability Gap, and how do I get across?”

How to Recognize the Gap

What exactly makes up the steep cliff sides of the “Usability Gap?” Well, these symptoms can manifest themselves in any Splunk deployment or client ecosystem, and is caused just as much by human elements as technical blockers. 

The key to any good Splunk deployment is a properly focused admin. Many admins or admin teams were handed Splunk as an additional responsibility instead of a planned and scoped aspect of their job. This disconnect can lead to under-certified admins who lack the time and experience needed to quickly solve issues and incoming requests from Splunk users.

Splunk users can also be underequipped and undertrained. While formal training is available for users with Splunk Fundamentals certification and other online training, they may not meet the user where they are, and those solutions lack the benefits of in-person training with real, actionable data. These issues can be big blockers for learning Splunk and increase the time it takes for users to become confident with the system.

If you’re still not sure if you have a Usability Gap issue, check the activity found on the system itself. If your Splunk search heads are getting little action from users and admins, you know for a fact that something is coming between your users and their Splunk goals.

What a Gap Means for You

What are the consequences of a Usability Gap? They are wide ranging and impactful.

With a lack of focus and experience, admins are going to be severely hampered in achieving outcomes with Splunk. When technical issues arise with the complex Splunk ecosystem, or a unique data set requires attention, admins will have to carve out time to not only work on the issue at hand but learn Splunk on-the-fly as well. Without the proper support, progress slows and a lack of Splunk best practices is to be expected in these deployments.

Users without a watchful or knowledgeable eye will be left to their own devices. This can lead to poorly created searches and dashboards, bad RBAC implementation (if implemented at all), or worse — no movement at all. Without a guiding hand and training, the technical nature of Splunk will eventually cause users to misconfigure or slow down the platform, or just not adopt such an imposing tool. These issues together can lead to a peculiar event, where Splunk is labeled as an “IT tool for IT people.” This is far from the truth, but if users are not properly trained, and admins don’t have time to be proactive, only the technical savvy or previously experienced will be able to utilize the investment. While some outcomes will be achieved, many organizations will realize their significant investment isn’t aligned with their outcomes and will drop Splunk altogether, putting all the effort and time invested to waste.

Mind the (Usability) Gap

Fortunately, there’s an easy answer for solving these problems and bridging the Usability Gap in your environment — the Atlas™ Platform for Splunk. Atlas is geared towards increasing and speeding up Splunk adoption and enabling Splunk admins to do more with their investment. Let’s look at the elements of Atlas that help bridge the Usability Gap!

The Atlas Application Suite, which is a collection of applications and elements that reside on the search head, helps admins improve their deployment, and zero in on giving users a head start with achieving outcomes in Splunk. One such application is the Atlas Search Library.

Search Library gives users an expandable list of Splunk searches that are properly described and tagged for discoverability and learning. Using the Search Library, a Splunk User can create a library of knowledge and outcomes when it comes to the complex nature of Splunk’s Search Processing Language. This greatly accelerates skill sharing and education around SPL — one of Splunk’s biggest roadblocks.

Another element is the Atlas Request Manager. This application greatly increases the usability of Splunk by quickly linking admins and user with a request system built into the fabric of Splunk itself. Admins no longer need to spend time integrating other solutions, and users receive a robust system for asking for help with creating dashboards, Splunk searches, onboarding data, and more — all within Splunk!

Adding a data request is quick and painless thanks to Atlas Request Manager

Last, but certainly not least in bridging the Usability Gap, is Atlas Expertise on Demand. Expertise on Demand (EOD) is a lifeline to Kinney Group’s bench of trusted, Splunk-certified professionals when you need them most. EOD provides help and guidance for achieving outcomes in Splunk, and can lead the charge in educating your admins and users about all things Splunk. With EOD, your admins and users have all the help they need to maximize their Splunk investment.

Wrapping up

The Usability Gap is too big a problem to ignore. Frustrated users, overtaxed Splunk admins, and a clear lack of outcomes await any Splunk team that ignores the clear symptoms and issues presented by the Usability Gap. Hope is not lost, however! The Atlas platform is purpose-built to help you get over the hurdles of adopting and expanding Splunk. With incredible tooling to simplify searches, SPL gaps, and managing requests, not to mention Expertise on Demand, Atlas provides the support admins need and Splunk users with the attention they deserve for education and meeting their Splunk goals!

This just scratches the surface of what Atlas can do for your Splunk journey, so read more about our incredible platform and discover what you are missing!

Meet ES Helper

ES Helper is the purpose-built tool for getting Splunk Enterprise Security over the hump and actionable for Security Teams. Enterprise Security is a complex tool that takes all kinds of data to create its interesting visuals and track its notable events, but due to its complexity it can be off-putting to new or inexperienced Splunk Admins. ES Helper is here to bridge the gap and help those get a head start on utilizing an amazing security tool. 

Where are we at?

Every action plan needs a starting place, and ES Helper figures that out for you with automation. With a set of interesting and complex searches, the Atlas Element analyzes your ES Deployment and gives out the ES Utilization Score.

This score immediately tells Splunk and ES Owners where they ‘are’ with their deployment, and what room they have to grow. This is supremely beneficial for tracking growth of the platform and analyzing how well you are utilizing your investment into security and simplifies the complex workflows of ES into a digestible format.

What’s next?

After gaining perspective on the status of the Enterprise Security deployment with the ES Utilization Score, the next logical question is to ask how to improve it. ES Helper is right there with you with the ES Datamodel Report. This Report shows how much data is being ingested into Enterprise Security, and furthermore, layers a priority lens over it for context.

The Priority labels are derived from Team Atlas’s investigation in how Splunk ES utilizes the data, the importance of the tied outcomes from the data, and how much bang for the data buck each data point gives Security Analysts. Using this Priority, and the investigation into how filled the datamodels are, Splunk Admins can quickly identify which datamodel should be buffed up with more data to improve data coverage in Enterprise Security.

Lucky for the Admins, selecting the datamodel in Atlas ES Helper quickly identifies any recommended sourcetypes to fill out the datamodel with actionable data.

This workflow enables Admins to go from zero to hero with ES with a clear line of sight on next steps for improving their security monitoring and posture!

What’s Changed?

After updating a datamodel with a whole slew of additional data sources. An Admin may ask what impact they actually had. With ES Helper, Admins can utilize our analysis to get quick results on which dashboards and searches changed, enabling a quick validation check and reward for hard work!

Conclusion

ES Helper speed up a technical and slow process of improving an Enterprise Security deployment. By fast tracking a Splunk Admin’s ability to analyze their environment, identify new data sources, and track changes, Splunk Admins can quickly improve and track their improvement to their Security CIM. This effort is even more improved by bringing Expertise on Demand into the mix, who will further enable Admins to meet their security needs ahead of schedule!

Meet Data Utilization

Data Utilization is an excellent companion Element to Data Management. While Data Management is focused on tracking ingests with metadata and awareness alerts, Data Utilization is centered on using automation to help Admins and Users track how Users, Scheduled Searches, and Dashboards are utilization data being ingested into Splunk.

How is this being used again?

Data Utilization helps Admins quickly identify how data is being used across their environment by users. By tracking how ad-hoc searches and scheduled searches are searching across all data, Data Utilization can highlight active data streams. Furthermore, Data Utilization investigates dashboards that have been used lately, and investigates what data is being utilized on each dashboard load. All of this comes together into an easy-to-understand report.

Admins can change the filter for the search, splitting the data by either index, for high level investigations, index-sourcetype, for normal baselines, and index-sourcetype-source to identify individual data points that slipped the cracks. Admins can select any one of these findings to learn more about its utilization.

Using Data Utilization, Admins can quickly identify who is searching a sourcetype, using what scheduled searches, and on what dashboards, and when! Admins can also inspect the SPL associated with each of these three options!

Make way for the new!

Data Utilization also offers a powerful perspective for Splunk Owners. By analyzing how data is being utilized, Admins can quickly identify any depreciated data streams that could be removed from Splunk. The benefits for this are evident, as it can make room for other ingests for more important use cases, or bring a deployment down below their license level, reducing Splunk operating costs. Another benefit is the reduction in technical debt, as Splunk Admins can now focus on data streams that matter for their users!

Conclusion

Data Utilization is a powerful tool, enabling Splunk Admins to quickly come to terms how their environment is being used by both Users and Scheduled Searches, while empowering Admins to jumpstart discussions for prioritizing data streams. With Data Utilization, Admins can more easily reduce license utilization while increasing visibility. 

An Introduction to Atlas

Splunk is an amazing Data Analytics platform, able to receive data from all over your ecosystem, perform crazy effective correlation searches and populate tidy dashboards. But a platform this big and mature obviously has some hurdles for owners to overcome before receiving the full benefits of such a large investment. Atlas can help you clear these Splunk sore spots and get you and your team on the fast track to achieving outcomes and mastering your Splunk Environment and Utilization.

When we say Splunk is a massive platform, we mean it! From Splunk Core, which consists of Search Heads, Indexers, and Forwarders, to Splunk Enterprise Security, Splunk SOAR, Splunk ITSI, the Splunk portfolio is as complex as the unique and technical requirements demand. And peeling behind the curtain, powering these applications and outcomes are Splunk searches, scheduled alerts, dashboards, data models, lookup tables, KV stores, and the list goes on and on. All these ingredients mixed together in what could be an overwhelming dish for many new and old Splunk owners.

Zeroing in on New Splunk Owners, these fresh faced Splunkers are eager to get results out of their new platform but may be overwhelmed by all the bits and bots of the platform. What should new Splunk Users and Admins do first? Learn Search Processing Language? Get Data into Splunk through a one-time drop or a consistent data feed? Build a simple dashboard? What should Users and Admins learn first? How to control for concurrency in alerting, or how to write better SPL searches to ensure the environment doesn’t tank? These questions can hamper Splunk adoption, and lead to an unorganized, and unoptimized, Splunk environment. 

Mature Splunk Owners are not without issues as well. With Admins swapping in and out of management teams, is the Splunk environment under a consistent and knowledgeable enough watch to improve?  How are new users and data onboarded to ensure stability? How do Admin teams stay proactive with Concurrency, and deprecated data streams? These issues can slow down a once effective Splunk Team, and more importantly slow or stall a Splunk System.

As we promised earlier, Atlas can help with these issues and light the fire of Splunk Adoption and Expansion for your team. But what is Atlas, and how does it look in practice? An Atlas subscription consists of 3 products: 

  • Reference Designs, which is packaged automation to get Splunk operational on unique hardware ecosystems to ensure best practice and better performance
  • Expertise on Demand (EoD), a fantastic service dedicated to live help with achieving Splunk outcomes such as data onboarding or education
  • Atlas Application Suite, a collection of Atlas Elements that help Admins and Users master Splunk

The Atlas Application Suite resides on your Search Head layer and is easily applied like a collection of Splunk Apps. These Elements work together to achieve great outcomes for your Splunk environment and Users. These elements align themselves with common themes that plague Splunk deployments everywhere:

  • Data Sprawl: Keeping track of ingests and ensuring license utilization is spent wisely
  • Search Quality: Poor scheduled searches and concurrency can severely impact performance or results
  • Data Awareness: Ensuring data streams stay healthy and automating reporting for disruptions
  • Usability Gap: Lowering the bar to entry for utilizing and managing Splunk
  • Cloud Migration: Making the bridge to Splunk Cloud quick and easy

Aligned to these themes, Atlas can make short work of the hurdles we mentioned earlier, especially since the Atlas Application Suite resides on Splunk itself!

For new Splunk Owners, Expertise on Demand and the Search Library can guide Users and Admins down the best route for learning Splunk SPL and building dashboards. Scheduling Inspector empowers Splunk Admins to ensure Scheduled Searches are working as expected and Data Management enables Admins to have visibility into how ingest License is being spent.

Mature Splunk Environments can appreciate Atlas’s ability to tune up their environment and give Admin tools they need to not just manage Data Sprawl but put their Forwarders under a watchful eye and highlight unreliable data streams. Scheduling Assistant will help Admins speed up the environment and reduce errors as they reduce system concurrency, and Expertise on Demand is of course still there to assist with any issues outside the Splunk Team’s scope.

Splunk is a large platform that deserves a well-equipped team of Splunk Admins and a well-supported squad of Splunk Users. With Atlas, Splunk Environments both big and small, new and old, can get more out of Splunk, making it faster, more effective, and able to tackle new goals. With Atlas, you can begin your Splunk Journey with the right foot forward.

Meet Atlas Forwarder Awareness

If there was a secret sauce for Splunk, the key ingredient would be the platform’s forwarders. Providing users with the ability to automatically send data for indexing, Splunk forwarders are essential to data delivery in Splunk Enterprise and Splunk Cloud environments.

In most Splunk instances, you have multiple, if not hundreds, sometimes even thousands of, forwarders. These forwarders throw data at your search heads and indexers in order to read and store your data. However, there has historically been an issue with forwarders: they can silently drop dead.

If you’re looking at your data pipeline in Splunk, your forwarders are on the front line. Forwarders play a pivotal role in ingesting your data; however, they can disappear or unexpectedly fail without you knowing. A missing forwarder may result in an issue as small as temporarily not ingesting data or as large as a dashboard or alert missing key information for weeks.

To solve this time-old problem in Splunk, we’ve built an application within Atlas, our platform for Splunk, that allows you to have eyes on all of your forwarders in one place.

Atlas’ Forwarder Awareness Application

Atlas Forwarder Awareness is an application that provides visibility into all of your forwarders, their statuses, and any misconfigurations or failures within your environment. Built within the Atlas Application Suite, the Forwarder Awareness tool enables teams to have constant visibility into their forwarders’ health and statuses.

Now teams can quickly determine if a forwarder is missing and take action—immediately.

A Birds’ Eye View of Your Ecosystem

Atlas Forwarder Awareness brings the Atlas touch to your Splunk and Infrastructure by empowering Admins to quickly group together forwarders by context or server classes. This means that Admins can quickly identify if a critical forwarder is down from the Forwarder Group overview, and further investigate with ease.

By giving this overview, Admins won’t be swamped with false positives or low priorities but will receive actionable information with the appropriate context. Furthermore, Atlas’s automation also tracks uptime on a per Forwarder basis, enabling Admins to identify problematic data streams.

Forwarder Awareness also comes pre-packaged with Alerts that can quickly inform Admins and Group Owners of failures in their critical systems, ensuring fast turnaround time for fixing issues.

Selecting one of these tiles enables Admins to drill down further into the Forwarder Report.

A Clear Dive

On the Forwarder Report, the Admin gets a heads-up display of the most actionable items related to Forwarders. Version, SSL status, throughput, receiver count, uptime, status and more! This enables Admins to understand what action times they need to take, while applying the group filter from the previous page.

Selecting a Forwarder offers a deeper dive into ingest of sourcetypes and throughput over time, while also outlining what sourcetypes the forwarder is responsible for. In fact, the entire Forwarder Awareness Element enables users to search by sourcetype, enabling users to quickly see the status of all forwarders tied to particular outcomes!

Conclusion

Every Splunk instance is at risk of a failed or missing forwarder. With your forwarders being at the front line of your data pipeline, it’s essential to have eyes on them at all times. With Atlas’s Forwarder Awareness Application, you have the visibility you need, and visibility you won’t find anywhere else. Paired with built-in alerting, Splunk Admins powered by Atlas will have all the tools necessary to make the most robust Splunk system out there!

This is just a glimpse into the power of the Atlas platform. Paired with more applications, reference designs, and support services, Atlas enables all Splunk teams to be successful. If you’d like to learn more about the Atlas Platform, let us know in the form below.

Meet Atlas Data Management

Splunk is the data to everything platform, capturing massive volumes of data every day. Users will know, though, that without visibility, it can be difficult to extract the maximum value from Splunk. Too often, insufficient monitoring can lead to serious issues in a Splunk instance: platform underutilization, license overage, and even missing data. Each of these problems translates into a serious cost in financial resources, not to mention the hours of human intervention spent on troubleshooting a Splunk environment.

Atlas makes data management easy.

Figure 1 — the Data Management icon on the Atlas Core homepage

Atlas, Kinney Group’s revolutionary new platform for Splunk, includes the Data Management application, a tool that de-mystifies your license costs and improves Splunk stability with automated reporting. Gone are the days of license overages and data streams quietly dying in the dark—Atlas ensures unparalleled visibility to guarantee efficient use of data resources.

Attack Your Sprawl!

Data Management has the pivotal tool in Atlas for reigning in your license and keeping it in check. The Data Inventory dashboard organizes your existing data by sourcetype and index, with visibility into license utilization per sourcetype. With the Atlas Feature, Data Definitions, Admins can record a plethora of notes on each sourcetype, keep track of the who, what, when, where, and why of each data point! Data Definitions also automatically pull important Index and Sourcetype information, coalescing many different data points helpful for triaging into one place.

With these two enhancements together, Admins can use Data Management to finally own their Data Ingests into Splunk, with the ability to easily track, update, and be informed on how they are spending their license. This gives Admins the tools they need to course correct or triage issues like never before!

Automate Alerts and Awareness!

Data Management has another trick up it’s sleeve. On the Data Inventory dashboard, there exists a feature capable of greatly increasing visibility into data pipelines and ensuring critical dashboards and alerts stay accurate. Meet Data Watch!

Each sourcetype on Data Inventory has a bell icon that can be selected to turn on Data Watch, an automated alert interface that turns your needs for stability and visibility into easy to track and edit alerting!

Using Data Watch, Admins can quickly create alerts for data streams to ensure they are the first to know if a data pipeline faulters. Using pattern recognition, Data Watch can alert if the data stream dips below a threshold or comes from fewer sources, creating a technical trip wire to alert based on bad behavior in your pipelines.

With Data Watch, Admins can quickly expand their horizon on their Splunk Deployment, with minimal time needed thanks to Atlas’s automation to set up and further edit!

Conclusion

The “data to everything” platform promises incredible results—but you need a high degree of visibility within a Splunk environment to make that happen. Atlas’s Data Management application provides the transparency you need to ensure your license and ingests are being properly tracked to prevent sprawl and managed to prevent adverse errors seeping into your dashboards. Teams can now collaborate seamlessly with the knowledge that their data streams won’t be hidden or lost, bringing your organization one step closer to getting every insight you can out of your data.

There’s more to come from Atlas! Fill out the form below to stay in touch with Kinney Group.