What is more important for a successful Splunk Enterprise implementation: IOPS, number of CPU cores, or the amount of memory (RAM)? The answer is IOPS is the most important requirement. While both CPU and memory are important factors, they are both building blocks on the core requirement of Splunk Enterprise, which is 800 (or preferably more) IOPS.
Taking a step back though, what are IOPS? IOPS are a measurement that stands for input/output operations per second. To further simplify, IOPS measure the number of times the disk or storage device can read or write per second. IOPS are an industry standard for benchmarking storage devices and disks.
How does this storage hardware jargon pertain to Splunk Enterprise? Since Splunk Enterprise collects, indexes, and stores any type of machine data, IOPS lie at the core of how Splunk Enterprise works because servers are fundamentally machines with physical components. As data is forwarded from a universal forwarder to an indexer, the indexer must parse and write that data to disk. Then, when the search head looks for the data, the indexer must recall that data from the disk and pass it back to the search head, where it will be formatted and presented for analysis, dashboarding, and reporting. The goal: the storage needs faster reads and writes. The faster the disk can read and write the data, the less the CPU has to wait for the data to arrive and be processed, whether that be during the search or while the data is in the parsing and indexing phase.
Here is an example to demonstrate how the IOPS bottleneck can negatively impact the success of Splunk Enterprise implementation. Let us pretend we are passing 100 GBs per day to an indexer with 12 cores, but the storage is only providing 200 IOPS. This means that the CPUs will be stuck holding the data while it waits for the disks to be ready, causing a cascading effect of negative consequences. How? This scenario adversely impacts the environment twofold. First, it increases your CPU utilization, which creates data center inefficiency; your power consumption increases, causing more heat, which then causes your cooling costs in your data center to rise. Second, since your CPUs are allocated, you need to incorporate an additional three indexers to handle the amount of data you are trying to ingest with Splunk Enterprise. The lack of IOPS is not only causing processing trouble at the time of indexing, but also at the time of searching. This is because indexing is always prioritized over searching, so anytime a search is requested it is getting queued.
Think about how you get queued in line at the amusement park at the hot new roller coaster. This analogy describes processing index time versus search time. You (“searching”) have been patiently waiting in a treacherous line for two hours. You finally make it to the front. You jump with joy, naturally, and mentally prepare to get strapped in for the ride of your life. Then suddenly, a big group of people (“indexing”) who have VIP passes walk up, and hop right on in front of you. Surely your dreams are shattered as you learn you have to wait another half hour. This example shows how the lack of IOPS puts the searching functionality on a lower priority.
In a perfect world, for a successful Splunk Enterprise integration it is important you meet the minimum requirements of 800 IOPS. Remember: at 800 IOPS, the CPUs are no longer bottlenecked. The process does not have to wait for the disks to be available for reads or writes, and it can pass the data straight along to its storage destination. At 800 IOPS, you no longer need the extra servers to support your license. For your installation, you could reduce down to a single indexer, although I still recommend having a second one to help your search performance in the long run. Ultimately, you will bring down your hardware cost, cooling cost, and energy costs.
Spinning disk has physical limitations and less IOPS performance than all-flash storage arrays, which are now much more affordable for the data center than they were 5 years ago.When you dedicate an all-flash storage array to your Splunk deployment and thus present thousands of IOPS to your indexer, now that single indexer that was handling 100 GB a day can now handle 200 GB or more. Now, the IOPS game changes: your storage is waiting on the CPUs to pass the data, as it can write data faster than the CPU can process the data.
To measure your existing IOPS, which will help forecast what you might need, you have free, effective options. We recommend free tools, such as Bonnie++ on Linux and IOmeter on Windows. Both of these tools can match the full, random read and write process of Splunk Enterprise. Make your Splunk Enterprise deployment a success and ensure that your architecture has the correct number of IOPS, because it is the most important requirement of a successful deployment.