When configuring Splunk Edge Processor environments, pipelines serve as the critical middle layer that transforms, filters, and enriches your data streams. Unlike traditional Splunk deployments that rely on props.conf and transforms.conf configurations, Edge Processor leverages powerful SPL2-driven pipelines to manipulate data between source and destination.
What Are Splunk Edge Processor Pipelines?
Edge Processor pipelines provide enterprise-grade data processing capabilities that enable you to:
- Extract specific fields from raw data streams
- Mask sensitive information for compliance and security
- Filter and drop unnecessary data to optimize storage costs
- Transform data formats before indexing
- Route data to appropriate destinations based on business logic
This pipeline-based architecture replaces traditional intermediate forwarder configurations, offering more flexibility and real-time processing power.
If you would like to know more about Edge Processors, please see our blog article here.
Alternatively, you can also check out our YouTube video about Edge Processors.
SPL2: The Next-Generation Query Language for Data Engineers
SPL2 (Search Processing Language 2) represents Splunk’s evolution in data processing languages, combining the best features of both SPL and SQL. This hybrid approach makes SPL2 incredibly accessible for developers with experience in either query language.
Key SPL2 Advantages:
- Intuitive syntax designed for faster learning curves
- Familiar commands for SPL and SQL practitioners
- Enhanced performance for large-scale data processing
- Built-in optimization for cloud and edge environments
- Streamlined development workflow
Whether you’re migrating from traditional SPL searches or SQL-based data transformations, SPL2‘s familiar syntax patterns accelerate your development timeline while providing enterprise-scale processing capabilities.
Why This Guide Matters for Your Data Strategy
This tutorial focuses specifically on cost optimization, time savings, and operational efficiency through strategic SPL2 pipeline implementation. Rather than covering every SPL2 use case (including Ingest Processors and SPL2 application development), we’ll dive deep into the most impactful pipeline commands and data transformation patterns.
You’ll learn practical SPL2 techniques that directly impact your:
- Infrastructure costs through intelligent data filtering
- Processing performance via optimized transformations
- Operational complexity through simplified configuration management
SPL2 Syntax Fundamentals: From SPL to SPL2 Migration
SPL2 inherits much of its command structure from traditional SPL, making it immediately familiar to Splunk professionals. However, several key syntax differences optimize performance and improve code readability.
Universal SPL2 Command Structure:
| command_name [optional_arguments]
Core Syntax Rules:
- All commands begin with a pipe (|) separator
- Command names are followed by arguments (optional in brackets [], required in angle brackets <>)
- Ellipsis (…) indicates repeatable arguments
- Optional arguments must be before required arguments
Practical Example: The fields Command
The fields command demonstrates essential SPL2 syntax patterns for field selection and filtering:
Syntax Pattern:
| fields [-|+] [...]
Real-World Implementation:
| fields + source, 'host-ip', '*dest*'
This command configuration:
- Includes only specified fields (+ operator)
- Selects the source field
- Captures host-ip (using single quotes for hyphenated fields)
- Matches any field containing “dest” (wildcard pattern)
When migrating existing SPL searches to SPL2 pipelines, pay special attention to field naming conventions and delimiter requirements to avoid parsing errors.
SPL2 Commands for Edge Processor Pipelines
Now that you understand SPL2 fundamentals, let’s dive into the most used pipeline commands that will power your data transformation workflows. While this isn’t an exhaustive reference, these commands cover the most common Edge Processor use cases.
NOTE: All Edge Processor pipelines begin with $pipeline syntax to distinguish them from SPL2 search queries.
from Command: Sourcing Data
The from command initializes every pipeline by defining your data input source. This mandatory command appears as the first operation in your pipeline workflow.
Syntax:
| from $source
Key Features:
- Automatic inclusion in new pipeline templates
- Flexible filtering by sourcetype, host, or source parameters
- Dataset specification through the $source parameter
- Guided configuration with built-in prompts for source selection
Best Practice: Use specific sourcetype filters to optimize processing performance and reduce unnecessary data ingestion.
into Command: Routing Our Data
The into command defines where processed data flows after pipeline transformations are complete. This terminal command supports both single and multi-destination routing strategies.
Syntax:
| into $destination
| into $destination2
| into $destination3
Routing Capabilities:
- Multiple destinations for data replication and backup
- Conditional routing based on processing results
- Flexible endpoint support (any configured destination)
- Sequential numbering for multi-destination pipelines
Use Case: Send security logs to both Splunk indexers and long-term S3 storage simultaneously.
rex Command: Extracting Fields
The rex command extracts structured fields from unstructured text using PCRE (Perl-Compatible Regular Expressions). Unlike search-time extractions, pipeline rex commands write extracted fields directly to your index.
Syntax:
| rex [field=][max-match=][offset_field=][mode=sed] |
Real-World Example:
| rex field=_raw /SRC=(?\d+\.\d+\.\d+\.\d+)/
Here we are extracting the source IP. When using a pipeline with sample data, you will see a button which will present a table below with your sample data. Notice the new column with the name “source_ip” with the values from your raw data shown below.
Tip: Use regex101.com with sample data to test capture groups before implementing them in production pipelines.
Performance Benefits:
- Index-time extraction eliminates search-time processing overhead
- Immediate field availability for downstream commands
- Reduced query complexity in user searches
eval Command: Dynamic Field Transformation
The eval command creates new fields or modifies existing ones using mathematical operations, string functions, and conditional logic. This versatile command maintains SPL compatibility while supporting SPL2 syntax enhancements.
Syntax:
| eval [, …]
Practical Example:
| eval port = if(port == “”, “No Port Found”, port)
In this case, we are filling any empty values for our “port” field with the string “No Port Found”. This ensures that any user who searches this data has the added context when looking at this field.
Common Use Cases:
- Data normalization and standardization
- Field concatenation for composite keys
- Hash generation for data privacy
- Mathematical calculations and aggregations
- Conditional field population for data quality
User Experience Benefit: Populate missing fields with meaningful defaults to improve search result clarity.
route Command: Intelligent Data Branching
The route command creates conditional processing paths within your pipeline, enabling sophisticated data routing based on field values or complex expressions.
Syntax:
| route == , [[] | into ]
Advanced Implementation:
| route network_type == “VPN”, [| rex mode=sed field=_raw "s/SRC=\\d+\\.\\d+\\.\\d+\\.\\d+/SRC=XXX.XXX.XXX.XXX/g" | into $destination2]
Here, we want to send a subset of our data from our VPN network users to a separate location. But before we do, we want to mask the source IP addresses of our users. In this case, we use route and set the “network_type” to “VPN”. This will send any event being brought into this pipeline with this field-value pair to this second destination instead.
Strategic Advantages:
- Conditional processing reduces computational overhead
- Data segregation for compliance and security requirements
- Multi-tier architectures with specialized data paths
- Performance optimization through selective transformations
Real-World Scenario: Route VPN traffic to a separate destination with IP address masking for privacy compliance, while standard traffic flows to primary indexes unchanged.
Pipeline Development Best Practices
#1: Performance Optimization:
- Place route commands after general field extractions
- Use specific source filters to minimize data volume
- Test regex patterns with sample data before production deployment
#2: Operational Efficiency:
- Implement consistent field naming conventions
- Document complex regex patterns for team collaboration
- Monitor pipeline performance metrics regularly
Implementation & Next Steps
Cost Optimization Results
Organizations implementing SPL2 Edge Processor pipelines typically achieve:
- Reduced provisioning for indexer infrastructure
- Faster data onboarding for new sources
- Simplified compliance workflows through automated data handling
- Reduced operational overhead via centralized pipeline management
Getting Started:
- Identify high-volume data sources that would benefit from distributed processing
- Leverage the Pipeline Editor for rapid prototype development
- Start with simple transformations before implementing complex routing logic
- Monitor performance metrics to quantify infrastructure improvements
Conclusion
SPL2 and Edge Processor pipelines represent a fundamental shift toward more efficient, scalable, and cost-effective data processing architectures. While the learning curve requires investment in new syntax patterns and command structures, the performance improvements and operational benefits deliver immediate ROI.
Ready to optimize your Splunk deployment? Start by identifying your highest-volume data sources and experiment with SPL2 pipeline prototypes using the built-in Pipeline Editor. The combination of real-time validation, distributed processing, and sophisticated routing capabilities will transform how your organization handles enterprise data at scale.
To access more Splunk searches, check out Atlas Search Library, which is part of the Atlas Platform. Specifically, Atlas Search Library offers a curated list of optimized searches. These searches empower Splunk users without requiring SPL knowledge. Furthermore, you can create, customize, and maintain your own search library. By doing so, you ensure your users get the most from using Splunk.




