Splunk Quick Reference Guide
Splunk Quick Reference Guide
This guide describes key concepts and Use the Field Extractor tool to automatically Aler ts
features, as well as commonly used generate and validate field extractions at search-
time using regular expressions or delimiters such Alerts are triggered when search results meet
commands and functions for Splunk as spaces, commas, or other characters. specific conditions. You can use alerts on
Cloud and Splunk Enterprise. historical and real-time searches. Alerts can be
Tags configured to trigger actions such as sending
Concepts A tag is a knowledge object that enables you to
alert information to designated email addresses
or posting alert information to a web resource
Events search for events that contain particular field
Explore our full suite of products, or investigate the table below to find the specific starting point for your journey.
Or dive right in: Download the free trial and see for yourself what the Splunk platform can do for your data strategy.
Splunk, Splunk>, Data-to-Everything, D2E and Turn Data Into Doing are trademarks and registered trademarks of Splunk Inc. in the United States and
other countries. All other brand names, product names or trademarks belong to their respective owners. © 2021 Splunk Inc. All rights reserved. 21-17506-Splunk-QuickReferenceGuide-121
QUICK REFERENCE GUIDE
The eval command calculates an expression and puts the resulting value into a field (e.g. “...| eval
force = mass * acceleration”). The following table lists some of the functions used with the eval
Common Eval Functions command. You can also use basic arithmetic operators (+ - * / %), string concatenation (e.g., “...|
eval name = last . “,” . first”), and Boolean operations (AND OR NOT XOR < > <= >= != = == LIKE).
Function Description Examples
abs(X) Returns the absolute value of X. abs(number)
Takes pairs of arguments X and Y, where X arguments are case(error == 404, "Not found", error ==
case(X,"Y",…) Boolean expressions. When evaluated to TRUE, the arguments
return the corresponding Y argument.
500,"Internal Server Error", error == 200, "OK")
Common statistical functions used with the chart, stats, and timechart commands. Field names
Common Stats Functions can be wildcarded, so avg(*delay) might calculate the average of the delay and xdelay fields.
avg(X) Returns the average of the values of field X.
count(X) Returns the number of occurrences of the field X. To indicate a specific field value to match, format X as eval(field="value").
max(X) Returns the maximum value of the field X. If the values of X are non-numeric, the max is found from alphabetical ordering.
min(X) Returns the minimum value of the field X. If the values of X are non-numeric, the min is found from alphabetical ordering.
perc<X>(Y) Returns the X-th percentile value of the field Y. For example, perc5(total) returns the 5th percentile value of a field "total".
range(X) Returns the difference between the max and min values of the field X.
sumsq(X) Returns the sum of the squares of the values of the field X.
values(X) Returns the list of all distinct values of the field X as a multi-value entry. The order of the values is alphabetical.
Search Examples
Filter Results Reporting (cont.)
Returns X rounded to the amount Return the average for each hour,
of decimal places specified by of any unique field that ends with … | stats avg(*lay) by
round(3.5)
Y. The default is to round to an the string "lay" (e.g., delay, xdelay, date _ hour
integer. relay, etc).
Returns X with the characters in Y Return the 20 most common
… | top limit=20 url
trimmed from the right side. If Y is values of the "url" field.
rtrim(" ZZZZabcZZ ", " Z")
not specified, spaces and tabs are
Return the least common values
trimmed. … | rare url
of the "url" field.
Returns X as a multi-valued field,
split(address, ";")
split by delimiter Y.
Advanced Reporting
Given pairs of arguments, Boolean validate(isint(port),
expressions X and strings Y, Compute the overall average
"ERROR: Port is not an ... | eventstats
returns the string Y corresponding duration and add 'avgdur' as a
integer", port >= 1 AND new field to each event where the avg(duration) as avgdur
to the first expression X that
evaluates to False and defaults to
port <= 65535, "ERROR: 'duration' field exists
NULL if all are True. Port is out of range")
... | streamstats
sum(bytes) as bytes _ total
Find the cumulative sum of bytes.
Group Results | timechart max(bytes _
total)
Cluster results together, sort by … | cluster t=0.9
their "cluster_count" values, and
sourcetype=nasdaq
showcount=true | sort Find anomalies in the field ‘Close_ earliest=-10y |
then return the 20 largest clusters
(in data size).
limit=20 -cluster _ count Price’ during the last 10 years. anomalydetection Close _
Price
Group results that have the same
"host" and "cookie", occur within Create a chart showing the count
… | transaction host of events with a predicted value ... | timechart count |
30 seconds of each other, and do
cookie maxspan=30s and range added to each event in predict count
not have a pause greater than 5
seconds between each event into
maxpause=5s the time-series.
a transaction. Computes a five event simple “... | timechart count |
Group results with the same IP moving average for field ‘count’
trendline sma5(count) as
address (clientip) and where the … | transaction clientip and write to new field ‘smoothed_
count.’
smoothed _ count”
first result contains "signon", startswith="signon"
and the last result contains endswith="purchase"
"purchase".
Metrics
| mcatalog values(metric _
Order Results List all of the metric names in the
name) WHERE index= _
“_metrics” metric index.
Return the first 20 results. … | head 20 metrics
See examples of the metric data | mpreview index= _
Reverse the order of a result set. … | reverse metrics target _ per _
points stored in the “_metrics”
Sort results by "ip" value (in metric index. timeseries=5
ascending order) and then by "url" … | sort ip, -url Return the average value of a
value (in descending order). | mstats avg(aws.ec2.
metric in the “_metrics” metric
CPUUtilization) WHERE
Return the last 20 results in index. Bucket the results into 30
… | tail 20 second time spans.
index= _ metrics span=30s
reverse order.
270 Brannan St., San Francisco, CA 94107 | [email protected] | [email protected] | 866-438-7758 | 415-848-8400 | splunkbase.splunk.com
Splunk, Splunk>, Data-to-Everything, D2E and Turn Data Into Doing are trademarks and registered trademarks of Splunk Inc. in the United States and www.splunk.com
other countries. All other brand names, product names or trademarks belong to their respective owners. © 2021 Splunk Inc. All rights reserved. 21-17506-Splunk-QuickReferenceGuide-121
Splunk's machine learning capabilities enhance data analysis and prediction by providing integrated tools such as the Splunk Machine Learning Toolkit, Streaming ML framework, and Machine Learning Environment. These tools allow users to create predictive models, perform anomaly detection, and automate data insights, which adds significant depth to the analysis of large data sets. By incorporating machine learning, Splunk empowers users to derive predictive insights and detect patterns that are not immediately apparent .
In a distributed Splunk environment, the search head acts as the component that directs search requests to multiple search peers or indexers. It handles query management, delegating specific data queries to appropriate indexers (search peers), and retrieving and merging results for the user. This architecture separates the searching functionality from data storage, enabling scalable and efficient processing of distributed data while maintaining response accuracy .
Alerts in Splunk can be configured to automate monitoring and reactive actions by setting them up to trigger when search results meet specific conditions. Alerts can be applied to historical or real-time searches and can initiate actions such as sending alert information via email or posting to a web resource. This automation enables users to proactively manage incidents and respond promptly to critical events without manual intervention .
Datasets in Splunk enable efficient data management by allowing the creation and maintenance of structured data collections like lookups, data models, and table datasets. These datasets provide a curated and focused collection of event data designed for specific business purposes, which aids in optimizing searches by streamlining the data to be processed. For instance, data models, a type of dataset, can be accelerated to improve search performance, making them integral to powering dashboards and generating on-demand reports efficiently .
Dashboards in Splunk enhance data visualization and interactivity by allowing users to compile panels that contain modules, such as search boxes, fields, and data visualizations. They are connected to saved searches, displaying results from completed searches and supporting data from real-time queries. This feature enables users to interactively explore data patterns and trends, making data analysis more intuitive and actionable .
SPL2 offers several advantages over its predecessor SPL, including improved usability with a more consistent command syntax and the removal of infrequently used commands. This simplifies writing searches and reduces the learning curve for new users, making the language more accessible. SPL2 enhances search effectiveness through a clearer and more uniform structure, ensuring that commands are easier to understand and use .
Forwarders and indexers are crucial components in Splunk's data handling process. Forwarders collect data from clients and send it to indexers for processing and storage. Indexers then transform the raw data into events, apply necessary parsing, and store these events in indexes. Furthermore, indexers handle search requests by retrieving relevant data from the indexes. This division of labor ensures efficient data ingestion and retrieval .
Common eval functions in Splunk enhance data transformation by providing robust capabilities to calculate expressions, manipulate string data, and perform numeric conversions. These functions allow for operations such as computing mathematical expressions, formatting strings, and altering data presentation, thus enabling users to adapt raw data into forms suitable for analysis and visualization. By using eval functions, data can be dynamically transformed and enriched, which enhances the overall analytical potential of Splunk .
Optimizing search performance in a Splunk environment involves several strategies: limiting the dataset to be pulled from disk by partitioning it into distinct indexes, specifying narrow time ranges to reduce data scope, and using precise search terms to filter data effectively. Additional methods include employing post-processing searches in dashboards, leveraging summary indexing, and utilizing data model acceleration. These techniques collectively minimize data handling and processing, resulting in faster and more efficient searches .
Index-time processing in Splunk involves reading data from a source, classifying it into a source type, extracting timestamps, and parsing data into individual events, which are then stored in an index on disk. This process ensures that data is prepared for quick retrieval during searches. On the other hand, search-time processing occurs when a search is initiated; indexed events are retrieved, and fields are extracted from the raw text of these events. The importance lies in how index-time processing prepares data for quick access and transformation during searches, while search-time processing enables dynamic extraction and analysis of data as needed .