Sei sulla pagina 1di 6

QUICK REFERENCE GUIDE

Concepts Tags configured to trigger actions such as sending


alert information to designated email addresses
A tag is a knowledge object that enables you or posting alert information to a web resource.
Events to search for events that contain particular field
values. You can assign one or more tags to any
An event is a set of values associated with a
timestamp. It is a single entry of data and can
have one or multiple lines. An event can be a
field/value combination, including event types,
hosts, sources, and source types. Use tags to Additional Features
group related field values together, or to track
text document, a configuration file, an entire abstract field values such as IP addresses or ID Data Model
stack trace, and so on. This is an example of an numbers by giving them more descriptive names.
event in a web activity log: A data model is a hierarchically-organized
173.26.34.223 - - [01/ Index-Time and Search-Time collection of datasets that Pivot uses to
generate reports. Data model objects represent
Mar/2015:12:05:27 -0700] “GET / During index-time processing, data is read from individual datasets, which the data model is
trade/app?action=logout HTTP/1.1” a source on a host and is classified into a source composed of.
200 2953 type. Timestamps are extracted, and the data is
You can also define transactions to search for parsed into individual events. Line-breaking rules Pivot
and group together events that are conceptually are applied to segment the events to display in
the search results. Each event is written to an Pivot refers to the table, chart, or other visualization
related but span a duration of time. Transactions you create using the Pivot Editor. You can map
can represent a multistep business-related index on disk, where the event is later retrieved
with a search request. attributes defined by data model objects to
activity, such as all events related to a single data visualizations, without manually writing the
customer session on a retail website. When a search starts, referred to as search-time, searches. Pivots can be saved as reports and used
indexed events are retrieved from disk. Fields are to power dashboards.
Metrics extracted from the raw text for the event.
A metric consists of a timestamp, metric Apps
name, measure and dimensions. A measure is Indexes Apps are a collection of configurations,
a numeric data point while dimensions help When data is added, Splunk software parses knowledge objects, and customer designed
categorize these data points. Sample metric: the data into individual events, extracts the views and dashboards. Apps extend the
timestamp, applies line-breaking rules, and Splunk environment to fit the specific needs of
Timestamp: 01/Aug/2017 12:05:27
stores the events in an index. You can create new organizational teams such as Unix or Windows
Metric Name: os.cpu.user
indexes for different inputs. By default, data is system administrators, network security
Measure: 42.12345
stored in the “main” index. Events are retrieved specialists, website managers, business
Dimensions: hq:us-west-1, hq:us-east-1
from one or more indexes during a search. analysts, and so on. A single Splunk Enterprise
Metrics and Events can be searched and or Splunk Cloud installation can run multiple
apps simultaneously.
correlated together but are stored in different
indexes. Core Features
Distributed Search
Host, Source, and Source Type Search
A distributed search provides a way to scale
A host is the name of the physical or virtual Search is the primary way users navigate data your deployment by separating the search
device where an event originates. It can be used in Splunk software. You can write a search to management and presentation layer from the
to find all data originating from a specific device. retrieve events from an index, use statistical indexing and search retrieval layer. You use
A source is the name of the file, directory, data commands to calculate metrics and generate distribute search to facilitate horizontal scaling
stream, or other input from which a particular reports, search for specific conditions within a for enhanced performance, to control access
event originates. Sources are classified into rolling time window, identify patterns in your to indexed data, and to manage geographically
source types, which can be either well known data, predict future trends, and so on. You dispersed data.
formats or formats defined by the user. Some transform the events using the Splunk Search
Process Language (SPL™). Searches can be
common source types are HTTP web server logs
and Windows event logs. saved as reports and used to power dashboards. Splunk Components
Events with the same source types can come from Reports
different sources. For example, events from the Forwarders
file source=/var/log/messages and from a Reports are saved searches and pivots. You can
run reports on an ad hoc basis, schedule reports A Splunk instance that forwards data to another
syslog input port source=UDP:514 often share
to run on a regular interval, or set a scheduled Splunk instance is referred to as a forwarder.
the source type, sourcetype=linux _ syslog.
report to generate alerts when the results meet
Fields particular conditions. Reports can be added to Indexer
dashboards as dashboard panels.
An indexer is the Splunk instance that indexes data.
Fields are searchable name and value pairings
The indexer transforms the raw data into events
that distinguish one event from another. Not Dashboards and stores the events into an index. The indexer
all events have the same fields and field values.
Dashboards are made up of panels that contain also searches the indexed data in response to
Using fields, you can write tailored searches to
modules such as search boxes, fields, and data search requests. The search peers are indexers that
retrieve the specific events that you want. When
visualizations. Dashboard panels are usually fulfill search requests from the search head.
Splunk software processes events at index-time
and search-time, the software extracts fields connected to saved searches or pivots. They
based on configuration file definitions and user- can display the results of completed searches, Search Head
defined patterns. as well as data from real-time searches.
In a distributed search environment, the search
head is the Splunk instance that directs search
Use the Field Extractor tool to automatically Alerts requests to a set of search peers and merges
generate and validate field extractions at search-
Alerts are triggered when search results meet the results back to the user. If the instance
time using regular expressions or delimiters such
specific conditions. You can use alerts on does only search and not indexing, it is usually
as spaces, commas, or other characters.
historical and real-time searches. Alerts can be referred to as a dedicated search head.
QUICK REFERENCE GUIDE

Search Processing Subsearches


Common Search Commands
Language A subsearch runs its own search and returns the
results to the parent command as the argument Command Description
A Splunk search is a series of commands and value. The subsearch is run first and is contained
chart/ Returns results in a tabular
arguments. Commands are chained together with in square brackets. For example, the following output for (time-series)
search uses a subsearch to find all syslog events timechart
a pipe “|” character to indicate that the output of charting.
one command feeds into the next command on from the user that had the last login error:
Removes subsequent
the right. sourcetype=syslog [ search login dedup results that match a
search | command1 arguments1 | error | return 1 user ] specified criterion.
command2 arguments2 | ... Calculates an expression.
Optimizing Searches eval See COMMON EVAL
At the start of the search pipeline, is an FUNCTIONS.
implied search command to retrieve events
from the index. Search requests are written The key to fast searching is to limit the data Removes fields from
that needs to be pulled off disk to an absolute fields
with keywords, quoted phrases, Boolean search results.
expressions, wildcards, field name/value pairs, minimum. Then filter that data as early as
possible in the search so that processing is done head/tail Returns the first/last N
and comparison expressions. The AND operator results.
is implied between search terms. For example: on the minimum data necessary.
lookup Adds field values from an
sourcetype=access _ combined error | Partition data into separate indexes, if you will external source.
top 5 uri rarely perform searches across multiple types
of data. For example, put web data in one index, Renames a field. Use
This search retrieves indexed web activity events and firewall data in another. rename wildcards to specify
that contain the term “error”. For those events, it multiple fields.
returns the top 5 most common URI values. Limit the time range to only what is needed. For
Specifies regular
example -1h not -1w, or earliest=-1d. rex expression named groups
Search commands are used to filter unwanted to extract fields.
events, extract more information, calculate Search as specifically as you can. For example,
values, transform, and statistically analyze fatal _ error not *error* Filters results to those
the indexed data. Think of the search results search that match the search
Filter out results as soon as possible before expression.
retrieved from the index as a dynamically calculations. Use field-value pairs, before the
created table. Each indexed event is a row. The first pipe. For example, >ERROR status=404 Sorts the search results by
field values are columns. Each search command sort
|… instead of >ERROR | search the specified fields.
redefines the shape of that table. For example, status=404… Or use filtering commands such
search commands that filter events will remove Provides statistics,
as where. grouped optionally by
rows, search commands that extract fields will stats
fields. See COMMON
add columns. Filter out unnecessary fields as soon as possible STATS FUNCTIONS.
in the search.
Time Modifiers mstats Similar to stats but used on
Postpone commands that process over the metrics instead of events.
You can specify a time range to retrieve events entire result set (non-streaming commands) as
inline with your search by using the latest Specifies fields to keep in
late as possible in your search. Some of these table
and earliest search modifiers. The relative the result set. Retains data
commands are: dedup, sort, and stats.
times are specified with a string of characters to in tabular format.
indicate the amount of time (integer and unit) Use post-processing searches in dashboards. Displays the most/least
top/rare
and an optional “snap to” time unit. The syntax is: common values of a field.
Use summary indexing, and report and data
[+|-]<integer><unit>@<snap _ time _ model acceleration features. transaction Groups search results into
unit> transactions.

Machine Learning Commands Filters search results using


The search “error earliest=-1d@d where eval expressions. Used to
latest=-h@h” retrieves events containing
compare two different fields.
“error” that occurred yesterday snapping to the The Machine Learning Toolkit delivers additional
beginning of the day (00:00:00) and through SPL commands that you can use to apply
to the most recent hour of today, snapping on machine learning to your data. Find out more in
the hour. the Machine Learning Quick Reference Guide.

The snap to time unit rounds the time down. For


example, if it is 11:59:00 and you snap to hours
(@h), the time used is 11:00:00 not 12:00:00.
You can also snap to specific days of the week
using @w0 for Sunday, @w1 for Monday, and so on.
www.splunk.com
docs.splunk.com

Splunk Inc.
270 Brannan Street
San Francisco, CA 94107

Copyright © 2017 Splunk Inc. All rights reserved. Splunk,


Splunk>, Listen to Your Data, The Engine for Machine Data,
Splunk Cloud, Splunk Light and SPL are trademarks and
registered trademarks of Splunk Inc. in the United States and
other countries. All other brand names, product names, or
trademarks belong to their respective owners.
QUICK REFERENCE GUIDE

The eval command calculates an expression and puts the resulting value into a field (e.g. “...| eval
force = mass * acceleration”). The following table lists some of the functions used with the eval
Common Eval Functions command. You can also use basic arithmetic operators (+ - * / %), string concatenation (e.g., “...|
eval name = last . “,” . first”), and Boolean operations (AND OR NOT XOR < > <= >= != = == LIKE).
Function Description Examples
abs(X) Returns the absolute value of X. abs(number)
Takes pairs of arguments X and Y, where X arguments are case(error == 404, "Not found", error == 500,"Internal
case(X,"Y",…) Boolean expressions. When evaluated to TRUE, the arguments
return the corresponding Y argument. Server Error", error == 200, "OK")
ceil(X) Ceiling of a number X. ceil(1.9)
cidrmatch("X",Y) Identifies IP addresses that belong to a particular subnet. cidrmatch("123.132.32.0/25",ip)
coalesce(X,…) Returns the first value that is not null. coalesce(null(), "Returned val", null())
cos(X) Calculates the cosine of X. n=cos(0)

exact(X) Evaluates an expression X using double precision floating


point arithmetic.
exact(3.14*num)

exp(X) Returns eX. exp(3)

if(X,Y,Z) If X evaluates to TRUE, the result is the second argument Y. If X


evaluates to FALSE, the result evaluates to the third argument Z.
if(error==200, "OK", "Error")

isbool(X) Returns TRUE if X is Boolean. isbool(field)


isint(X) Returns TRUE if X is an integer. isint(field)
isnull(X) Returns TRUE if X is NULL. isnull(field)
isstr() Returns TRUE if X is a string. isstr(field)
len(X) This function returns the character length of a string X. len(field)
like(X,"Y") Returns TRUE if and only if X is like the SQLite pattern in Y. like(field, "addr%")
Returns the log of the first argument X using the
log(X,Y) log(number,2)
second argument Y as the base. Y defaults to 10.
lower(X) Returns the lowercase of X. lower(username)
Returns X with the characters in Y trimmed from the
ltrim(X,Y) ltrim(" ZZZabcZZ ", " Z")
left side. Y defaults to spaces and tabs.
match(X,Y) Returns if X matches the regex pattern Y. match(field, "^\d{1,3}\.\d$")
max(X,…) Returns the maximum. max(delay, mydelay)
md5(X) Returns the MD5 hash of a string value X. md5(field)
min(X,…) Returns the minimum. min(delay, mydelay)
mvcount(X) Returns the number of values of X. mvcount(multifield)
Filters a multi-valued field based on the Boolean
mvfilter(X) mvfilter(match(email, "net$"))
expression X.
Returns a subset of the multivalued field X from start
mvindex(X,Y,Z) mvindex( multifield, 2)
position (zero-based) Y to Z (optional).
Given a multi-valued field X and string delimiter Y, and
mvjoin(X,Y) mvjoin(address, ";")
joins the individual values of X using Y.
now() Returns the current time, represented in Unix time. now()
null() This function takes no arguments and returns NULL. null()
Given two arguments, fields X and Y, and returns the X
nullif(X,Y) nullif(fieldA, fieldB)
if the arguments are different. Otherwise returns NULL.
Returns a pseudo-random number ranging from 0 to
random() random()
2147483647.
relative _ Given epochtime time X and relative time specifier Y,
relative_time(now(),"-1d@d")
time (X,Y) returns the epochtime value of Y applied to X.
Returns date with the month and day numbers
Returns a string formed by substituting string Z for switched, so if the input was 4/30/2015 the return
replace(X,Y,Z)
every occurrence of regex string Y in string X. value would be 30/4/2009: replace(date, "^(\d{1,2})/
(\d{1,2})/", "\2/\1/")
QUICK REFERENCE GUIDE

Common Eval Functions (continued)


Function Description Examples
Returns X rounded to the amount of decimal places
round(X,Y) round(3.5)
specified by Y. The default is to round to an integer.
Returns X with the characters in Y trimmed from the right
rtrim(X,Y) rtrim(" ZZZZabcZZ ", " Z")
side. If Y is not specified, spaces and tabs are trimmed.
searchmatch(X) Returns true if the event matches the search string X. searchmatch("foo AND bar")
split(X,"Y") Returns X as a multi-valued field, split by delimiter Y. split(address, ";")
sqrt(X) Returns the square root of X. sqrt(9)
Returns epochtime value X rendered using the format
strftime(X,Y) strftime( _ time, "%H:%M")
specified by Y.
Given a time represented by a string X, returns value
strptime(X,Y) strptime(timeStr, "%H:%M")
parsed from format Y.
Returns a substring field X from start position (1-based)
substr(X,Y,Z) substr("string", 1, 3)
Y for Z (optional) characters.
time() Returns the wall-clock time with microsecond resolution. time()
Converts input string X to a number, where Y (optional,
tonumber(X,Y) tonumber("0A4",16)
defaults to 10) defines the base of the number to convert to.
Returns a field value of X as a string. If the value of X is a number, it
reformats it as a string. If X is a Boolean value,, reformats to "True" This example returns: foo=615 and
or "False". If X is a number, the second argument Y is optional foo2=00:10:15:
tostring(X,Y)
and can either be "hex" (convert X to hexadecimal), "commas" … | eval foo=615 | eval foo2 =
(formats X with commas and 2 decimal places), or "duration" tostring(foo, “duration”)
(converts seconds X to readable time format HH:MM:SS).
This example returns:
typeof(X) Returns a string representation of the field type. “NumberStringBoolInvalid”: typeof(12)+
typeof(“string”)+
urldecode("http%3A%2F%2Fwww.splunk.
urldecode(X) Returns the URL X decoded.
com%2Fdownload%3Fr%3Dheader")
Given pairs of arguments, Boolean expressions X and strings validate(isint(port), "ERROR: Port is not
validate|
Y, returns the string Y corresponding to the first expression X an integer", port >= 1 AND port <= 65535,
(X,Y,…)
that evaluates to False and defaults to NULL if all are True. "ERROR: Port is out of range")

Common statistical functions used with the chart, stats, and timechart commands. Field names
Common Stats Functions can be wildcarded, so avg(*delay) might calculate the average of the delay and xdelay fields.
avg(X) Returns the average of the values of field X.
count(X) Returns the number of occurrences of the field X. To indicate a specific field value to match, format X as eval(field="value").
dc(X) Returns the count of distinct values of the field X.
earliest(X) Returns the chronologically earliest seen value of X.
latest(X) Returns the chronologically latest seen value of X.
max(X) Returns the maximum value of the field X. If the values of X are non-numeric, the max is found from alphabetical ordering.
median(X) Returns the middle-most value of the field X.
min(X) Returns the minimum value of the field X. If the values of X are non-numeric, the min is found from alphabetical ordering.
mode(X) Returns the most frequent value of the field X.
perc<X>(Y) Returns the X-th percentile value of the field Y. For example, perc5(total) returns the 5th percentile value of a field "total".
range(X) Returns the difference between the max and min values of the field X.
stdev(X) Returns the sample standard deviation of the field X.
stdevp(X) Returns the population standard deviation of the field X.
sum(X) Returns the sum of the values of the field X.
sumsq(X) Returns the sum of the squares of the values of the field X.
values(X) Returns the list of all distinct values of the field X as a multi-value entry. The order of the values is alphabetical.
var(X) Returns the sample variance of the field X.
QUICK REFERENCE GUIDE

Search Examples
Filter Results Reporting (cont.)
Returns X rounded to the amount Create a table showing the count … | stats sparkline
of decimal places specified by round(3.5) of events and a small line chart count by host
Y. The default is to round to an
integer. Create a timechart of the count … | timechart count by
of from "web" sources by "host" host
Returns X with the characters in Y
trimmed from the right side. If Y is rtrim(" ZZZZabcZZ ", " Calculate the average value of
not specified, spaces and tabs are Z") … | timechart span=1m
"CPU" each minute for each avg(CPU) by host
trimmed. "host".
Returns true if the event matches searchmatch("foo AND Return the average for each hour,
the search string X. bar") of any unique field that ends … | stats avg(*lay) by
with the string "lay" (e.g., delay, date _ hour
Returns X as a multi-valued field, split(address, ";")
split by delimiter Y. xdelay, relay, etc).

Given pairs of arguments, Boolean Return the 20 most common … | top limit=20 url
validate(isint(port), values of the "url" field.
expressions X and strings Y, "ERROR: Port is not an
returns the string Y corresponding integer", port >= 1 AND Return the least common values
to the first expression X that … | rare url
port <= 65535, "ERROR: of the "url" field.
evaluates to False and defaults to Port is out of range")
NULL if all are True.
Advanced Reporting
Group Results Compute the overall average
duration and add 'avgdur' as a ... | eventstats
Cluster results together, sort … | cluster t=0.9 avg(duration) as avgdur
new field to each event where the
by their "cluster_count" values, showcount=true | sort 'duration' field exists
and then return the 20 largest limit=20 -cluster _ count
clusters (in data size). ... | streamstats
sum(bytes) as bytes _
Group results that have the same Find the cumulative sum of bytes. total | timechart
"host" and "cookie", occur within … | transaction host max(bytes _ total)
30 seconds of each other, and do cookie maxspan=30s
not have a pause greater than 5 sourcetype=nasdaq
maxpause=5s earliest=-10y |
seconds between each event into a Find anomalies in the field ‘Close_
transaction. Price’ during the last 10 years. anomalydetection Close _
Price
Group results with the same IP
address (clientip) and where the … | transaction clientip Create a chart showing the count
first result contains "signon", startswith="signon" of events with a predicted value ... | timechart count |
and the last result contains endswith="purchase" and range added to each event in predict count
"purchase". the time-series.
Computes a five event simple “... | timechart count |
moving average for field
Order Results ‘count’ and write to new field
trendline sma5(count) as
smoothed _ count”
‘smoothed_count.’
Return the first 20 results. … | head 20

… | reverse
Add Fields
Reverse the order of a result set.
… | eval
Set velocity to distance / time. velocity=distance/time
Sort results by "ip" value (in
ascending order) and then by … | sort ip, -url
Extract "from" and "to" fields
"url" value (in descending order).
using regular expressions. If a … | rex field= _ raw
Return the last 20 results in raw event contains "From: Susan "From: (?<from>.*) To:
… | tail 20 (?<to>.*)"
reverse order. To: David", then from=Susan and
to=David.
Save the running total of "count" … | accum count as
Reporting in a field called "total_count". total _ count
| mstats avg( _ value),
Return the average and count For each event where 'count' exists,
count( _ value) WHERE … | delta count as
using a 30 second span of all compute the difference between
metric _ name=”*.cpu. countdiff
metrics ending in cpu.percent count and its previous value and
percent” by metric _ name
split by each metric name. store the result in 'countdiff'.
span=30s
Return max(delay) for each value … | chart max(delay)
of foo split by the value of bar. over foo by bar Filter Fields
Return max(delay) for each value … | chart max(delay) Keep only the "host" and "ip"
of foo. over foo fields, and display them in that … | fields + host, ip
order.
Count the events by "host" … | stats count by host
Remove the “host” and “ip” fields … | fields - host, ip
from the results.
QUICK REFERENCE GUIDE

Search Examples (continued)


Lookup Tables (Splunk Enterprise only) Multi-Valued Fields
For each event, use the lookup Combine the multiple values of the … | nomv recipients
table usertogroup to locate the … | lookup usertogroup recipients field into a single value
matching “user” value from the user output group
event. Output the group field Separate the values of the … | makemv delim=","
"recipients" field into multiple field recipients | top
value to the event recipients
values, displaying the top recipients
Read in the usertogroup lookup … | inputlookup
table that is defined in the Create new results for each value … | mvexpand recipients
usertogroup of the multivalue field "recipients"
transforms.conf file.
… | eval to _ count =
Write the search results to the … | outputlookup users. Find the number of recipient values mvcount(recipients)
lookup file “users.csv”. csv
Find the first email address in the … | eval recipient _ first
recipient field = mvindex(recipient,0)
Modify Fields … | eval netorg _
… | rename _ ip as Find all recipient values that end recipients = mvfilter
Rename the "_ip" field as match(recipient,"\.net$")
"IPAddress". IPAddress in .net or .org
OR match(recipient,"\.org$"))
Find the index of the first … | eval orgindex =
recipient value match “\.org$” mvfind(recipient, "\.org$")
Regular Expressions (Regexes)
Regular Expressions are useful in multiple areas: search commands regex
Common Date and Time Formatting
and rex; eval functions match() and replace(); and in field extraction.
Use these values for eval functions strftime() and strptime(), and for
Regex Note Example Explanation
timestamping event data.
\s white space \d\s\d digit space digit
%H 24 hour (leading zeros) (00 to 23)
digit non-
\S not white space \d\S\d %I 12 hour (leading zeros) (01 to 12)
whitespace digit
\d\d\d-\d\d- %M Minute (00 to 59)
\d digit SSN
\d\d\d\d %S Second (00 to 61)
\D not digit \D\D\D three non-digits subseconds with width (%3N = millisecs,
%N
word character Time %6N = microsecs, %9N = nanosecs)
\w (letter, number, \w\w\w three word chars %p AM or PM
or _)
%Z Time zone (EST)
not a word three non-word
\W \W\W\W Time zone offset from UTC, in hour and
character chars %z
minute: +hhmm or -hhmm. (-0500 for EST)
any included any char that is a thru
[...] [a-z0-9#] %s Seconds since 1/1/1970 (1308677092)
character z, 0 thru 9, or #
%d Day of month (leading zeros) (01 to 31)
no included any char but x, y,
[^...] [^xyz]
character or z %j Day of year (001 to 366)
zero or more words Days %w Weekday (0 to 6)
* zero or more \w*
chars
%a Abbreviated weekday (Sun)
+ one or more \d+ integer
%A Weekday (Sunday)
\d\d\d-?\d\d- SSN with dashes
? zero or one %b Abbreviated month name (Jan)
?\d\d\d\d being optional
Months %B Month name (January)
word or digit
| or \w|\d
character %m Month number (01 to 12)
(?P<var> named (?P<ssn>\d\d\d- pull out a SSN and %y Year without century (00 to 99)
...) extraction \d\d-\d\d\d\d) assign to 'ssn' field Years
%Y Year (2015)
(?: ... logical or alphabetic
(?:[a-zA-Z]|\d) %Y-%m-%d 2014-12-31
) atomic grouping character OR a digit
%y-%m-%d 14-12-31
line begins with at
^ start of line ^\d+
least one digit %b %d, %Y Jan 24, 2015
Examples
line ends with at %B %d, %Y January 24, 2015
$ end of line \d+$
least one digit
q|%d %b '%y
number of q|25 Feb '15 = 2015-02-25|
{...} \d{3,5} between 3-5 digits = %Y-%m-%d|
repetitions
escape the [ For more info visit:
\ escape \[ docs.splunk.com
character

GDE-Splunk-QuickReferenceGuide-115

Potrebbero piacerti anche