Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Duration: 3 days
Skill Level: Introductory and beyond
Hands-On Format: This hands-on class is approximately 50% hands-on lab to 50% lecture ratio, combining engaging lecture, demos, group activities and
discussions with machine-based practical student labs and project work.
Course Overview
Are you in charge of creating Splunk knowledge objects for your organization? Then you will benefit from this course that walks you through the various
knowledge objects and how to create them. Working with Splunk is a comprehensive hands-on course that teaches students how to search, navigate, tag,
build alerts, create simple reports and dashboards in Splunk, and how to Splunk's Pivot interface.
Working in a hands-on learning environment, students will learn how to use Splunk Analytics to provide an efficient way to search large volumes of data.
Students will learn how to run Basic Searches, Save and Share Search Results, Create Tags and Event Types, Create Reports, Create Different Charts, Perform
Calculations and Format Search Data, and Enrich Data with Lookups. Examples will center around financial institution examples.
Hands on Lab covering: Identify the contents of search results, Refine searches
Use the timeline
Work with events
Hands on Lab covering: Use the timeline, Work with events
Control a search job
Save search results
Hands on Lab covering: Control a search job, Save search results
End of Module Hands-on Quiz
Edit a dashboard
Hands on Lab covering: Add a pivot report to a dashboard, Edit a dashboard.
End of Module Hands on Quiz
You can split this functionality across multiple specialized instances of Splunk Enterprise, ranging in number from just a few to
thousands, depending on the quantity of data you're dealing with and other variables in your environment. You might, for example,
create a deployment with many instances that only consume data, several other instances that index the data, and one or more
instances that handle search requests. These specialized instances are known collectively as components. There are several types of
components.
For a typical mid-size deployment, for example, you can deploy lightweight versions of Splunk Enterprise, called forwarders, on the
machines where the data originates. The forwarders consume data locally and then forward the data across the network to another
Splunk Enterprise component, called the indexer. The indexer does the heavy lifting; it indexes the data and runs searches. It should
reside on a machine by itself. The forwarders, on the other hand, can easily co-exist on the machines generating the data, because the
data-consuming function has minimal impact on machine performance.
As you scale up, you can add more forwarders and indexers. For a larger deployment, you might have hundreds of forwarders sending
data to a number of indexers. You can use load balancing on the forwarders, so that they distribute their data across some or all of the
indexers. Not only does load balancing help with scaling, but it also provides a fail-over capability if one of the indexers goes down.
The forwarders automatically switch to sending their data to any indexers that remain alive.
10
In this diagram, each forwarder load-balances its data across two indexers:
11
These are the fundamental components and features of a Splunk Enterprise distributed environment:
Indexers.
Forwarders.
Search heads.
Deployment server.
Indexer
A Splunk Enterprise instance that indexes data, transforming raw data into events and placing the results into an index. It also
searches the indexed data in response to search requests.
The indexer also frequently performs the other fundamental Splunk Enterprise functions: data input and search management. In
larger deployments, forwarders handle data input and forward the data to the indexer for indexing. Similarly, although indexers
always perform searches across their own data, in larger deployments, a specialized Splunk Enterprise instance, called a search head,
handles search management and coordinates searches across multiple indexers.
Forwarder
A Splunk Enterprise instance that forwards data to another Splunk Enterprise instance, such as an indexer or another forwarder, or to
a third-party system.
There are three types of forwarders:
12
A universal forwarder is a dedicated, streamlined version of Splunk Enterprise that contains only the essential components
needed to send data.
A heavy forwarder is a full Splunk Enterprise instance, with some features disabled to achieve a smaller footprint.
A light forwarder is a full Splunk Enterprise instance, with most features disabled to achieve a small footprint. The universal
forwarder supersedes the light forwarder for nearly all purposes. The light forwarder has been deprecated as of Splunk
Enterprise version 6.0.0.
The universal forwarder is the best tool for forwarding data to indexers. Its main limitation is that it forwards only unparsed data. To
send event-based data to indexers, you must use a heavy forwarder.
Search Heads
In a distributed search environment, a Splunk Enterprise instance that handles search management functions, directing search
requests to a set of search peers and then merging the results back to the user.
A Splunk Enterprise instance can function as both a search head and a search peer. A search head that performs only searching, and
not any indexing, is referred to as a dedicated search head.
Search head clusters are groups of search heads that coordinate their activities.
Deployment Server
A Splunk Enterprise instance that acts as a centralized configuration manager, grouping together and collectively managing any
number of Splunk Enterprise instances. Instances that are remotely configured by deployment servers are called deployment clients.
The deployment server downloads updated content, such as configuration files and apps, to deployment clients. Units of such content
are known as deployment apps.
13
14
This table summarizes the similarities and differences among the three types of forwarders:
Features and capabilities
Universal forwarder
Heavy forwarder
Dedicated executable
Smallest
Bundles Python?
No
Yes
All types
Yes
Yes
Yes
Yes
Yes
Yes
Optional
Load balancing?
Yes
Yes
Data cloning?
Yes
Yes
Per-event filtering?
No
Yes
Event routing?
No
Yes
Event parsing?
No
Yes
Local indexing?
No
Optional, by setting
indexAndForward attribute in
outputs.conf
Searching/alerting?
No
Optional
Splunk Web?
No
Optional
15
The HTTP/HTTPS port. This port provides the socket for Splunk Web. It defaults to 8000.
The management port. This port is used to communicate with the splunkd daemon. Splunk Web talks to splunkd on this
port, as does the command line interface and any distributed connections from other servers. This port defaults to 8089.
Your instructor will give you your machine number. Please remember your machine number throughout the training session.
Then please go to Start > All Programs > Splunk Enterprise > Splunk Enterprise
The Splunk web interface should come up.
The login details : username: admin
password: admin
16
Splunk Enterprise takes in data from sources you designate and processes it so that you can analyze it. We call this process indexing.
Splunk Enterprise licenses specify how much data you can index per calendar day (from midnight to midnight by the clock on the
license master).
Any host in your Splunk Enterprise infrastructure that performs indexing must be licensed to do so. You can either run a standalone
indexer with a license installed locally, or you can configure one of your Splunk Enterprise instances as a license master and set up a
license pool from which other indexers, configured as license slaves, can draw.
When a license master instance is configured, and license slaves are added to it, the license slaves communicate their usage to the
license master every minute. If the license master is unreachable for any reason, the license slave starts a 72 hour timer.
If the license slave cannot reach the license master for 72 hours, search is blocked on the license slave (although indexing continues).
Users cannot search data in the indexes on the license slave until that slave can reach the license master again.
17
The indexer is the Splunk Enterprise component that creates and manages indexes. The primary functions of an indexer are:
In single-machine deployments consisting of just one Splunk Enterprise instance, the indexer also handles the data input and search
management functions.
For larger-scale needs, indexing is split out from the data input function and sometimes from the search management function as well.
In these larger, distributed deployments, the indexer might reside on its own machine and handle only indexing, along with searching
of its indexed data. In those cases, other Splunk Enterprise components take over the non-indexing roles.
For instance, you might have a set of Windows and Linux machines generating events, which need to go to a central indexer for
consolidation. Usually the best way to do this is to install a lightweight instance of Splunk Enterprise, known as a forwarder, on each
of the event-generating machines. These forwarders handle data input and send the data across the network to the indexer residing on
its own machine.
Similarly, in cases where you have a large amount of indexed data and numerous concurrent users searching on it, it can make sense to
split off the search management function from indexing. In this type of scenario, known as distributed search, one or more search
heads distribute search requests across multiple indexers. The indexers still perform the actual searching of their own indexes, but the
search heads manage the overall search process across all the indexers and present the consolidated search results to the user.
18
A deployment server uses server classes to determine what content to deploy to groups of deployment clients. The forwarder
management interface offers an easy way to create, edit, and manage server classes.
19
20
Collects and indexes log and machine data from any source
Powerful search, analysis and visualization capabilities empower users of all types
Apps provide solutions for security, IT ops, business analysis and more
Enables visibility across on premise, cloud and hybrid environments
Delivers the scale, security and availability to suit any organization
Available as a software or SaaS ( Software as a Solution) solution
21
A Splunk App is a prebuilt collection of dashboards, panels and UI elements powered by saved searches and packaged for a specific
technology or use case to make Splunk immediately useful and relevant to different roles.
As an alternative to using Splunk for searching and exploring, you can use Splunk Apps to gain the specific insights you need from
your machine data.
You can also apply user/role based permissions and access controls to Splunk Apps, thus providing a level of control when you are
deploying and sharing Apps across your organization.
Apps can be opened from the Splunk Enterprise Home Page, from the App menu, or from the Apps section of Settings.
22
Apps
The Apps panel lists the apps that are installed on your Splunk instance that you have permission to view. Select the
app from the list to open it.
For an out-of-the-box Splunk Enterprise installation, you see one App in the workspace: Search & Reporting. When
you have more than one app, you can drag and drop the apps within the workspace to rearrange them.
The Splunk bar in another view, such as the Search & Reporting app's Search view, also includes an App menu next
to the Splunk logo.
24
The Settings menu lists the configuration pages for Knowledge objects, Distributed environment settings, System and
licensing, Data, and Authentication settings. If you do not see some of these options, you do not have the permissions to
view or edit them.
25
User menu
The User menu here is called "Administrator" because that is the default user name for a new installation. You can
change this display name by selecting Edit account and changing the Full name. You can also edit the time zone
settings, select a default app for this account, and change the account's password. The User menu is also where you
Logout of this Splunk installation.
Messages menu
All system-level error messages are listed here. When there is a new message to review, a notification displays as a count
next to the Messages menu. Click the X to remove the message.
Activity menu
The Activity menu lists shortcuts to the Jobs, Triggered alerts, and System Activity views.
Click Jobs to open the search jobs manager window, where you can view and manage currently running
searches.
26
Click Triggered Alerts to view scheduled alerts that are triggered. This tutorial does not discuss saving and
scheduling alerts. See "About alerts" in the Alerting Manual.
Click System Activity to see Dashboards about user activity and status of the system.
Help
Click Help to see links to Video Tutorials, Splunk Answers, the Splunk Support Portal, and online Documentation.
Find
Use Find to search for objects within your Splunk Enterprise instance. Find performs non-case sensitive matches on the
ID, labels, and descriptions in saved objects. For example, if you type in "error", it returns the saved objects that contain
the term "error".
These saved objects include Reports, Dashboards, Alerts, and Data models. The results appear in the list
separated by the categories where they exist.
27
You can also run a search for error in the Search & Reporting app by clicking
Open error in search.
28
29
Module 3 - Searching
30
Investigate to learn more about the data you just indexed or to find the root cause of an issue.
Summarize your search results into a report, whether tabular or other visualization format.
Because of this, you might hear us refer to two types of searches: Raw event searches and Report-generating searches.
Raw event searches
Raw event searches are searches that just retrieve events from an index or indexes and are typically done when you want to analyze a
problem. Some examples of these searches include: checking error codes, correlating events, investigating security issues, and
analyzing failures. These searches do not usually include search commands (except search, itself), and the results are typically a list
of raw events.
Transforming searches
Transforming searches are searches that perform some type of statistical calculation against a set of results. These are searches
where you first retrieve events from an index and then pass them into one or more search commands. These searches will always
require fields and at least one of a set of statistical commands. Some examples include: getting a daily count of error events, counting
the number of times a specific user has logged in, or calculating the 95th percentile of field values.
31
Information density
Whether you're retrieving raw events or building a report, you should also consider whether you are running a search for sparse or
dense information:
Sparse searches are searches that look for single event or an event that occurs infrequently within a large
set of data. You've probably heard these referred to as 'needle in a haystack' or "rare term" searches. Some
examples of these searches include: searching for a specific and unique IP address or error code.
Dense searches are searches that scan through and report on many events. Some examples of these
searches include: counting the number of errors that occurred or finding all events from a specific host.
32
The Disk represents all of your indexed data and it's a table of a certain size with columns represent fields and rows representing
events. The first intermediate results table shows fewer rows--representing the subset of events retrieved from the index that matched
the search terms "sourcetype=syslog ERROR". The second intermediate results table shows fewer columns, representing the results of
the top command, "top user", which summarizes the events into a list of the top 10 users and displays the user, count, and percentage.
Then, "fields - percent" removes the column that shows the percentage, so you are left with a smaller final results table.
33
A search such as error | stats count will find the number of events containing the string error.
A search such as ... | search "error | stats count" would return the raw events containing error, a pipe,
stats, and count, in that order.
Additionally, you want to use quotes around keywords and phrases if you don't want to search for their default meaning, such as
Boolean operators and field/value pairs. For example:
A search for the keyword AND without meaning the Boolean operator: error "AND"
A search for this field/value phrase: error "startswith=foo"
The backslash character (\) is used to escape quotes, pipes, and itself. Backslash escape sequences are still expanded inside quotes.
For example:
The sequence \| as part of a search will send a pipe character to the command, instead of having the pipe
split between commands.
The sequence \" will send a literal quote to the command, for example for searching for a literal quotation
mark or inserting a literal quotation mark into a field using rex.
The \\ sequence will be available as a literal backslash in the command.
If Splunk does not recognize a backslash sequence, it will not alter it.
For example \s in a search string will be available as \s to the command, because \s is not a known escape
sequence.
However, in the search string \\s will be available as \s to the command, because \\ is a known escape
sequence that is converted to \.
34
Asterisks, *, can not be searched for using a backslash to escape the character. Splunk treats the asterisk character as a major
breaker. Because of this, it will never be in the index. If you want to search for the asterisk character, you will need to run a postfiltering regex search on your data:
index=_internal | regex ".*\*.*"
Examples
Example 1: myfield is created with the value of 6.
... | eval myfield="6"
35
Time is crucial for determining what went wrong. You often know when something happened, if not exactly what happened. Looking
at events that happened around the same time can help correlate results and find the root cause.
Searches run with overly-broad time range wastes system resources and produces more results than you can handle.
Select time ranges to apply to your search
Use the time range picker to set time boundaries on your searches. You can restrict a search with preset time ranges, create custom
time ranges, specify time ranges based on date or date and time, or work with advanced features in the time range picker. These
options are described in the following sections.
Note: If you are located in a different timezone, time-based searches use the timestamp of the event from the instance that indexed the
data.
36
37
38
The labels for Earliest and Latest update to match your selection.
The preview boxes below the fields update to the time range as you set it.
39
40
For these fields, you can type the date into the text box or select the date from a calendar:
41
42
You can type the date into the text box or select the date from a calendar.
43
44
Hands on Lab
Part 1 - Basic Concepts
There are a few concepts in the Splunk world that will be helpful for you to understand. Ill cover them in a few sentences,
so try to pay attention. If you want more details, see the Concepts section near the end of this document.
Processing at the time the data is processed: Splunk reads data from a source, such as a file or port, on a host (e.g. "my
machine"), classifies that source into a sourcetype (e.g., "syslog", "access_combined", "apache_error", ...), then extracts
timestamps, breaks up the source into individual events (e.g., log events, alerts, ), which can be a single-line or multiple
lines, and writes each event into an index on disk, for later retrieval with a search.
Processing at the time the data is searched: When a search starts, matching indexed events are retrieved from disk, fields
(e.g., code=404, user=david,...) are extracted from the event's text, and the event is classified by matched against eventtype
definitions (e.g., 'error', 'login', ...). The events returned from a search can then be powerfully transformed using Splunk's
search language to generate reports that live on dashboards.
45
Splunk can eat data from just about any source, including files, directories, ports, and scripts, keeping track of changes to
them as they happen. We're going to start simple and just tell Splunk to index a particular file and not monitor it for
updates:
1.
2.
3.
4.
5.
6.
7.
8.
Go to the Splunk Web interface (e.g. http://localhost:8000) and log in, if you havent already.
Click Settings in the upper right-hand corner of Splunk Web.
Under Settings, click Add Data.
Click Upload Data to upload file.
Click Select File.
Browse and find "websample.log" on your Desktop that we previously saved.
Accept all the default values and just click Submit.
Click Start Searching
Assuming all goes well, websample.log is now indexed, and all the events are timestamped and searchable.
46
Splunk comes with several Apps, but the only relevant one now is the 'Search' app, which is the interface for generic
searching. (More apps can be downloaded and advanced users can built them themselves.) After logging into Splunk,
select the Search app and let's get started in searching. We'll start out simple and work our way up.
To begin your Splunk search, type in terms you might expect to find in your data. For example, if you want to find events
that might be HTTP 404 errors (i.e., webpage not found), type in the keywords:
http 404
You'll get back all the events that have both HTTP and 404 in their text.
47
Notice that search terms are implicitly AND'd together. The search was the same as "http AND 404". Let's make the search
narrower:
http 404 "like gecko"
48
Using quotes tells Splunk to search for a literal phrase like gecko, which returns more specific results than just searching
for like and gecko because they must be adjacent as a phrase.
Splunk supports the Boolean operators AND, OR, and NOT (must be capitalized), as well as parentheses to enforce
grouping. To get all HTTP error events (i.e., not 200 error code), not including 403 or 404, use this:
Again, the AND operator is implied; the previous search is the same as
Splunk supports the asterisk (*) wildcard for searching. For example, to retrieve events that has 40x and 50xx classes of
HTTP status codes, you could try:
49
When you index data, Splunk automatically adds fields (i.e., attributes) to each of your events. It does this based on some
text patterns commonly found in IT data, and intermediate users can add their own extraction rules for pulling out additional
fields.
To narrow results with a search, just add attribute=value to your search:
sourcetype=access_combined status=404
This search shows a much more precise version of our first search (i.e., "http 404") because it will only return events that
come from access_combined sources (i.e., webserver events) and that have a status code of 404, which is different than just
having a 404 somewhere in the text. The 404 has to be found where a status code is expected on the event and not just
anywhere. In addition to <attribute>=<value>, you can also do != (not equals), and <, >, >=, and <= for numeric fields.
50
51
52
53
54
You are a member of the Customer Support team for the online Flower & Gift shop. This is your first day on the job. You want to
learn some more about the shop. Some questions you want answered are:
It's your first day of work with the Customer Support team for the online Flower & Gift shop. You're just starting to dig into the Web
access logs for the shop, when you receive a call from a customer who complains about trouble buying a gift for
his girlfriend--he keeps hitting a server error when he tries to complete a purchase. He gives you his IP address, 10.2.1.44.
55
As you type into the search bar, Splunk's search assistant opens.
Search assistant shows you typeahead, or contextual matches and completions for each keyword as
you type it into the search bar. These contextual matches are based on what's in your data. The entries
under matching terms update as you continue to type because the possible completions for your term
change as well.
56
57
58
Hands on Lab
1.
Please choose the Data Source LoanStats3a.csv. Remember click on Search on the Toolbar and then click on
the Data Summary Button.
2.
3.
4.
BONUS LAB:
1.
Without the use of fields, find the status of Not Paid and Not Mortgage
59
The timeline is a visual representation of the number of events returned by a search over a selected time range. The timeline is a type
of histogram, where the range is broken up into smaller time intervals (such as seconds, minutes, hours, or days), and the count of
events for each interval appears in column form.
When you use the timeline to display the results of real-time searches, the timeline represents the sliding time range window
covered by the real-time search.
Mouseover a bar to see the count of events. Click on a bar to drill-down to that time range. Drilling down in this way does not run a
new search, it just filters the results from the previous search. You can use the timeline to highlight patterns or clusters of events or
investigate peaks (spikes in activity) and lows (possible server downtime) in event activity.
Change the timeline format
The timeline is located in the Events tab above the events listing. It shows the count of events over the time range that the search was
run. Here, the timeline shows web access events over the Previous business week.
60
You can hide the timeline (Hidden) and display a Compact or Full view of it. You can also toggle the timeline scale between linear
(Linear Scale) or logarithmic (Log Scale).
61
When Full is selected, the timeline is taller and displays the count on the y-axis and time on the x-axis.
Zoom in and zoom out to investigate events
Zoom and selection options are located above the timeline. At first, only the Zoom Out option is available.
The timeline legend is on the top right corner of the timeline. This indicates the scale of the timeline. For example, 1 minute per
column indicates that each column represents a count of events during that minute. Zooming in and out changes the time scale. For
example, if you click Zoom Out the legend will indicate that each column now represents an hour instead of a minute.
When you mouse over and select bars in the timeline, the Zoom to Selection or Deselect options become available.
62
Mouse over and click on the tallest bar or drag your mouse over a cluster of bars in the timeline. The events list updates to display
only the events that occurred in that selected time range. The time range picker also updates to the selected time range. You can cancel
this selection by clicking Deselect.
When you Zoom to Selection, you filter the results of your previous search for your selected time period. The timeline and events list
update to show the results of the new search.
63
You cannot Deselect after you zoomed into a selected time range. But, you can Zoom Out again.
64
65
Hands on Lab
Back at the Flower & Gift shop, let's continue with the customer (10.2.1.44) you were assisting. He reported an error while
purchasing a gift for his girlfriend. You confirmed his error, and now you want to find the cause of it.
Continue with the last search, which showed you the customer's failed purchase attempts.
1. Type purchase into the search bar and run the search:
sourcetype="access_combined_wcookie" 10.2.1.44 purchase
When you search for keywords, your search is not case-sensitive and Splunk retrieves the events that contain those
keywords anywhere in the raw text of the event's data
Use Boolean operators
If you're familiar with Apache server logs, in this case the access_combined format, you'll notice that
most of these events have an HTTP status of 200, or Successful. These events are not interesting for
you right now, because the customer is reporting a problem.
Splunk supports the Boolean operators: AND, OR, and NOT. When you include
Boolean expressions in your search, the operators have to be capitalized.
2. Use the Boolean NOT operator to quickly remove all of these Successful page
requests. Type in:
66
The AND operator is always implied between search terms. So the search in Step 5 is
the same as:
sourcetype="access_combined_wcookie" AND 10.2.1.44 AND purchase NOT 200
You notice that the customer is getting HTTP server (503) and client (404) errors. But, he specifically
mentioned a server error, so let's quickly remove events that are irrelevant.
Another way to add Boolean clauses quickly and interactively to your search is to use your search
results. Splunk lets you highlight and select any segment from
67
Timeline Usage
Continue with the last search, which showed you the customer's failed purchase attempts.
1. Search for:
In the
last topic, you really just focused on the search results listed in the events viewer area of
this dashboard. Now, let's take a look at the timeline.
sourcetype="access_combined_wcookie" 10.2.1.44 purchase NOT 200 NOT 404
The location of each bar on the timeline corresponds to an instance when the events that match your
search occurred. If there are no bars at a time period, no events were found then.
2. Mouse over one of the bars.
A tooltip pops up and displays the number of events that Splunk found during the time span of that bar (1
bar = 1 hour).
68
The taller the bar, the more events occurred at that time. Often seeing spikes in the number of events
or no events is a good indication that something has happened.
3. Click one of the bars, for example the tallest bar.
This updates your search results to show you only the events at the time span. Splunk does not run the
search when you click on the bar. Instead, it gives you a preview of the results zoomed-in at the time
range. You can still select other bars at this point.
One hour is still a wide time period to search, so let's narrow the search down more.
4. Double-click on the same bar.
Splunk runs the search again and retrieves only events during that one hour span you selected.
69
You should see the same search results in the Event viewer, but, notice that the search overrides the
time range picker and it now shows "Custom time". (You'll see more of the time range picker later.) Also,
each bar now represents one minute of time (1 bar = 1 min).
5. Double-click another bar.
Once again, this updates your search to now retrieve events during that one minute span of time. Each
bar represents the number of events for one second of time.
Now, you want to expand your search to see everything else, if anything, that happened during this
second.
6. Without changing the time range, replace your previous search in the search bar with:
*
Splunk supports using the asterisk (*) wildcard to search for "all" or to retrieve events based on parts
70
of a keyword. Up to now, you've just searched for Web access logs. This search tells Splunk that you
want to see everything that occurred at this time range:
71
You can:
Edit the job settings. Select this to open the Job Settings dialog, where you can change the job read permissions, extend the
job lifetime, and get a URL for the job that you can use to share the job with others or put a link to the job in your browser's
bookmark bar.
Send the job to the background. Select this if the search job is slow to complete and you would like to run the job in the
background while you work on other Splunk activities (including running a new search job).
Inspect the job. Opens a separate window and display information and metrics for the search job using the Search Job
Inspector. You can select this action while the search is running or after it completes.
Delete the job. Use this to delete a job that is currently running, is paused, or which has finalized. After you have deleted the
job you can still save the search as a report.
72
73
Report: If you would like to make the search available for later use, you can save it as a report. You can run the report again
on an ad hoc basis by finding the report on the Reports listing page and clicking its name.
Dashboard Panel...: Click this if you'd like to generate a dashboard panel based on your search and add it to a new or existing
dashboard.
Alert Click to define an alert based on your search. Alerts run saved searches in the background (either on a schedule or in
real time). When the search returns results that meet a condition you have set in the alert definition, the alert is triggered.
Event Type Event types let you classify events that have common characteristics. If the search doesn't include a pipe operator
or a subsearch , you can use this to save it as an event type. Other search actions
Between the job progress controls and search mode selector are three buttons which enable you to Share, Export, and Print the
results of a search.
74
Click Share to share the job. When you select this, the job's lifetime is extended to 7 days and read permissions are set to
Everyone.
Click Export to export the results. You can select to output to CSV, raw events, XML, or JSON and specify the number of
results to export.
Click Print to send the results to a printer that has been configured.
75
Hands on Lab
1.
2.
Using your file LoanStats3a.csv, save your last search as an event type
Go to Settings, and click on event types to view your saved event type
76
77
Understand fields
Use fields in searches
Use the fields sidebar
Hands on Lab covering: Understand Fields, Use fields in searches, Use the fields sidebar
End of Module Hands-on Quiz
78
Understand fields
Fields exist in machine data in many forms. Often, a field is a value (with a fixed, delimited position on the line) or a name and value
pair, where there is a single value to each field name. A field can be multivalued, that is, it can appear more than once in an event and
has a different value for each appearance.
Some examples of fields are clientip for IP addresses accessing your Web server, _time for the timestamp of an event, and host for
domain name of a server. One of the more common examples of multivalue fields is email address fields. While the From field will
contain only a single email address, the To and Cc fields have one or more email addresses associated with them.
In Splunk Enterprise, fields are searchable name and value pairings that distinguish one event from another because not all events will
have the same fields and field values. Fields let you write more tailored searches to retrieve the specific events that you want.
79
Use the following syntax to search for a field: fieldname="fieldvalue" . Field names are case sensitive, but field values are not.
1. Go to the Search dashboard and type the following into the search bar:
sourcetype="access_*"
This indicates that you want to retrieve only events from your web access logs and nothing else.
sourcetype
is a field name and access_* is a wildcarded field value used to match any Apache web access event. Apache web
access logs are formatted as access_common, access_combined, or access_combined_wcookie.
2. In the Events tab, scroll through the list of events.
If you are familiar with the access_combined format of Apache logs, you recognize some of the information in each event, such as:
80
To the left of the events list is the Fields sidebar. As Splunk Enterprise retrieves the events that match your search, the Fields sidebar
updates with Selected fields and Interesting fields. These are the fields that Splunk Enterprise extracted from your data.
Selected Fields are the fields that appear in your search results. The default fields host, source, and sourcetype are selected.
You can hide and show the fields sidebar by clicking Hide Fields and Show Fields, respectively.
3. Click All Fields.
The Select Fields dialog box opens, where you can edit the fields to show in the events list.
You see the default fields that Splunk defined. Some of these fields are based on each event's timestamp (everything beginning with
date_*), punctuation (punct), and location (index).
81
Other field names apply to the web access logs. For example, there are clientip, method, and status. These are not default fields.
They are extracted at search time.
This opens the field summary for the action field.
In this set of search results, Splunk Enterprise found five values for action, and that the action field appears in 49.9% of your search
results.
82
Hands on Lab
1. Go back to the Search dashboard and search for web access activity. Select
Other > Yesterday from the time range picker:
sourcetype="access_*"
You were actually using fields all along! Each time you searched for sourcetype=access_*, you told
Splunk to only retrieve events from your web access logs and nothing else.
To search for a particular field, specify the field name and value:
fieldname="fieldvalue"
is a field name and access_combined_wcookie is a field value. Here, the wildcarded value is
used to match all field values beginning with access_ (which would include access_common,
access_combined, and access_combined_wcookie) .
sourcetype
Note: Field names are case sensitive, but field values are not!
2. Scroll through the search results.
If you're familiar with the access_combined format of Apache logs, you will recognize some of the
information in each event, such as:
83
As Splunk retrieves these events, the Fields sidebar updates with selected fields
and interesting fields. These are the fields that Splunk extracted from your data.
Notice that default fields host, source, and sourcetype are selected fields and are displayed in your
search results:
your search results (by default, host, source, and sourcetype are selected).
Field name
action
Description
what a user does at the online shop.
85
category_id
product_id
6. From the Available fields list, select action, category_id, and product_id.
7. Click Save.
When you return to the Search view, the fields you selected will be included in your search results if they
exist in that particular event. Different events will have different fields.
86
The fields sidebar doesn't just show you what fields Splunk has captured from your data. It also displays
how many values exist for each of these fields. For the fields you just selected, there are 2 for action, 5
for category_id, and 9 for product_id. This doesn't mean that these are all the values that exist for each
of the fields--these are just the values that Splunk knows about from the results of your search.
What are some of these values?
8. Under selected fields, click action for the action field.
This opens the field summary for the action field.
This window tells you that, in this set of search results, Splunk found two values for action and they are
purchase and update. Also, it tells you that the action field appears in 71% of your search results. This
87
means that three-quarters of the Web access events are related to the purchase of an item or an update
(of the item quantity in the cart, perhaps).
9. Close this window and look at the other two fields you selected, category_id (what
types of products the shop sells) and product_id (specific catalog names for products).
Now you know a little bit more about the information in your data relating to the online Flower and Gift
shop. The online shop sells a selection of flowers, gifts, plants, candy, and balloons. Let's use these
fields, category_id and product_id, to see what people are buying.
Use fields to run more targeted searches
These next two examples compares the results when searching with and without fields.
Example 1
Return to the search you ran to check for errors in your data. Select Other > Yesterday from the
time range picker:
error OR failed OR severe OR (sourcetype=access_* (404 OR 500 OR 503))
88
Run this search again, but this time, use fields in your search.
The HTTP error codes are values of the status field. Now your search looks like this:
error OR failed OR severe OR (sourcetype=access_* (status=404 OR
status=500 OR status=503))
Notice the difference in the count of events between the two searches--because it's a more targeted
search, the second search returns fewer events.
When you run simple searches based on arbitrary keywords, Splunk matches the raw text of your data.
When you add fields to your search, Splunk looks for events that have those specific field/value pairs.
Example 2
Before you learned about the fields in your data, you might have run this search to see how many times
flowers were purchased from the online shop:
sourcetype=access_* purchase flower*
As you typed in "flower", search assistant shows you both "flower" and "flowers' in the typeahead. Since
89
you don't know which is the one you want, you use the wildcard to match both.
If you scroll through the (many) search results, you'll see that some of the events have
action=update and category_id that have a value other than flowers.
These are not events that you wanted!
Run this search instead. Select Other > Yesterday from the time range picker:
sourcetype=access_* action=purchase category_id=flower*
For the second search, even though you still used the wildcarded word "flower*", there is only one value
of category_id that it matches (FLOWERS).
90
Notice the difference in the number of events that Splunk retrieved for each search; the second search
returns significantly fewer events. Searches with fields are more targeted and retrieves more exact
matches against your data.
91
92
93
To save your search as a report, click on the Report link. This opens the Save As Report dialog:
Once you click Save, Splunk prompts you to either review Additional Settings for your newly created report (Permissions,
Schedule, Acceleration, and Embed), Add (the report) to Dashboard, View the report, or Continue Editing the search:
95
The additional settings that can be made to the report are given as follows:
Permissions: Allows you to set how the saved report is displayed: by owner, by app, or for all apps. In addition, you can make
the report read only or writeable (can be edited).
Schedule: Allows you to schedule the report (for Splunk to run/refresh it based upon your schedule). For example, an interval
like every week, on Monday at 6 AM, and for a particular time range.
Acceleration: Not all saved reports qualify for acceleration and not all users (not even admins) have the ability to accelerate
reports. Generally speaking, Splunk Enterprise will build a report acceleration summary for the report if it determines that the
report would benefit from summarization (acceleration).
Embed: Report embedding lets you bring the results of your reports to large numbers of report stakeholders. With report
embedding, you can embed scheduled reports in external (non-Splunk) websites, dashboards, and portals. Embedded reports
can display results in the form of event views, tables, charts, maps, single values, or any other visualization type. They use the
same formatting as the originating report. When you embed a saved report, you do this by copying a Splunk generated URL
into an HTML-based web page.
96
Edit reports
You can easily edit an existing report. You can edit a report's definition (its search string, pivot setup, or result formatting). You can
also edit its description, permissions, schedule, and acceleration settings.
To edit a report's definition
If you want to edit a report's definition, there are two ways to start, depending on whether you're on the Reports listing page or looking
at the report itself.
If you're on the Reports listing page, locate the report you want to edit, go to the Actions column, and click Open in Search
or Open in Pivot (you'll see one or the other depending on which tool you used to create the report).
If you've entered the report to review its results, click Edit and select Open in Search or Open in Pivot (you'll see one or the
other depending on which tool you used to create the report).
97
A visualization is a representation of data returned from a search. Most visualizations are graphical representations, however, a
visualization can also be non-graphical. In dashboards, a panel contains one or more visualizations. Visualizations available for
simple XML dashboards include:
chart
event listing
map
table
single value
area
bar
bubble
column
filler gauge
line
marker gauge
pie
radial gauge
scatter
98
3. Select under Interesting Fields, category_id . Then click under Reports, top values:
100
101
102
6. Go back to the Visualization tab, under Format , then investigate all the different options
7. Under the Bar Chart drop, investigate all the different chart types as well
Bonus Lab:
Using the LoanStats3a.csv file, create a report from the data that top values across all the states
103
104
105
2. Click the Statistics tab after you have the search you want:
106
4. Then you can choose the fields you have selected to Pivot, and click OK :
107
5. Then you can choose a field like annual_inc with a default of Sum to be part of your Pivot column values:
108
109
110
Hands on Lab
1. Create a report out of LoanStats3a.csv source that looks into the annual income < 70000 and the addr_state of
CA ,FL, NY
2. Create an instant pivot out of the search from #1 above.
111
112
Create a dashboard
Add a report to a dashboard
Hands on Lab covering: Create a dashboard, Add a report to a dashboard
Add a pivot report to a dashboard
Edit a dashboard
Hands on Lab covering: Add a pivot report to a dashboard, Edit a dashboard.
End of Module Hands on Quiz
113
Create a dashboard
You can create a dashboard from the search OR you can click on the Dashboard option on the Toolbar
OR
114
115
Hands on Lab:
Let's use the flower shop transactions to create a dashboard and add a report to it
Before you learned about the fields in your data, you might have run this search to see how many times flowers were
purchased from the online shop:
sourcetype=access_* purchase flower*| top limit=20 category_id
1.
2.
3.
4.
Bonus Lab:
The report out of LoanStats3a.csv source that looks into the annual income < 70000 and the addr_state of CA ,FL,
NY from the last module and create a dashboard
116
117
Edit a dashboard
From your dashboard, you can edit your dashboard from the menu
118
Hands on Lab:
1. Create an instant pivot, like the one from the previous module out of LoanStats3a.csv source that looks into the
annual income < 70000 and the addr_state of CA ,FL, NY
2. Then add that pivot report to the dashboard
3. Create another report that looks at ALL the annual incomes in the states of CA,FL, NY
4. Add that report to the dashboard created in exercise #1
5. Edit the dashboard panels and add titles to your panels.
Bonus Lab:
1. Create another instant pivot or report and add to the existing dashboard
119
120
121
To successfully use Splunk, it is vital that you write effective searches. Using the index efficiently will make your initial discoveries
faster, and the reports you create will run faster for you and for others. In this chapter, we will cover the following topics:
Search terms are case insensitive: Searches for error, Error, ERROR, and ErRoR are all the same thing.
Search terms are additive: Given the search item, mary error, only events that contain both words will be found. There are
Boolean and grouping operators to change this behavior; we will discuss in this chapter under Boolean and grouping
operators.
Only the time frame specified is queried: This may seem obvious, but it's very different from a database, which would
always have a single index across all events in a table. Since each index is sliced into new buckets over time, only the buckets
that contain events for the time frame in question need to be queried.
Search terms are words, including parts of words: A search for foo will also match foobar.
122
With just these concepts, you can write fairly effective searches. Let's dig a little deeper, though:
A word is anything surrounded by whitespace or punctuation: For instance, given the log line 2012-0207T01:03:31.104-0600 INFO AuthClass Hello world. [user=Bobby, ip=1.2.3.3], the "words" indexed are
2012,02, 07T01, 03, 31, 104, 0600, INFO, AuthClass, Hello, world, user, Bobby, ip, 1, 2, 3, and 3. This
may seem strange, and possibly a bit wasteful, but this is what Splunk's index is really, really good atdealing with huge
numbers of words across a huge number of events.
Splunk is not grep with an interface: One of the most common questions is whether Splunk uses regular expressions for your
searches. Technically, the answer is no. Splunk does use regex internally to extract fields, including the auto generated fields,
but most of what you would do with regular expressions is available in other ways. Using the index as it is designed is the best
way to build fast searches. Regular expressions can then be used to further filter results or extract fields.
Numbers are not numbers until after they have been parsed at search time: This means that searching for foo>5 will not
use the index, as the value of foo is not known until it has been parsed out of the event at search time. There are different ways
to deal with this behavior, depending on the question you're trying to answer.
Field names are case sensitive: When searching for host=myhost, host must be lowercase. Likewise, any extracted or
configured fields have case sensitive field names, but the values are case insensitive.
Host=myhost will not work
host=myhost will work
host=MyHost will work
Fields do not have to be defined before indexing data: An indexed field is a field that is added to the metadata of an event at
index time. There are legitimate reasons to define indexed fields, but in the vast majority of cases it is unnecessary and is
actually wasteful.
123
AND is implied between terms. For instance, error mary (two words separated by a space) is the same as
error AND mary.
OR allows you to specify multiple values. For instance, error OR mary means find any event that contains
either word.
NOT applies to the next term or group. For example, error NOT mary would find events that contain error but
do not contain mary.
The quote marks ("") identify a phrase. For example, "Out of this world" will find this exact sequence of
words. Out of this world would find any event that contains all of these words, but not necessarily in that
order.
Parentheses ( ( ) ) is used for grouping terms. Parentheses can help avoid confusion in logic. For instance,
these two statements are equivalent:
The equal sign (=) is reserved for specifying fields. Searching for an equal sign can be accomplished by
wrapping it in quotes. You can also escape characters to search for them. \= is the same as "=".
Brackets ( [ ] ) are used to perform a subsearch.
124
You can use these operators in fairly complicated ways if you want to be very specific, or even to find multiple sets of events in a
single query. The following are a few examples:
125
Clicking on any word or field value will give you the option to Add to search or Exclude from search (the
existing search) or (create a) New search:
Clicking on a word or a field value that is already in the query will give you the option to remove it (from the
existing query) or, as above, (create a) new (search):
126
Event segmentation
In previous versions of Splunk, event segmentation was configurable through a setting in the Options dialog. In version 6.2, the
options dialog is not present although segmentation (discussed later in this chapter under field widgets section) is still an important
concept, it is not accessible through the web interface/options dialog in this version.
Field widgets
Clicking on values in the Select Fields dialog (the field picker), or in the field value widgets underneath an event, will again give us an
option to append (add to) or exclude (remove from) our search or, as before, to start a new search.
For instance, if source="C:\Test Data\TM1ProcessError_20140623213757_temp.log" appears under your event, clicking on that
value and selecting Add to search will append source="C:\\Test Data\\TM1ProcessError_20140623213757_temp.log" to your
search:
127
To use the field picker, you can click on the link All Fields (see the following image):
Expand the results window by clicking on > in the far-left column. Clicking on a result will append that item to the current search:
128
If a field value looks like key=value in the text of an event, you will want to use one of the field widgets instead of clicking on the
raw text of the event. Depending on your event segmentation setting, clicking on the word will either add the value or key=value. The
former will not take advantage of the field definition; instead, it will simply search for the word. The latter will work for events that
contain the exact quoted text, but not for other events that actually contain the same field value extracted in a different way.
129
Time
Clicking on the time next to an event will open the _time dialog (shown in the following image) allowing you to change the search to
select Events Before or After a particular time period, and will also have the following choices:
In addition, you can select Nearby Events within plus, minus, or plus or minus, a number of seconds (the default), milliseconds,
minutes, hours, days, or weeks:
130
One search trick is to click on the time of an event, select At this time, and then use the Zoom out (above the timeline) until the
appropriate time frame is reached.
131
Fields command
Description
Keeps (+) or removes (-) fields from search results based on the field list criteria. If + is specified, only the fields that match one of the
fields in the list are kept. If - is specified, only the fields that match one of the fields in the list are removed. If neither is specified,
defaults to +.
Important: The leading underscore is reserved for all internal Splunk Enterprise field names, such as _raw and _time. By default,
internal fields _raw and _time are included in output. The fields command does not remove internal fields unless explicitly
specified with:
... | fields - _*
Note: Be cautious removing the _time field. Statistical commands, such as timechart and chart, cannot display date or time
information without the _time field.
Syntax
fields [+|-] <wc-field-list>
Required arguments
<wc-field-list>
Syntax: <string>, <string>, ...
Description: Comma-delimited list of fields to keep (+) or remove (-). You can use wild card characters in the
field names.
132
Examples
Example 1:
Remove the "host" and "ip" fields.
... | fields - host, ip
Example 2:
Keep only the host and ip fields. Remove all of the internal fields. The internal fields begin with an underscore character, for
example _time.
... | fields host, ip | fields - _*
Example 3:
Keep only the fields 'source', 'sourcetype', 'host', and all fields beginning with 'error'.
... | fields source, sourcetype, host, error*
133
Table command
Description
The table command is similar to the fields command in that it lets you specify the fields you want to keep in your results. Use
table command when you want to retain data in tabular format.
The table command can be used to build a scatter plot to show trends in the relationships between discrete values of your data.
Otherwise, you should not use it for charts (such as chart or timechart) because the UI requires the internal fields (which are the
fields beginning with an underscore, _*) to render the charts, and the table command strips these fields out of the results by default.
Instead, you should use the fields command because it always retains all the internal fields.
Syntax
table <wc-field-list>
Arguments
<wc-field-list>
Syntax: <wc-field> <wc-field> ...
Description: A list of field names. You can use wild card characters in the field names.
Usage
The table command returns a table formed by only the fields specified in the arguments. Columns are displayed in the same order
that fields are specified. Column headers are the field names. Rows are the field values. Each row represents an event.
Command type: The table command is a non-streaming command. If you are looking for a streaming command similar to the table
command, use the fields command.
Field renaming: The table command doesn't let you rename fields, only specify the fields that you want to show in your tabulated
results. If you're going to rename a field, do it before piping the results to table.
134
Rename command
Description
Use the rename command to rename a specified field or multiple fields. This command is useful for giving fields more meaningful
names, such as "Product ID" instead of "pid". If you want to rename multiple fields, you can use wildcards.
Use quotes to rename a field to a phrase:
... | rename SESSIONID AS sessionID
If both the source and destination fields are wildcard expressions with the same number of wildcards, the renaming will carry over the
wildcarded portions to the destination expression. See Example 2, below.
Note: You cannot rename one field with multiple names. For example if you had a field A, you cannot do "A as B, A as C" in one
string.
... | stats first(host) AS site, first(host) AS report
Note: You cannot use this command to merge multiple fields into one field because null, or non-present, fields are brought along with
the values. For example, if you had events with either product_id or pid fields, ... | rename pid AS product_id would not
merge the pid values into the product_id field. It overwrites product_id with Null values where pid does not exist for the event.
135
Syntax
rename <wc-field> AS <wc-field>...
Required arguments
wc-field
Syntax: <string>
Description: The name of a field and the name to replace it. You can use wild card characters in the field
names. Names with spaces must be enclosed in quotation marks.
136
Rex command
Description
Use this command to either extract fields using regular expression named groups, or replace or substitute characters in a field using
sed expressions.
The rex command matches the value of the specified field against the unanchored regular expression and extracts the named groups
into fields of the corresponding names. If a field is not specified, the regular expression is applied to the _raw field. Note: Running
rex against the _raw field might have a performance impact.
When mode=sed, the given sed expression used to replace or substitute characters is applied to the value of the chosen field. If a field
is not specified, the sed expression is applied to _raw. This sed-syntax is also used to mask sensitive data at index-time.
Use the rex command for search-time field extraction or string replacement and character substitution.
Syntax
rex [field=<field>] ( <regex-expression> [max_match=<int>] [offset_field=<string>] ) | (mode=sed <sed-expression>)
Required arguments
regex-expression
Syntax: "<string>"
Description: The PCRE regular expression that defines the information to match and extract from the
specified field. Quotation marks are required.
mode
Syntax: mode=sed
Description: Specify to indicate that you are using a sed (UNIX stream editor) expression.
sed-expression
Syntax: "<string>"
Description: When mode=sed, specify whether to replace strings (s) or substitute characters (y) in the
137
matching regular expression. No other sed commands are implemented. Quotation marks are required. Sed
mode supports the following flags: global (g) and Nth occurrence (N), where N is a number that is the
character location in the string.
Optional arguments
field
Syntax: field=<field>
Description: The field that you want to extract information from.
Default: _raw
max_match
Syntax: max_match=<int>
Description: Controls the number of times the regex is matched. If greater than 1, the resulting fields are
multivalued fields.
Default: 1, use 0 to mean unlimited.
offset_field
Syntax: offset_field=<string>
Description: If provided, a field is created with the name specified by <string>. This value of the field has
the endpoints of the match in terms of zero-offset characters into the matched field. For example, if the rex
expression is "(?<tenchars>.{10})", this matches the first ten characters of the field, and the offset_field
contents is "0-9".
Default: unset
Sed expression
When using the rex command in sed mode, you have two options: replace (s) or character substitution (y).
The syntax for using sed to replace (s) text in your data is: "s/<regex>/<replacement>/<flags>"
This substitutes the characters that match <string1> with the characters in <string2>.
Usage
Splunk Enterprise uses perl-compatible regular expressions (PCRE).
When you use regular expressions in searches, you need to be aware of how characters such as pipe ( | ) and backslash ( \ ) are
handled.
Examples
Example 1:
Extract "from" and "to" fields using regular expressions. If a raw event contains "From: Susan To: Bob", then from=Susan and
to=Bob.
... | rex field=_raw "From: (?<from>.*) To: (?<to>.*)"
Example 2:
Extract "user", "app" and "SavedSearchName" from a field called "savedsearch_id" in scheduler.log events. If
savedsearch_id=bob;search;my_saved_search then user=bob , app=search and SavedSearchName=my_saved_search
... | rex field=savedsearch_id "(?<user>\w+);(?<app>\w+);(?<SavedSearchName>\w+)"
139
Example 3:
Use sed syntax to match the regex to a series of numbers and replace them with an anonymized string.
... | rex field=ccnumber mode=sed "s/(\d{4}-){3}/XXXX-XXXX-XXXX-/g"
Example 4:
Display IP address and ports of potential attackers.
sourcetype=linux_secure port "failed password" | rex "\s+(?<ports>port \d+)" | top src_ip ports showperc=0
This search used rex to extract the port field and values. Then, it displays a table of the top source IP addresses (src_ip) and ports the
returned with the search for potential attackers.
140
Multikv command
Description
Extracts field-values from table-formatted events, such as the results of top, netstat, ps, and so on. The multikv command creates a
new event for each table row and assigns field names from the title row of the table.
An example of the type of data multikv is designed to handle:
Name
Josh
Francine
Samantha
Age
42
35
22
Occupation
SoftwareEngineer
CEO
ProjectManager
multikv can transform this table from one event into three events with the relevant fields. It works more easily with the fixedalignment though can sometimes handle merely ordered fields.
The general strategy is to identify a header, offsets, and field counts, and then determine which components of subsequent lines should
be included into those field names. Multiple tables in a single event can be handled (if multitable=true), but may require ensuring that
the secondary tables have capitalized or ALLCAPS names in a header row.
Auto-detection of header rows favors rows that are text, and are ALLCAPS or Capitalized.
141
Syntax
multikv [conf=<stanza_name>] [<multikv-option>...]
Optional arguments
conf
Syntax: conf=<stanza_name>
Description: If you have a field extraction defined in multikv.conf, use this argument to reference the stanza
in your search.
<multikv-option>
Syntax: copyattrs=<bool> | fields <field-list> | filter <field-list> | forceheader=<int> | multitable=<bool> |
noheader=<bool> | rmorig=<bool>
Description: Options for extracting fields from tabular events.
Descriptions for multikv options
copyattrs
Syntax: copyattrs=<bool>
Description: When true, multikv copies all fields from the original event to the events generated from that
event. When false, no fields are copied from the original event. This means that the events will have no _time
field and the UI will not know how to display them.
Default: true
fields
Syntax: fields <field-list>
Description: Limit the fields set by the multikv extraction to this list. Ignores any fields in the table which
are not on this list.
filter
Syntax: filter <term-list>
Description: If specified, multikv skips over table rows that do not contain at least one of the strings in the
filter list. Quoted expressions are permitted, such as "multiple words" or "trailing_space ".
forceheader
142
Syntax: forceheader=<int>
Description: Forces the use of the given line number (1 based) as the table's header. Does not include empty
lines in the count.
Default: The multikv command attempts to determine the header line automatically.
multitable
Syntax: multitable=<bool>
Description: Controls whether or not there can be multiple tables in a single _raw in the original events.
Default: true
noheader
Syntax: noheader=<bool>
Description: Handle a table without header row identification. The size of the table will be inferred from the
first row, and fields will be named Column_1, Column_2, ... noheader=true implies multitable=false.
Default: false
rmorig
Syntax: rmorig=<bool>
Description: When true, the original events will not be included in the output results. When false, the
original events are retained in the output results, with each original emitted after the batch of generated
results from that original.
Default: true
Examples
Example 1: Extract the "COMMAND" field when it occurs in rows that contain "splunkd".
... | multikv fields COMMAND filter splunkd
143
Hands-on Lab
1. Use the source LoanStats3a.csv and only take a look at some fields out of the data
2. Use the source LoanStats3a.csv and the table command on the same fields in #1
3. Use the source LoanStats3a.csv and use the rename command to rename fields in #1
4. Use the source LoanStats3a.csv and use the rex command for:
a. source="LoanStats3a.csv" annual_inc=60000 | rex "Does not meet the credit policy.(?<all_util>.*)"
b. and then click on all_util field to demonstrate the rex results
144
145
146
Top command
Description
Displays the most common values of a field.
Finds the most frequent tuple of values of all fields in the field list, along with a count and percentage. If the optional by-clause is
included, the command finds the most frequent values for each distinct tuple of values of the group-by fields.
Syntax
top [<N>] [<top-options>...] <field-list> [<by-clause>]
Required arguments
<field-list>
Syntax: <field>, <field>, ...
Description: Comma-delimited list of field names.
Optional arguments
<N>
Syntax: <int>
Description: The number of results to return.
<top-options>
Syntax: countfield=<string> | limit=<int> | otherstr=<string> | percentfield=<string> | showcount=<bool>
| showperc=<bool> | useother=<bool>
Description: Options for the top command. See Top options.
<by-clause>
147
Syntax: BY <field-list>
Description: The name of one or more fields to group by.
Top options
countfield
Syntax: countfield=<string>
Description: The name of a new field that the value of count is written to.
Default: "count"
limit
Syntax: limit=<int>
Description: Specifies how many tuples to return, "0" returns all values.
Default: "10"
otherstr
Syntax: otherstr=<string>
Description: If useother is true, specify the value that is written into the row representing all other values.
Default: "OTHER"
percentfield
Syntax: percentfield=<string>
Description: Name of a new field to write the value of percentage.
Default: "percent"
showcount
Syntax: showcount=<bool>
Description: Specify whether to create a field called "count" (see "countfield" option) with the count of that
tuple.
Default: true
showperc
Syntax: showperc=<bool>
Description: Specify whether to create a field called "percent" (see "percentfield" option) with the relative
prevalence of that tuple.
Default: true
148
useother
Syntax: useother=<bool>
Description: Specify whether or not to add a row that represents all values not included due to the limit
cutoff.
Default: false
Examples
Example 1:
Return the 20 most common values of the "referer" field.
sourcetype=access_* | top limit=20 referer
Example 2:
Return top "action" values for each "referer_domain".
149
Because a limit is not specified, this returns all the combinations of values for "action" and "referer_domain" as well as the counts
and percentages
150
Example 3:
Return the top product purchased for each category. Do not show the percent field. Rename the count field to "total".
sourcetype=access_* status=200 action=purchase | top 1 productName by categoryId showperc=f
countfield=total
151
Rare command
Description
Displays the least common values of a field.
Finds the least frequent tuple of values of all fields in the field list. If the <by-clause> is specified, this command returns rare tuples of
values for each distinct tuple of values of the group-by fields.
This command operates identically to the top command, except that the rare command finds the least frequent instead of the most
frequent.
Syntax
rare [<top-options>...] <field-list> [<by-clause>]
Required arguments
<field-list>
Syntax: <string>,...
Description: Comma-delimited list of field names.
Optional arguments
<top-options>
Syntax: countfield=<string> | limit=<int> | percentfield=<string> | showcount=<bool> | showperc=<bool>
Description: Options that specify the type and number of values to display. These are the same <topoptions> used by the top command.
<by-clause>
Syntax: BY <field-list>
Description: The name of one or more fields to group by.
152
Top options
countfield
Syntax: countfield=<string>
Description: The name of a new field to write the value of count into.
Default: "count"
limit
Syntax: limit=<int>
Description: Specifies how many tuples to return. If you specify >code>limit=0</code>, all values up to
maxresultrows are returned. See Limits section. Specifying a value larger than maxresultrows produces an
error.
Default: 10
percentfield
Syntax: percentfield=<string>
Description: Name of a new field to write the value of percentage.
Default: "percent"
showcount
Syntax: showcount=<bool>
Description: Specify whether to create a field called "count" (see "countfield" option) with the count of that
tuple.
Default: true
showperc
Syntax: showperc=<bool>
Description: Specify whether to create a field called "percent" (see "percentfield" option) with the relative
prevalence of that tuple.
Default: true
153
Limits
There is a limit on the number of results which rare returns. By default this limit is 10, but other values can be selected with the limit
option up to a further constraint expressed in limits.conf, in the [rare] stanza, maxresultrows. This ceiling is 50,000 by default, and
effectively keeps a ceiling on the memory that rare will use.
Examples
Example 1:
Return the least common values of the "url" field.
... | rare url
Example 2:
Find the least common "user" value for a "host".
... | rare user by host
154
1. Run
155
Stats command
Description
Calculates aggregate statistics over the results set, such as average, count, and sum. This is similar to SQL aggregation. If stats is
used without a by clause only one row is returned, which is the aggregation over the entire incoming result set. If you use a by clause
one row is returned for each distinct value specified in the by clause.
Syntax
Simple: stats (stats-function(field) [AS field])... [BY field-list]
Complete: stats [partitions=<num>] [allnum=<bool>] [delim=<string>] ( <stats-agg-term>... | <sparkline-agg-term>... ) [<byclause>]
Required arguments
stats-agg-term
Syntax: <stats-function>(<evaled-field> | <wc-field>) [AS <wc-field>]
Description: A statistical aggregation function. The function can be applied to an eval expression, or to a
field or set of fields. Use the AS clause to place the result into a new field with a name that you specify. You
can use wild card characters in field names. For more information on eval expressions, see Types of eval
expressions in the Search Manual.
sparkline-agg-term
Syntax: <sparkline-agg> [AS <wc-field>]
Description: A sparkline aggregation function. Use the AS clause to place the result into a new field with a
name that you specify. You can use wild card characters in the field name.
Optional arguments
allnum
syntax: allnum=<bool>
Description: If true, computes numerical statistics on each field if and only if all of the values of that field
156
are numerical.
Default: false
delim
Syntax: delim=<string>
Description: Specifies how the values in the list() or values() aggregation are delimited.
Default: a single space
by-clause
Syntax: BY <field-list>
Description: The name of one or more fields to group by. You cannot use a wildcard character to specify
multiple fields with similar names. You must specify each field separately.
partitions
Syntax: partitions=<num>
Description: If specified, partitions the input data based on the split-by fields for multithreaded reduce.
Default: 1
Stats function options
stats-function
Syntax: avg() | c() | count() | dc() | distinct_count() | earliest() | estdc() | estdc_error() | exactperc<int>() |
first() | last() | latest() | list() | max() | median() | min() | mode() | p<in>() | perc<int>() | range() | stdev()
| stdevp() | sum() | sumsq() | upperperc<int>() | values() | var() | varp()
Description: Functions used with the stats command. Each time you invoke the stats command, you can use
more than one function. However, you can only use one by clause.
Usage
The stats command does not support wildcard characters in field values in BY clauses.
For example, you cannot specify | stats count BY source*.
157
Basic Examples
1. Return the average transfer rate for each host
sourcetype=access* | stats avg(kbps) by host
2. Search the access logs, and return the total number of hits from the top 100 values of "referer_domain"
Search the access logs, and return the total number of hits from the top 100 values of "referer_domain". The "top" command returns a
count and percent value for each "referer_domain".
sourcetype=access_combined | top limit=100 referer_domain | stats sum(count) AS total
3. Calculate the average time for each hour for similar fields using wildcard characters
Return the average, for each hour, of any unique field that ends with the string "lay". For example, delay, xdelay, relay, etc.
... | stats avg(*lay) BY date_hour
4. Remove duplicates in the result set and return the total count for the unique results
Remove duplicates of results with the same "host" value and return the total count of the remaining results.
... | stats dc(host)
158
Addcoltotals command
Description
The addcoltotals command appends a new result to the end of the search result set. The result contains the sum of each numeric
field or you can specify which fields to summarize. Results are displayed on the Statistics tab. If the labelfield argument is
specified, a column is added to the statistical results table with the name specified.
Syntax
addcoltotals [labelfield=<field>] [label=<string>] [<fieldlist>]
Optional arguments
<fieldlist>
Syntax: <field> ...
Description: A space delimited list of valid field names. The addcoltotals command calculates the sum only
for the fields in the list you specify. You can use the asterisk ( * ) as a wildcard in the field names.
Default: Calculates the sum for all of the fields.
labelfield
Syntax: labelfield=<fieldname>
Description: Specify a field name to add to the result set.
Default: none
label
Syntax: label=<string>
Description: Used with the labelfield argument to add a label in the summary event. If the labelfield
argument is absent, the label argument has no effect.
Default: Total
159
Examples
Example 1:
Compute the sums of all the fields, and put the sums in a summary event called "change_name".
... | addcoltotals labelfield=change_name label=ALL
Example 2:
Add a column total for two specific fields in a table.
sourcetype=access_* | table userId bytes avgTime duration | addcoltotals bytes duration
Example 3:
Filter fields for two name-patterns, and get totals for one of them.
... | fields user*, *size | addcoltotals *size
Example 4:
Augment a chart with a total of the values present.
index=_internal source=*metrics.log" group=pipeline |stats avg(cpu_seconds) by processor |addcoltotals
labelfield=processor
160
Hands on Lab
1.
Run a search query that uses the top, stats functions with Loan file to get the count:
3. Try running:
4.
161
162
163
Search
Dashboards
Dashboard visual editor
Pivot
Reports
Events visualizations
Events visualizations are essentially raw lists of events.
You can get events visualizations from any search that does not include a transform operation, such as a search that uses reporting
commands like stats, chart, timechart, top, or rare.
Tables
You can pick table visualizations from just about any search, but the most interesting tables are generated by searches that include
transform operations, such as a search that uses reporting commands like stats, chart, timechart, top, or rare.
Charts
Splunk provides a variety of chart visualizations, such as column, line, area, scatter, and pie charts. These visualizations require
transforming searches (searches that use reporting commands) whose results involve one or more series.
Maps
Splunk provides a map visualization that lets you plot geographic coordinates as interactive markers on a world map. Searches for
map visualizations should use the geostats search command to plot markers on a map. The geostats command is similar to the stats
command, but provides options for zoom levels and cells for mapping. Events generated from the geostats command include latitude
and longitude coordinates for markers.
165
Charts
Splunk provides a variety of chart visualizations, such as column, line, area, scatter, and pie charts. These visualizations require
transforming searches (searches that use reporting commands) whose results involve one or more series.
A series is a sequence of related data points that can be plotted on a chart. For example, each line plotted on a line chart represents an
individual series. You can design transforming searches that produce a single series, or you can set them up so the results provide data
for multiple series.
It may help to think of the tables that can be generated by transforming searches. Every column in the table after the first one
represents a different series. A "single series" search would produce a table with only two columns, while a "multiple series" search
would produce a table with three or more columns.
All of the chart visualizations can handle single-series searches, though you'll find that bar, column, line, and pie chart visualizations
are usually best for such searches. In fact, pie charts can only display data from single series searches.
On the other hand, if your search produces multiple series, you'll want to go with a bar, column, line, area, or scatter chart
visualization.
Column and bar charts
Use a column chart or bar chart to compare the frequency of values of fields in your data. In a column chart, the x-axis values are
typically field values (or time, especially if your search uses the timechart reporting command) and the y-axis can be any other field
value, count of values, or statistical calculation of a field value. Bar charts are exactly the same, except that the x-axis and y-axis
values are reversed.
166
The following bar chart presents the results of this search, which uses internal Splunk metrics. It finds the total sum of CPU_seconds
by processor in the last 15 minutes, and then arranges the processors with the top ten sums in descending order:
index=_internal "group=pipeline" | stats sum(cpu_seconds) as totalCPUSeconds by processor | sort 10
totalCPUSeconds desc
Note that in this example, we've also demonstrated how you can roll over a single bar or column to get detail information about it.
When you define the properties of your bar and column charts, you can:
set the chart titles, as well as the titles of the x-axis and y-axis.
set the minimum y-axis values for the y-axis (for example, if all the y-axis values of your search are above 100 it may improve
clarity to have the chart start at 100).
167
set the unit scale to Log (logarithmic) to improve clarity of charts where you have a mix of very small and very large y-axis
values.
determine whether charts are stacked, 100% stacked, and unstacked. Bar and column charts are always unstacked by default.
See the following subsection for details on stacking bar and column charts.
If you are formatting bar or column charts in dashboards with the Visualization Editor you can additionally:
set the major unit for the y-axis (for example, you can arrange to have tick marks appear in units of 10, or 20,
or 45...whatever works best).
determine the position of the chart legend and the manner in which the legend labels are truncated.
turn their drilldown functionality on or off.
168
The following chart illustrates the customer views of pages in the website of MyFlowerShop, a hypothetical web-based flower store,
broken out by product category over a 7 day period:
sourcetype=access_* method=GET | timechart count by categoryId | fields _time BOUQUETS FLOWERS GIFTS
SURPRISE TEDDY
Note the usage of the fields command; it ensures that the chart only displays counts of events with a product category ID; events
without one (categorized as null by Splunk) are excluded.
The third Stack mode option, Stacked 100%, enables you to compare data distributions within a column or bar by making it fit to
100% of the length or width of the chart and then presenting its segments in terms of their proportion of the total "100%" of the
column or bar. Stacked 100% can help you to better see data distributions between segments in a column or bar chart that contains a
mix of very small and very large stacks when Stack mode is just set to Stacked.
Line and area charts
Line and area charts are commonly used to show data trends over time, though the x-axis can be set to any field value. If your chart
includes more than one series, each series will be represented by a differently colored line or area.
This chart is based on a simple search that reports on internal Splunk metrics:
index=_internal | timechart count by sourcetype
170
The shaded areas in area charts can help to emphasize quantities. The following area chart is derived from this search, which also
makes use of internal Splunk metrics:
index=_internal source=*metrics.log group=search_concurrency "system total" NOT user=* | timechart
max(active_hist_searches) as "Historical Searches" max(active_realtime_searches) as "Real-time Searches"
171
When you define the properties of your line and area charts, you can:
set the chart titles, as well as the titles of the x-axis and y-axis.
determine what Splunk does with missing (null) y-axis values. You can have the system leave gaps for null datapoints, have
connect to zero datapoints, or just connect to the next positive datapoint. If you choose to leave gaps, Splunk will display
markers for datapoints that are disconnected because they are not adjacent to other positive datapoints.
set the minimum y-axis values (for example, if all the y-axis values of your search are above 100 it may improve clarity to
have the chart start at 100).
set the unit scale to Log (logarithmic) to improve clarity of charts where you have a mix of very small and very large y-axis
values.
determine whether charts are stacked, 100% stacked, and unstacked. Bar and column charts are always unstacked by default.
See the following subsection for details on stacking bar and column charts.
172
If you are formatting line or area charts in dashboards with the Visualization Editor you can additionally:
set the major unit for the y-axis (for example, you can arrange to have tick marks appear in units of 10, or 20,
or 45...whatever works best).
determine the position of the chart legend and the manner in which the legend labels are truncated.
turn their drilldown functionality on or off.
173
Pie chart
Use a pie chart to show the relationship of parts of your data to the entire set of data as a whole. The size of a slice in a pie graph is
determined by the size of a value of part of your data as a percentage of the total of all values.
The following pie chart presents the views by referrer domain for a hypothetical online store for the previous day. Note that you can
get metrics for individual pie chart wedges by mousing over them.
174
When you define the properties of pie charts you can set the chart title. If you are formatting pie charts in dashboards with the
Visualization Editor you can additionally:
175
Scatter chart
Use a scatter chart ( or "scatter plot") to show trends in the relationships between discrete values of your data. Generally, a scatter plot
shows discrete values that do not occur at regular intervals or belong to a series. This is different from a line graph, which usually
plots a regular series of points.
Here's an example of a search that can be used to generate a scatter chart. It looks at USGS earthquake data (in this case a CSV file
that presents all magnitude 2.5+ quakes recorded over a given 7-day period, worldwide), pulls out just the Californian quakes, plots
out the quakes by magnitude and quake depth, and then color-codes them by region. As you can see the majority of quakes recorded
during this period were fairly shallow--10 or fewer meters in depth, with the exception of one quake that was around 27 meters deep.
None of the quakes exceeded a magnitude of 4.0.
176
To generate the chart for this example, we've used the table command, followed by three fields. The first field is what appears in the
legend (Region). The second field is the x-axis value (Magnitude), which leaves the third field (Depth) to be the y-axis value. Note
that when you use table the latter two fields must be numeric in nature.
source=usgs Region=*California | table Region Magnitude Depth | sort Region
You can download a current CSV file from the USGS Earthquake Feeds and add it as an input to Splunk, but the field names and
format will be slightly different from the example shown here.
When you define the properties of your scatter charts, you can:
set the chart titles, as well as the titles of the x-axis and y-axis.
set the minimum y-axis values for the y-axis (for example, if all the y-axis values of your search are above 100
it may improve clarity to have the chart start at 100).
set the unit scale to Log (logarithmic) to improve clarity of charts where you have a mix of very small and very
large y-axis values.
If you are formatting bar or column charts in dashboards with the Visualization Editor you can additionally:
set the major unit for the y-axis (for example, you can arrange to have tick marks appear in units of 10, or 20,
or 45...whatever works best).
determine the position of the chart legend and the manner in which the legend labels are truncated.
turn their drilldown functionality on or off.
177
Then click the visualization tab to see the result of this having two series. Make sure to select Line Chart
178
Hands on Lab :
1. Upload a data file called: ImplementingSplunkDataGenerator.tgz located on the desktop
Run:
source="ImplementingSplunkDataGenerator.tgz:*" host="WIN-SQM8ERRKEIJ"| chart count over date_month
by date_wday
If you look back at the results from stats, the data is presented as one row per combination. Instead of a row per combination, chart
generates the intersection of the two fields. You can specify multiple functions, but you may only specify one field each for over and
by.
Switching the fields (by rearranging our search statement a bit) turns the data the other way.
By simply clicking on the Visualization tab (to the right of the Statistics tab), we can see these results in a chart:
179
This is an Area chart, with particular format options set. Within the chart area, you can click on Area to change the chart type (Line,
Area, Column, Bar, and so on) or Format to change the format options (Stack, Null Values, Multi-series Mode, and Drilldown).
Bonus Lab:
180
If your records have a unique Id field, then the following snippet removes null fields:
| stats values(*) as * by Id
The reason is that "stats values won't show fields that don't have at least one non-null value".
If your records don't have a unique Id field, then you should create one first using streamstats:
| streamstats count as Id | stats values(*) as * by Id
181
182
Then click the visualization tab to see the result of this having two series. Make sure to select Line Chart
183
Hands on Lab
Run
sourcetype=access_* | timechart count(eval(method="GET")) AS GET, count(eval(method="POST")) AS POST
184
Format charts
Let's go ahead and take a look at the (chart) Format options. These options are grouped as:
General: Under general, you have the option to set the Stack Model (which indicates how Splunk will display your chart
columns for different series (alongside each other or as a single column), determine how to handle Null Values (you can leave
gaps for null data points, connect to zero data points, or just connect to the next positive data point), set the Multi-series mode
(Yes or No), and turn Drilldown (active or inactive) on or off.
X-Axis: Is mostly visual, you can set a custom title, allow truncation of label captions, and set the rotation of the text for your
chart labels.
Y-Axis: Here you can set not just a custom title, but also the scale (linear or log), the interval, and the min and max values.
Chart Overlay: Here you can set the following options:
Overlay: Select a field to show as an overlay.
View as Axis: Select On to map the overlay to a second Y-axis.
Title: Specify a title for the overlay.
Scale: Select Inherit, Linear, or Log. Inherit uses the scale for the base chart. Log provides a logarithmic scale,
useful for minimizing the display of large peak values.
Interval: Enter the units between tick marks in the axis.
Min Value: The minimum value to display. Values less than the Min Value do not appear on the chart.
Max Value: The maximum value to display. Values greater than the Max Value do not appear on the chart.
Legend: Finally, under Legend, you can set Position (where to place the legend (or to not include the legend) in the
visualization.) and Truncation (set how to represent names that are too long to display). Keep in mind that, depending on your
search results and the visualization options that you select, you may or may not get a useable result. Some experimentation
with the various options is recommended.
185
chart:
used to create charts that can display any series of data that you want to plot. You can decide what
field is tracked on the x-axis of the chart.
timechart: used to create "trend over time" reports, which means that _time is always the x-axis.
top: generates charts that display the most common values of a field.
rare: creates charts that display the least common values of a field.
stats, eventstats, and streamstats: generate reports that display summary statistics.
associate, correlate, and diff: create reports that enable you to see associations, correlations, and differences
between fields in your data.
186
Note: As you'll see in the following examples, you always place your reporting commands after your search commands, linking them
with a pipe operator ("|").
chart, timechart, stats, eventstats,
and streamstats are all designed to work in conjunction with statistical functions. The list
of available statistical functions includes:
187
Hands on Lab
Please format your chart from the last lab exercise
188
189
190
Example 1: Use eval to define a field that is the sum of the areas of two circles, A and B.
... | eval sum_of_areas = pi() * pow(radius_a, 2) + pi() * pow(radius_b, 2)
The area of circle is r^2, where r is the radius. For circles A and B, the radii are radius_a and radius_b, respectively. This eval
expression uses the pi and pow functions to calculate the area of each circle and then adds them together, and saves the result in a field
named, sum_of_areas.
191
Example 2: Use eval to define a location field using the city and state fields. For example, if the city=Philadelphia and state=PA,
location="Philadelphia, PA".
... | eval location=city.", ".state
192
Convert values
The convert command converts field values into numerical values. Unless you use the AS clause, the original values
are replaced by the new values.
Example 1
This example uses sendmail email server logs and refers to the logs with sourcetype=sendmail. The sendmail logs
have two duration fields, delay and xdelay.
The delay is the total amount of time a message took to deliver or bounce. The delay is expressed as "D+HH:MM:SS", which
indicates the time it took in hours (HH), minutes (MM), and seconds (SS) to handle delivery or rejection of the message. If the
delay exceeds 24 hours, the time expression is prefixed with the number of days and a plus character (D+).
The xdelay is the total amount of time the message took to be transmitted during final delivery, and its time is expressed as
"HH:MM:SS".
Change the sendmail duration format of delay and xdelay to seconds.
sourcetype=sendmail | convert dur2sec(delay) dur2sec(xdelay)
This search pipes all the sendmail events into the convert command and uses the dur2sec() function to convert the duration times of
the fields, delay and xdelay, into seconds.
193
Here is how your search results look after you use the fields sidebar to add the fields to your events:
You can compare the converted field values to the original field values in the events list.
Example 2
This example uses syslog data.
Convert a UNIX epoch time to a more readable time formatted to show hours, minutes, and seconds.
sourcetype=syslog | convert timeformat="%H:%M:%S" ctime(_time) AS c_time | table _time, c_time
The ctime() function converts the _time value of syslog (sourcetype=syslog) events to the format specified by the timeformat
argument. The timeformat="%H:%M:%S" arguments tells the search to format the _time value as HH:MM:SS.
Here, the table command is used to show the original _time value and the converted time, which is renamed c_time:
194
The ctime() function changes the timestamp to a non-numerical value. This is useful for display in a report or for readability in your
events list.
Example 3
This example uses syslog data.
Convert a time in MM:SS.SSS (minutes, seconds, and subseconds) to a number in seconds.
sourcetype=syslog | convert mstime(_time) AS ms_time | table _time, ms_time
The mstime() function converts the _time value of syslog (sourcetype=syslog) events from a minutes and seconds to just seconds.
195
Here, the table command is used to show the original _time value and the converted time, which is renamed ms_time:
The mstime() function changes the timestamp to a numerical value. This is useful if you want to use it for more calculations.
More examples
Example 1: Convert values of the "duration" field into number value by removing string values in the field value. For example, if
"duration="212 sec"", the resulting value is "duration="212"".
... | convert rmunit(duration)
Example 2: Change the sendmail syslog duration format (D+HH:MM:SS) to seconds. For example, if "delay="00:10:15"", the
resulting value is "delay="615"".
196
Example 4: Convert every field value to a number value except for values in the field "foo" Use the "none" argument to specify fields
to ignore.
... | convert auto(*) none(foo)
197
Hands on Lab
1.
Run:
Take the Loan csv file and develop some eval functions
198
All functions that accept strings can accept literal strings or any field.
All functions that accept numbers can accept literal numbers or any numeric field.
cidrmatch("X",Y)
Description
This function takes pairs of
arguments X and Y. The X
arguments are Boolean
expressions that will be
evaluated from first to last.
When the first X expression is
encountered that evaluates to
TRUE, the corresponding Y
argument will be returned. The
function defaults to NULL if
none are true.
This function returns true,
when IP address Y belongs to a
particular subnet X. The
function uses two string
arguments: the first is the CIDR
subnet; the second is the IP
address to match.
Example(s)
199
if(X,Y,Z)
like(TEXT,
PATTERN)
or
... | where like(field, "foo%")
200
null()
nullif(X,Y)
searchmatch(X)
validate(X,Y,...)
tostring(X,Y)
Description
Examples
tostring(X,"commas")
formats X
with commas and, if the number
includes decimals, rounds to nearest
two decimal places.
tostring(X,"duration")
converts
seconds X to readable time format
HH:MM:SS.
Note: When used with the eval command, the values might
not sort as expected because the values are converted to ASCII.
Use the fieldformat command with the tostring function to
format the displayed values. The underlying values are not
changed with the fieldformat command.
Cryptographic functions
Function
Description
Example(s)
md5(X)
This function computes and returns the MD5 hash of a string value X.
... | eval
n=md5(field)
sha1(X)
This function computes and returns the secure hash of a string value X based
on the FIPS compliant SHA-1 hash function.
... | eval
n=sha1(field)
203
sha256(X)
This function computes and returns the secure hash of a string value X based
on the FIPS compliant SHA-256 hash function.
... | eval
n=sha256(field)
sha512(X)
This function computes and returns the secure hash of a string value X based
on the FIPS compliant SHA-512 hash function.
... | eval
n=sha512(field)
Description
Example(s)
now()
relative_time(X,Y)
... | eval
n=relative_time(now(), "-1d@d")
strftime(X,Y)
strptime(X,Y)
This function takes a time represented by a string, X, and If timeStr is in the form, "11:59",
parses it into a timestamp using the format specified by
this returns it as a timestamp:
Y.
... | eval n=strptime(timeStr,
"%H:%M")
time()
Description
Example(s)
or
... | where isbool(field)
isint(X)
or
... | where isint(field)
isnotnull(X)
or
... | where isnotnull(field)
isnull(X)
or
... | where isnull(field)
isnum(X)
or
... | where isnum(field)
isstr(X)
or
... | where isstr(field)
typeof(X)
Mathematical functions
Function
abs(X)
Description
Examples
ceil(X),
ceiling(X)
exact(X)
formatted output.
exp(X)
floor(X)
ln(X)
log(X,Y)
log(X)
pi()
pow(X,Y)
round(X,Y)
sigfig(X)
1.00*1111 = 1111,
but
returns n=1110.
This function takes one numeric argument X and
returns its square root.
sqrt(X)
Multivalue functions
Function
commands(X)
Description
This function takes a search string, or
field that contains a search string, X and
returns a multivalued field containing a
list of the commands used in X. (This is
generally not recommended for use
except for analysis of audit.log events.)
Example(s)
... | eval x=commands("search foo |
stats count | sort count")
mvappend(X,...)
... | eval
fullName=mvappend(initial_values,
"middle value", last_values)
mvcount(MVFIELD)
208
values if it is a multivalue, 1 if it is a
single value field, and NULL otherwise.
mvdedup(X)
mvfilter(X)
mvfind(MVFIELD,"REGEX")
mvindex(MVFIELD,STARTINDEX,
ENDINDEX)
mvindex(MVFIELD,STARTINDEX)
mvjoin(MVFIELD,STR)
mvrange(X,Y,Z)
mvsort(X)
mvzip(X,Y,"Z")
Statistical functions
In addition to these functions, a comprehensive set of statistical functions is available to use with the stats, chart, and related
commands.
Function
Description
Example(s)
Text functions
Function
Description
Examples
len(X)
lower(X)
ltrim(X,Y)
ltrim(X)
rtrim(X)
spath(X,Y)
split(X,"Y")
substr(X,Y,Z)
trim(X,Y)
trim(X)
upper(X)
urldecode(X)
Description
Examples
acos(X)
acosh(X)
asin(X)
214
asinh(X)
atan(X)
This function computes the arc tangent of X, in the interval [pi/2,+pi/2] radians.
atan2(Y,
X)
.. | eval n=atan2(0.50,
0.75)
To compute the value, the function takes into account the sign of both arguments
to determine the quadrant.
atanh(X)
cos(X)
cosh(X)
The function returns the square root of the sum of the squares of X and Y, as
described in the Pythagorean theorem.
... | eval n=sin(1)
sin(X)
sinh(X)
215
tan(X)
tanh(X)
216
Hands-on Lab
Please take a look at Loan csv file. Use that file and some of the functions in the table in the manual
Take a look at round and some other functions that are very popular
217
218
219
Group: Corresponds to the working group(s) of the user saving the search.
Search type: Indicates the type of search (alert, report, summary-index-populating)
Platform: Corresponds to the platform subjected to the search
Category: Corresponds to the concern areas for the prevailing platforms.
Time interval: The interval over which the search runs (or on which the search runs, if it is a scheduled
search).
Description: A meaningful description of the context and intent of the search, limited to one or two words if
possible. Ensures the search name is unique.
Alert
Report
Summary
Windows
iSeries
Network
Category
Disk
<arbitrary>
Exchange
SQL
Event log
CPU
Jobs
Subsystems
220
<arbitrary>
Services
Security
SEG_Alert_Windows_Eventlog_15m_Failures
SEG_Report_iSeries_Jobs_12hr_Failed_Batch
NOC_Summary_Network_Security_24hr_Top_src_ip
221
When you set up the lookup in props.conf, you can just use ipaddress where you'd otherwise have used ip:
[dns]
lookup_ip = dnsLookup ipaddress OUTPUT host
223
The eval command is immensely versatile and useful. But while some eval expressions are relatively simple, they often can be quite
complex. If you find that you need to use a particularly long and complex eval expression on a regular basis, you may find that
retyping the expression accurately in search after search is tedious business.
This is where calculated fields come to the rescue. Calculated fields enable you to define fields with eval expressions in props.conf .
Then, when you're writing out a search, you can cut out the eval expression entirely and reference the field like you would any other
extracted field. When you run the search, the fields will be extracted at search time and will be added to the events that include the
fields in the eval expressions.
For example, take this example search , which examines earthquake data and classifies quakes by their depth by creating a new
Description field:
source=eqs7day-M1.csv | eval Description=case(Depth<=70, "Shallow", Depth>70 AND Depth<=300, "Mid",
Depth>300 AND Depth<=700, "Deep") | table Datetime, Region, Depth, Description
Using calculated fields, you could define the eval expression for the Description field in props.conf and write the search as:
source=eqs7day-M1.csv | table Datetime, Region, Depth, Description
You can now search on Description as if it is any other extracted field. Splunk Enterprise will find the calculated field key in
props.conf and evaluate it for every event that contains a Depth field. You can also run searches like this:
source=eqs7day-M1.csv Description=Deep
Note: In the next section we show you how the Description calculated field would be set up in props.conf.
224
Hands on Lab:
To create a calculated field go to: Settings -> Fields -> Add new (under Calculated Fields sections)
Sourcetype : csv
Save it
When you bring up the csv sourcetype in search , you will see the field a_test doubled the amount of annual_inc
Now you can try other calculated fields, if you like
225
226
227
As Splunk Enterprise processes events, it extracts fields from them. This process is called field extraction.
Splunk Enterprise automatically extracts some fields
Splunk Enterprise extracts some fields from your events without assistance. It automatically extracts host, source, and sourcetype
values, timestamps, and several other default fields when it indexes incoming events.
It also extracts fields that appear in your event data as key=value pairs. This process of recognizing and extracting k/v pairs is called
field discovery. You can disable field discovery to improve search performance.
When fields appear in events without their keys, Splunk Enterprise uses pattern-matching rules called regular expressions to extract
those fields as complete k/v pairs. With a properly configured regular expression, Splunk Enterprise can extract user_id=johnz from
the previous sample event. Splunk Enterprise comes with several field extraction configurations that use regular expressions to
identify and extract fields from event data.
To get all of the fields in your data, create custom field extractions
To use the power of Splunk Enterprise search, create additional field extractions. Custom field extractions allow you to capture and
track information that is important to your needs, but which is not automatically discovered and extracted by Splunk Enterprise. Any
field extraction configuration you provide must include a regular expression that tells Splunk Enterprise how to find the field that you
want to extract.
All field extractions, including custom field extractions, are tied to a specific source, sourcetype, or host value. For example, if you
create an ip field extraction, you might tie the extraction configuration for ip to sourcetype=access_combined.
228
Custom field extractions should take place at search time, but in certain rare circumstances you can arrange for some custom field
extractions to take place at index time.
Before you create custom field extractions, get to know your data
Before you begin to create field extractions, ensure that you are familiar with the formats and patterns of the event data associated
with the source, sourcetype, or host that you are working with. One way is to investigate the predominant event patterns in your
data with the Patterns tab..
Here are two events from the same source type, an apache server web access log.
131.253.24.135 - - [03/Jun/2014:20:49:53 -0700] "GET /wp-content/themes/aurora/style.css HTTP/1.1" 200 7464
"http://www.splunk.com/download" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0;
Trident/5.0)
10.1.10.14 - - [03/Jun/2014:20:49:33 -0700] "GET / HTTP/1.1" 200 75017 "-" "Mozilla/5.0 (compatible; Nmap
Scripting Engine; http://nmap.org/book/nse.html)"
While these events contain different strings and characters, they are formatted in a consistent manner. They both present values for
fields such as clientIP, status, bytes, method, and so on in a reliable order.
Reliable means that the method value is always followed by the URI value, the URI value is always followed by the status value, the
status value is always followed by the bytes value, and so on. When your events have consistent and reliable formats, you can
create a field extraction that accurately captures multiple field values from them.
For contrast, look at this set of Cisco ASA firewall log events:
Jul 15 20:10:27 10.11.36.31 %ASA-6-113003: AAA group policy for user AmorAubrey is being set to
1 Acme_techoutbound
2 outside:87.194.216.51
3 Jul 15 20:13:52 10.11.36.28 %ASA-6-302014: Teardown TCP connection 517934 for Outside:128.241.220.82/1561
229
to Inside:10.123.124.28/8443 duration 0:05:02 bytes 297 Tunnel has been torn down (AMOSORTILEGIO)
Apr 19 11:24:32 PROD-MFS-002 %ASA-4-106103: access-list fmVPN-1300 denied udp for user 'sdewilde7'
4 outside/12.130.60.4(137) -> inside1/10.157.200.154(137) hit-cnt 1 first hit [0x286364c7, 0x0] "
While these events contain field values that are always space-delimited, they do not share a reliable format like the preceding two
events. In order, these events represent:
1. A group policy change
2. An IGMP request
3. A TCP connection
4. A firewall access denial for a request from a specific IP
Because these events differ so widely, it is difficult to create a single field extraction that can apply to each of these event patterns and
extract relevant field values.
In situations like this, where a specific host, source type, or source contains multiple event patterns, you may want to define field
extractions that match each pattern, rather than designing a single extraction that can apply to all of the patterns. Inspect the events to
identify text that is common and reliable for each pattern.
Using required text in field extractions
In the last four events, the string of numbers that follows %ASA-#- have specific meanings. You can find their definitions in the Cisco
documentation. When you have unique event identifiers like these in your data, specify them as required text in your field extraction.
Required text strings limit the events that can match the regular expression in your field extraction.
Specifying required text is optional, but it offers multiple benefits. Because required text reduces the set of events that it scans, it
improves field extraction efficiency and decreases the number of false-positive field extractions.
230
The field extractor utility enables you to highlight text in a sample event and specify that it is required text.
Methods of custom field extraction in Splunk Enterprise
As a knowledge manager you oversee the set of custom field extractions created by users of your Splunk Enterprise implementation,
and you might define specialized groups of custom field extractions yourself. The ways that you can do this include:
The field extractor utility, which generates regular expressions for your field extractions.
Adding field extractions through pages in Settings. You must provide a regular expression.
Manual addition of field extraction configurations at the .conf file level. Provides the most flexibility for field
extraction.
The field extraction methods that are available to Splunk Enterprise users are described in the following sections. All of these methods
enable you to create search-time field extractions. To create an index-time field extraction, choose the third option: Configure field
extractions directly in configuration files.
Let the field extractor build extractions for you
The field extractor utility leads you step-by-step through the field extraction design process. It provides two methods of field
extraction: regular expressions and delimiter-based field extraction. The regular expression method is useful for extracting fields from
unstructured event data, where events may follow a variety of different event patterns. It is also helpful if you are unfamiliar with
regular expression syntax and usage, because it generates regular expressions and lets you validate them.
The delimiter-based field extraction method is suited to structured event data. Structured event data comes from sources like SQL
databases and CSV files, and produces events where all fields are separated by a common delimiter, such as commas, spaces, or pipe
characters. Regular expressions usually are not necessary for structured data events from a common source.
With the regular expression method of the field extractor you can:
Set up a field extraction by selecting a sample event and highlighting fields to extract from that event.
231
The field extractor can only build search time field extractions that are associated with specific sources or source types in your data
(no hosts).
Define field extractions with the Field Extractions and Field Transformations pages
You can use the Field Extractions and Field Transformations pages in Settings to define and maintain complex extracted fields in
Splunk Web.
This method of field extraction creation lets you create a wider range of field extractions than you can generate with the field extractor
utility. It requires that you have the following knowledge.
If you create a custom field extraction that extracts its fields from _raw and does not require a field transform, use the field extractor
utility. The field extractor can generate regular expressions, and it can give you feedback about the accuracy of your field extractions
as you define them.
232
Use the Field Extractions page to create basic field extractions, or use it in conjunction with the Field Transformations page to define
field extraction configurations that can do the following things.
Reuse the same regular expression across multiple sources, source types, or hosts.
Apply multiple regular expressions to the same source, source type, or host.
Use a regular expression to extract fields from the values of another field.
The Field Extractions and Field Transformations pages define only search time field extractions.
233
Hands on Lab
Please refer to Lab on desktop
234
235
236
Settings Tags
List by tag name Click Add new
237
Event types are a categorization system to help you make sense of your data. Event types let you sift through huge amounts of data,
find similar patterns, and create alerts and reports.
238
239
240
241
242
Workflow actions have a wide variety of applications. For example, you can define workflow actions that enable you to:
Are targeted to events that contain a specific field or set of fields, or which belong to a particular event type.
Appear either in field menus or event menus in search results. You can also set them up to only appear in the
menus of specific fields, or in all field menus in a qualifying event.
When selected, open either in the current window or in a new one.
243
GET workflow actions, which create typical HTML links to do things like perform Google searches on
specific values or run domain name queries against external WHOIS databases.
POST workflow actions, which generate an HTTP POST request to a specified URI. This action type
enables you to do things like create entries in external issue management systems using a set of relevant field
values.
Search workflow actions, which launch secondary searches that use specific field values from an event,
such as a search that looks for the occurrence of specific combinations of ipaddress and http_status' field
values in your index over a specific time range.
244
GET link workflow actions drop one or more values into an HTML link. Clicking that link performs an HTTP GET request in a
browser, allowing you to pass information to an external web resource, such as a search engine or IP lookup service.
To define a GET workflow action:
1. Navigate to Settings > Fields > Workflow Actions.
2. Click New to open up a new workflow action form.
3. Define a Label for the action.
The Label field enables you to define the text that is displayed in either the field or event workflow menu.
Labels can be static or include the value of relevant fields.
4. Determine whether the workflow action applies to specific fields or event types in your data.
Use Apply only to the following fields to identify one or more fields. When you identify fields, the
workflow action only appears for events that have those fields, either in their event menu or field menus. If
you leave it blank or enter an asterisk the action appears in menus for all fields.
Use Apply only to the following event types to identify one or more event types. If you identify an event
type, the workflow action only appears in the event menus for events that belong to the event type.
5. For Show action in determine whether you want the action to appear in the Event menu, the Fields menus, or Both.
6. Set Action type to link.
245
7. In URI provide a URI for the location of the external resource that you want to send your field values to.
Similar to the Label setting, when you declare the value of a field, you use the name of the field enclosed by
dollar signs.
Variables passed in GET actions via URIs are automatically URL encoded during transmission. This means
you can include values that have spaces between words or punctuation characters.
8. Under Open link in, determine whether the workflow action displays in the current window or if it opens the link in a new window.
9. Set the Link method to get.
10. Click Save to save your workflow action definition.
246
Hands-on Lab
247
7. Under URI provide the URI for a web resource that responds to POST requests.
8. Under Open link in, determine whether the workflow action displays in the current window or if it opens the link in a new window.
9. Set Link method to Post.
10. Under Post arguments define arguments that should be sent to web resource at the identified URI.
These arguments are key and value combinations. On both the key and value sides of the argument, you can
use field names enclosed in dollar signs to identify the field value from your events that should be sent over to
the resource. You can define multiple key/value arguments in one POST workflow action.
Enter the key in the first field, and the value in the second field. Click Add another field to create an
additional POST argument.
11. Click Save to save your workflow action definition.
Splunk software automatically HTTP-form encodes variables that it passes in POST link actions via URIs. This means you can
include values that have spaces between words or punctuation characters.
249
To set up workflow actions that launch dynamically populated secondary searches, you start by setting Action type to search on the
Workflow actions detail page. This reveals a set of Search configuration fields that you use to define the specifics of the secondary
search.
In Search string enter a search string that includes one or more placeholders for field values, bounded by dollar signs. For example, if
you're setting up a workflow action that searches on client IP values that turn up in events, you might simply enter
clientip=$clientip$ in that field.
Identify the app that the search runs in. If you want it to run in a view other than the current one, select that view. And as with all
workflow actions, you can determine whether it opens in the current window or a new one.
Be sure to set a time range for the search (or identify whether it should use the same time range as the search that created the field
listing) by entering relative time modifiers in the in the Earliest time and Latest time fields. If these fields are left blank the search runs
over all time by default.
Finally, as with other workflow action types, you can restrict the search workflow action to events containing specific sets of fields
and/or which belong to particular event types.
250
Hands-on Lab
Please refer to Lab on desktop
251
252
Describe alerts
Create alerts
View fired alerts
Hands on Lab covering: Describe alerts, Create alerts, View fired alerts
End of Module Hands on Quiz
253
Describe alerts
An alert is an action that a saved search triggers based on the results of the search. When creating an alert, you specify a condition that
triggers the alert. Typically the action is an email based on the results of the search. But you can also choose to run a script or to list
the alert as a triggered alert in Settings. When you create an alert you are creating a saved search with trigger conditions for the alert.
To avoid sending out alerts too frequently, specify a throttle condition for an alert.
The following list describes the types of alerts:
Per result alert. Based on a real-time search. The trigger condition is whenever the search returns a result.
Scheduled alert. Runs a search according to a schedule that you specify when creating the alert. You specify results of the
search that trigger the alert.
Rolling-window alert. Based on a real-time search. The trigger condition is a combination of specified results of the search
within a specified time window.
254
Create alerts
A scheduled alert runs periodically at a scheduled time, responding to a condition that triggers the alert.
This example uses a search to track when there are too many errors in a Splunk Enterprise instance during the last 24 hours. When the
number of errors exceeds 5, the alert sends an email with information about the conditions that triggered the alert. The alert sends an
email every day at 10:00AM when the number of errors exceed the threshold.
1. From the Search Page, create the following search
index=_internal " error " NOT debug source=*splunkd.log* earliest=-24h latest=now
255
4. Click Next.
5. Click Send Email.
6. Set the following email settings, using tokens in the Subject and Message fields:
To: email recipient
Priority: Normal
Subject: Too many errors alert: $name$
Message: There were $job.resultCount$ errors reported on $trigger_date$.
Include: Link to Alert and Link to Results
Accept defaults for all other options.
256
7. Click Save.
After you create the alert you can view and edit the alert in the Alerts Page.
When the alert triggers, it sends the following email:
257
258
259
Hands-on Lab
Please refer to Lab on desktop
260
261
Describe macros
Manage macros
Create and use a basic macro
Hands on Lab covering: Describe macros, Manage macros, Create and use a basic macro.
Define arguments and variables for a macro
Add and use arguments with a macro
Hands on Lab covering: Define arguments and variable for a macro, Add and use arguments with a macro.
End of Module Hands on Quiz
262
Describe Macros
Search macros are chunks of a search that you can reuse in multiple places, including saved and ad hoc searches.
Search macros can be any part of a search, such as an eval statement or search term, and do not need to be a
complete command. You can also specify whether or not the macro field takes any arguments.
263
If Eval Generated Definition? is checked, then the 'Definition' is expected to be an eval expression that
returns a string that represents the expansion of this macro.
If a macro definition includes a leading pipe character ("|"), you may not use it as the first term in
searches from the UI. Example: "| metadata type=sources". The UI does not do the macro expansion and
cannot correctly identify the initial pipe to differentiate it from a regular search term. The UI constructs the
search as if the macro name were a search term, which after expansion would cause the metadata command
to be incorrectly formed and therefore invalid.
264
Arguments are a comma-delimited string of argument names. Argument names may only contain the characters: alphanumeric 'a-Z,
A-Z, 0-9'; underscore '_'; and dash '-'. This list should not contain any repeated elements.
If a macro argument includes quotes, you need to escape the quotes when you call the macro in
your search. For example, if you wanted to pass a quoted string as your macro's argument, you would use:
`my-macro("He said \"hello!\"")`.
Validation Expression is a string that is an 'eval' expression that evaluates to a boolean or a string.
If the validation expression is a boolean expression, validation succeeds when it returns true. If it returns
false or is null, validation fails, and the Validation Error Message is returned.
If the validation expression is not a boolean expression, it is expected to return a string or NULL. If it returns null, validation is
considered a success. Otherwise, the string returned is rendered as the error string.
Apply macros to saved and ad hoc searches
To include a search macro in your saved or ad hoc searches, use the left quote (also known as a grave accent) character; on most
English-language keyboards, this character is located on the same key as the tilde (~). You can also reference a search macro within
other search macros using this same syntax.
Note: Do NOT use the straight quote character that appears in the same key as the double quote (").
265
Hands-on Lab
266
267
268
Describe Pivot
The Pivot tool lets you report on a specific data set without the Splunk Enterprise Search Processing Language (SPL). First, identify
a dataset that you want to report on, and then use a drag-and-drop interface to design and generate pivots that present different aspects
of that data in the form of tables, charts, and other visualizations.
How does Pivot work? It uses data models to define the broad category of event data that you're working with, and then uses
hierarchically arranged collections of data model objects to further subdivide the original dataset and define the attributes that you
want Pivot to return results on. Data models and their objects are designed by the knowledge managers in your organization. They do
a lot of hard work for you to enable you to quickly focus on a specific subset of event data.
269
Data models drive the Pivot tool. They enable users of Pivot to create compelling reports and dashboards without designing the
searches that generate them. Data models can have other uses, especially for Splunk Enterprise app developers.
Splunk Enterprise knowledge managers design and maintain data models. These knowledge managers understand the format and
semantics of their indexed data and are familiar with the Splunk Enterprise search language. In building a typical data model,
knowledge managers use knowledge object types such as lookups, transactions, search-time field extractions, and calculated fields.
270
Data models are composed of one or more objects. Here are some basic facts about data model objects:
An object is a specification for a dataset. Each data model object corresponds in some manner to a set of data in an index.
You can apply data models to different indexes and get different datasets.
Objects break down into four types. These types are: Event objects, search objects, transaction objects, and child objects.
Objects are hierarchical. Objects in data models can be arranged hierarchically in parent/child relationships. The top-level
event, search, and transaction objects in data models are collectively referred to as "root objects."
Child objects have inheritance. Data model objects are defined by characteristics that mostly break down into constraints and
attributes. Child objects inherit constraints and attributes from their parent objects and have additional constraints and
attributes of their own.
271
Hands-on Lab
272
273
274