© 2017 SPLUNK INC.© 2017 SPLUNK INC.
Power of Splunk
Search Processing Language (SPL™)
07/13/2017 | Jacksonville
Tolga Tohumcu | Staff Sales Engineer
@TolgaTohumcu
© 2017 SPLUNK INC.
During the course of this presentation, we may make forward-looking statements regarding future events or
the expected performance of the company. We caution you that such statements reflect our current
expectations and estimates based on factors currently known to us and that actual events or results could
differ materially. For important factors that may cause actual results to differ from those contained in our
forward-looking statements, please review our filings with the SEC.
The forward-looking statements made in this presentation are being made as of the time and date of its live
presentation. If reviewed after its live presentation, this presentation may not contain current or accurate
information. We do not assume any obligation to update any forward-looking statements we may make. In
addition, any information about our roadmap outlines our general product direction and is subject to change
at any time without notice. It is for informational purposes only and shall not be incorporated into any contract
or other commitment. Splunk undertakes no obligation either to develop the features or functionality
described or to include any such feature or functionality in a future release.
Splunk, Splunk>, Listen to Your Data, The Engine for Machine Data, Splunk Cloud, Splunk Light and SPL are trademarks and registered trademarks of Splunk Inc. in
the United States and other countries. All other brand names, product names, or trademarks belong to their respective owners. © 2017 Splunk Inc. All rights reserved.
Forward-Looking Statements
© 2017 SPLUNK INC.
Set Up Before You Can Play
Download the following at splunk.com and splunkbase.com
▶ Splunk Enterprise:
• https://www.splunk.com/download
▶ Power of SPL App:
• https://splunkbase.splunk.com/app/3353/
© 2017 SPLUNK INC.
Set Up Before You Can Play
▶ Install Power of SPL App ▶ Now we’re ready to go!
© 2017 SPLUNK INC.
▶ License expired (already had older version installed)
• Close browser, empty cache, open browser. If that doesn’t work:
• Stop Splunk.
• Uninstall all Splunk versions
• Windows Control Panel->Uninstall programs->Splunk
• OS X. Finder->Applications->Right click Splunk, Move to trash
• Reinstall
• Start Splunk
▶ Can’t start Splunk
• Windows, Search Control panel ->Services->Splunk start
• Linux; cd <SPLUNK dir>/splunk/bin;./splunk start
Common problems at this point
© 2017 SPLUNK INC.
1. Installation and Setup (~15min)
2. Power of SPL Walkthrough (~1h 30min)
• Overview & Anatomy of a Search
• SPL Commands & Examples for Searching, charting,
converging, mapping, transactions, anomalies, exploring
data, custom
3. Custom Visualizations (~30min)
4. SPL and the Machine Learning Toolkit (~45min)
Agenda
© 2017 SPLUNK INC.
SPL Overview
© 2017 SPLUNK INC.
▶ Over 140 search commands
▶ Syntax was originally based upon the Unix pipeline
and SQL and is optimized for time-series data
▶ The scope of SPL includes data searching, filtering,
modification, manipulation, enrichment, insertion and deletion
▶ Includes machine learning such as anomaly detection
SPL Overview
Disk
Intermediate
results table
Intermediate
results table
Final
results table
© 2017 SPLUNK INC.
▶ Flexibility and effectiveness on
small and big data
▶ Late-binding schema
▶ More/better methods of correlation
▶ Not just analyze, but visualize
Why Create a New
Query Language?
Data
BIG Data
© 2017 SPLUNK INC.
search and filter | munge | report | cleanup
| rename sum(KB) AS "Total KB" dc(clientip) AS "Unique Customers"
| eval KB=bytes/1024
sourcetype=access*
| stats sum(KB) dc(clientip)
SPL Basic Structure
© 2017 SPLUNK INC.
Knowledge Objects
Data Interpretation – provide structure to raw data. (field extractions)
Data Classification – group similar events. (event types, transactions)
Data Enrichment – add value from external sources. (lookups)
Data Normalization – group related fields. (tags, aliases)
Data Models – represent one or more data sets. (pivot)
© 2017 SPLUNK INC.
Types of Search Commands
Streaming
• Operate on each event
individually
• Distributable Streaming
• Run on Indexers
• Eval, fields, rename, regex
• Centralized (Stateful)
Streaming
• Run on Search Head
• head, streamstats
12
Transforming
• Generate a reporting data
structure
• Operate on the entire event set
• chart,timechart, stats, top
Generating
• Usually invoked at the beginning
of a search
• Do not expect or require an input
• | search is implied
• dbinspect, datamodel, inputcsv
© 2017 SPLUNK INC.
Search Processor Categories
Streaming (remote)
• Consider each event/result row
individually
• Can be distributed to the indexers
• Ex: eval, where, search, lookup
Stateful Streaming
• Consider each event/results row
individually but maintain extra state
modified by processing each event
• Ordering is significant, and as such they
force the search back to the search head
• Ex: head, streamstats
13
Events
• Need to see all events/result rows
to perform their tasks
• Ex: tail, sort, eventstats
Transforming
• Need to see all events/result rows and
transform them into some new result rows
• Stream Reporting
• Can take batches of input, and then at
any time generate output based on the
input seen so far
• Ex: stats, chart
• Reporting
• Must take all events at once
• Ex: cluster, geostats
© 2017 SPLUNK INC.
Search Pipelines
Pipeline Setup
▶Search Pipelines are built according to the processors and
the order they occur in the search string
• At most, 5 pipelines are built, one for each processor category
• Less restrictive processors may be accumulated into a more restrictive pipeline
• Depending on the order of commands
▶Each processor is asked to first evaluate its arguments for
validity, then processes them for setup
14
© 2017 SPLUNK INC.
Search Pipelines
Pipeline Processing
▶ Pipeline processing begins
• Processors for the current category and earlier pipelines are accumulated in the current
pipeline
• Splunk moves to the next most restricted pipeline as we find a command that is restricted
to that level
• We never move back to a less restrictive pipeline once moving to a more restrictive pipeline
▶ Streaming Pipeline
• Only pipeline that can be distributed to indexers
• After all streaming commands are accumulated, non-streaming commands are asked if
they can be split
• Ex: Prestats, Presort, etc…
• If available, commands are added to the Streaming Pipeline
15
© 2017 SPLUNK INC.
Search Pipelines
Pipeline Rendering
▶Streaming Pipeline
• Rendered as remoteSearch string and sent to Indexers
▶Stateful and Events Pipelines
• Rendered into eventsSearch
▶Stream Report and Report Pipelines
• Rendered into reportSearch
16
© 2017 SPLUNK INC.
Search Modes
● Historical
– Cursored Search
‣ Most historical searches run in this mode
‣ Returns events in strict time ordering
– Batch Mode Search
‣ Automatically enabled when possible
‣ Reads buckets sequentially = returns
data faster
‣ Events are not time ordered
‣ Uses more memory
‣ Job Inspector: isBatchModeSearch
17
● Realtime
– Realtime Search
‣ Impacts indexing performance
‣ Hooks into indexing pipeline daemon to
build a separate queue of events
– Indexed Realtime Search
‣ Continuous search
‣ Does not hook into indexing pipeline
‣ Default latency of 60sec with 1 sec
lookback
– Parameters can be adjusted
© 2017 SPLUNK INC.
Search UI
● Search UI Modes
– Fast, Smart, Verbose Mode
– Determine which fields are extracted
– Turn timeliner on/off
● Data Preview
– When its time, main search thread is stopped so Splunk can gather current results and
generate a preview
– Lots of parameters available to tweak this
– If you don’t need a preview, disable it in the UI
● Timeliner
– Runs in the background
– May cause unnecessary events to be downloaded, especially when in verbose mode
– By default, 1,000 events are downloaded per column with 300 columns max
– Parameters available to adjust this or turn it off
18
© 2017 SPLUNK INC.
Search Advice
● Filter early
– Specify an index
– Utilize indexed extractions where available
– Use the TERM directive if applicable
● Place streaming/remote commands before non-streaming commands
● Avoid using table, except at the very end
– This is a reporting command and will cause data to be pushed to Search Head
● Remove unnecessary data using | fields
19
© 2017 SPLUNK INC.
Job Inspector
Types of Components
▶ Command
• Measures actual work being done
▶ Dispatch
• Measures Splunk framework
20
● Invocations
– # of times the component was run
● Input/Output Count
– Generally the event count
– Exception: dispatch.stream.remote
‣ Byte count of data received from the
peer
© 2017 SPLUNK INC.
Job Inspector
Dispatch.evaluate
▶ Search is parsed here
▶ Sub-searches are run here
21
© 2017 SPLUNK INC.
Job Inspector
Dispatch.stream.remote
▶ Time spent searching on peers
▶ Useful for comparing time spent and # of results returned across peers
22
© 2017 SPLUNK INC.
Job Inspector
Command.search.index.usec
▶ Indicates how long it took to search an index in microseconds
• Helps identify slow searching of indexes
• Can indicate VERY large indexes (especially on rare term searches)
• Lots of indexed extractions? Data with extremely high cardinality?
• Can indicate slow disk or disk contention (especially on dense searches)
▶ Histogram of invocations grouped by microseconds
▶ Correlate this to dispatch.stream.remote
23
© 2017 SPLUNK INC.
Job Inspector
Other Interesting Components…
▶ Command.search.rawdata
• Time peer spend uncompressing bucket slices
• On a dense search, can indicate CPU contention
▶ Command.search.filter
• Post filtering after KV extraction (schema-on-the-fly)
▶ Command.search.createProviderQueue
• Time spent setting up connections to peers
24
© 2017 SPLUNK INC.
SPL Examples
© 2017 SPLUNK INC.
▶ Find the needle in the haystack
▶ Charting statistics and predicting values
▶ Enriching and converging data sources
▶ Map geographic data in real time (We’ll add some custom viz here!)
▶ Identifying anomalies
▶ Transactions
▶ Data exploration & finding relationships between fields
▶ Custom commands
SPL Examples and Recipes
© 2017 SPLUNK INC.
▶ Find the needle in the haystack
▶ Charting statistics and predicting values
▶ Enriching and converging data sources
▶ Map geographic data in real time
▶ Identifying anomalies
▶ Transactions
▶ Data exploration & finding relationships between fields
▶ Custom commands
SPL Examples and Recipes
© 2017 SPLUNK INC.
▶ Keyword search:
sourcetype=access* http
▶ Filter:
sourcetype=access* http
host=webserver-02
▶ Combined:
sourcetype=access* http
host=webserver-02 (503 OR 504)
Search and Filter
Examples
© 2017 SPLUNK INC.
▶ Keyword search:
sourcetype=access* http
▶ Filter:
sourcetype=access* http
host=webserver-02
▶ Combined:
sourcetype=access* http
host=webserver-02 (503 OR 504)
Search and Filter
Examples
© 2017 SPLUNK INC.
▶ Keyword search:
sourcetype=access* http
▶ Filter:
sourcetype=access* http
host=webserver-02
▶ Combined:
sourcetype=access* http
host=webserver-02 (503 OR 504)
Search and Filter
Examples
© 2017 SPLUNK INC.
▶ Calculation:
sourcetype=access*
|eval KB=bytes/1024
▶ Evaluation:
sourcetype=access*
| eval http_response =
if(status != 200, ”Error", ”OK”)
▶ Concatenation:
sourcetype=access*
| eval connection = device.” - ".clientip
Eval – Modify or Create
New Fields and Values
Examples
© 2017 SPLUNK INC.
▶ Calculation:
sourcetype=access*
|eval KB=bytes/1024
▶ Evaluation:
sourcetype=access*
| eval http_response =
if(status != 200, ”Error", ”OK”)
▶ Concatenation:
sourcetype=access*
| eval connection = device.” - ".clientip
Eval – Modify or Create
New Fields and Values
Examples
© 2017 SPLUNK INC.
▶ Calculation:
sourcetype=access*
|eval KB=bytes/1024
▶ Evaluation:
sourcetype=access*
| eval http_response =
if(status != 200, ”Error", ”OK”)
▶ Concatenation:
sourcetype=access*
| eval connection = device.” - ".clientip
Eval – Modify or Create
New Fields and Values
Examples
© 2017 SPLUNK INC.
Eval – Just Getting Started!
Splunk Search Quick Reference Guide
© 2017 SPLUNK INC.
▶ Find the needle in the haystack
▶ Charting statistics and predicting values
▶ Enriching and converging data sources
▶ Map geographic data in real time
▶ Identifying anomalies
▶ Transactions
▶ Data exploration & finding relationships between fields
▶ Custom commands
SPL Examples and Recipes
© 2017 SPLUNK INC.
▶ Calculate stats and rename
Index=power_of_spl
| stats avg(bytes) AS “Avg Bytes”
▶ Multiple statistics
index=power_of_spl | stats avg(bytes) AS bytes
sparkline(avg(bytes)) AS
Bytes_Trend min(bytes) max(bytes)
▶ By another field
index=power_of_spl
| stats avg(bytes) AS avg_bytes
sparkline(avg(bytes)) AS Bytes_Trend
min(bytes) max(bytes) by clientip | sort -
avg_bytes
Stats – Calculate Statistics
Based on Field Values
Examples
© 2017 SPLUNK INC.
▶ Calculate stats and rename
Index=power_of_spl
| stats avg(bytes) AS “Avg Bytes”
▶ Multiple statistics
index=power_of_spl | stats avg(bytes) AS
bytes sparkline(avg(bytes)) AS
Bytes_Trend min(bytes) max(bytes)
▶ By another field
index=power_of_spl
| stats avg(bytes) AS avg_bytes
sparkline(avg(bytes)) AS Bytes_Trend
min(bytes) max(bytes) by clientip | sort -
avg_bytes
Stats – Calculate Statistics
Based on Field Values
Examples
© 2017 SPLUNK INC.
▶ Calculate stats and rename
Index=power_of_spl
| stats avg(bytes) AS “Avg Bytes”
▶ Multiple statistics
index=power_of_spl | stats avg(bytes) AS bytes
sparkline(avg(bytes)) AS
Bytes_Trend min(bytes) max(bytes)
▶ By another field
index=power_of_spl
| stats avg(bytes) AS avg_bytes
sparkline(avg(bytes)) AS Bytes_Trend
min(bytes) max(bytes) by clientip | sort -
avg_bytes
Stats – Calculate Statistics
Based on Field Values
Examples
© 2017 SPLUNK INC.
▶ Visualize stats over time
index=power_of_spl
| timechart avg(bytes)
▶ Add a trendline
index=power_of_spl
| timechart avg(bytes) as bytes
| trendline sma5(bytes)
▶ Add a prediction overlay
index=power_of_spl
| timechart avg(bytes) as bytes
| predict future_timespan=5 bytes
Timechart – Visualize
Statistics Over Time
Examples
© 2017 SPLUNK INC.
▶ Visualize stats over time
index=power_of_spl
| timechart avg(bytes)
▶ Add a trendline
index=power_of_spl
| timechart avg(bytes) as bytes
| trendline sma5(bytes)
▶ Add a prediction overlay
index=power_of_spl
| timechart avg(bytes) as bytes
| predict future_timespan=5 bytes
Timechart – Visualize
Statistics Over Time
Examples
© 2017 SPLUNK INC.
▶ Visualize stats over time
index=power_of_spl
| timechart avg(bytes)
▶ Add a trendline
index=power_of_spl
| timechart avg(bytes) as bytes
| trendline sma5(bytes)
▶ Add a prediction overlay
index=power_of_spl
| timechart avg(bytes) as bytes
| predict future_timespan=5 bytes
Timechart – Visualize
Statistics Over Time
Examples
© 2017 SPLUNK INC.
▶ Cumulative/Running Totals
index=power_of_spl
| reverse
| streamstats sum(bytes) AS sum_bytes
| timechart latest(sum_bytes) as "Total Bytes"
▶ Summary Statistics
index=power_of_spl
| eventstats avg(bytes) AS overall_avg_bytes
| stats avg(bytes) as clientip_avg_bytes by
clientip overall_avg_bytes
Streamstats – Cumulative/Running
Totals Statistics
Examples
© 2017 SPLUNK INC.
▶ Cumulative/Running Totals
index=power_of_spl
| reverse
| streamstats sum(bytes) AS sum_bytes
| timechart latest(sum_bytes) as "Total Bytes"
▶ Summary Statistics
index=power_of_spl
| eventstats avg(bytes) AS overall_avg_bytes
| stats avg(bytes) as clientip_avg_bytes by
clientip overall_avg_bytes
Streamstats – Cumulative/Running
Totals Statistics
Examples
© 2017 SPLUNK INC.
Stats/Timechart – But Wait, There’s More!
Splunk Search Quick Reference Guide
© 2017 SPLUNK INC.
▶ Find the needle in the haystack
▶ Charting statistics and predicting values
▶ Enriching and converging data sources
▶ Map geographic data in real time
▶ Identifying anomalies
▶ Transactions
▶ Data exploration & finding relationships between fields
▶ Custom commands
SPL Examples and Recipes
© 2017 SPLUNK INC.
Converging Data Sources
Index Untapped Data: Any Source, Type, Volume Ask Any Question
Application Delivery
Security, Compliance
and Fraud
IT Operations
Business Analytics
Industrial Data and
the Internet of Things
On-Premises
Private Cloud
Public
Cloud
Storage
Online
Shopping Cart
Telecoms
Desktops
Security
Web
Services
Networks
Containers
Web
Clickstreams
RFID
Smartphones
and Devices
Servers
Messaging
GPS
Location
Packaged
Applications
Custom
Applications
Online
Services
DatabasesCall Detail
Records
Energy Meters
Firewall
Intrusion
Prevention
© 2017 SPLUNK INC.
▶ Enrich data with lookups
index=power_of_spl status!=200
| lookup customer_info uid
| stats count by customer_value
▶ Search Inception!
index=power_of_spl
[ search index=power_of_spl | stats sum(bytes)
as total_bytes by clientip
| sort - total_bytes | head 1 | return clientip ]
| stats count by clientip status uri | sort - count
▶ Append multiple searches
index=power_of_spl
| timechart span=15s avg(bytes) as avg_bytes
| appendcols [ search index=power_of_spl
| stats stdev(bytes) as stdev_bytes] | eval 2stdv_upper = avg_bytes + stdev_bytes*2 | filldown 2stdv_upper | eval 2stdv_lower = avg_bytes -
stdev_bytes*2 | filldown 2stdv_lower
| eval 2stdv_lower = if('2stdv_lower’ <0,0,'2stdv_lower') | fields - stdev_bytes
lookup – Converging Data Sources
Examples
© 2017 SPLUNK INC.
▶ Enrich data with lookups
index=power_of_spl status!=200
| lookup customer_info uid
| stats count by customer_value
▶ Search Inception!
index=power_of_spl
[ search index=power_of_spl | stats sum(bytes)
as total_bytes by clientip
| sort - total_bytes | head 1 | return clientip ]
| stats count by clientip status uri | sort - count
▶ Append multiple searches
index=power_of_spl
| timechart span=15s avg(bytes) as avg_bytes
| appendcols [ search index=power_of_spl
| stats stdev(bytes) as stdev_bytes] | eval 2stdv_upper = avg_bytes + stdev_bytes*2 | filldown 2stdv_upper | eval 2stdv_lower = avg_bytes -
stdev_bytes*2 | filldown 2stdv_lower
| eval 2stdv_lower = if('2stdv_lower’ <0,0,'2stdv_lower') | fields - stdev_bytes
Converging Data Sources
Examples
© 2017 SPLUNK INC.
▶ Enrich data with lookups
index=power_of_spl status!=200
| lookup customer_info uid
| stats count by customer_value
▶ Search Inception!
index=power_of_spl
[ search index=power_of_spl | stats sum(bytes)
as total_bytes by clientip
| sort - total_bytes | head 1 | return clientip ]
| stats count by clientip status uri | sort - count
▶ Append multiple searches
index=power_of_spl
| timechart span=15s avg(bytes) as avg_bytes
| appendcols [ search index=power_of_spl
| stats stdev(bytes) as stdev_bytes] | eval 2stdv_upper = avg_bytes + stdev_bytes*2 | filldown 2stdv_upper | eval 2stdv_lower =
avg_bytes - stdev_bytes*2 | filldown 2stdv_lower
| eval 2stdv_lower = if('2stdv_lower’ <0,0,'2stdv_lower') | fields - stdev_bytes
appendcols – Converging Data Sources
Examples
© 2017 SPLUNK INC.
▶ Find the needle in the haystack
▶ Charting statistics and predicting values
▶ Enriching and converging data sources
▶ Map geographic data in real time (Let’s add some viz!)
▶ Identifying anomalies
▶ Transactions
▶ Data exploration & finding relationships between fields
▶ Custom commands
SPL Examples and Recipes
© 2017 SPLUNK INC.
▶ Assign Lat/Lon to IP addresses
… | iplocation clientip
▶ Visualize statistics geographically
… | geostats sum(price) by product
▶ Use custom choropleths
… | geom <featureCollection> <featureId>
▶ Track object movements
… | table _time latitude longitude vehicleId
iplocation
Geographic Data
Examples
© 2017 SPLUNK INC.
▶ Assign Lat/Lon to IP addresses
… | iplocation clientip
▶ Visualize statistics geographically
… | geostats sum(price) by product
▶ Use custom choropleths
… | geom <featureCollection> <featureId>
▶ Track object movements
… | table _time latitude longitude vehicleId
geostats –
Geographic Data
Examples
© 2017 SPLUNK INC.
▶ Assign Lat/Lon to IP addresses
… | iplocation clientip
▶ Visualize statistics geographically
… | geostats sum(price) by product
▶ Use custom choropleths
… | geom <featureCollection> <featureId>
▶ Track object movements
… | table _time latitude longitude vehicleId
geom –
Geographic Data
Examples
© 2017 SPLUNK INC.
▶ Assign Lat/Lon to IP addresses
… | iplocation clientip
▶ Visualize statistics geographically
… | geostats sum(price) by product
▶ Use custom choropleths
… | geom <featureCollection> <featureId>
▶ Track object movements
… | table _time latitude longitude vehicleId
table –
Geographic Data
Examples
© 2017 SPLUNK INC.
Custom Visualizations
© 2017 SPLUNK INC.
▶ Native charts and maps
• Bar / Line / Area charts
• Bubble / Scatter plots
• Gauges
• Maps
• Single Value Displays
• Tables
▶ Generalized to fit use cases
across many different areas
▶ Can be customized to some
extent to cover specific use cases
Native Visualizations In Splunk
56
© 2017 SPLUNK INC.
▶ Many use cases require a
more specific visualization
▶ Specific custom appearance
▶ Represent data where native
visualizations are not suitable
• You can Splunk everything!
• We won’t be able to predict every
possible use case
• Still uses SPL to drive
visualizations
Custom Visualizations FTW!
© 2017 SPLUNK INC.
▶ Platform extensibility framework
and API
▶ Targeted at internal and external
developers with web development
/ JS skills and basic knowledge of
the Splunk platform
▶ Developers can make use of any
third party libraries (d3.js, three.js,
highcharts.js, etc…) that run in the
browser*
* with minor adjustments, and if third party license permits
such use
Custom Visualizations
© 2017 SPLUNK INC.
Custom Visualizations For Admins
In-product
• Packaged as an app!
• Installed like any other app
• Users can search for
visualizations on
Splunkbase and directly in
the product
Installation
© 2017 SPLUNK INC.
▶ Choose from potentially dozens of installed
visualizations!
▶ Appears as a first-class citizen alongside
native visualizations
• Looks and works just like packaged native
visualizations
▶ Customize functionality and appearance of
the visualization without touching any code,
straight from the UI
▶ SPL Example provided as you hover over
each visualization option.
Custom Visualizations How-to
© 2017 SPLUNK INC.
New Splunk Visualizations
Treemap
Sankey
Diagram
Punchcard Calendar
Heat Map
Parallel
Coordinates
Bullet GraphLocation
Tracker
Horseshoe
Meter
Machine Learning
Charts
Timeline
Horizon
Chart
Multiple use cases across IT, security, IoT, and business analytics
© 2017 SPLUNK INC.
Box Plot
3D scatter plot
New Partner/Community Visualizations
Wordcloud
Donut Chart
Heat Map
© 2017 SPLUNK INC.
New Partner/Community Visualizations
Geo Heatmap
Custom Cluster Map
Clustered Single
Value Map
Missile Map
© 2017 SPLUNK INC.
Custom Visualizations – Demo!
© 2017 SPLUNK INC.
Demo Screenshot #1
© 2017 SPLUNK INC.
Demo Screenshot #2
© 2017 SPLUNK INC.
▶ Find the needle in the haystack
▶ Charting statistics and predicting values
▶ Enriching and converging data sources
▶ Map geographic data in real time
▶ Identifying anomalies
▶ Transactions
▶ Data exploration & finding relationships between fields
▶ Custom commands
SPL Examples and Recipes
© 2017 SPLUNK INC.
▶ Find anomalies
| inputlookup car_data.csv |
anomalydetection
▶ Summarize anomalies
| inputlookup car_data.csv |
anomalydetection action=summary
▶ Use IQR and remove outliers
| inputlookup car_data.csv |
anomalydetection method=iqr
action=remove
Anomaly Detection –
Find anomalies in your data
Examples
© 2017 SPLUNK INC.
▶ Find the needle in the haystack
▶ Charting statistics and predicting values
▶ Enriching and converging data sources
▶ Map geographic data in real time
▶ Identifying anomalies
▶ Transactions
▶ Data exploration & finding relationships between fields
▶ Custom commands
SPL Examples and Recipes
© 2017 SPLUNK INC.
▶ Group by session ID
sourcetype=access*
| transaction JSESSIONID
▶ Calculate session durations
sourcetype=access*
| transaction JSESSIONID
| stats min(duration) max(duration)
avg(duration)
▶ Stats is better
sourcetype=access*
| stats min(_time) AS earliest max(_time)
AS latest by JSESSIONID
| eval duration=latest-earliest
| stats min(duration) max(duration)
avg(duration)
Transaction – Group Related
Events Spanning Time
Examples
© 2017 SPLUNK INC.
▶ Group by session ID
sourcetype=access*
| transaction JSESSIONID
▶ Calculate session durations
sourcetype=access*
| transaction JSESSIONID
| stats min(duration) max(duration)
avg(duration)
▶ Stats is better
sourcetype=access*
| stats min(_time) AS earliest max(_time)
AS latest by JSESSIONID
| eval duration=latest-earliest
| stats min(duration) max(duration)
avg(duration)
Transaction – Group Related
Events Spanning Time
Examples
© 2017 SPLUNK INC.
▶ Group by session ID
sourcetype=access*
| transaction JSESSIONID
▶ Calculate session durations
sourcetype=access*
| transaction JSESSIONID
| stats min(duration) max(duration)
avg(duration)
▶ Stats is better
sourcetype=access*
| stats min(_time) AS earliest max(_time)
AS latest by JSESSIONID
| eval duration=latest-earliest
| stats min(duration) max(duration)
avg(duration)
Transaction – Group Related
Events Spanning Time
Examples
© 2017 SPLUNK INC.
▶ Find the needle in the haystack
▶ Charting statistics and predicting values
▶ Enriching and converging data sources
▶ Map geographic data in real time
▶ Identifying anomalies
▶ Transactions
▶ Data exploration & finding relationships between fields
▶ Custom commands
SPL Examples and Recipes
© 2017 SPLUNK INC.
Data Exploration
| analyzefields
| anomalies
| arules
| associate
| cluster
| contingency
| correlate
| fieldsummary
© 2017 SPLUNK INC.
▶ Find most/least common events
* | cluster showcount=t t=.1
| table _raw cluster_count
▶ Display Summary of Fields
sourcetype=access_combined
| fields – date* source* time*
| fieldsummary maxvals=5
▶ Show patterns of co-occurring fields
sourcetype=access_combined
| fields – date* source* time* | correlate
▶ View field relationships
sourcetype=access_combined
| contingency uri status
▶ Find predictors of fields
sourcetype=access_combined
| analyzefields classfield=status
Cluster – Exploring Your Data
(find	common	and/or	rare	events	within	your	data)
Examples
© 2017 SPLUNK INC.
▶ Find most/least common events
* | cluster showcount=t t=.1
| table _raw cluster_count
▶ Display Summary of Fields
sourcetype=access_combined
| fields – date* source* time*
| fieldsummary maxvals=5
▶ Show patterns of co-occurring fields
sourcetype=access_combined
| fields – date* source* time* | correlate
▶ View field relationships
sourcetype=access_combined
| contingency uri status
▶ Find predictors of fields
sourcetype=access_combined
| analyzefields classfield=status
fieldsummary – Exploring Your Data
Gives	you	a	quick	breakdown	of	your	numerical	fields	such	as	count,	
min,	max,	stdev,	etc.		It	also	shows	you	examples	values	in	the	event.	
Examples
© 2017 SPLUNK INC.
▶ Find most/least common events
* | cluster showcount=t t=.1
| table _raw cluster_count
▶ Display Summary of Fields
sourcetype=access_combined
| fields – date* source* time*
| fieldsummary maxvals=5
▶ Show patterns of co-occurring fields
sourcetype=access_combined
| fields – date* source* time* | correlate
▶ View field relationships
sourcetype=access_combined
| contingency uri status
▶ Find predictors of fields
sourcetype=access_combined
| analyzefields classfield=status
correlate – Exploring Your Data
find	co-occurrence	between	fields.		Basically	a	matrix	showing	the	
‘Field1	exists	80%	of	the	time	when	Field2	exists
Examples
© 2017 SPLUNK INC.
▶ Find most/least common events
* | cluster showcount=t t=.1
| table _raw cluster_count
▶ Display Summary of Fields
sourcetype=access_combined
| fields – date* source* time*
| fieldsummary maxvals=5
▶ Show patterns of co-occurring fields
sourcetype=access_combined
| fields – date* source* time* | correlate
▶ View field relationships
sourcetype=access_combined
| contingency uri status
▶ Find predictors of fields
sourcetype=access_combined
| analyzefields classfield=status
contingency – Exploring Your Data
look	for	relationships	of	between	two	fields.	
Examples
© 2017 SPLUNK INC.
▶ Find most/least common events
* | cluster showcount=t t=.1
| table _raw cluster_count
▶ Display Summary of Fields
sourcetype=access_combined
| fields – date* source* time*
| fieldsummary maxvals=5
▶ Show patterns of co-occurring fields
sourcetype=access_combined
| fields – date* source* time* | correlate
▶ View field relationships
sourcetype=access_combined
| contingency uri status
▶ Find predictors of fields
sourcetype=access_combined
| analyzefields classfield=status
analyzefields – Exploring Your Data
extremely	useful	for	not	only	looking	for	meaningful	fields	in	your	data,	but	also	
for	determining	which	fields	to	use	in	linear	or	logistical	regression	algorithms	in	
the	machine	learning	appExamples
© 2017 SPLUNK INC.
▶ Predict Numeric Fields
▶ Predict Categorical Fields
▶ Detect Numerical Outliers
▶ Detect Categorical Outliers
▶ Forecast Time Series
▶ Cluster Events
Machine Learning
Toolkit and Showcase
Examples
© 2017 SPLUNK INC.
▶ Find the needle in the haystack
▶ Charting statistics and predicting values
▶ Enriching and converging data sources
▶ Map geographic data in real time
▶ Identifying anomalies
▶ Transactions
▶ Data exploration & finding relationships between fields
▶ Custom commands
SPL Examples and Recipes
© 2017 SPLUNK INC.
▶ What is a Custom Command?
• “| haversine origin="47.62,-122.34" outputField=dist lat lon”
▶ Why do we use Custom Commands?
• Run other/external algorithms on your Splunk data
• Save time munging data (see Timewrap!)
• Because you can!
▶ Create your own or download as Apps
• Haversine (Distance between two GPS coords)
• Timewrap (Enhanced Time overlay)
• Levenshtein (Fuzzy string compare)
• Base64 (Encode/Decode)
Custom Commands
© 2017 SPLUNK INC.
▶ Download and install App
Haversine
▶ Read documentation then use in SPL!
sourcetype=access*
| iplocation clientip
| search City=A*
| haversine origin="47.62,-122.34" units=mi
outputField=dist lat lon
| table clientip, City, dist, lat, lon
Custom Commands –
Haversine
Examples
© 2017 SPLUNK INC.
▶ Download and install App
Haversine
▶ Read documentation then use in SPL!
sourcetype=access*
| iplocation clientip
| search City=A*
| haversine origin="47.62,-122.34" units=mi
outputField=dist lat lon
| table clientip, City, dist, lat, lon
Custom Commands –
Haversine
Examples
© 2017 SPLUNK INC.
SPL & The Machine
Learning Toolkit
© 2017 SPLUNK INC.
▶ Predict Numeric Fields
▶ Predict Categorical Fields
▶ Detect Numerical Outliers
▶ Detect Categorical Outliers
▶ Forecast Time Series
▶ Cluster Events
Machine Learning
Toolkit and Showcase
Examples
© 2017 SPLUNK INC.
Machine Learning with the Splunk Platform
Visualize
Share
Clean
Transform
Operationalize
Monitor Alert
Build Model
Search
Explore
Collect
Data
Test, Improve
Models
Ecosystem MLTK
Choose
Algorithm
Ecosystem
Splunk Splunk
Splunk
Splunk
MLTK
Splunk
MLTK
Splunk
MLTK
Splunk
Ecosystem
Splunk
Real-time Data Science Pipeline
Ecosystem
MLTK
Splunk
Splunk’s App Ecosystem contains 1000’s of free add-ons for getting data in,
applying structure and visualizing your data giving you faster time to value.
The Machine Learning Toolkit delivers new SPL commands, custom
visualizations, assistants, and examples to explore a variety of ml concepts.
Splunk Enterprise is the mission-critical platform for indexing, searching,
analyzing, alerting and visualizing machine data.
Packaged:
UBA, ITSI
© 2017 SPLUNK INC.
ML SPL
Visualize
Share
Correlate
Clean Munge
Operationalize
Monitor Alert
Build Model
Search
Explore
Universal
Indexing
Test, Improve
Models
Ecosystem MLTK
Choose
Algorithm
Ecosystem
Splunk Splunk
Splunk
Splunk
MLTK
Splunk
MLTK
Splunk
MLTK
Splunk
Ecosystem
Splunk
fit
sample
apply
listmodels
deletemodel
summary
eval
rex
stats
eventstats
streamstats
table
…
timechart
chart
stats
geostats
geom
sendalert
sendemail
table
…
MLTK Library
predict (cmd)
anomalydetection
(cmd)
analyzefields
anomalies
arules
associate
cluster
contingency
correlate
fieldsummary
…
© 2017 SPLUNK INC.
MLTK Commands
The Machine Learning Toolkit contains several custom
search commands that implement classic machine
learning and statistical learning tasks:
• fit: Fit and apply a machine learning model to search
results.
• apply: Apply a machine learning model that was
learned using the fit command.
• summary: Return a summary of a machine learning
model that was learned using the fit command.
• listmodels: Return a list of machine learning models
that were learned using the fit command.
• deletemodel: Delete a machine learning model that
was learned using the fit command.
• sample: Randomly sample or partition events.
© 2017 SPLUNK INC.
ML-SPL Demo
© 2017 SPLUNK INC.
Set Up Before You Can Play
Download the following at splunkbase.com
▶ Machine Learning Toolkit:
• https://splunkbase.splunk.com/app/2890/
▶ Python for Scientific Computing:
• https://splunkbase.splunk.com/app/2881/
*Note – For the Python for Scientific Computing App you need to
download the platform specific version – Mac, Linux, Windows
© 2017 SPLUNK INC.
| fit
© 2017 SPLUNK INC.
| sample
© 2017 SPLUNK INC.
| apply
© 2017 SPLUNK INC.
| listmodels
© 2017 SPLUNK INC.
| summary
© 2017 SPLUNK INC.
| deletemodel
© 2017 SPLUNK INC.© 2017 SPLUNK INC.
Best Practices
© 2017 SPLUNK INC.
The Truth about SPL
COMMAND ORDER MATTERS
© 2017 SPLUNK INC.
Command Order
Streaming > Transforming
© 2017 SPLUNK INC.
LoadJob
• Loads events or results of a previously completed search job.
• Identified either by the search job id or a scheduled search name.
One Search, Many Panels
| loadjob savedsearch=”Name of Saved Search"
© 2017 SPLUNK INC.
One Search, Many Panels
| loadjob savedsearch=”Name of Saved Search"
| eval threat_ip=if(match_field="src_ip",src_ip,dest_ip)
| eval
start_time=coalesce(relative_time(now(),"$field1.earliest$"),"-
30d@m")
| eval
end_time=coalesce(relative_time(now(),"$field1.latest$"),now())
| where _time>=start_time AND _time<=end_time
| timechart span=1h dc(threat_ip) as threat_ip_count by
feed_category
© 2017 SPLUNK INC.
One Search, Many Panels
Macros
• Reusable chunks of SPL that you can use within other searches.
• Allows for the passing of arguments. (ex. |`splunk_domain_split(url)`)
© 2017 SPLUNK INC.
Use Lookups
index=*
tag=authentication
action=failure user!=test
user!=test1 user!=test2
user!=test3 user!=test
user!=test4 user!=test5
user!=user
| stats count by user
| where count > 1000
| sort -count
index=*
tag=authentication
action=failure NOT
[|inputlookup
splunk_user_whitelist.csv
| fields user]
| stats count by user
| where count > 1000
| sort -count
Direct Exclusion Lookup Exclusion
© 2017 SPLUNK INC.
▶Additional information can be found in:
• Power of SPL App!
• Docs - Search Manual
• Docs - MLTK Search Commands
• MLTK Quick Reference Guide
• Blogs
• Answers
• Exploring Splunk
For More Information
© 2017 SPLUNK INC.
• SPL Examples
App
Other Useful Apps to download!
• Splunk 6.x Dashboard
Examples
• Splunk 6.5 Overview
App
© 2017 SPLUNK INC.
▶ Benefit of using your own laptop is a ‘Take it with you after’ approach –
and bang on it without messing up production Splunk
• Promote Personalized 50GB 6 month Dev/Test license to use for
Workshop (vs. 30 day download trial)
• Encourages long term playing with Splunk and continuously testing out new
data sources
How about an extra 50GB?
© 2017 SPLUNK INC.
Q & A
© 2017 SPLUNK INC.© 2017 SPLUNK INC.
Thank You

More Related Content

PPTX
Best Practices for Forwarder Hierarchies
PPTX
Zabbix - fonctionnement, bonnes pratiques, inconvenients
PPTX
End to End Security With Palo Alto Networks (Onur Kasap, engineer Palo Alto N...
PDF
Splunk 6.4 Administration.pdf
PPTX
NGINX Installation and Tuning
PDF
Splunk Data Onboarding Overview - Splunk Data Collection Architecture
PDF
Linux Systems Performance 2016
PDF
All about Zookeeper and ClickHouse Keeper.pdf
Best Practices for Forwarder Hierarchies
Zabbix - fonctionnement, bonnes pratiques, inconvenients
End to End Security With Palo Alto Networks (Onur Kasap, engineer Palo Alto N...
Splunk 6.4 Administration.pdf
NGINX Installation and Tuning
Splunk Data Onboarding Overview - Splunk Data Collection Architecture
Linux Systems Performance 2016
All about Zookeeper and ClickHouse Keeper.pdf

What's hot (20)

PDF
SplunkSummit 2015 - A Quick Guide to Search Optimization
PPTX
"Splunk Worst Practices"... und wie man diese behebt
PPTX
Worst Splunk practices...and how to fix them
PPTX
Best Practices for Splunk Deployments
PPTX
Splunk Cloud
PPTX
Getting Data into Splunk
PPTX
Splunk Architecture
PPTX
Splunk Architecture overview
PDF
ELK stack introduction
PDF
Beyond SQL: Speeding up Spark with DataFrames
PDF
Alfresco Backup and Disaster Recovery White Paper
PPTX
Splunk for IT Operations
PPTX
Log analysis using elk
PPTX
Splunk overview
PPTX
Apache Tez - A unifying Framework for Hadoop Data Processing
DOCX
Getting Started with Splunk Enterprise - Demo
PDF
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
PPTX
Worst Splunk practices...and how to fix them
PDF
Introducing ELK
SplunkSummit 2015 - A Quick Guide to Search Optimization
"Splunk Worst Practices"... und wie man diese behebt
Worst Splunk practices...and how to fix them
Best Practices for Splunk Deployments
Splunk Cloud
Getting Data into Splunk
Splunk Architecture
Splunk Architecture overview
ELK stack introduction
Beyond SQL: Speeding up Spark with DataFrames
Alfresco Backup and Disaster Recovery White Paper
Splunk for IT Operations
Log analysis using elk
Splunk overview
Apache Tez - A unifying Framework for Hadoop Data Processing
Getting Started with Splunk Enterprise - Demo
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Worst Splunk practices...and how to fix them
Introducing ELK
Ad

Similar to The Power of SPL (20)

PDF
Power of SPL
PDF
The Power of SPL
PDF
Power of SPL Workshop
PDF
Splunk workshop-2017-Power-of-SPL
PPTX
Power of SPL
PDF
Power of SPL Workshop
PDF
Splunk Discovery Day Milwaukee 9-14-17
PDF
Machine Data 101
PDF
Machine Data Is EVERYWHERE: Use It for Testing
PPTX
Splunk Forum Frankfurt - 15th Nov 2017 - .conf2017 Update
PPTX
SplunkLive! Zurich 2017 - The Power of SPL
PPTX
Machine Data 101: Turning Data Into Insight
PDF
Splunk Discovery Indianapolis - October 10, 2017
PPTX
Taking Splunk to the Next Level - Architecture Breakout Session
PPTX
Taking Splunk to the Next Level - Architecture
PDF
Machine Data 101
PPTX
Supporting Splunk at Scale, Splunking at Home & Introduction to Enterprise Se...
PPTX
SplunkLive! Zurich 2017 - Data Obfuscation in Splunk Enterprise
PDF
SplunkSummit 2015 - Update on Splunk Enterprise 6.3 & Hunk 6.3
PDF
SFBA Usergroup meeting November 2, 2022
Power of SPL
The Power of SPL
Power of SPL Workshop
Splunk workshop-2017-Power-of-SPL
Power of SPL
Power of SPL Workshop
Splunk Discovery Day Milwaukee 9-14-17
Machine Data 101
Machine Data Is EVERYWHERE: Use It for Testing
Splunk Forum Frankfurt - 15th Nov 2017 - .conf2017 Update
SplunkLive! Zurich 2017 - The Power of SPL
Machine Data 101: Turning Data Into Insight
Splunk Discovery Indianapolis - October 10, 2017
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture
Machine Data 101
Supporting Splunk at Scale, Splunking at Home & Introduction to Enterprise Se...
SplunkLive! Zurich 2017 - Data Obfuscation in Splunk Enterprise
SplunkSummit 2015 - Update on Splunk Enterprise 6.3 & Hunk 6.3
SFBA Usergroup meeting November 2, 2022
Ad

More from Splunk (20)

PDF
Splunk Leadership Forum Wien - 20.05.2025
PDF
Splunk Security Update | Public Sector Summit Germany 2025
PDF
Building Resilience with Energy Management for the Public Sector
PDF
IT-Lagebild: Observability for Resilience (SVA)
PDF
Nach dem SOC-Aufbau ist vor der Automatisierung (OFD Baden-Württemberg)
PDF
Monitoring einer Sicheren Inter-Netzwerk Architektur (SINA)
PDF
Praktische Erfahrungen mit dem Attack Analyser (gematik)
PDF
Cisco XDR & Splunk SIEM - stronger together (DATAGROUP Cyber Security)
PDF
Security - Mit Sicherheit zum Erfolg (Telekom)
PDF
One Cisco - Splunk Public Sector Summit Germany April 2025
PDF
.conf Go 2023 - Data analysis as a routine
PDF
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV
PDF
.conf Go 2023 - Navegando la normativa SOX (Telefónica)
PDF
.conf Go 2023 - Raiffeisen Bank International
PDF
.conf Go 2023 - På liv og død Om sikkerhetsarbeid i Norsk helsenett
PDF
.conf Go 2023 - Many roads lead to Rome - this was our journey (Julius Bär)
PDF
.conf Go 2023 - Das passende Rezept für die digitale (Security) Revolution zu...
PDF
.conf go 2023 - Cyber Resilienz – Herausforderungen und Ansatz für Energiever...
PDF
.conf go 2023 - De NOC a CSIRT (Cellnex)
PDF
conf go 2023 - El camino hacia la ciberseguridad (ABANCA)
Splunk Leadership Forum Wien - 20.05.2025
Splunk Security Update | Public Sector Summit Germany 2025
Building Resilience with Energy Management for the Public Sector
IT-Lagebild: Observability for Resilience (SVA)
Nach dem SOC-Aufbau ist vor der Automatisierung (OFD Baden-Württemberg)
Monitoring einer Sicheren Inter-Netzwerk Architektur (SINA)
Praktische Erfahrungen mit dem Attack Analyser (gematik)
Cisco XDR & Splunk SIEM - stronger together (DATAGROUP Cyber Security)
Security - Mit Sicherheit zum Erfolg (Telekom)
One Cisco - Splunk Public Sector Summit Germany April 2025
.conf Go 2023 - Data analysis as a routine
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV
.conf Go 2023 - Navegando la normativa SOX (Telefónica)
.conf Go 2023 - Raiffeisen Bank International
.conf Go 2023 - På liv og død Om sikkerhetsarbeid i Norsk helsenett
.conf Go 2023 - Many roads lead to Rome - this was our journey (Julius Bär)
.conf Go 2023 - Das passende Rezept für die digitale (Security) Revolution zu...
.conf go 2023 - Cyber Resilienz – Herausforderungen und Ansatz für Energiever...
.conf go 2023 - De NOC a CSIRT (Cellnex)
conf go 2023 - El camino hacia la ciberseguridad (ABANCA)

Recently uploaded (20)

PDF
Rapid Prototyping: A lecture on prototyping techniques for interface design
PDF
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
PDF
Data Virtualization in Action: Scaling APIs and Apps with FME
PPTX
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
PPTX
Training Program for knowledge in solar cell and solar industry
PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
PDF
Early detection and classification of bone marrow changes in lumbar vertebrae...
DOCX
Basics of Cloud Computing - Cloud Ecosystem
PDF
NewMind AI Weekly Chronicles – August ’25 Week IV
PPTX
Microsoft User Copilot Training Slide Deck
PDF
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
PDF
Advancing precision in air quality forecasting through machine learning integ...
PDF
SaaS reusability assessment using machine learning techniques
PDF
Introduction to MCP and A2A Protocols: Enabling Agent Communication
PDF
Improvisation in detection of pomegranate leaf disease using transfer learni...
PDF
EIS-Webinar-Regulated-Industries-2025-08.pdf
PPTX
Module 1 Introduction to Web Programming .pptx
PDF
A hybrid framework for wild animal classification using fine-tuned DenseNet12...
PDF
Comparative analysis of machine learning models for fake news detection in so...
PDF
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
Rapid Prototyping: A lecture on prototyping techniques for interface design
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
Data Virtualization in Action: Scaling APIs and Apps with FME
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
Training Program for knowledge in solar cell and solar industry
Enhancing plagiarism detection using data pre-processing and machine learning...
Early detection and classification of bone marrow changes in lumbar vertebrae...
Basics of Cloud Computing - Cloud Ecosystem
NewMind AI Weekly Chronicles – August ’25 Week IV
Microsoft User Copilot Training Slide Deck
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
Advancing precision in air quality forecasting through machine learning integ...
SaaS reusability assessment using machine learning techniques
Introduction to MCP and A2A Protocols: Enabling Agent Communication
Improvisation in detection of pomegranate leaf disease using transfer learni...
EIS-Webinar-Regulated-Industries-2025-08.pdf
Module 1 Introduction to Web Programming .pptx
A hybrid framework for wild animal classification using fine-tuned DenseNet12...
Comparative analysis of machine learning models for fake news detection in so...
Dell Pro Micro: Speed customer interactions, patient processing, and learning...

The Power of SPL

  • 1. © 2017 SPLUNK INC.© 2017 SPLUNK INC. Power of Splunk Search Processing Language (SPL™) 07/13/2017 | Jacksonville Tolga Tohumcu | Staff Sales Engineer @TolgaTohumcu
  • 2. © 2017 SPLUNK INC. During the course of this presentation, we may make forward-looking statements regarding future events or the expected performance of the company. We caution you that such statements reflect our current expectations and estimates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward-looking statements, please review our filings with the SEC. The forward-looking statements made in this presentation are being made as of the time and date of its live presentation. If reviewed after its live presentation, this presentation may not contain current or accurate information. We do not assume any obligation to update any forward-looking statements we may make. In addition, any information about our roadmap outlines our general product direction and is subject to change at any time without notice. It is for informational purposes only and shall not be incorporated into any contract or other commitment. Splunk undertakes no obligation either to develop the features or functionality described or to include any such feature or functionality in a future release. Splunk, Splunk>, Listen to Your Data, The Engine for Machine Data, Splunk Cloud, Splunk Light and SPL are trademarks and registered trademarks of Splunk Inc. in the United States and other countries. All other brand names, product names, or trademarks belong to their respective owners. © 2017 Splunk Inc. All rights reserved. Forward-Looking Statements
  • 3. © 2017 SPLUNK INC. Set Up Before You Can Play Download the following at splunk.com and splunkbase.com ▶ Splunk Enterprise: • https://www.splunk.com/download ▶ Power of SPL App: • https://splunkbase.splunk.com/app/3353/
  • 4. © 2017 SPLUNK INC. Set Up Before You Can Play ▶ Install Power of SPL App ▶ Now we’re ready to go!
  • 5. © 2017 SPLUNK INC. ▶ License expired (already had older version installed) • Close browser, empty cache, open browser. If that doesn’t work: • Stop Splunk. • Uninstall all Splunk versions • Windows Control Panel->Uninstall programs->Splunk • OS X. Finder->Applications->Right click Splunk, Move to trash • Reinstall • Start Splunk ▶ Can’t start Splunk • Windows, Search Control panel ->Services->Splunk start • Linux; cd <SPLUNK dir>/splunk/bin;./splunk start Common problems at this point
  • 6. © 2017 SPLUNK INC. 1. Installation and Setup (~15min) 2. Power of SPL Walkthrough (~1h 30min) • Overview & Anatomy of a Search • SPL Commands & Examples for Searching, charting, converging, mapping, transactions, anomalies, exploring data, custom 3. Custom Visualizations (~30min) 4. SPL and the Machine Learning Toolkit (~45min) Agenda
  • 7. © 2017 SPLUNK INC. SPL Overview
  • 8. © 2017 SPLUNK INC. ▶ Over 140 search commands ▶ Syntax was originally based upon the Unix pipeline and SQL and is optimized for time-series data ▶ The scope of SPL includes data searching, filtering, modification, manipulation, enrichment, insertion and deletion ▶ Includes machine learning such as anomaly detection SPL Overview Disk Intermediate results table Intermediate results table Final results table
  • 9. © 2017 SPLUNK INC. ▶ Flexibility and effectiveness on small and big data ▶ Late-binding schema ▶ More/better methods of correlation ▶ Not just analyze, but visualize Why Create a New Query Language? Data BIG Data
  • 10. © 2017 SPLUNK INC. search and filter | munge | report | cleanup | rename sum(KB) AS "Total KB" dc(clientip) AS "Unique Customers" | eval KB=bytes/1024 sourcetype=access* | stats sum(KB) dc(clientip) SPL Basic Structure
  • 11. © 2017 SPLUNK INC. Knowledge Objects Data Interpretation – provide structure to raw data. (field extractions) Data Classification – group similar events. (event types, transactions) Data Enrichment – add value from external sources. (lookups) Data Normalization – group related fields. (tags, aliases) Data Models – represent one or more data sets. (pivot)
  • 12. © 2017 SPLUNK INC. Types of Search Commands Streaming • Operate on each event individually • Distributable Streaming • Run on Indexers • Eval, fields, rename, regex • Centralized (Stateful) Streaming • Run on Search Head • head, streamstats 12 Transforming • Generate a reporting data structure • Operate on the entire event set • chart,timechart, stats, top Generating • Usually invoked at the beginning of a search • Do not expect or require an input • | search is implied • dbinspect, datamodel, inputcsv
  • 13. © 2017 SPLUNK INC. Search Processor Categories Streaming (remote) • Consider each event/result row individually • Can be distributed to the indexers • Ex: eval, where, search, lookup Stateful Streaming • Consider each event/results row individually but maintain extra state modified by processing each event • Ordering is significant, and as such they force the search back to the search head • Ex: head, streamstats 13 Events • Need to see all events/result rows to perform their tasks • Ex: tail, sort, eventstats Transforming • Need to see all events/result rows and transform them into some new result rows • Stream Reporting • Can take batches of input, and then at any time generate output based on the input seen so far • Ex: stats, chart • Reporting • Must take all events at once • Ex: cluster, geostats
  • 14. © 2017 SPLUNK INC. Search Pipelines Pipeline Setup ▶Search Pipelines are built according to the processors and the order they occur in the search string • At most, 5 pipelines are built, one for each processor category • Less restrictive processors may be accumulated into a more restrictive pipeline • Depending on the order of commands ▶Each processor is asked to first evaluate its arguments for validity, then processes them for setup 14
  • 15. © 2017 SPLUNK INC. Search Pipelines Pipeline Processing ▶ Pipeline processing begins • Processors for the current category and earlier pipelines are accumulated in the current pipeline • Splunk moves to the next most restricted pipeline as we find a command that is restricted to that level • We never move back to a less restrictive pipeline once moving to a more restrictive pipeline ▶ Streaming Pipeline • Only pipeline that can be distributed to indexers • After all streaming commands are accumulated, non-streaming commands are asked if they can be split • Ex: Prestats, Presort, etc… • If available, commands are added to the Streaming Pipeline 15
  • 16. © 2017 SPLUNK INC. Search Pipelines Pipeline Rendering ▶Streaming Pipeline • Rendered as remoteSearch string and sent to Indexers ▶Stateful and Events Pipelines • Rendered into eventsSearch ▶Stream Report and Report Pipelines • Rendered into reportSearch 16
  • 17. © 2017 SPLUNK INC. Search Modes ● Historical – Cursored Search ‣ Most historical searches run in this mode ‣ Returns events in strict time ordering – Batch Mode Search ‣ Automatically enabled when possible ‣ Reads buckets sequentially = returns data faster ‣ Events are not time ordered ‣ Uses more memory ‣ Job Inspector: isBatchModeSearch 17 ● Realtime – Realtime Search ‣ Impacts indexing performance ‣ Hooks into indexing pipeline daemon to build a separate queue of events – Indexed Realtime Search ‣ Continuous search ‣ Does not hook into indexing pipeline ‣ Default latency of 60sec with 1 sec lookback – Parameters can be adjusted
  • 18. © 2017 SPLUNK INC. Search UI ● Search UI Modes – Fast, Smart, Verbose Mode – Determine which fields are extracted – Turn timeliner on/off ● Data Preview – When its time, main search thread is stopped so Splunk can gather current results and generate a preview – Lots of parameters available to tweak this – If you don’t need a preview, disable it in the UI ● Timeliner – Runs in the background – May cause unnecessary events to be downloaded, especially when in verbose mode – By default, 1,000 events are downloaded per column with 300 columns max – Parameters available to adjust this or turn it off 18
  • 19. © 2017 SPLUNK INC. Search Advice ● Filter early – Specify an index – Utilize indexed extractions where available – Use the TERM directive if applicable ● Place streaming/remote commands before non-streaming commands ● Avoid using table, except at the very end – This is a reporting command and will cause data to be pushed to Search Head ● Remove unnecessary data using | fields 19
  • 20. © 2017 SPLUNK INC. Job Inspector Types of Components ▶ Command • Measures actual work being done ▶ Dispatch • Measures Splunk framework 20 ● Invocations – # of times the component was run ● Input/Output Count – Generally the event count – Exception: dispatch.stream.remote ‣ Byte count of data received from the peer
  • 21. © 2017 SPLUNK INC. Job Inspector Dispatch.evaluate ▶ Search is parsed here ▶ Sub-searches are run here 21
  • 22. © 2017 SPLUNK INC. Job Inspector Dispatch.stream.remote ▶ Time spent searching on peers ▶ Useful for comparing time spent and # of results returned across peers 22
  • 23. © 2017 SPLUNK INC. Job Inspector Command.search.index.usec ▶ Indicates how long it took to search an index in microseconds • Helps identify slow searching of indexes • Can indicate VERY large indexes (especially on rare term searches) • Lots of indexed extractions? Data with extremely high cardinality? • Can indicate slow disk or disk contention (especially on dense searches) ▶ Histogram of invocations grouped by microseconds ▶ Correlate this to dispatch.stream.remote 23
  • 24. © 2017 SPLUNK INC. Job Inspector Other Interesting Components… ▶ Command.search.rawdata • Time peer spend uncompressing bucket slices • On a dense search, can indicate CPU contention ▶ Command.search.filter • Post filtering after KV extraction (schema-on-the-fly) ▶ Command.search.createProviderQueue • Time spent setting up connections to peers 24
  • 25. © 2017 SPLUNK INC. SPL Examples
  • 26. © 2017 SPLUNK INC. ▶ Find the needle in the haystack ▶ Charting statistics and predicting values ▶ Enriching and converging data sources ▶ Map geographic data in real time (We’ll add some custom viz here!) ▶ Identifying anomalies ▶ Transactions ▶ Data exploration & finding relationships between fields ▶ Custom commands SPL Examples and Recipes
  • 27. © 2017 SPLUNK INC. ▶ Find the needle in the haystack ▶ Charting statistics and predicting values ▶ Enriching and converging data sources ▶ Map geographic data in real time ▶ Identifying anomalies ▶ Transactions ▶ Data exploration & finding relationships between fields ▶ Custom commands SPL Examples and Recipes
  • 28. © 2017 SPLUNK INC. ▶ Keyword search: sourcetype=access* http ▶ Filter: sourcetype=access* http host=webserver-02 ▶ Combined: sourcetype=access* http host=webserver-02 (503 OR 504) Search and Filter Examples
  • 29. © 2017 SPLUNK INC. ▶ Keyword search: sourcetype=access* http ▶ Filter: sourcetype=access* http host=webserver-02 ▶ Combined: sourcetype=access* http host=webserver-02 (503 OR 504) Search and Filter Examples
  • 30. © 2017 SPLUNK INC. ▶ Keyword search: sourcetype=access* http ▶ Filter: sourcetype=access* http host=webserver-02 ▶ Combined: sourcetype=access* http host=webserver-02 (503 OR 504) Search and Filter Examples
  • 31. © 2017 SPLUNK INC. ▶ Calculation: sourcetype=access* |eval KB=bytes/1024 ▶ Evaluation: sourcetype=access* | eval http_response = if(status != 200, ”Error", ”OK”) ▶ Concatenation: sourcetype=access* | eval connection = device.” - ".clientip Eval – Modify or Create New Fields and Values Examples
  • 32. © 2017 SPLUNK INC. ▶ Calculation: sourcetype=access* |eval KB=bytes/1024 ▶ Evaluation: sourcetype=access* | eval http_response = if(status != 200, ”Error", ”OK”) ▶ Concatenation: sourcetype=access* | eval connection = device.” - ".clientip Eval – Modify or Create New Fields and Values Examples
  • 33. © 2017 SPLUNK INC. ▶ Calculation: sourcetype=access* |eval KB=bytes/1024 ▶ Evaluation: sourcetype=access* | eval http_response = if(status != 200, ”Error", ”OK”) ▶ Concatenation: sourcetype=access* | eval connection = device.” - ".clientip Eval – Modify or Create New Fields and Values Examples
  • 34. © 2017 SPLUNK INC. Eval – Just Getting Started! Splunk Search Quick Reference Guide
  • 35. © 2017 SPLUNK INC. ▶ Find the needle in the haystack ▶ Charting statistics and predicting values ▶ Enriching and converging data sources ▶ Map geographic data in real time ▶ Identifying anomalies ▶ Transactions ▶ Data exploration & finding relationships between fields ▶ Custom commands SPL Examples and Recipes
  • 36. © 2017 SPLUNK INC. ▶ Calculate stats and rename Index=power_of_spl | stats avg(bytes) AS “Avg Bytes” ▶ Multiple statistics index=power_of_spl | stats avg(bytes) AS bytes sparkline(avg(bytes)) AS Bytes_Trend min(bytes) max(bytes) ▶ By another field index=power_of_spl | stats avg(bytes) AS avg_bytes sparkline(avg(bytes)) AS Bytes_Trend min(bytes) max(bytes) by clientip | sort - avg_bytes Stats – Calculate Statistics Based on Field Values Examples
  • 37. © 2017 SPLUNK INC. ▶ Calculate stats and rename Index=power_of_spl | stats avg(bytes) AS “Avg Bytes” ▶ Multiple statistics index=power_of_spl | stats avg(bytes) AS bytes sparkline(avg(bytes)) AS Bytes_Trend min(bytes) max(bytes) ▶ By another field index=power_of_spl | stats avg(bytes) AS avg_bytes sparkline(avg(bytes)) AS Bytes_Trend min(bytes) max(bytes) by clientip | sort - avg_bytes Stats – Calculate Statistics Based on Field Values Examples
  • 38. © 2017 SPLUNK INC. ▶ Calculate stats and rename Index=power_of_spl | stats avg(bytes) AS “Avg Bytes” ▶ Multiple statistics index=power_of_spl | stats avg(bytes) AS bytes sparkline(avg(bytes)) AS Bytes_Trend min(bytes) max(bytes) ▶ By another field index=power_of_spl | stats avg(bytes) AS avg_bytes sparkline(avg(bytes)) AS Bytes_Trend min(bytes) max(bytes) by clientip | sort - avg_bytes Stats – Calculate Statistics Based on Field Values Examples
  • 39. © 2017 SPLUNK INC. ▶ Visualize stats over time index=power_of_spl | timechart avg(bytes) ▶ Add a trendline index=power_of_spl | timechart avg(bytes) as bytes | trendline sma5(bytes) ▶ Add a prediction overlay index=power_of_spl | timechart avg(bytes) as bytes | predict future_timespan=5 bytes Timechart – Visualize Statistics Over Time Examples
  • 40. © 2017 SPLUNK INC. ▶ Visualize stats over time index=power_of_spl | timechart avg(bytes) ▶ Add a trendline index=power_of_spl | timechart avg(bytes) as bytes | trendline sma5(bytes) ▶ Add a prediction overlay index=power_of_spl | timechart avg(bytes) as bytes | predict future_timespan=5 bytes Timechart – Visualize Statistics Over Time Examples
  • 41. © 2017 SPLUNK INC. ▶ Visualize stats over time index=power_of_spl | timechart avg(bytes) ▶ Add a trendline index=power_of_spl | timechart avg(bytes) as bytes | trendline sma5(bytes) ▶ Add a prediction overlay index=power_of_spl | timechart avg(bytes) as bytes | predict future_timespan=5 bytes Timechart – Visualize Statistics Over Time Examples
  • 42. © 2017 SPLUNK INC. ▶ Cumulative/Running Totals index=power_of_spl | reverse | streamstats sum(bytes) AS sum_bytes | timechart latest(sum_bytes) as "Total Bytes" ▶ Summary Statistics index=power_of_spl | eventstats avg(bytes) AS overall_avg_bytes | stats avg(bytes) as clientip_avg_bytes by clientip overall_avg_bytes Streamstats – Cumulative/Running Totals Statistics Examples
  • 43. © 2017 SPLUNK INC. ▶ Cumulative/Running Totals index=power_of_spl | reverse | streamstats sum(bytes) AS sum_bytes | timechart latest(sum_bytes) as "Total Bytes" ▶ Summary Statistics index=power_of_spl | eventstats avg(bytes) AS overall_avg_bytes | stats avg(bytes) as clientip_avg_bytes by clientip overall_avg_bytes Streamstats – Cumulative/Running Totals Statistics Examples
  • 44. © 2017 SPLUNK INC. Stats/Timechart – But Wait, There’s More! Splunk Search Quick Reference Guide
  • 45. © 2017 SPLUNK INC. ▶ Find the needle in the haystack ▶ Charting statistics and predicting values ▶ Enriching and converging data sources ▶ Map geographic data in real time ▶ Identifying anomalies ▶ Transactions ▶ Data exploration & finding relationships between fields ▶ Custom commands SPL Examples and Recipes
  • 46. © 2017 SPLUNK INC. Converging Data Sources Index Untapped Data: Any Source, Type, Volume Ask Any Question Application Delivery Security, Compliance and Fraud IT Operations Business Analytics Industrial Data and the Internet of Things On-Premises Private Cloud Public Cloud Storage Online Shopping Cart Telecoms Desktops Security Web Services Networks Containers Web Clickstreams RFID Smartphones and Devices Servers Messaging GPS Location Packaged Applications Custom Applications Online Services DatabasesCall Detail Records Energy Meters Firewall Intrusion Prevention
  • 47. © 2017 SPLUNK INC. ▶ Enrich data with lookups index=power_of_spl status!=200 | lookup customer_info uid | stats count by customer_value ▶ Search Inception! index=power_of_spl [ search index=power_of_spl | stats sum(bytes) as total_bytes by clientip | sort - total_bytes | head 1 | return clientip ] | stats count by clientip status uri | sort - count ▶ Append multiple searches index=power_of_spl | timechart span=15s avg(bytes) as avg_bytes | appendcols [ search index=power_of_spl | stats stdev(bytes) as stdev_bytes] | eval 2stdv_upper = avg_bytes + stdev_bytes*2 | filldown 2stdv_upper | eval 2stdv_lower = avg_bytes - stdev_bytes*2 | filldown 2stdv_lower | eval 2stdv_lower = if('2stdv_lower’ <0,0,'2stdv_lower') | fields - stdev_bytes lookup – Converging Data Sources Examples
  • 48. © 2017 SPLUNK INC. ▶ Enrich data with lookups index=power_of_spl status!=200 | lookup customer_info uid | stats count by customer_value ▶ Search Inception! index=power_of_spl [ search index=power_of_spl | stats sum(bytes) as total_bytes by clientip | sort - total_bytes | head 1 | return clientip ] | stats count by clientip status uri | sort - count ▶ Append multiple searches index=power_of_spl | timechart span=15s avg(bytes) as avg_bytes | appendcols [ search index=power_of_spl | stats stdev(bytes) as stdev_bytes] | eval 2stdv_upper = avg_bytes + stdev_bytes*2 | filldown 2stdv_upper | eval 2stdv_lower = avg_bytes - stdev_bytes*2 | filldown 2stdv_lower | eval 2stdv_lower = if('2stdv_lower’ <0,0,'2stdv_lower') | fields - stdev_bytes Converging Data Sources Examples
  • 49. © 2017 SPLUNK INC. ▶ Enrich data with lookups index=power_of_spl status!=200 | lookup customer_info uid | stats count by customer_value ▶ Search Inception! index=power_of_spl [ search index=power_of_spl | stats sum(bytes) as total_bytes by clientip | sort - total_bytes | head 1 | return clientip ] | stats count by clientip status uri | sort - count ▶ Append multiple searches index=power_of_spl | timechart span=15s avg(bytes) as avg_bytes | appendcols [ search index=power_of_spl | stats stdev(bytes) as stdev_bytes] | eval 2stdv_upper = avg_bytes + stdev_bytes*2 | filldown 2stdv_upper | eval 2stdv_lower = avg_bytes - stdev_bytes*2 | filldown 2stdv_lower | eval 2stdv_lower = if('2stdv_lower’ <0,0,'2stdv_lower') | fields - stdev_bytes appendcols – Converging Data Sources Examples
  • 50. © 2017 SPLUNK INC. ▶ Find the needle in the haystack ▶ Charting statistics and predicting values ▶ Enriching and converging data sources ▶ Map geographic data in real time (Let’s add some viz!) ▶ Identifying anomalies ▶ Transactions ▶ Data exploration & finding relationships between fields ▶ Custom commands SPL Examples and Recipes
  • 51. © 2017 SPLUNK INC. ▶ Assign Lat/Lon to IP addresses … | iplocation clientip ▶ Visualize statistics geographically … | geostats sum(price) by product ▶ Use custom choropleths … | geom <featureCollection> <featureId> ▶ Track object movements … | table _time latitude longitude vehicleId iplocation Geographic Data Examples
  • 52. © 2017 SPLUNK INC. ▶ Assign Lat/Lon to IP addresses … | iplocation clientip ▶ Visualize statistics geographically … | geostats sum(price) by product ▶ Use custom choropleths … | geom <featureCollection> <featureId> ▶ Track object movements … | table _time latitude longitude vehicleId geostats – Geographic Data Examples
  • 53. © 2017 SPLUNK INC. ▶ Assign Lat/Lon to IP addresses … | iplocation clientip ▶ Visualize statistics geographically … | geostats sum(price) by product ▶ Use custom choropleths … | geom <featureCollection> <featureId> ▶ Track object movements … | table _time latitude longitude vehicleId geom – Geographic Data Examples
  • 54. © 2017 SPLUNK INC. ▶ Assign Lat/Lon to IP addresses … | iplocation clientip ▶ Visualize statistics geographically … | geostats sum(price) by product ▶ Use custom choropleths … | geom <featureCollection> <featureId> ▶ Track object movements … | table _time latitude longitude vehicleId table – Geographic Data Examples
  • 55. © 2017 SPLUNK INC. Custom Visualizations
  • 56. © 2017 SPLUNK INC. ▶ Native charts and maps • Bar / Line / Area charts • Bubble / Scatter plots • Gauges • Maps • Single Value Displays • Tables ▶ Generalized to fit use cases across many different areas ▶ Can be customized to some extent to cover specific use cases Native Visualizations In Splunk 56
  • 57. © 2017 SPLUNK INC. ▶ Many use cases require a more specific visualization ▶ Specific custom appearance ▶ Represent data where native visualizations are not suitable • You can Splunk everything! • We won’t be able to predict every possible use case • Still uses SPL to drive visualizations Custom Visualizations FTW!
  • 58. © 2017 SPLUNK INC. ▶ Platform extensibility framework and API ▶ Targeted at internal and external developers with web development / JS skills and basic knowledge of the Splunk platform ▶ Developers can make use of any third party libraries (d3.js, three.js, highcharts.js, etc…) that run in the browser* * with minor adjustments, and if third party license permits such use Custom Visualizations
  • 59. © 2017 SPLUNK INC. Custom Visualizations For Admins In-product • Packaged as an app! • Installed like any other app • Users can search for visualizations on Splunkbase and directly in the product Installation
  • 60. © 2017 SPLUNK INC. ▶ Choose from potentially dozens of installed visualizations! ▶ Appears as a first-class citizen alongside native visualizations • Looks and works just like packaged native visualizations ▶ Customize functionality and appearance of the visualization without touching any code, straight from the UI ▶ SPL Example provided as you hover over each visualization option. Custom Visualizations How-to
  • 61. © 2017 SPLUNK INC. New Splunk Visualizations Treemap Sankey Diagram Punchcard Calendar Heat Map Parallel Coordinates Bullet GraphLocation Tracker Horseshoe Meter Machine Learning Charts Timeline Horizon Chart Multiple use cases across IT, security, IoT, and business analytics
  • 62. © 2017 SPLUNK INC. Box Plot 3D scatter plot New Partner/Community Visualizations Wordcloud Donut Chart Heat Map
  • 63. © 2017 SPLUNK INC. New Partner/Community Visualizations Geo Heatmap Custom Cluster Map Clustered Single Value Map Missile Map
  • 64. © 2017 SPLUNK INC. Custom Visualizations – Demo!
  • 65. © 2017 SPLUNK INC. Demo Screenshot #1
  • 66. © 2017 SPLUNK INC. Demo Screenshot #2
  • 67. © 2017 SPLUNK INC. ▶ Find the needle in the haystack ▶ Charting statistics and predicting values ▶ Enriching and converging data sources ▶ Map geographic data in real time ▶ Identifying anomalies ▶ Transactions ▶ Data exploration & finding relationships between fields ▶ Custom commands SPL Examples and Recipes
  • 68. © 2017 SPLUNK INC. ▶ Find anomalies | inputlookup car_data.csv | anomalydetection ▶ Summarize anomalies | inputlookup car_data.csv | anomalydetection action=summary ▶ Use IQR and remove outliers | inputlookup car_data.csv | anomalydetection method=iqr action=remove Anomaly Detection – Find anomalies in your data Examples
  • 69. © 2017 SPLUNK INC. ▶ Find the needle in the haystack ▶ Charting statistics and predicting values ▶ Enriching and converging data sources ▶ Map geographic data in real time ▶ Identifying anomalies ▶ Transactions ▶ Data exploration & finding relationships between fields ▶ Custom commands SPL Examples and Recipes
  • 70. © 2017 SPLUNK INC. ▶ Group by session ID sourcetype=access* | transaction JSESSIONID ▶ Calculate session durations sourcetype=access* | transaction JSESSIONID | stats min(duration) max(duration) avg(duration) ▶ Stats is better sourcetype=access* | stats min(_time) AS earliest max(_time) AS latest by JSESSIONID | eval duration=latest-earliest | stats min(duration) max(duration) avg(duration) Transaction – Group Related Events Spanning Time Examples
  • 71. © 2017 SPLUNK INC. ▶ Group by session ID sourcetype=access* | transaction JSESSIONID ▶ Calculate session durations sourcetype=access* | transaction JSESSIONID | stats min(duration) max(duration) avg(duration) ▶ Stats is better sourcetype=access* | stats min(_time) AS earliest max(_time) AS latest by JSESSIONID | eval duration=latest-earliest | stats min(duration) max(duration) avg(duration) Transaction – Group Related Events Spanning Time Examples
  • 72. © 2017 SPLUNK INC. ▶ Group by session ID sourcetype=access* | transaction JSESSIONID ▶ Calculate session durations sourcetype=access* | transaction JSESSIONID | stats min(duration) max(duration) avg(duration) ▶ Stats is better sourcetype=access* | stats min(_time) AS earliest max(_time) AS latest by JSESSIONID | eval duration=latest-earliest | stats min(duration) max(duration) avg(duration) Transaction – Group Related Events Spanning Time Examples
  • 73. © 2017 SPLUNK INC. ▶ Find the needle in the haystack ▶ Charting statistics and predicting values ▶ Enriching and converging data sources ▶ Map geographic data in real time ▶ Identifying anomalies ▶ Transactions ▶ Data exploration & finding relationships between fields ▶ Custom commands SPL Examples and Recipes
  • 74. © 2017 SPLUNK INC. Data Exploration | analyzefields | anomalies | arules | associate | cluster | contingency | correlate | fieldsummary
  • 75. © 2017 SPLUNK INC. ▶ Find most/least common events * | cluster showcount=t t=.1 | table _raw cluster_count ▶ Display Summary of Fields sourcetype=access_combined | fields – date* source* time* | fieldsummary maxvals=5 ▶ Show patterns of co-occurring fields sourcetype=access_combined | fields – date* source* time* | correlate ▶ View field relationships sourcetype=access_combined | contingency uri status ▶ Find predictors of fields sourcetype=access_combined | analyzefields classfield=status Cluster – Exploring Your Data (find common and/or rare events within your data) Examples
  • 76. © 2017 SPLUNK INC. ▶ Find most/least common events * | cluster showcount=t t=.1 | table _raw cluster_count ▶ Display Summary of Fields sourcetype=access_combined | fields – date* source* time* | fieldsummary maxvals=5 ▶ Show patterns of co-occurring fields sourcetype=access_combined | fields – date* source* time* | correlate ▶ View field relationships sourcetype=access_combined | contingency uri status ▶ Find predictors of fields sourcetype=access_combined | analyzefields classfield=status fieldsummary – Exploring Your Data Gives you a quick breakdown of your numerical fields such as count, min, max, stdev, etc. It also shows you examples values in the event. Examples
  • 77. © 2017 SPLUNK INC. ▶ Find most/least common events * | cluster showcount=t t=.1 | table _raw cluster_count ▶ Display Summary of Fields sourcetype=access_combined | fields – date* source* time* | fieldsummary maxvals=5 ▶ Show patterns of co-occurring fields sourcetype=access_combined | fields – date* source* time* | correlate ▶ View field relationships sourcetype=access_combined | contingency uri status ▶ Find predictors of fields sourcetype=access_combined | analyzefields classfield=status correlate – Exploring Your Data find co-occurrence between fields. Basically a matrix showing the ‘Field1 exists 80% of the time when Field2 exists Examples
  • 78. © 2017 SPLUNK INC. ▶ Find most/least common events * | cluster showcount=t t=.1 | table _raw cluster_count ▶ Display Summary of Fields sourcetype=access_combined | fields – date* source* time* | fieldsummary maxvals=5 ▶ Show patterns of co-occurring fields sourcetype=access_combined | fields – date* source* time* | correlate ▶ View field relationships sourcetype=access_combined | contingency uri status ▶ Find predictors of fields sourcetype=access_combined | analyzefields classfield=status contingency – Exploring Your Data look for relationships of between two fields. Examples
  • 79. © 2017 SPLUNK INC. ▶ Find most/least common events * | cluster showcount=t t=.1 | table _raw cluster_count ▶ Display Summary of Fields sourcetype=access_combined | fields – date* source* time* | fieldsummary maxvals=5 ▶ Show patterns of co-occurring fields sourcetype=access_combined | fields – date* source* time* | correlate ▶ View field relationships sourcetype=access_combined | contingency uri status ▶ Find predictors of fields sourcetype=access_combined | analyzefields classfield=status analyzefields – Exploring Your Data extremely useful for not only looking for meaningful fields in your data, but also for determining which fields to use in linear or logistical regression algorithms in the machine learning appExamples
  • 80. © 2017 SPLUNK INC. ▶ Predict Numeric Fields ▶ Predict Categorical Fields ▶ Detect Numerical Outliers ▶ Detect Categorical Outliers ▶ Forecast Time Series ▶ Cluster Events Machine Learning Toolkit and Showcase Examples
  • 81. © 2017 SPLUNK INC. ▶ Find the needle in the haystack ▶ Charting statistics and predicting values ▶ Enriching and converging data sources ▶ Map geographic data in real time ▶ Identifying anomalies ▶ Transactions ▶ Data exploration & finding relationships between fields ▶ Custom commands SPL Examples and Recipes
  • 82. © 2017 SPLUNK INC. ▶ What is a Custom Command? • “| haversine origin="47.62,-122.34" outputField=dist lat lon” ▶ Why do we use Custom Commands? • Run other/external algorithms on your Splunk data • Save time munging data (see Timewrap!) • Because you can! ▶ Create your own or download as Apps • Haversine (Distance between two GPS coords) • Timewrap (Enhanced Time overlay) • Levenshtein (Fuzzy string compare) • Base64 (Encode/Decode) Custom Commands
  • 83. © 2017 SPLUNK INC. ▶ Download and install App Haversine ▶ Read documentation then use in SPL! sourcetype=access* | iplocation clientip | search City=A* | haversine origin="47.62,-122.34" units=mi outputField=dist lat lon | table clientip, City, dist, lat, lon Custom Commands – Haversine Examples
  • 84. © 2017 SPLUNK INC. ▶ Download and install App Haversine ▶ Read documentation then use in SPL! sourcetype=access* | iplocation clientip | search City=A* | haversine origin="47.62,-122.34" units=mi outputField=dist lat lon | table clientip, City, dist, lat, lon Custom Commands – Haversine Examples
  • 85. © 2017 SPLUNK INC. SPL & The Machine Learning Toolkit
  • 86. © 2017 SPLUNK INC. ▶ Predict Numeric Fields ▶ Predict Categorical Fields ▶ Detect Numerical Outliers ▶ Detect Categorical Outliers ▶ Forecast Time Series ▶ Cluster Events Machine Learning Toolkit and Showcase Examples
  • 87. © 2017 SPLUNK INC. Machine Learning with the Splunk Platform Visualize Share Clean Transform Operationalize Monitor Alert Build Model Search Explore Collect Data Test, Improve Models Ecosystem MLTK Choose Algorithm Ecosystem Splunk Splunk Splunk Splunk MLTK Splunk MLTK Splunk MLTK Splunk Ecosystem Splunk Real-time Data Science Pipeline Ecosystem MLTK Splunk Splunk’s App Ecosystem contains 1000’s of free add-ons for getting data in, applying structure and visualizing your data giving you faster time to value. The Machine Learning Toolkit delivers new SPL commands, custom visualizations, assistants, and examples to explore a variety of ml concepts. Splunk Enterprise is the mission-critical platform for indexing, searching, analyzing, alerting and visualizing machine data. Packaged: UBA, ITSI
  • 88. © 2017 SPLUNK INC. ML SPL Visualize Share Correlate Clean Munge Operationalize Monitor Alert Build Model Search Explore Universal Indexing Test, Improve Models Ecosystem MLTK Choose Algorithm Ecosystem Splunk Splunk Splunk Splunk MLTK Splunk MLTK Splunk MLTK Splunk Ecosystem Splunk fit sample apply listmodels deletemodel summary eval rex stats eventstats streamstats table … timechart chart stats geostats geom sendalert sendemail table … MLTK Library predict (cmd) anomalydetection (cmd) analyzefields anomalies arules associate cluster contingency correlate fieldsummary …
  • 89. © 2017 SPLUNK INC. MLTK Commands The Machine Learning Toolkit contains several custom search commands that implement classic machine learning and statistical learning tasks: • fit: Fit and apply a machine learning model to search results. • apply: Apply a machine learning model that was learned using the fit command. • summary: Return a summary of a machine learning model that was learned using the fit command. • listmodels: Return a list of machine learning models that were learned using the fit command. • deletemodel: Delete a machine learning model that was learned using the fit command. • sample: Randomly sample or partition events.
  • 90. © 2017 SPLUNK INC. ML-SPL Demo
  • 91. © 2017 SPLUNK INC. Set Up Before You Can Play Download the following at splunkbase.com ▶ Machine Learning Toolkit: • https://splunkbase.splunk.com/app/2890/ ▶ Python for Scientific Computing: • https://splunkbase.splunk.com/app/2881/ *Note – For the Python for Scientific Computing App you need to download the platform specific version – Mac, Linux, Windows
  • 92. © 2017 SPLUNK INC. | fit
  • 93. © 2017 SPLUNK INC. | sample
  • 94. © 2017 SPLUNK INC. | apply
  • 95. © 2017 SPLUNK INC. | listmodels
  • 96. © 2017 SPLUNK INC. | summary
  • 97. © 2017 SPLUNK INC. | deletemodel
  • 98. © 2017 SPLUNK INC.© 2017 SPLUNK INC. Best Practices
  • 99. © 2017 SPLUNK INC. The Truth about SPL COMMAND ORDER MATTERS
  • 100. © 2017 SPLUNK INC. Command Order Streaming > Transforming
  • 101. © 2017 SPLUNK INC. LoadJob • Loads events or results of a previously completed search job. • Identified either by the search job id or a scheduled search name. One Search, Many Panels | loadjob savedsearch=”Name of Saved Search"
  • 102. © 2017 SPLUNK INC. One Search, Many Panels | loadjob savedsearch=”Name of Saved Search" | eval threat_ip=if(match_field="src_ip",src_ip,dest_ip) | eval start_time=coalesce(relative_time(now(),"$field1.earliest$"),"- 30d@m") | eval end_time=coalesce(relative_time(now(),"$field1.latest$"),now()) | where _time>=start_time AND _time<=end_time | timechart span=1h dc(threat_ip) as threat_ip_count by feed_category
  • 103. © 2017 SPLUNK INC. One Search, Many Panels Macros • Reusable chunks of SPL that you can use within other searches. • Allows for the passing of arguments. (ex. |`splunk_domain_split(url)`)
  • 104. © 2017 SPLUNK INC. Use Lookups index=* tag=authentication action=failure user!=test user!=test1 user!=test2 user!=test3 user!=test user!=test4 user!=test5 user!=user | stats count by user | where count > 1000 | sort -count index=* tag=authentication action=failure NOT [|inputlookup splunk_user_whitelist.csv | fields user] | stats count by user | where count > 1000 | sort -count Direct Exclusion Lookup Exclusion
  • 105. © 2017 SPLUNK INC. ▶Additional information can be found in: • Power of SPL App! • Docs - Search Manual • Docs - MLTK Search Commands • MLTK Quick Reference Guide • Blogs • Answers • Exploring Splunk For More Information
  • 106. © 2017 SPLUNK INC. • SPL Examples App Other Useful Apps to download! • Splunk 6.x Dashboard Examples • Splunk 6.5 Overview App
  • 107. © 2017 SPLUNK INC. ▶ Benefit of using your own laptop is a ‘Take it with you after’ approach – and bang on it without messing up production Splunk • Promote Personalized 50GB 6 month Dev/Test license to use for Workshop (vs. 30 day download trial) • Encourages long term playing with Splunk and continuously testing out new data sources How about an extra 50GB?
  • 108. © 2017 SPLUNK INC. Q & A
  • 109. © 2017 SPLUNK INC.© 2017 SPLUNK INC. Thank You