40,79 €
A comprehensive guide to making machine data accessible across the organization using advanced dashboards
Splunk is the leading platform that fosters an efficient methodology and delivers ways to search, monitor, and analyze growing amounts of big data. This book will allow you to implement new services and utilize them to quickly and efficiently process machine-generated big data.
We introduce you to all the new features, improvements, and offerings of Splunk 7. We cover the new modules of Splunk: Splunk Cloud and the Machine Learning Toolkit to ease data usage. Furthermore, you will learn to use search terms effectively with Boolean and grouping operators. You will learn not only how to modify your search to make your searches fast but also how to use wildcards efficiently. Later you will learn how to use stats to aggregate values, a chart to turn data, and a time chart to show values over time; you'll also work with fields and chart enhancements and learn how to create a data model with faster data model acceleration. Once this is done, you will learn about XML Dashboards, working with apps, building advanced dashboards, configuring and extending Splunk, advanced deployments, and more. Finally, we teach you how to use the Machine Learning Toolkit and best practices and tips to help you implement Splunk services effectively and efficiently.
By the end of this book, you will have learned about the Splunk software as a whole and implemented Splunk services in your tasks at projects
This book is intended for data analysts, business analysts, and IT administrators who want to make the best use of big data, operational intelligence, log management, and monitoring within their organization. Some knowledge of Splunk services will help you get the most out of the book
James D. Miller is an IBM-certified expert, creative innovator, director, senior project leader, and application/system architect with 35+ years extensive application, system design, and development experience. He has introduced customers to new and sometimes disruptive technologies and platforms, integrating with IBM Watson Analytics, Cognos BI, TM1, web architecture design, systems analysis, GUI design and testing, database modeling and systems analysis. He has done design and development of OLAP, client/server, web, and mainframe applications.Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 514
Veröffentlichungsjahr: 2018
Copyright © 2018 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor: Sunith ShettyAcquisition Editor:Tushar GuptaContent Development Editor:Mayur PawanikarTechnical Editor: Prasad RameshCopy Editor: Vikrant PhadkeProject Coordinator: Nidhi JoshiProofreader: Safis EditingIndexer:Mariammal ChettiyarGraphics: Tania DuttaProduction Coordinator: Nilesh Mohite
First published: January 2013 Second edition: July 2015 Third edition: March 2018
Production reference: 1280318
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78883-628-9
www.packtpub.com
Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Mapt is fully searchable
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
James D. Miller is an IBM-certified expert, creative innovator, director, senior project leader, and application/system architect with 35+ years extensive application, system design, and development experience. He has introduced customers to new and sometimes disruptive technologies and platforms, integrating with IBM Watson Analytics, Cognos BI, TM1, web architecture design, systems analysis, GUI design and testing, database modeling and systems analysis. He has done design and development of OLAP, client/server, web, and mainframe applications.
Kyle Smith is a self-proclaimed geek from Pennsylvania and has been working with Splunk extensively since 2010. He has spoken many times at the Splunk User Conference and is an active contributor to the Splunk Answers Community, the #splunk IRC Channel, and the Splunk Slack Channels. He has published several Splunk apps and add-ons to Splunkbase, the Splunk community’s premier app, and add- on publishing platform. He now works as a consultant/developer for Splunk's longest running Aplura, LLC. He has written Splunk Developer's Guide, also by Packt.
Yogesh Raheja is a certified DevOps and cloud expert with a decade of IT experience. He has expertise in technologies such as OS, source code management, build & release tools, continuous integration/deployment/delivery tools, containers, config management tools, monitoring, logging tools, and public and private clouds. He loves to share his technical expertise with audience worldwide at various forums, conferences, webinars, blogs, and LinkedIn (https://in.linkedin.com/in/yogesh-raheja-b7503714). He has written Automation with Puppet 5 and Automation with Ansible.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Implementing Splunk 7 Third Edition
Packt Upsell
Why subscribe?
PacktPub.com
Contributors
About the author
About the reviewer
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Conventions used
Get in touch
Reviews
The Splunk Interface
Logging in to Splunk
The home app
The top bar
The Search & Reporting app
Data generator
The Summary view
Search
Actions
Timeline
The field picker
Fields
Search results
Options
Events viewer
Using the time picker
Using the field picker
The settings section
Splunk Cloud
Try before you buy
A quick cloud tour
The top bar in Splunk Cloud
Splunk reference app – PAS
Universal forwarder
eventgen
Next steps
Summary
Understanding Search
Using search terms effectively
Boolean and grouping operators
Clicking to modify your search
Event segmentation
Field widgets
Time
Using fields to search
Using the field picker
Using wildcards efficiently
Supplementing wildcards in fields
All about time
How Splunk parses time
How Splunk stores time
How Splunk displays time
How time zones are determined and why it matters
Different ways to search against time
Presets
Relative
Real-time
Windowed real-time versus all-time real-time searches
Date range
Date and time range
Advanced
Specifying time in-line in your search
_indextime versus _time
Making searches faster
Sharing results with others
The URL
Save As Report
Save As Dashboard Panel
Save As Alert
Save As Event Type
Searching job settings
Saving searches for reuse
Creating alerts from searches
Enable Actions
Action Options
Sharing
Event annotations
An illustration
Summary
Tables, Charts, and Fields
About the pipe symbol
Using top to show common field values
Controlling the output of top
Using stats to aggregate values
Using chart to turn data
Using timechart to show values over time
The timechart options
Working with fields
A regular expression primer
Commands that create fields
eval
rex
Extracting loglevel
Using the extract fields interface
Using rex to prototype a field
Using the admin interface to build a field
Indexed fields versus extracted fields
Indexed field case 1 - rare instances of a common term
Indexed field case 2 - splitting words
Indexed field case 3 - application from source
Indexed field case 4 - slow requests
Indexed field case 5 - unneeded work
Chart enhancements in version 7.0
charting.lineWidth
charting.data.fieldHideList
charting.legend.mode
charting.fieldDashStyles
charting.axis Y.abbreviation
Summary
Data Models and Pivots
What is a data model?
What does a data model search?
Data model objects
Object constraining
Attributes
Acceleration in version 7.0
Creating a data model
Filling in the new data model dialog
Editing fields (attributes)
Lookup attributes
Children
What is a pivot?
The Pivot Editor
Working with pivot elements
Filtering pivots
Split (row or column)
Column values
Pivot table formatting
A quick example
Sparklines
Summary
Simple XML Dashboards
The purpose of dashboards
Using wizards to build dashboards
Adding another panel
A cool trick
Converting the panel to a report
More options
Back to the dashboard
Add input
Editing source
Edit UI
Editing XML directly
UI examples app
Building forms
Creating a form from a dashboard
Driving multiple panels from one form
Post-processing search results
Post-processing limitations
Features replaced
Autorun dashboard
Scheduling the generation of dashboards
Summary
Advanced Search Examples
Using subsearches to find loosely related events
Subsearch
Subsearch caveats
Nested subsearches
Using transaction
Using transaction to determine session length
Calculating the aggregate of transaction statistics
Combining subsearches with transaction
Determining concurrency
Using transaction with concurrency
Using concurrency to estimate server load
Calculating concurrency with a by clause
Calculating events per slice of time
Using timechart
Calculating average requests per minute
Calculating average events per minute, per hour
Rebuilding top
Acceleration
Big data – summary strategy
Report acceleration
Report acceleration availability
Version 7.0 advancements in metrics
Definition of a Splunk metric
Using Splunk metrics
Creating a metrics index
Creating a UDP or TCP data input
Summary
Extending Search
Using tags to simplify search
Using event types to categorize results
Using lookups to enrich data
Defining a lookup table file
Defining a lookup definition
Defining an automatic lookup
Troubleshooting lookups
Using macros to reuse logic
Creating a simple macro
Creating a macro with arguments
Creating workflow actions
Running a new search using values from an event
Linking to an external site
Building a workflow action to show field context
Building the context workflow action
Building the context macro
Using external commands
Extracting values from XML
xmlkv
XPath
Using Google to generate results
Summary
Working with Apps
Defining an app
Included apps
Installing apps
Installing apps from Splunkbase
Using Geo Location Lookup Script
Using Google Maps
Installing apps from a file
Building your first app
Editing navigation
Customizing the appearance of your app
Customizing the launcher icon
Using custom CSS
Using custom HTML
Custom HTML in a simple dashboard
Using server-side include in a complex dashboard
Object permissions
How permissions affect navigation
How permissions affect other objects
Correcting permission problems
App directory structure
Adding your app to Splunkbase
Preparing your app
Confirming sharing settings
Cleaning up our directories
Packaging your app
Uploading your app
Self-service app management
Summary
Building Advanced Dashboards
Reasons for working with advanced XML
Reasons for not working with advanced XML
Development process
Advanced XML structure
Converting simple XML to advanced XML
Module logic flow
Understanding layoutPanel
Panel placement
Reusing a query
Using intentions
stringreplace
addterm
Creating a custom drilldown
Building a drilldown to a custom query
Building a drilldown to another panel
Building a drilldown to multiple panels using HiddenPostProcess
Third-party add-ons
Google Maps
Sideview Utils
The Sideview search module
Linking views with Sideview
Sideview URLLoader
Sideview forms
Summary
Summary Indexes and CSV Files
Understanding summary indexes
Creating a summary index
When to use a summary index
When to not use a summary index
Populating summary indexes with saved searches
Using summary index events in a query
Using sistats, sitop, and sitimechart
How latency affects summary queries
How and when to backfill summary data
Using fill_summary_index.py to backfill
Using collect to produce custom summary indexes
Reducing summary index size
Using eval and rex to define grouping fields
Using a lookup with wildcards
Using event types to group results
Calculating top for a large time frame
Summary index searches
Using CSV files to store transient data
Pre-populating a dropdown
Creating a running calculation for a day
Summary
Configuring Splunk
Locating Splunk configuration files
The structure of a Splunk configuration file
The configuration merging logic
The merging order
The merging order outside of search
The merging order when searching
The configuration merging logic
Configuration merging – example 1
Configuration merging – example 2
Configuration merging – example 3
Configuration merging – example 4, search
Using btool
An overview of Splunk.conf files
props.conf
Common attributes
Search-time attributes
Index-time attributes
Parse-time attributes
Input-time attributes
Stanza types
Priorities inside a type
Attributes with class
inputs.conf
Common input attributes
Files as inputs
Using patterns to select rolled logs
Using blacklist and whitelist
Selecting files recursively
Following symbolic links
Setting the value of the host from the source
Ignoring old data at installation
When to use crcSalt
Destructively indexing files
Network inputs
Native Windows inputs
Scripts as inputs
transforms.conf
Creating indexed fields
Creating a loglevel field
Creating a session field from the source
Creating a tag field
Creating host categorization fields
Modifying metadata fields
Overriding the host
Overriding the source
Overriding sourcetype
Routing events to a different index
Lookup definitions
Wildcard lookups
CIDR wildcard lookups
Using time in lookups
Using REPORT
Creating multivalue fields
Creating dynamic fields
Chaining transforms
Dropping events
fields.conf
outputs.conf
indexes.conf
authorize.conf
savedsearches.conf
times.conf
commands.conf
web.conf
User interface resources
Views and navigation
Appserver resources
Metadata
Summary
Advanced Deployments
Planning your installation
Splunk instance types
Splunk forwarders
Splunk indexer
Splunk search
Common data sources
Monitoring logs on servers
Monitoring logs on a shared drive
Consuming logs in batch
Receiving syslog events
Receiving events directly on the Splunk indexer
Using a native syslog receiver
Receiving syslog with a Splunk forwarder
Consuming logs from a database
Using scripts to gather data
Sizing indexers
Planning redundancy
The replication factor
Configuring your replication factors
Syntax
Indexer load balancing
Understanding typical outages
Working with multiple indexes
Directory structure of an index
When to create more indexes
Testing data
Differing longevity
Differing permissions
Using more indexes to increase performance
The life cycle of a bucket
Sizing an index
Using volumes to manage multiple indexes
Deploying the Splunk binary
Deploying from a tar file
Deploying using msiexec
Adding a base configuration
Configuring Splunk to launch at boot
Using apps to organize configuration
Separate configurations by purpose
Configuration distribution
Using your own deployment system
Using the Splunk deployment server
Step 1 – deciding where your deployment server will run
Step 2 - defining your deploymentclient.conf configuration
Step 3 - defining our machine types and locations
Step 4 - normalizing our configurations into apps appropriately
Step 5 - mapping these apps to deployment clients in serverclass.conf
Step 6 - restarting the deployment server
Step 7 - installing deploymentclient.conf
Using LDAP for authentication
Using single sign-on
Load balancers and Splunk
web
splunktcp
deployment server
Multiple search heads
Summary
Extending Splunk
Writing a scripted input to gather data
Capturing script output with no date
Capturing script output as a single event
Making a long-running scripted input
Using Splunk from the command line
Querying Splunk via REST
Writing commands
When not to write a command
When to write a command
Configuring commands
Adding fields
Manipulating data
Transforming data
Generating data
Writing a scripted lookup to enrich data
Writing an event renderer
Using specific fields
A table of fields based on field value
Pretty printing XML
Writing a scripted alert action to process results
Hunk
Summary
Machine Learning Toolkit
What is machine learning?
Content recommendation engines
Natural language processing
Operational intelligence
Defining the toolkit
Time well spent
Obtaining the Kit
Prerequisites and requirements
Installation
The toolkit workbench
Assistants
Extended SPL (search processing language)
ML-SPL performance app
Building a model
Time series forecasting
Using Splunk
Launching the toolkit
Validation
Deployment
Saving a report
Exporting data
Summary
Splunk is a leading platform that fosters an efficient methodology and delivers ways to search, monitor, and analyze growing amounts of big data. This book will allow you to implement new services and utilize them to quickly and efficiently process machine-generated big data.
We'll introduce you to all the new features, improvements, and offerings of Splunk 7. We cover the new modules of Splunk—Splunk Cloud and the Machine Learning Toolkit—to ease data usage. Furthermore, you will learn how to use search terms effectively with boolean and grouping operators. You will learn not only how to modify your search to make your searches fast, but also how to use wildcards efficiently. Later, you will learn how to use stats to aggregate values, a chart to turn data, and a time chart to show values over time; you'll also work with fields and chart enhancements and learn how to create a data model with faster data model acceleration. Once this is done, you will learn about XML dashboards, working with apps, building advanced dashboards, configuring and extending Splunk, advanced deployments, and more. Finally, we'll teach you how to use the Machine Learning Toolkit and some best practices and tips to help you implement Splunk services effectively and efficiently.
By the end of this book, you will have learned the Splunk software as a whole and implemented Splunk services in your tasks at projects.
This book is intended for data analysts, business analysts, and IT administrators who want to make the best use of big data, operational intelligence, log management, and monitoring within their organization. Some knowledge of Splunk services will help you get the most out of the book.
Chapter 1, The Splunk Interface, walks you through the most common elements in the Splunk interface.
Chapter 2, Understanding Search, dives into the nuts and bolts of how searching works so that you can make efficient searches to populate the cool reports.
Chapter 3, Tables, Charts, and Fields, starts using fields for more than searches; we'll build tables and graphs. Then we'll learn how to make our own fields.
Chapter 4, Data Models and Pivots, covers data models and pivots, the pivot editor, pivot elements and filters, and sparklines.
Chapter 5, Simple XML Dashboards, demonstrates simple XML dashboards; their purpose; using wizards to build, schedule the generation of, and edit XML directly; and building forms.
Chapter 6, Advanced Search Examples, dives into advanced search examples, which can be a lot of fun. We'll expose some really powerful features of the search language and go over a few tricks that I've learned over the years.
Chapter 7, Extending Search, uses more advanced features of Splunk to help extend the search language and enrich data at search time.
Chapter 8, Working with Apps, explores what makes up a Splunk app, as well as the latest self-service app management (originally introduced in version 6.6) updated in version 7.0.
Chapter 9, Building Advanced Dashboards, covers module nesting, layoutPanel, intentions, and an alternative to intentions with SideView Utils.
Chapter 10, Summary Indexes and CSV Files, explores the use of summary indexes and the commands surrounding them.
Chapter 11, Configuring Splunk, overviews how configurations work and gives a commentary on the most common aspects of Splunk configuration.
Chapter 12, Advanced Deployments, digs into distributed deployments and looks at how they are efficiently configured.
Chapter 13, Extending Splunk, shows a number of ways in which Splunk can be extended to input, manipulate, and output events.
Chapter 14, Machine Learning Toolkit, overviews the fundamentals of Splunk's Machine Learning Toolkit and shows how it can be used to create a machine learning model.
To start with the book, you will first need to download Splunk from https://www.splunk.com/en_us/download.html.
You can find the official installation manual at http://docs.splunk.com/Documentation/Splunk/latest/Installation/Systemrequirements.
You can download the example code files for this book from your account at www.packtpub.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packtpub.com
.
Select the
SUPPORT
tab.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Implementing-Splunk-7-Third-Edition. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "The events must have a _time field."
A block of code is set as follows:
sourcetype="impl_splunk_gen" ip="*"| rex "ip=(?P<subnet>\d+\.\d+\.\d+)\.\d+"| table ip subnet
Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "There are several ways to define a field. Let's start by using the Extract Fields interface."
Feedback from our readers is always welcome.
General feedback: Email [email protected] and mention the book title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packtpub.com.
This is the third edition of this book! Splunk has continued to grow in popularity since our original publication and each new release of the product proves to be enthusiastically consumed by the industry. The content of each chapter within this edition has been reviewed and updated for Splunk version 7.0, with new sections added to cover several new features now available in version 7.0. In addition, we have two new chapters, one covering Splunk's latest machine learning toolkit (MLT) and another discussing practical proven-practice recommendations. So, even if you have an earlier edition of this book (thank you!), it's worthwhile picking up this edition.
Let's begin!
This chapter will walk you through the most common elements in the Splunk interface, and will touch upon concepts that are covered in greater detail in later chapters. You may want to dive right into them, but an overview of the user interface elements might save you some frustration later. We will cover the following topics in this chapter:
Logging in and app selection
A detailed explanation of the search interface widgets
A quick overview of the admin interface
The Splunk GUI (Splunk is also accessible through its command-line interface (CLI) and REST API) is web-based, which means that no client needs to be installed. Newer browsers with fast JavaScript engines, such as Chrome, Firefox, and Safari, work better with the interface. As of Splunk Version 6.2.0 (and version 7.0 is no different), no browser extensions are required.
The default port (which can be changed) for a Splunk installation is still 8000. The address will look like http://mysplunkserver:8000 or http://mysplunkserver.mycompany.com:8000:
If you have installed Splunk on your local machine, the address can be some variant of http://localhost:8000, http://127.0.0.1:8000, http://machinename:8000, or http://machinename.local:8000.
Once you determine the address, the first page you will see is the login screen. The default username is admin with the password changeme. The first time you log in, you will be prompted to change the password for the admin user. It is a good idea to change this password to prevent unwanted changes to your deployment.
By default, accounts are configured and stored within Splunk. Authentication can be configured to use another system, for instance, Lightweight Directory Access Protocol (LDAP). By default, Splunk authenticates locally. If LDAP is set up, the order is as follows: LDAP / Local.
After logging in, the default app is the Launcher app (some refer to it as Home). This app is a launching pad for apps and tutorials.
In earlier versions of Splunk, the Welcome tab provided two important shortcuts, Add data and Launch search app. In version 6.2.0, the Home app was divided into distinct areas or panes that provided easy access to Explore Splunk Enterprise (Add Data, Splunk Apps, Splunk Docs, and Splunk Answers) as well as Apps (the app management page), Search & Reporting (the link to the Search app), and an area where you can set your default dashboard (choose a home dashboard).
In version 7.0, the main page has not been changed very much, although you may notice some difference in the graphics. But the general layout remains the same, with the same panes and access to the same functionalities.
We'll cover apps and dashboards in later chapters of this book:
The Explore Splunk Enterprise pane shows the following links:
Product Tours
(a change in 7.0): When you click here, you can select a specific tour for your review (
Add Data Tour
,
Search Tour
and
Dashboards Tour
).
Add Data: This links Add Data to the Splunk page. This interface is a great start for getting local data flowing into Splunk (making it available to Splunk users). The Preview data interface takes an enormous amount of complexity out of configuring dates and line breaking. We won't go through those interfaces here, but we will go through the configuration files that these wizards produce in Chapter 11, Configuring Splunk.
Splunk Apps: This allows you to find and install more apps from the Splunk Apps Marketplace (https://splunkbase.splunk.com). This marketplace is a useful resource where Splunk users and employees post Splunk apps, mostly free but some premium ones as well. Note that you will need to have a splunk.com user ID.
Splunk Docs: This is one of your links to the wide amount of Splunk documentation available, specifically https://answers.splunk.com, to come on board with the Splunk community on Splunkbase (https://splunkbase.splunk.com/) and get the best out of your Splunk deployment. In addition, this is where you can access http://docs.splunk.com/Documentation/Splunk for the very latest updates to documentation on (almost) any version of Splunk.
The Apps section shows the apps that have GUI elements on your instance of Splunk. App is an overloaded term in Splunk. An app doesn't necessarily have a GUI; it is simply a collection of configurations wrapped into a directory structure that means something to Splunk. We will discuss apps in a more detailed manner in Chapter 8, Working with Apps.
Search & Reporting is the link to the Splunk Search & Reporting app:
Beneath the Search & Reporting link, Splunk provides an outline that, when you hover over it, displays a Find More Apps balloon tip. Clicking on the link opens the (same) Browse more apps page as the Splunk Apps link mentioned earlier:
Choose a home dashboard provides an intuitive way to select an existing (simple XML) dashboard and set it as part of your Splunk Welcome or Home page. This sets you at a familiar starting point each time you enter Splunk. The following screenshot displays the Choose Default Dashboard dialog:
Once you select (from the drop-down list) an existing dashboard, it will be a part of your welcome screen every time you log in to Splunk—until you change it. There are no dashboards installed by default after installing Splunk, except the Search & Reporting app. Once you have created additional dashboards, they can be selected as the default.
The bar across the top of the window contains information about where you are as well as quick links to preferences, other apps, and administration.
The current app is specified in the upper-left corner. The following screenshot shows the upper-left Splunk bar when using the Search & Reporting app:
Clicking on the text takes you to the default page for that app. In most apps, the text next to the logo is simply changed, but the whole block can be customized with logos and alternate text by modifying the app's CSS. We will cover this in Chapter 8, Working with Apps:
The upper-right corner of the window, as seen in the previous screenshot, contains action links that are almost always available:
The name of the user who is currently logged in appears first. In this case, the user is Administrator. Previously, clicking on the username allowed you to select Edit Account (which would take you to the Your account page) or Logout (of Splunk). In version 7.0, it's a bit different. The first option is now listed as Account Settings, which opens a settings page similar to prior versions (below is the 7.0 page). Logout is the other option, and, like prior versions, it ends the session and forces the user to log in again.
The following screenshot shows what the your account page looks like:
This form presents the global preferences that a user is allowed to change. Other settings that affect users are configured through permissions on objects and settings on roles. (Note that preferences can also be configured using the command-line interface or by modifying specific Splunk configuration files.) Preferences include the following:
Full name
and
Email address
are stored for the administrator's convenience.
Set password
allows you to change your password. This is relevant only if Splunk is configured to use internal authentication. For instance, if the system is configured to use Windows Active Directory via LDAP (a very common configuration), users must change their password in Windows.
Global/Time zone
can be changed for the logged-in user.
Default application
controls where you first land after login. Most users will want to change this to search.
Restart backgrounded jobs
controls whether unfinished queries should run again if Splunk is restarted.
Search/Search assistant/Syntax highlighting/auto-format and Show line numbers: these properties are used for assistance with command syntax, including examples, autocomplete syntax, or to turn off search assistance. Syntax highlighting displays search string components in different colors.
Messagesallows you to view any system-level error messages you may have pending. When there is a new message for you to review, a notification displays as a count next to theMessagesmenu. You can click on theXto remove a message.
The
Settings
link presents the user with the configuration pages for all Splunk
Knowledge
objects,
Distributed environment
,
System
and
Licensing
,
Data
, and
Users and Authentication
settings. For any option that you are unable to see, you do not have the permissions to view or edit it:
The
Activity
menu lists shortcuts to Splunk
Jobs
,
Triggered Alerts
, and (in previous versions
System Activity)
views.
You can click on
Jobs
(to open the search jobs manager window, where you can view and manage currently running searches) or
Triggered Alerts
(to view scheduled alerts that are triggered).
Help
lists links to video tutorials,
Splunk Answers
, the Splunk
Contact Support
portal, and online
Documentation
:
Find
can be used to search for objects within your Splunk Enterprise instance. These saved objects include
Reports
,
Dashboards
,
Alerts
, and so on. Errors can be searched with the
Search & Reporting
app by clicking on
Open error
in search.
The Search & Reporting app (or just the search app) is where most actions in Splunk start. This app is a dashboard where you will begin your searching.
If you want to follow the examples that appear in the next few chapters, install the ImplementingSplunkDataGenerator demo app by following these steps:
Download
ImplementingSplunkDataGenerator.tar.gz
from the code bundle available at
http://www.packtpub.com/support
Choose
Manage apps...
from the
Apps
menu
Click on the button labeled
Install app
from the file
Click on
Choose File
, select the file, and then click on
Upload
This data generator app will produce about 16 megabytes of output per day. The app can be disabled so that it stops producing data by using Manage apps... under the App menu.
Within the Search & Reporting app, the user is presented with the Summary view, which contains information about the data that the user searches by default. This is an important distinction; in a mature Splunk installation, not all users will always search all data by default. But if this is your first trip into Search & Reporting, you'll see the following:
From the screen depicted in the previous screenshot, you can access the Splunk documentation related to What to Search and How to Search. Once you have at least some data indexed (a topic we'll discuss later), Splunk will provide some statistics on the available data under What to Search.
What to Search is shown in the following screenshot:
In previous versions of Splunk, panels such as the All indexed data panel provided statistics for a user's indexed data. Other panels gave a breakdown of data using three important pieces of metadata—Source, Sourcetype, and Hosts. In the current version, 7.0.0, you access this information by clicking on the button labeled Data Summary, which presents the following to the user:
This dialog splits the information into three tabs—Hosts, Sources and Sourcetypes:
A host is a captured hostname for an event. The majority of cases, the host field is set to the name of the machine where the data originated. There are cases where this is not known, so the host can also be configured arbitrarily.
A source in Splunk is a unique path or name. In a large installation, there may be thousands of machines submitting data, but all data on the same path across these machines counts as one source. When the data source is not a file, the value of the source can be arbitrary. For instance, the name of a script or network port.
A source type is an arbitrary categorization of events. There may be many sources across many hosts in the same source type. For instance, given the sources
/var/log/access.2012-03-01.log
and
/var/log/access.2012-03-02.log
on the hosts
fred
and
wilma
, you could reference all these logs with source type access or any other name that you like.
Let's move on now and discuss each of the Splunk widgets (just below the app name). The first widget is the navigation bar:
As a general rule, within Splunk, items with downward triangles are menus. Items without a downward triangle are links.
We will cover customizing the navigation bar in Chapter 8, Working with Apps.
Next, we find the Search bar. This is where the magic starts. We'll go into great detail shortly:
Okay, we've finally made it to search. This is where the real power of Splunk lies.
For our first search, we will search for the word (not case-specific) error. Click in the search bar, type the word error, and then either press Enter or click on the magnifying glass to the right of the bar:
Upon initiating the search, we are taken to the search results page (which hasn't really changed in version 7.0):
Refer to the Using the time picker section for details on changing the time frame of your search.
Let's inspect the elements on this page. Below the Search bar, we have the event count, action icons, and menus:
Starting from the left, we have the following:
The
number of events
matched by the base search. Technically, this may not be the number of results pulled from disk, depending on your search. Also, if your query uses commands, this number may not match what is shown in the event listing.
Job
: It opens the
Search job inspector
window, which provides very detailed information about the query that was run.
Pause
: It causes the current search to stop locating events but keeps the job open. This is useful if you want to inspect the current results to determine whether you want to continue a long-running search.
Stop
: This stops the execution of the current search but keeps the results generated so far. This is useful when you have found enough and want to inspect or share the results found so far.
Share
: It shares the search job. This option extends the job's lifetime to seven days and sets the read permissions to everyone.
: This formats the page for printing and instructs the browser to print.
Export
: It exports the results. Select this option to output to CSV, raw events, XML, or
JavaScript Object Notation
(
JSON
) and specify the number of results to export.
Smart mode
: This controls the search experience. You can set it to speed up searches by cutting down on the event data it returns and additionally by reducing the number of fields that Splunk will extract by default from the data (
Fast mode
). You can otherwise set it to return as much event information as possible (
Verbose mode
). In Smart mode (the default setting), it toggles search behavior based on the type of search you're running.
Now we'll skip to the timeline below the action icons:
Along with providing a quick overview of the event distribution over a period of time, the timeline is also a very useful tool for selecting sections of time. Placing the pointer over the timeline displays a popup for the number of events in that slice of time. Clicking on the timeline selects the events for a particular slice of time.
Clicking and dragging selects a range of time:
Once you have selected a period of time, clicking on Zoom to selection changes the time frame and reruns the search for that specific slice of time. Repeating this process is an effective way to drill down to specific events.
Deselect shows all events for the time range selected in the time picker.
Zoom out changes the window of time to a larger period around the events in the current time frame.
To the left of the search results, we find the field picker. This is a great tool for discovering patterns and filtering search results:
The field list contains two lists.
Selected Fields
, which have their values displayed under the search event in the search results
Interesting Fields
, which are other fields that Splunk has picked out for you
Above the field list are two links, Hide Fields and All Fields:
Hide Fields
: Hides the field list area from the view
All Fields
: Takes you to the
Selected Fields
window:
We are almost through with all the widgets on the page. We still have a number of items to cover in the search results section, though, just to be thorough:
As you can see in the previous screenshot, at the top of this section, we have the number of events displayed. When viewing all results in their raw form, this number will match the number above the timeline. This value can be changed either by making a selection on the timeline or by using other search commands.
Next, we have the action icons (described earlier) that affect these particular results.
Under the action icons, we have four results tabs:
Events
list, which will show the raw events. This is the default view when running a simple search, as we have done so far.
Patterns
streamlines event pattern detection. A list of the most common patterns among the set of events is returned by your search. A number of events that share a similar structure are represented by these patterns.
Statistics
populates when you run a search with transforming commands such as stats, top, chart, and so on. The previous keyword search for
error
does not display any results in this tab because it does not have any transforming commands.
Visualization
transforms searches and also populates the
Visualization
tab. The results area of the
Visualization
tab includes a chart and the statistics table used to generate the chart. Not all searches are eligible for visualization—a concept which will be covered later in this book.
Under the previously described tabs, is the timeline that we will cover in more detail later in this chapter.
Beneath the timeline (starting from the left) is a row of option links, including:
Show Fields
: Shows the
Selected Fields
screen
List
: Allows you to select an output option (
Raw
,
List,
or
Table
) for displaying the search result
Format
: Provides the ability to set
Result display options
, such as
Show row numbers
,
Wrap results
, the
Max lines
(to display) and
Drilldown
as
on
or
off
NN Per Page
: This is where you can indicate the number of results to show per page (
10
,
20,
or
50
)
To the right are options that you can use to choose a page of results, and to change the number of events per page.
Finally, we make it to the actual events. Let's examine a single event:
Starting from the left, we have:
Event Details
: Clicking here (indicated by the
right facing arrow
) opens the selected event, provides specific information about the event by type, field, and value, and allows you the ability to perform specific actions on a particular event field. In addition, Splunk offers a button labeled
Event Actions
to access workflow actions, a few of which are always available.
Build Event Type
: Event types are a way to name events that match a certain query. We will dive into event types in
Chapter 7
,
Extending Search
.
Extract Fields
: This launches an interface for creating custom field extractions. We will cover field extraction in
Chapter 3
,
Tables, Charts, and Fields
.
Show Source
: This pops up a window with a simulated view of the original source.
The event number
: Raw search results are always returned in the order
most recent first
.
Next appear any workflow actions that have been configured. Workflow actions let you create new searches or links to other sites, using data from an event. We will discuss workflow actions in
Chapter 7
,
Extending Search
.
Next comes the parsed date from this event, displayed in the time zone selected by the user. This is an important and often confusing distinction. In most installations, everything is in one time zone—the servers, the user, and the events. When one of these three things is not in the same time zone as the others, things can get confusing. We will discuss time in great detail in
Chapter 2
,
Understanding Search
.
Next, we see the raw event itself. This is what Splunk saw as an event. With no help, Splunk can do a good job finding the date and breaking lines appropriately; but as we will see later, with a little help, event parsing can be more reliable and more efficient.
Below the event are the fields that were selected in the field picker. Clicking on the value adds the field value to the search.
Now that we've looked through all the widgets, let's use them to modify our search. First, we will change our time. The default setting of All time is fine when there are few events, but when Splunk has been gathering events over a period of time (perhaps for weeks or months), this is less than optimal. Let's change our search time to one hour:
The search will run again, and now we see results for the last hour only. Let's try a custom time. Date Range is an option:
If you know specifically when an event happened, you can drill down to whatever time range you want here. We will examine the other options in Chapter 2, Understanding Search.
The field picker is very useful for investigating and navigating data. Clicking on any field in the field picker pops open a panel with a wealth of information about that field in the results of your search:
Looking through the information, we observe the following:
Number (of) values, appears in X% of results
tells you how many events contain a value for this field.
Selected
indicates if the field is a selected field.
Top values
and
Top values by time
(allows referring to the
Top 10 Values
returned in the search) present graphs about the data in this search. This is a great way to dive into reporting and graphing. We will use this as a launching point later.
Rare values
displays the least common values of a field.
Events with this field
will modify the query to show only those events that have this field defined.
The links are actually a quick representation of the top values overall. Clicking on a link adds that value to the query. Let's click on
c:\\Test Data\\tm1server.log
:
This will rerun the search, now looking for errors that affect only the source value c:\\Test Data\\tm1server.log.
The Settings section, in a nutshell, is an interface for managing configuration files. The number of files and options in these configuration files is truly daunting, so the web interface concentrates on the most commonly used options across the different configuration types.
Splunk is controlled exclusively by plain text configuration files. Feel free to take a look at the configuration files that are being modified as you make changes in the admin interface. You will find them by hitting the following locations:
$SPLUNK_HOME/etc/system/local/
$SPLUNK_HOME/etc/apps/
$SPLUNK_HOME/etc/users/<user>/<app>/local
You may notice configuration files with the same name at different locations. We will cover in detail the different configuration files, their purposes, and how these configurations merge together in Chapter 11, Configuring Splunk. Don't start modifying the configurations directly until you understand what they do and how they merge.
Clicking on Settings on the top bar takes you to the Settings page:
The reader will note that the layout of the setting page has changed a bit in version 7.0, but it is generally the same as prior versions. We'll point out the differences here. First, there have been some name changes (Distributed Management Console is now Monitoring Console) and a few extra links added (under SYSTEM we see Instrumentation, and DATA has added Source Types).
The options are organized into logical groupings, as follows:
KNOWLEDGE: Each of the links under KNOWLEDGE allows you to control one of the many object types that are used at search time. The following screenshot shows an example of one object type, workflow actions: Searches, reports, and alerts:
System
: The options under this section control system-wide settings:
System settings
covers network settings, the default location to store indexes, outbound email server settings, and how much data Splunk logs about itself
Server controls
contains a single page that lets you restart Splunk from the web interface
Licensing
lets you add license files or configure Splunk as a slave to a Splunk license server
Instrumentation
(new to version 7.0) lets you configure automated reporting settings, view collected data, export data to a file, or send data to Splunk
Data
: This section is where you manage the data flow:
Data Inputs
: Splunk can receive data by reading files (either in batch mode or in real time), listening to network ports, or running scripts.
Forwarding and receiving
: Splunk instances don't typically stand alone. Most installations consist of at least one Splunk indexer and many Splunk forwarders. Using this interface, you can configure each side of this relationship and more complicated setups (we will discuss this in more detail in
Chapter 12
,
Advanced Deployments
).
Indexes
: An index is essentially a data store. Under the covers, it is simply a set of directories, created and managed by Splunk. For small installations, a single index is usually acceptable. For larger installations, using multiple indexes allows flexibility in security, retention, and performance tuning, as well as better use of hardware. We will discuss this further in
Chapter 11
,
Configuring Splunk
.
Report acceleration summaries
: Accesses automatically-created summaries to speed up completion times for certain kinds of reports.
Source Types: Allows access to the source types page. Source types are used to assign configurations like timestamp recognition, event breaking, and field extractions to data indexed by Splunk.
Distributed environment
: The three options here relate to distributed deployments (we will cover these options in detail in
Chapter 12
,
Advanced Deployments
):
Indexer clustering
: Access to enabling and configuring Splunk
Indexer clustering
, which we will discuss later in this book.
Forwarder management
: Access to the forwarder management UI distributes deployment apps to Splunk clients.
Distributed search
: Any Splunk instance running searches can utilize itself and other Splunk instances to retrieve results. This interface allows you to configure access to other Splunk instances.
Users and authentication
: This section provides authentication controls and an account link:
Access controls
: This section is for controlling how Splunk authenticates users and what users are allowed to see and do. We will discuss this further in
Chapter 11
,
Configuring Splunk
.
In addition to the links, the Settings page also presents a panel on the left-hand side of the page. This panel includes two icons, Add Data and (previously) Distributed Management Console, now Monitoring Console:
Add Data
links to the
Add Data
page. This page presents you with three options for getting data into your Splunk Enterprise instance:
Upload
,
Monitor
, and
Forward
.
Monitoring Console
is where you can view detailed performance information about your Splunk Enterprise deployment.
An exciting new option for Splunk is Splunk Cloud. This option offers almost all of Splunk's features and functionalities along with the convenience of being on a real cloud platform:
Readily available online, Splunk lists the following statement as per
http://docs.splunk.com/Documentation/SplunkCloud/6.6.3/User/WelcometoSplunkCloud
:
In my experience, moving any software or service to the cloud typically will have some implications. With Splunk Cloud, you can expect the following differences (from Splunk Enterprise):
There is no CLI (Splunk's command-line interface) support. This means that some (administrative) tasks can be achieved through the web browser but most will require Splunk support.
Only apps that have been assessed (on security and stability) and accepted by Splunk support are allowed to be installed and run in Splunk Cloud.
If you selected a
managed
Splunk Cloud, Splunk support must install and configure all apps (self-service Splunk Cloud still allows you to install apps yourself).
Direct monitoring of TCP, UDP, file, and syslog inputs. Unlike Splunk Enterprise, these data types cannot be sent straight to Splunk Cloud (an on-premises forwarder must be used).
Scripted Alerts are supported only in approved apps.
License pooling is not available in Splunk Cloud. The license manager is not internet-accessible to the Splunk Cloud customers.
Again, for
managed
Splunk Cloud deployments, the
HTTP event collector
(
HEC
) must be set up for you by Splunk.
Access to the Splunk API is initially turned off (for Splunk Clusters) but can be turned on by Splunk support. To enable API access to Splunk Cloud sandbox(es) and trials, and single instance deployments, you must file a Support ticket (not recommended due to the short duration of trials).
It's worth giving Splunk Cloud a try (even if you are not seriously considering its usage in the near future). If you have a valid Splunk ID, you can test-drive Splunk Cloud for free (for 15 days):
To use your Splunk ID to test drive Splunk Cloud, all you need to do is register and agree to the conditions and terms. This is the Terms of Service acceptance page:
Once you check the box (and then click on the button labeled Ok), you will be sent an instructional email, and you are ready to go!
Let's start with accessing your instance. Once you've received the acknowledgement that your Splunk Cloud (trial) instance is ready, you can point your browser to the provided URL. You will notice that the web address for Splunk Cloud is prefixed with a unique identifier that qualifies your particular instance (this is actually the server name where your instance resides):
And the Log In page is a bit different in appearance (from Splunk Enterprise):
Once you are authenticated, we see the Splunk Cloud main page:
First things first. Looking across the task bar (at the top of the page), if you click on Support & Services and then About, you will notice that the Splunk Cloud version is 6.6.3.2, which is NOT the latest on-premise or locally installed version:
The top bar in Splunk Cloud is relatively the same but shifts the locally installed Splunk right-located links (Messages, Settings, Activity, and Find) to the left:
While on the right, there is My Splunk:
The My Splunk link sends you to the Instances