Puppet 5 Beginner's Guide - Third Edition - John Arundel - E-Book

Puppet 5 Beginner's Guide - Third Edition E-Book

John Arundel

37,19 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.

Puppet 5 Beginner's Guide, Third Edition is a practical guide that gets you up and running with the very latest features of Puppet 5.

About This Book

  • Develop skills to run Puppet 5 on single or multiple servers without hiccups
  • Use Puppet to create and manage cloud resources such as Amazon EC2 instances
  • Take full advantage of powerful new features of Puppet including loops, data types, Hiera integration, and container management

Who This Book Is For

Puppet 5 Beginner's Guide, Third Edition is designed for those who are new to Puppet, including system administrators and developers who are looking to manage computer server systems for configuration management. No prior programming or system administration experience is assumed.

What You Will Learn

  • Understand the latest Puppet 5 features
  • Install and set up Puppet and discover the latest and most advanced features
  • Configure, build, and run containers in production using Puppet's industry-leading Docker support
  • Deploy configuration files and templates at super-fast speeds and manage user accounts and access control
  • Automate your IT infrastructure
  • Use the latest features in Puppet 5 onward and its official modules
  • Manage clouds, containers, and orchestration
  • Get to know the best practices to make Puppet more reliable and increase its performance

In Detail

Puppet 5 Beginner's Guide, Third Edition gets you up and running with the very latest features of Puppet 5, including Docker containers, Hiera data, and Amazon AWS cloud orchestration. Go from beginner to confident Puppet user with a series of clear, practical examples to help you manage every aspect of your server setup.

Whether you're a developer, a system administrator, or you are simply curious about Puppet, you'll learn Puppet skills that you can put into practice right away. With practical steps giving you the key concepts you need, this book teaches you how to install packages and config files, create users, set up scheduled jobs, provision cloud instances, build containers, and so much more.

Every example in this book deals with something real and practical that you're likely to need in your work, and you'll see the complete Puppet code that makes it happen, along with step-by-step instructions for what to type and what output you'll see. All the examples are available in a GitHub repo for you to download and adapt for your own server setup.

Style and approach

This tutorial is packed with quick step-by-step instructions that are immediately applicable for beginners. This is an easy-to-read guide, to learn Puppet from scratch, that explains simply and clearly all you need to know to use this essential IT power tool, while applying these solutions to real-world scenarios.

Sie lesen das E-Book in den Legimi-Apps auf:

von Legimi
zertifizierten E-Readern

Seitenzahl: 347

Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.

Table of Contents

Puppet 5 Beginner's Guide Third Edition
About the Author
About the Reviewer
eBooks, discount offers, and more
Why subscribe?
Customer Feedback
What this book covers
What you need for this book
Who this book is for
Reader feedback
Customer support
Downloading the example code
1. Getting started with Puppet
Why do we need Puppet anyway?
Keeping the configuration synchronized
Repeating changes across many servers
Self-updating documentation
Version control and history
Why not just write shell scripts?
Why not just use containers?
Why not just use serverless?
Configuration management tools
What is Puppet?
Resources and attributes
Puppet architectures
Getting ready for Puppet
Installing Git and downloading the repo
Installing VirtualBox and Vagrant
Running your Vagrant VM
Troubleshooting Vagrant
2. Creating your first manifests
Hello, Puppet – your first Puppet manifest
Understanding the code
Modifying existing files
Dry-running Puppet
How Puppet applies the manifest
Creating a file of your own
Managing packages
How Puppet applies the manifest
Querying resources with the puppet resource
Getting help on resources with puppet describe
The package-file-service pattern
Notifying a linked resource
Resource ordering with require
3. Managing your Puppet code with Git
What is version control?
Tracking changes
Sharing code
Creating a Git repo
Making your first commit
How often should I commit?
Distributing Puppet manifests
Creating a GitHub account and project
Pushing your repo to GitHub
Cloning the repo
Fetching and applying changes automatically
Writing a manifest to set up regular Puppet runs
Applying the run-puppet manifest
The run-puppet script
Testing automatic Puppet runs
Managing multiple nodes
4. Understanding Puppet resources
The path attribute
Managing whole files
Trees of files
Symbolic links
Uninstalling packages
Installing specific versions
Installing the latest version
Installing Ruby gems
Installing gems in Puppet's context
Using ensure_packages
The hasstatus attribute
The pattern attribute
The hasrestart and restart attributes
Creating users
The user resource
The group resource
Managing SSH keys
Removing users
Cron resources
Attributes of the cron resource
Randomizing cron jobs
Removing cron jobs
Exec resources
Automating manual interaction
Attributes of the exec resource
The user attribute
The onlyif and unless attributes
The refreshonly attribute
The logoutput attribute
The timeout attribute
How not to misuse exec resources
5. Variables, expressions, and facts
Introducing variables
Using Booleans
Interpolating variables in strings
Creating arrays
Declaring arrays of resources
Understanding hashes
Setting resource attributes from a hash
Introducing expressions
Meeting Puppet's comparison operators
Introducing regular expressions
Using conditional expressions
Making decisions with if statements
Choosing options with case statements
Finding out facts
Using the facts hash
Running the facter command
Accessing hashes of facts
Referencing facts in expressions
Using memory facts
Discovering networking facts
Providing external facts
Creating executable facts
Iterating over arrays
Using the each function
Iterating over hashes
6. Managing data with Hiera
Why Hiera?
Data needs to be maintained
Settings depend on nodes
Operating systems differ
The Hiera way
Setting up Hiera
Adding Hiera data to your Puppet repo
Troubleshooting Hiera
Querying Hiera
Typed lookups
Types of Hiera data
Single values
Boolean values
Interpolation in Hiera data
Using lookup()
Using alias()
Using literal()
The hierarchy
Dealing with multiple values
Merge behaviors
Data sources based on facts
What belongs in Hiera?
Creating resources with Hiera data
Building resources from Hiera arrays
Building resources from Hiera hashes
The advantages of managing resources with Hiera data
Managing secret data
Setting up GnuPG
Adding an encrypted Hiera source
Creating an encrypted secret
How Hiera decrypts secrets
Editing or adding encrypted secrets
Distributing the decryption key
7. Mastering modules
Using Puppet Forge modules
What is the Puppet Forge?
Finding the module you need
Using r10k
Understanding the Puppetfile
Managing dependencies with generate-puppetfile
Using modules in your manifests
Using puppetlabs/mysql
Using puppetlabs/apache
Using puppet/archive
Exploring the standard library
Safely installing packages with ensure_packages
Modifying files in place with file_line
Introducing some other useful functions
The pry debugger
Writing your own modules
Creating a repo for your module
Writing the module code
Creating and validating the module metadata
Tagging your module
Installing your module
Applying your module
More complex modules
Uploading modules to Puppet Forge
8. Classes, roles, and profiles
The class keyword
Declaring parameters to classes
Automatic parameter lookup from Hiera data
Parameter data types
Available data types
Content type parameters
Range parameters
Flexible data types
Defined resource types
Type aliases
Managing classes with Hiera
Using include with lookup()
Common and per-node classes
Roles and profiles
9. Managing files with templates
What are templates?
The dynamic data problem
Puppet template syntax
Using templates in your manifests
Referencing template files
Inline templates
Template tags
Computations in templates
Conditional statements in templates
Iteration in templates
Iterating over Facter data
Iterating over structured facts
Iterating over Hiera data
Working with templates
Passing parameters to templates
Validating template syntax
Rendering templates on the command line
Legacy ERB templates
10. Controlling containers
Understanding containers
The deployment problem
Options for deployment
Introducing the container
What Docker does for containers
Deployment with Docker
Building Docker containers
The layered filesystem
Managing containers with Puppet
Managing Docker with Puppet
Installing Docker
Running a Docker container
Stopping a container
Running multiple instances of a container
Managing Docker images
Building images from Dockerfiles
Managing Dockerfiles
Building dynamic containers
Configuring containers with templates
Self-configuring containers
Persistent storage for containers
Host-mounted volumes
Docker volumes
Networking and orchestration
Connecting containers
Container orchestration
What is orchestration?
What orchestration tools are available?
Running Puppet inside containers
Are containers mini VMs or single processes?
Configuring containers with Puppet
Containers need Puppet too
11. Orchestrating cloud resources
Introducing the cloud
Automating cloud provisioning
Using CloudFormation
Using Terraform
Using Puppet
Setting up an Amazon AWS account
Creating an AWS account
Creating an IAM policy
Creating an IAM user
Storing your AWS credentials
Getting ready to use puppetlabs/aws
Creating a key pair
Installing the puppetlabs/aws module
Installing the AWS SDK gem
Creating EC2 instances with Puppet
Choosing an Amazon Machine Image (AMI)
Creating the EC2 instance
Accessing your EC2 instance
VPCs, subnets, and security groups
The ec2_securitygroup resource
The ec2_instance resource
Managing custom VPCs and subnets
Creating an instance in a custom VPC
The ec2_vpc resource
The ec2_vpc_internet_gateway resource
The ec2_vpc_routetable resource
The ec2_vpc_subnet resource
Other AWS resource types
Provisioning AWS resources from Hiera data
Iterating over Hiera data to create resources
Cleaning up unused resources
12. Putting it all together
Getting the demo repo
Copying the repo
Understanding the demo repo
The control repo
Module management
Users and access control
SSH configuration
Sudoers configuration
Time zone and clock synchronization
Puppet configuration
The bootstrap process
Adapting the repo for your own use
Configuring users
Adding per-node data files and role classes
Modifying the bootstrap credentials
Bootstrapping a new node
Bootstrapping a Vagrant VM
Bootstrapping physical or cloud nodes
Using other distributions and providers
The beginning

Puppet 5 Beginner's Guide Third Edition

Puppet 5 Beginner's Guide Third Edition

Copyright © 2017 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: April 2013

Second edition: May 2017

Third edition: October 2017

Production reference: 1031017

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78847-290-6




John Arundel


Jo Rhett

Acquisition Editor

Ben Renow-Clarke

Project Editor

Alish Firasta

Content Development Editor

Monika Sangwan

Technical Editors

Bhagyashree Rai

Gaurav Gavas

Copy Editor

Gladson Monteiro


Mariammal Chettiyar


Kirk D'Penha

Production Coordinator

Arvindkumar Gupta

Cover Work

Arvindkumar Gupta

About the Author

John Arundel is a DevOps consultant, which means he helps people build world-class web operations teams and infrastructure and has fun doing it. He was formerly a senior operations engineer at global telco Verizon, designing resilient, high-performance infrastructures for major corporations such as Ford, McDonald's, and Bank of America. He is now an independent consultant, working closely with selected clients to deliver web-scale performance and enterprise-grade resilience on a startup budget.

He likes writing books, especially about Puppet (Puppet 2.7 Cookbook and Puppet 3 Cookbook are available from Packt as well). It seems that at least some people enjoy reading them, or maybe they just like the pictures. He also provides training and coaching on Puppet and DevOps, which, it turns out, is far harder than simply doing the work himself.

Off the clock, he is a medal-winning competitive rifle and pistol shooter and a decidedly uncompetitive piano player. He lives in a small cottage in Cornwall, England and believes, like Cicero, that if you have a garden and a library, then you have everything you need.

You may like to follow him on Twitter at @bitfield.


My grateful thanks are due to Jo Rhett, who made innumerable improvements and suggestions to this book, and whose Puppet expertise and clarity of writing I can only strive to emulate. Also to the original Puppet master, Luke Kanies, who created a configuration management tool that sucks less, and my many other friends at Puppet. Many of the key ideas in this book came from them and others including Przemyslaw 'SoboL' Sobieski, Peter Bleeck, and Igor Galić.

The techniques and examples in the book come largely from real production codebases, of my consulting clients and others, and were developed with the indispensable assistance of my friends and colleagues Jon Larkowski, Justin Domingus, Walter Smith, Ian Shaw, and Mike Thomas. Special thanks are also due to the Perseids Project at Tufts University, and most of all to the inestimable Bridget Almas, who patiently read and tested everything in the book several times and made many valuable suggestions, not to mention providing continuous moral support, love, and guidance throughout the writing process. This book is for her.

About the Reviewer

Jo Rhett is a DevOps architect with more than 25 years of experience conceptualizing and delivering large-scale Internet services. He creates automation and infrastructure to accelerate deployment and minimize outages.

Jo has been using, promoting, and enhancing configuration management systems for over 20 years. He builds improvements and plugins for Puppet, Mcollective, Chef, Ansible, Docker, and many other DevOps tools.

Jo is the author of the following books:

Learning Puppet 4 by O'ReillyLearning MCollective by O'ReillyInstant Puppet 3 Starter by Packt Publishing

I'd like to thank the Puppet community for their never-ending inspiration and support.


eBooks, discount offers, and more

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.


Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by PacktCopy and paste, print, and bookmark contentOn demand and accessible via a web browser

Customer Feedback

Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/1788628810.

If you'd like to join our team of regular reviewers, you can e-mail us at <[email protected]>. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!


There are many bad ways to write a technical book. One simply rehashes the official documentation. Another walks the reader through a large and complex example, which doesn't necessarily do anything useful, except show how clever the author is. Yet another exhaustively sets out every available feature of the technology, and every possible way you can use them, without much guidance as to which features you'll really use, or which are best avoided.

Like you, I read a lot of technical books as part of my job. I don't need a paraphrase of the documentation: I can read it online. I also don't want huge blocks of code for something that I don't need to do. And I certainly don't want an uncritical exposition of every single feature.

What I do want is for the author to give me a cogent and readable explanation of how the tool works, in enough detail that I can get started using it straight away, but not so much detail that I get bogged down. I want to learn about features in the order in which I'm likely to use them, and I want to be able to start building something that runs and delivers business value from the very first chapters.

That's what you can expect from this book. Whether you're a developer, a system administrator, or merely Puppet-curious, you're going to learn Puppet skills you can put into practice right away. Without going into lots of theory or background detail, I'll show you how to install packages and config files, create users, set up scheduled jobs, provision cloud instances, build containers, and so on. Every example deals with something real and practical that you're likely to need in your work, and you'll see the complete Puppet code to make it happen, along with step-by-step instructions for what to type and what output you'll see. All the examples are available in a GitHub repo for you to download and adapt.

After each exercise, I'll explain in detail what each line of code does and how it works, so that you can adapt it to your own purposes, and feel confident that you understand everything that's happened. By the end of the book, you will have all the skills you need to do real, useful, everyday work with Puppet, and there's a complete demo Puppet repository you can use to get your infrastructure up and running with minimum effort.

So let's get started.

What this book covers

Chapter 1, Getting started with Puppet, introduces Puppet and gets you up and running with the Vagrant virtual machine that accompanies this book.

Chapter 2, Creating your first manifests, shows you how Puppet works, and how to write code to manage packages, files, and services.

Chapter 3, Managing your Puppet code with Git, introduces the Git version control tool, shows you how to create a repository to store your code, and how to distribute it to your Puppet-managed nodes.

Chapter 4, Understanding Puppet resources, goes into more detail about the package, file, and service resources, as well as introducing resources to manage users, SSH keys, scheduled jobs, and commands.

Chapter 5, Variables, expressions, and facts, introduces Puppet's variables, data types, expressions, and conditional statements, shows you how to get data about the node using Facter, and how to create your own custom facts.

Chapter 6, Managing data with Hiera, explains Puppet's key-value database and how to use it to store and retrieve data, including secrets, and how to create Puppet resources from Hiera data.

Chapter 7, Mastering modules, teaches you how to install ready-to-use modules from the Puppet Forge using the r10k tool, introduces you to four key modules including the standard library, and shows you how to build your own modules.

Chapter 8, Classes, roles, and profiles, introduces you to classes and defined resource types, and shows you the best way to organize your Puppet code using roles and profiles.

Chapter 9, Managing files with templates, shows you how to build complex configuration files with dynamic data using Puppet's EPP template mechanism.

Chapter 10, Controlling containers, introduces Puppet's powerful new support for Docker containers, and shows you how to download, build, and run containers using Puppet resources.

Chapter 11, Orchestrating cloud resources, explains how you can use Puppet to provision cloud servers on Amazon AWS, and introduces a fully-automated cloud infrastructure based on Hiera data.

Chapter 12, Putting it all together, takes you through a complete example Puppet infrastructure that you can download and modify for your own projects, using ideas from all the previous chapters.

What you need for this book

You'll need a reasonably modern computer system and access to the Internet. You won't need to be a Unix expert or an experienced sysadmin; I'll assume you can install software, run commands, and edit files, but otherwise I'll explain everything you need as we go.

Who this book is for

The main audience for this book are those who are new to Puppet, including system administrators and developers who are looking to manage computer server systems for configuration management. No prior programming or system administration experience is assumed. However, if you have used Puppet before, you'll get a thorough grounding in all the latest features and modules, and I hope you'll still find plenty of new things to learn.


In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Puppet can manage files on a node using the file resource"

A block of code is set as follows:

file { '/tmp/hello.txt': ensure => file, content => "hello, world\n", }

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

file { '/tmp/hello.txt': ensure => file, content => "hello, world\n", }

Any command-line input or output is written as follows:

sudo puppet apply /vagrant/examples/file_hello.pp Notice: Compiled catalog for ubuntu-xenial in environment production in 0.07 seconds

New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: "In the AWS console, select VPC from the Services menu".


Warnings or important notes appear in a box like this.


Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.Hover the mouse pointer on the SUPPORT tab at the top.Click on Code Downloads & Errata.Enter the name of the book in the Search box.Select the book for which you're looking to download the code files.Choose from the drop-down menu where you purchased this book from.Click on Code Download.

You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for WindowsZipeg / iZip / UnRarX for Mac7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at the following URLs:


You can use the code bundle on GitHub from the Packt Publishing repository as well:


We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!


Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.


Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.


If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.

Chapter 1. Getting started with Puppet


For a list of all the ways technology has failed to improve the quality of life, please press three.

  --Alice Kahn

In this chapter, you'll learn about some of the challenges of managing configuration on servers, some common solutions to these problems, and how automation tools such as Puppet can help. You'll also learn how to download the GitHub repository containing all of the source code and examples in this book, how to set up your own Vagrant virtual machine to run the code, and how to download and install Puppet.

Whether you're a system administrator, a developer who needs to wrangle servers from time to time, or just someone who's annoyed at how long it takes to deploy a new app, you'll have come across the kind of problems Puppet is designed to solve.

Why do we need Puppet anyway?

Managing applications and services in production is hard work, and there are a lot of steps involved. To start with, you need some servers to serve the services. Luckily, these are readily available from your local cloud provider, at low, low prices. So you've got a server, with a base operating system installed on it, and you can log into it. So now what? Before you can deploy, you need to do a number of things:

Add user accounts and passwordsConfigure security settings and privilegesInstall all the packages needed to run the appCustomize the configuration files for each of these packagesCreate databases and database user accounts; load some initial dataConfigure the services that should be runningDeploy the app code and static assetsRestart any affected servicesConfigure the machine for monitoring

That's a lot to do—and for the next server you build, you'll need to do the exact same things all over again. There's something not right about that. Shouldn't there be an easier solution to this problem?

Wouldn't it be nice if you could write an executable specification of how the server should be set up, and you could apply it to as many machines as you liked?

Keeping the configuration synchronized

Setting up servers manually is tedious. Even if you're the kind of person who enjoys tedium, though, there's another problem to consider. What happens the next time you set up a server, a few weeks or months later?

Your careful notes will no longer be up to date with reality. While you were on vacation, the developers installed a couple of new libraries that the app now depends on—I guess they forgot to tell you! They are under a lot of schedule pressure, of course. You could send out a sternly worded email demanding that people update the build document whenever they change something, and people might even comply with that. But even if they do update the documentation, no-one actually tests the new build process from scratch, so when you come to do it, you'll find it doesn't work anymore. Turns out that if you just upgrade the database in place, it's fine, but if you install the new version on a bare server, it's not.

Also, since the build document was updated, a new version of a critical library was released upstream. Because you always install the latest version as part of the build, your new server is now subtly different to the old one. This will lead to subtle problems which will take you three days, or three bottles of whiskey, to debug.

By the time you have four or five servers, they're all a little different. Which is the authoritative one? Or are they all slightly wrong? The longer they're around, the more they will drift apart. You wouldn't run four or five different versions of your app code at once, so what's up with that? Why is it acceptable for server configuration to be in a mess like this?

Wouldn't it be nice if the state of configuration on all your machines could be regularly checked and synchronized with a central, standard version?

Repeating changes across many servers

Humans just aren't good at accurately repeating complex tasks over and over; that's why we invented robots. It's easy to make mistakes, miss things out, or be interrupted and lose track of what you've done.

Changes happen all the time, and it becomes increasingly difficult to keep things up to date and in sync as your infrastructure grows. Again, when you make a change to your app code, you don't go and make that change manually with a text editor on each server. You change it once and roll it out everywhere. Isn't your firewall setup just as much part of your code as your user model?

Wouldn't it be nice if you only had to make changes in one place, and they rolled out to your whole network automatically?

Self-updating documentation

In real life, we're too busy to stop every five minutes and document what we just did. As we've seen, that documentation is of limited use anyway, even if it's kept fanatically up-to-date.

The only reliable documentation, in fact, is the state of the servers themselves. You can look at a server to see how it's configured, but that only applies while you still have the machine. If something goes wrong and you can't access the machine, or the data on it, your only option is to reconstruct the lost configuration from scratch.

Wouldn't it be nice if you had a clear, human-readable build procedure which was independent of your servers, and was guaranteed to be up to date, because the servers are actually built from it?

Version control and history

When you're making manual, ad hoc changes to systems, you can't roll them back to a point in time. It's hard to undo a whole series of changes; you don't have a way of keeping track of what you did and how things changed.

This is bad enough when there's just one of you. When you're working in a team, it gets even worse, with everybody making independent changes and getting in each other's way.

When you have a problem, you need a way to know what changed and when, and who did it. And you also need to be able to set your configuration back to any previously stable state.

Wouldn't it be nice if you could go back in time?

Why not just write shell scripts?

Many people manage configuration with shell scripts, which is better than doing it manually, but not much. Some of the problems with shell scripts include the following:

Fragile and non-portableHard to maintainNot easy to read as documentationVery site-specificNot a good programming languageHard to apply changes to existing servers

Why not just use containers?

Containers! Is there any word more thrilling to the human soul? Many people feel as though containers are going to make configuration management problems just go away. This feeling rarely lasts beyond the first few hours of trying to containerize an app. Yes, containers make it easy to deploy and manage software, but where do containers come from? It turns out someone has to build and maintain them, and that means managing Dockerfiles, volumes, networks, clusters, image repositories, dependencies, and so on. In other words, configuration. There is an axiom of computer science which I just invented, called The Law of Conservation of Pain. If you save yourself pain in one place, it pops up again in another. Whatever cool new technology comes along, it won't solve all our problems; at best, it will replace them with refreshingly different problems.

Yes, containers are great, but the truth is, container-based systems require even more configuration management. You need to configure the nodes that run the containers, build and update the container images based on a central policy, create and maintain the container network and clusters, and so on.

Why not just use serverless?

If containers are powered by magic pixies, serverless architectures are pure fairy dust. The promise is that you just push your app to the cloud, and the cloud takes care of deploying, scaling, load balancing, monitoring, and so forth. Like most things, the reality doesn't quite live up to the marketing. Unfortunately, serverless isn't actually serverless: it just means your business is running on servers you don't have direct control over, plus, you have higher fixed costs because you're paying someone else to run them for you. Serverless can be a good way to get started, but it's not a long-term solution, because ultimately, you need to own your own configuration.

Configuration management tools

Configuration management (CM) tools are the modern, sensible way to manage infrastructure as code. There are many such tools available, all of which operate more or less the same way: you specify your desired configuration state, using editable text files and a model of the system's resources, and the tool compares the current state of each node (the term we use for configuration-managed servers) with your desired state and makes any changes necessary to bring it in line.

As with most unimportant things, there is a great deal of discussion and argument on the Internet about which CM tool is the best. While there are significant differences in approaches and capabilities between different tools, don't let that obscure the fact that using a tool of any sort to manage configuration is much better than trying to do it by hand.

That said, while there are many CM tools available, Puppet is an excellent choice. No other tool is more powerful, more portable, or more widely adopted. In this book, I'm going to show you what makes Puppet so good and the things that only Puppet can do.

What is Puppet?

Puppet is two things: a language for expressing the desired state (how your nodes should be configured), and an engine that interprets code written in the Puppet language and applies it to the nodes to bring about the desired state.

What does this language look like? It's not exactly a series of instructions, like a shell script or a Ruby program. It's more like a set of declarations about the way things should be. Have a look at the following example:

package { 'curl': ensure => installed, }

In English, this code says, "The curl package should be installed." When you apply this manifest (Puppet programs are called manifests), the tool will do the following:

Check the list of installed packages on the node to see if curl is already installed.If it is, do nothing.If not, install it.

Here's another example of Puppet code:

user { 'bridget': ensure => present, }

This is Puppet language for the declaration, "The bridget user should be present." (The keyword ensure means "the desired state of the resource is..."). Again, this results in Puppet checking for the existence of the bridget user on the node, and creating it if necessary. This is also a kind of documentation that expresses human-readable statements about the system in a formal way. The code expresses the author's desire that Bridget should always be present.

So you can see that the Puppet program—the Puppet manifest—for your configuration is a set of declarations about what things should exist, and how they should be configured.

You don't give commands, like "Do this, then do that". Rather, you describe how things should be, and let Puppet take care of making it happen. These are two quite different kinds of programming. One kind (so-called procedural style) is the traditional model used by languages such as C, Python, shell, and so on. Puppet's is called declarative style because you declare what the end result should be, rather than specify the steps to get there.

This means that you can apply the same Puppet manifest repeatedly to a node and the end result will be the same, no matter how many times you apply the manifest. It's better to think of Puppet manifests as a kind of specification, or declaration, rather than as a program in the traditional sense.

Resources and attributes

Puppet lets you describe configuration in terms of resources (types of things that can exist, such as users, files, or packages) and their attributes (appropriate properties for the type of resource, such as the home directory for a user, or the owner and permissions for a file). You don't have to get into the details of how resources are created and configured on different platforms. Puppet takes care of it.

The power of this approach is that a given manifest can be applied to different nodes, all running different operating systems, and the results will be the same everywhere.

Puppet architectures

It's worth noting that there are two different ways to use Puppet. The first way, known as agent/master architecture, uses a special node dedicated to running Puppet, which all other nodes contact to get their configuration.

The other way, known as stand-alone Puppet or masterless, does not need a special Puppet master node. Puppet runs on each individual node and does not need to contact a central location to get its configuration. Instead, you use Git, or any other way of copying files to the node, such as SFTP or rsync, to update the Puppet manifests on each node.

Both stand-alone and agent/master architectures are officially supported by Puppet. It's your choice which one you prefer to use. In this book, I will cover only the stand-alone architecture, which is simpler and easier for most organizations, but almost everything in the book will work just the same whether you use agent/master or stand-alone Puppet.


To set up Puppet with an agent/master architecture, consult the official Puppet documentation.


In this chapter, we looked at the various problems that configuration management tools can help solve, and how Puppet in particular models the aspects of system configuration. We checked out the Git repository of example code for this book, installed VirtualBox and Vagrant, started the Vagrant VM, and ran Puppet for the first time.

In the next chapter, we'll write our first Puppet manifests, get some insight into the structure of Puppet resources and how they're applied, and learn about the package, file, and service resources.

Chapter 2. Creating your first manifests


Beginnings are such delicate times.

  --Frank Herbert, 'Dune'