Introduction to Configuration Management with Puppet
Table of Contents
1 What is configuration management?
Configuration management is the use of specialized software to specify and apply desired configuration state to systems.
1.1 Before configuration management
- Configuration changes made by hand
- hard to do across multiple systems, error-prone
- required careful documentation to be able to recreate changes
- Scripting configuration changes
- faster and less error-prone for multiple systems
- scripts had to be carefully coded to not reapply changes already made
- "make" as a configuration management tool
- Makefile dependency logic could prevent reapplying changes
- limited features for managing complex configuration
1.2 why configuration management?
- increased system complexity
- more systems to manage
- manual effort doesn't scale
- scripting is tricky
- self-healing: systems can have the (limited) ability to repair themselves
1.3 What makes specialized configuration management software different?
- declarative rather than procedural language
- define desired system state
- "convergence": reapplying the specification should bring the system closer to its desired state
- methods for applying changes built in to the configuration management software
- one authoritative source for configuration information
- easy central management of lots of systems
1.4 Popular configuration management systems
roughly in order of development:
- CFEngine
- http://cfengine.com
- Puppet
- http://puppetlabs.com
- Chef
- https://www.chef.io/products/chef-infra
2 Puppet as a configuration management system
2.1 puppet apply
- simple and low overhead for small installations
- applies configuration specifications from a specified file
- does not require a puppet master to function
- Puppet configuration data must be present on every system using "puppet apply"
2.2 puppet agent
- applies configuration specifications provided by puppet master
- sends "facts" to master (can also be used for monitoring and reporting)
- puppet agents get all configuration data from the Puppet master, and need only the /etc/puppet/puppet.conf file to tell them where the master is
2.3 puppet master
- server for configuration data used by puppet agents
- configuration specification comes from the puppetmaster and need not be present on agents
- stores "facts" about clients that are used to customize configuration
- certificates used for agent authentication
- the puppet master will not work with unauthenticated agents
- new agents submit a certificate signing request (CSR) to master
- administrator signs CSRs (or arranges for appropriate autosigning)
- master will talk to agents that present signed certificates
2.4 facter
- utility that obtains "facts" (various system details) from an agent host for reporting to the puppet master
- facts are available as variables for use in Puppet code
2.5 Puppet cofiguration files
2.5.1 /etc/puppet/puppet.conf
- master configuration file for agents and masters
2.5.2 /etc/puppet/manifests/site.pp
- "node" is the Puppet term for an individual system managed by Puppet
- specifies configuration for nodes
- "node default" applies to any system that doesn't have its own node declaration
- custom configuration for specific individual nodes
node 10-0-20-254 { include module-name include another-module }
- node names are the short hostnames of systems (shown by "hostname -s")
- each node declaration can contain individual Puppet resources and include modules
2.5.3 /etc/puppet/code/modules
- modules are collections ("classes") of resources for specific services or configurations (such as how to configure a web server or install a sudoers file)
- node configurations in site.pp use "include" to bring in modules
- subdirectory structure for each module
- modulename
- top-level directory for each module
- modulename/manifests/init.pp
- Puppet resource class definition file "class modulename { … }", applied to a node with "include modulename" in the node configuration
- modulename/files
- subdirectory containing files needed by a module
- mapping source declarations to files:
source => | "puppet:///modules | /modulename | /filepath" |
---|---|---|---|
puppetmaster location | /etc/puppet/code/modules | /modulename | /files/filepath |
- 'source => "puppet:///modules/example/software.conf"' therefore refers to "/etc/puppet/modules/example/files/software.conf".
2.6 Puppet language concepts
See http://docs.puppet.com/puppet/5.4/index.html for links to more detailed documentation on Puppet. This is just an overview of the most important language features.
2.6.1 resource types
The kinds of configuration resources whose state can be managed, with various resource-dependent parameters. An example declaration of a file resource:
file { "/etc/sudoers": ensure => present, mode => 400, owner => root, group => root, source => "puppet:///modules/sudoers/sudoers", require => Package["sudo"], }
The string in quotes between the opening brace and the colon "/etc/sudoers" is the resource title, usually also the same as the path to the file being managed (unless the "path =>" attribute says something different). A title must be unique (there cannot be another file with "/etc/sudoers" as its title) and is used to refer to the resource in other places with 'File["/etc/sudoers"]'.
Resources also have a number of paramters for attributes of the resource, many of which are resource-specific. This file resource specifies the mode (permissions), user and group owners, and where to obtain the file contents ("source =>"), as well as a relationship to another resource ('Package["sudo"]').
Below I discuss only some commonly-used resources and parameters.
- file
- a file on the system with specified contents
- ensure (present, absent, directory, link to other file)
- mode
- owner
- group
- source (where to get the master copy of this file)
- user
- a user account, primarily intended for system functions (not
so useful for managing interactive user accounts)
- ensure (present, absent)
- uid
- gid
- home
- shell
- comment
- group
- a system group
- ensure (present, absent)
- gid
- members
- package
- a software package
- ensure (installed, latest, specific version, absent)
- source
- service
- a system service (typically a persistent running daemon program)
- enable (true, false; whether to enable this service at boot time)
- ensure (running, stopped)
- cron
- a cron job (scheduled execution of a task)
- user (which user to run as)
- command (command to run)
- minute
- hour
- monthday
- month
- weekday
- exec
- run a command (usually based on some trigger condition)
- command (what command to run)
- refreshonly (run command only if this exec is notified by another resource)
For more details on available resource types and how to declare them, see the "Resource Type Reference" at https://docs.puppet.com/puppet/5.4/type.html (for Puppet 5.4, your version may vary).
2.6.2 metaparameters
Attributes that can be applied to any resource
- namevar
- attribute that has the unique name identifying a
resource of a given type (such as the path of a file,
name of a user account, name of a package, etc.)
- usually given as a quoted string followed by a colon given before any other parameters, but sometimes can be specified as an explicit attribute
- require
- specify another resource that must be present before this one is created or updated
- before
- this resource must be present before the specified resource is created
- subscribe
- like "require" but also refresh this resource if the dependent changes
- notify
- refresh a specified resource if this one changes
For require/before/subscribe/notify, the parameter specifies the resource using the capitalized resource type and the resource name in brackets.
require => [ File["/etc/ldap.conf"], Package["openldap"], Service["slapd"]. source => [ "puppet:///modules/ldap/$hostname/ldap.conf", "puppet:///modules/ldap/ldap.conf", ],
This example also shows how many resource parameters can be lists of values enclosed in brackets [] with elements separated by commas.
2.6.3 variables
- "$var" is value of the variable named "var"
- $var = "value" assigns a value
- node facts are available in variables ($hostname == "ip-10-0-20-254")
- values can be tested in conditional expressions and interpolated into other Puppet expressions or even file contents
- variable scoping is kind of complicated, use carefully
2.6.4 functions
- functions that can be evaluated when a manifest is compiled on the puppet master
3 Simple example Puppet module
The module class is defined in a file "code/modules/smartd/manifests/init.pp" relative to the top-level directory of your Git puppet repo, and would be pulled into /etc/puppet on your Puppet instances.
class smartd { package { "smartmontools": ensure => installed; } file { "/etc/smartd.conf": source => [ # from modules/smartd/files/$hostname/smartd.conf "puppet:///modules/smartd/$hostname/smartd.conf", # from modules/smartd/files/smartd.conf "puppet:///modules/smartd/smartd.conf", ], mode => "444", owner => "root", group => "root", # package must be installed before configuration file require => Package["smartmontools"], } service { "smartd": # automatically start at boot time enable => true, # restart service if it is not running ensure => running, # "service smartd status" returns useful service status info hasstatus => true, # "service smartd restart" can restart service hasrestart => true, # package and configuration must be present for service require => [ Package["smartmontools"], File["/etc/smartd.conf"] ], # changes to configuration cause service restart subscribe => File["/etc/smartd.conf"], } }
The list of "source" paths for /etc/smartd.conf mean that first the Puppet will look for a host-specific version under "code/modules/smartd/files/$hostname/smartd.conf", then a default version under "code/modules/smartd/files/smartd.conf". Here "$hostname" refers to the Puppet variable "hostname" which is reported by the "facter" utility. When this module is used with a particular node, that node's hostname is the value of "$hostname" that "puppet apply" or the Puppet master builds into the compiled configuration for the node.
The uses of "require =>" mean that the "smartmontools" package must be installed before the configuration file "/etc/smartd.conf" can be managed, and that the the package "smartmontools" and the configuration file "/etc/smartd.conf" must exist before the "smartd" service can be managed. The "subscribe =>" also means that the service needs to be restarted whenever the configuration file changes.
This would be applied to a node by adding "include smartd" to a node definition in manifests/site.pp:
node somenode { include smartd }
"include" refers to an existing module definition in a subdirectory of "code/modules" in the Puppet configuration directory. Any node may include multiple modules. It's also safe to include a module more than once (such as when two different modules both include the same third module).
If "somenode" needs a special version of smartd.conf, that could be placed in "code/modules/smartd/files/somenode/smartd.conf", otherwise a default configuration in "code/modules/smartd/files/smartd.conf" will be used.