Introduction to Puppet

Puppet is a configuration management system. It is used to bring hosts to the desired state and maintain this state.

I have been working with Puppet for over five years. This text is essentially a translated and re-arranged compilation of key points from the official documentation, which will allow beginners to quickly delve into the essence of Puppet.

Introduction to Puppet

Basic Information

Puppet's work scheme is client-server, although a version without a server with limited functionality is also supported.

The pull model of work is used: by default, once every half an hour, clients contact the server for a configuration and apply it. If you worked with Ansible, then a different, push model is used there: the administrator initiates the process of applying the configuration, the clients themselves will not apply anything.

Network communication uses two-way TLS encryption: the server and client have their own private keys and their corresponding certificates. Usually the server issues certificates for clients, but in principle an external CA can also be used.

Introduction to Manifests

In Puppet terminology to the pappet server connect nodes (nodes). The configuration for the nodes is written in manifests in a special programming language - Puppet DSL.

Puppet DSL is a declarative language. It describes the desired state of the node in the form of a declaration of individual resources, for example:

  • The file exists and has some content.
  • The package is installed.
  • Service started.

Resources can be interconnected:

  • There are dependencies, they affect the order in which resources are applied.
    For example, "first install the package, then fix the configuration file, then start the service."
  • There are notifications - if the resource has changed, it sends notifications to the resources subscribed to it.
    For example, if the configuration file changes, you can automatically restart the service.

In addition, the Puppet DSL has functions and variables, as well as conditional statements and selectors. Various templating mechanisms are also supported - EPP and ERB.

Puppet is written in Ruby, so many constructs and terms are taken from there. Ruby allows you to extend Puppet - add complex logic, new types of resources, functions.

When Puppet is running, the manifests for each specific node on the server are compiled into a directory. Catalog is a list of resources and their relationships after calculating the value of functions, variables and expanding conditional statements.

Syntax and codestyle

Here are sections of the official documentation that will help you understand the syntax if the examples given are not enough:

Here is an example of what the manifest looks like:

# Комментарии пишутся, как и много где, после решётки.
#
# Описание конфигурации ноды начинается с ключевого слова node,
# за которым следует селектор ноды — хостнейм (с доменом или без)
# или регулярное выражение для хостнеймов, или ключевое слово default.
#
# После этого в фигурных скобках описывается собственно конфигурация ноды.
#
# Одна и та же нода может попасть под несколько селекторов. Про приоритет
# селекторов написано в статье про синтаксис описания нод.
node 'hostname', 'f.q.d.n', /regexp/ {
  # Конфигурация по сути является перечислением ресурсов и их параметров.
  #
  # У каждого ресурса есть тип и название.
  #
  # Внимание: не может быть двух ресурсов одного типа с одинаковыми названиями!
  #
  # Описание ресурса начинается с его типа. Тип пишется в нижнем регистре.
  # Про разные типы ресурсов написано ниже.
  #
  # После типа в фигурных скобках пишется название ресурса, потом двоеточие,
  # дальше идёт опциональное перечисление параметров ресурса и их значений.
  # Значения параметров указываются через т.н. hash rocket (=>).
  resource { 'title':
    param1 => value1,
    param2 => value2,
    param3 => value3,
  }
}

Indentation and newlines are not a mandatory part of the manifest, but there is a recommended style guides. Summary:

  • Double-space indentation, tabs are not used.
  • Curly braces are separated by a space; colons are not separated by a space.
  • Commas after each parameter, including the last one. Each parameter is on a separate line. An exception is made for the case without parameters and one parameter: you can write on one line and without a comma (i.e. resource { 'title': } и resource { 'title': param => value }).
  • The arrows next to the parameters must be at the same level.
  • Resource relationship arrows are written in front of them.

Location of files on folderserver

For further explanation, I will introduce the concept of "root directory". The root directory is the directory where the Puppet configuration for a particular node is located.

The root directory differs depending on the version of Puppet and the environments used. Environments are independent configuration sets that are stored in separate directories. Typically used in conjunction with git, in which case environments are created from git branches. Accordingly, each node is in a particular environment. This is configured on the node itself, or in ENC, which I will talk about in the next article.

  • In the third version ("old Pappet") the base directory was /etc/puppet. The use of environments is optional - we, for example, do not use them with the old Puppet. If environments are used, they are usually stored in /etc/puppet/environments, the root directory will be the environment directory. If environments are not used, the root directory will be the base directory.
  • Starting from the fourth version (“new Pappet”), the use of environments became mandatory, and the base directory was moved to /etc/puppetlabs/code. Accordingly, environments are stored in /etc/puppetlabs/code/environments, root directory — environment directory.

The root directory must have a subdirectory manifests, which contains one or more manifests with a description of the nodes. Also, there should be a subdirectory modules, which contains the modules. What are modules, I will tell a little later. In addition, the old Puppet may also have a subdirectory files, which contains various files that we copy to the nodes. In the new Pappet, all files are placed in modules.

Manifest files have the extension .pp.

A couple of combat examples

Description of the node and the resource on it

On the node server1.testdomain file must be created /etc/issue with content Debian GNU/Linux n l. File must be owned by user and group root, access rights should be 644.

We write the manifesto:

node 'server1.testdomain' {   # блок конфигурации, относящийся к ноде server1.testdomain
    file { '/etc/issue':   # описываем файл /etc/issue
        ensure  => present,   # этот файл должен существовать
        content => 'Debian GNU/Linux n l',   # у него должно быть такое содержимое
        owner   => root,   # пользователь-владелец
        group   => root,   # группа-владелец
        mode    => '0644',   # права на файл. Они заданы в виде строки (в кавычках), потому что иначе число с 0 в начале будет воспринято как записанное в восьмеричной системе, и всё пойдёт не так, как задумано
    }
}

Resource relationships on a node

On the node server2.testdomain nginx must be running, working with the configuration prepared in advance.

Let's decompose the problem:

  • The package needs to be installed nginx.
  • You need to copy the configuration files from the server.
  • The service needs to be running. nginx.
  • If the configuration is updated, the service must be restarted.

We write the manifesto:

node 'server2.testdomain' {   # блок конфигурации, относящийся к ноде server2.testdomain
    package { 'nginx':   # описываем пакет nginx
        ensure => installed,   # он должен быть установлен
    }
  # Прямая стрелка (->) говорит о том, что ресурс ниже должен
  # создаваться после ресурса, описанного выше.
  # Такие зависимости транзитивны.
    -> file { '/etc/nginx':   # описываем файл /etc/nginx
        ensure  => directory,   # это должна быть директория
        source  => 'puppet:///modules/example/nginx-conf',   # её содержимое нужно брать с паппет-сервера по указанному адресу
        recurse => true,   # копировать файлы рекурсивно
        purge   => true,   # нужно удалять лишние файлы (те, которых нет в источнике)
        force   => true,   # удалять лишние директории
    }
  # Волнистая стрелка (~>) говорит о том, что ресурс ниже должен
  # подписаться на изменения ресурса, описанного выше.
  # Волнистая стрелка включает в себя прямую (->).
    ~> service { 'nginx':   # описываем сервис nginx
        ensure => running,   # он должен быть запущен
        enable => true,   # его нужно запускать автоматически при старте системы
    }
  # Когда ресурс типа service получает уведомление,
  # соответствующий сервис перезапускается.
}

For this to work, you need something like this file location on the pappet server:

/etc/puppetlabs/code/environments/production/ # (это для нового Паппета, для старого корневой директорией будет /etc/puppet)
├── manifests/
│   └── site.pp
└── modules/
    └── example/
        └── files/
            └── nginx-conf/
                ├── nginx.conf
                ├── mime.types
                └── conf.d/
                    └── some.conf

Resource types

For a complete list of supported resource types, see in documentation, here I will describe five basic types, which in my practice are enough to solve most problems.

fillet

Manages files, directories, symlinks, their contents, access rights.

Options:

  • resource name - path to the file (optional)
  • path - path to the file (if it is not specified in the name)
  • ensure - file type:
    • absent - delete a file
    • present - there must be a file of any type (if there is no file, a regular file will be created)
    • file - regular file
    • directory - directory
    • link - symlink
  • content - file contents (only suitable for regular files, cannot be used with source or target)
  • source - a link to the path from which you want to copy the contents of the file (cannot be used together with content or target). Can be specified either as a URI with a scheme puppet: (then files from the pappet server will be used), and with the scheme http: (I hope it is clear what will happen in this case), and even with the scheme file: or as an absolute path without a scheme (then the file from the local FS on the node will be used)
  • target - where the symlink should point (cannot be used with content or source)
  • owner - the user who should own the file
  • group - the group that the file should belong to
  • Fashion - file permissions (as a string)
  • recurse - enables recursive processing of directories
  • purge - includes deleting files that are not described in Puppet
  • strength - includes removing directories that are not described in Puppet

package

Installs and removes packages. Able to process notifications - reinstalls the package if the parameter is set reinstall_on_refresh.

Options:

  • resource name - package name (optional)
  • name - package name (if not specified in the name)
  • provider - the package manager to use
  • ensure - the desired state of the package:
    • present, installed - installed any version
    • latest - latest version installed
    • absent - removed (apt-get remove)
    • purged - removed along with configuration files (apt-get purge)
    • held - package version is locked (apt-mark hold)
    • любая другая строка - the specified version is installed
  • reinstall_on_refresh - if true, the package will be reinstalled upon receipt of the notification. Useful for source-based distributions where rebuilding packages may be necessary when changing build options. Default false.

service

Manages services. Able to handle notifications - restarts the service.

Options:

  • resource name - service to be managed (optional)
  • name - the service to be managed (if not specified in the name)
  • ensure - the desired state of the service:
    • running - launched
    • stopped - stopped
  • enable - controls the ability to start the service:
    • true - enabled autorun (systemctl enable)
    • mask - disguised (systemctl mask)
    • false - autorun disabledsystemctl disable)
  • restart - command to restart the service
  • status - command to check the status of the service
  • hasrestart — specify whether the service initscript supports restarting. If false and the parameter restart — the value of this parameter is used. If false and parameter restart not specified - the service is stopped and started to restart (but systemd uses the command systemctl restart).
  • has status — specify whether the service initscript supports the command status. If false, then the parameter value is used status. Default true.

exec

Runs external commands. If you do not specify parameters create, only if, unless or refreshonly, the command will run on every Puppet run. Able to process notifications - launches a command.

Options:

  • resource name - command to be executed (optional)
  • command - the command to be executed (if it is not specified in the name)
  • path - paths in which to search for the executable file
  • only if - if the command specified in this parameter ended with a zero return code, the main command will be executed
  • unless - if the command specified in this parameter ended with a non-zero return code, the main command will be executed
  • create - if the file specified in this parameter does not exist, the main command will be executed
  • refreshonly - if true, then the command will only be run if that exec is notified by other resources
  • cwd - directory from which to run the command
  • user - the user from which to run the command
  • provider - with what to run the command:
    • posix - a child process is simply created, be sure to specify path
    • shell - the command is run in the shell /bin/sh, may not be specified. path, you can use globbing, pipes and other features of the shell. Usually determined automatically if there are any special characters (|, ;, &&, || etc).

cron

Manages cronjobs.

Options:

  • resource name - just some identifier
  • ensure - Cronjob state:
    • present - create if does not exist
    • absent - delete if exists
  • command - what command to run
  • environment - in what environment to run the command (list of environment variables and their values ​​through =)
  • user - from which user to run the command
  • minute, hour, weekday, month, monthday when to run cron. If any of these attributes is not specified, its value in the crontab will be *.

In Puppet 6.0 cron like removed from the box in puppetserver so there is no documentation on the shared site. But he is in the box in puppet-agent, so you don't need to install it separately. You can see the documentation for it. in the documentation for the fifth version of Puppetor on Github.

About resources in general

Resource uniqueness requirements

The most common mistake we encounter is Duplicate declaration. This error occurs when there are two or more resources of the same type with the same name in the directory.

So I'll write again: in the manifests for one node there should not be resources of the same type with the same name (title)!

Sometimes there is a need to install packages with the same name but different package managers. In this case, you need to use the parameter nameto avoid the error:

package { 'ruby-mysql':
  ensure   => installed,
  name     => 'mysql',
  provider => 'gem',
}
package { 'python-mysql':
  ensure   => installed,
  name     => 'mysql',
  provider => 'pip',
}

Other resource types have similar options to help avoid duplication − name у service, command у exec, and so on.

Metaparameters

Each type of resource has some special parameters, regardless of its nature.

Full list of meta parameters in the Puppet documentation.

Short list:

  • require - This parameter indicates which resources this resource depends on.
  • before - This parameter indicates which resources depend on this resource.
  • subscribe — this parameter indicates from which resources this resource receives notifications.
  • notify - This parameter specifies which resources receive notifications from this resource.

All of the listed metaparameters accept either a single resource reference or an array of references in square brackets.

Links to resources

A link to a resource is simply a reference to a resource. They are mainly used to specify dependencies. Referencing a non-existent resource will cause a compilation error.

The syntax of the link is as follows: the type of the resource with a capital letter (if the type name contains double colons, then each part of the name between the colons is capitalized), then the name of the resource in square brackets (the case of the name does not change!). There should be no spaces, square brackets are written immediately after the type name.

Example:

file { '/file1': ensure => present }
file { '/file2':
  ensure => directory,
  before => File['/file1'],
}
file { '/file3': ensure => absent }
File['/file1'] -> File['/file3']

Dependencies and notifications

Documentation here.

As mentioned earlier, simple dependencies between resources are transitive. By the way, be careful when putting down dependencies - you can make cyclic dependencies, which will cause a compilation error.

Unlike dependencies, notifications are not transitive. The following rules apply to notifications:

  • If the resource receives a notification, it is updated. Upgrade actions depend on the resource type − exec runs the command service restarts the service package reinstalls the package. If there is no update action defined for the resource, then nothing happens.
  • In one Puppet run, the resource is updated no more than once. This is possible because notifications include dependencies, and the dependency graph does not contain cycles.
  • If Puppet changes the state of a resource, then the resource sends notifications to all the resources that subscribe to it.
  • If a resource is updated, then it sends notifications to all resources subscribed to it.

Handling unspecified parameters

As a rule, if some resource parameter does not have a default value and this parameter is not specified in the manifest, then Pappet will not change this property for the corresponding resource on the node. For example, if a resource of type fillet parameter not specified owner, then Puppet will not change the owner of the corresponding file.

Introduction to classes, variables and defines

Suppose we have several nodes that have the same part of the configuration, but there are also differences - otherwise we could describe it all in one block node {}. Of course, you can just copy the same parts of the configuration, but in general this is a bad decision - the configuration grows, if you change the common part of the configuration, you will have to edit the same thing in many places. At the same time, it is easy to make a mistake, and in general, the principle of DRY (don't repeat yourself) was invented for a reason.

To solve this problem, there is such a construction as class.

Classes

Class is a named block of puppet code. Classes are needed to reuse code.

First, the class needs to be defined. The description itself does not add any resources anywhere. The class is described in the manifests:

# Описание класса начинается с ключевого слова class и его названия.
# Дальше идёт тело класса в фигурных скобках.
class example_class {
    ...
}

After that, the class can be used:

# первый вариант использования — в стиле ресурса с типом class
class { 'example_class': }
# второй вариант использования — с помощью функции include
include example_class
# про отличие этих двух вариантов будет рассказано дальше

An example from the previous task - we will move the installation and configuration of nginx to a class:

class nginx_example {
    package { 'nginx':
        ensure => installed,
    }
    -> file { '/etc/nginx':
        ensure => directory,
        source => 'puppet:///modules/example/nginx-conf',
        recure => true,
        purge  => true,
        force  => true,
    }
    ~> service { 'nginx':
        ensure => running,
        enable => true,
    }
}

node 'server2.testdomain' {
    include nginx_example
}

Variables

The class from the previous example is not flexible at all because it always brings the same nginx configuration. Let's make the path to the configuration variable, then this class can be used to install nginx with any configuration.

It can be done using variables.

Attention: variables in Puppet are immutable!

In addition, a variable can only be accessed after it has been declared, otherwise the value of the variable will be undef.

An example of working with variables:

# создание переменных
$variable = 'value'
$var2 = 1
$var3 = true
$var4 = undef
# использование переменных
$var5 = $var6
file { '/tmp/text': content => $variable }
# интерполяция переменных — раскрытие значения переменных в строках. Работает только в двойных кавычках!
$var6 = "Variable with name variable has value ${variable}"

Puppet has namespaces, and the variables, respectively, have area of ​​visibility: A variable with the same name can be defined in different namespaces. When resolving the value of a variable, the variable is searched in the current namespace, then in the enclosing one, and so on.

Namespace examples:

  • global - variables outside the class or node description get there;
  • the namespace of the node in the description of the node;
  • the class namespace in the class declaration.

To avoid ambiguity when referring to a variable, you can specify the namespace in the variable name:

# переменная без пространства имён
$var
# переменная в глобальном пространстве имён
$::var
# переменная в пространстве имён класса
$classname::var
$::classname::var

Let's agree that the path to the nginx configuration is in the variable $nginx_conf_source. Then the class will look like this:

class nginx_example {
    package { 'nginx':
        ensure => installed,
    }
    -> file { '/etc/nginx':
        ensure => directory,
        source => $nginx_conf_source,   # здесь используем переменную вместо фиксированной строки
        recure => true,
        purge  => true,
        force  => true,
    }
    ~> service { 'nginx':
        ensure => running,
        enable => true,
    }
}

node 'server2.testdomain' {
    $nginx_conf_source = 'puppet:///modules/example/nginx-conf'
    include nginx_example
}

However, the above example is bad because there is some “secret knowledge” that somewhere inside the class a variable with such and such a name is used. It is much more correct to make this knowledge general - classes can have parameters.

Class parameters are variables in the class namespace, they are defined in the class header and can be used like ordinary variables in the class body. Parameter values ​​are specified when using the class in the manifest.

The parameter can be given a default value. If a parameter does not have a default value and a value is not set on use, this will cause a compilation error.

Let's parameterize the class from the example above and add two parameters: the first, required, is the path to the configuration, and the second, optional, is the name of the package with nginx (Debian, for example, has packages nginx, nginx-light, nginx-full).

# переменные описываются сразу после имени класса в круглых скобках
class nginx_example (
  $conf_source,
  $package_name = 'nginx-light', # параметр со значением по умолчанию
) {
  package { $package_name:
    ensure => installed,
  }
  -> file { '/etc/nginx':
    ensure  => directory,
    source  => $conf_source,
    recurse => true,
    purge   => true,
    force   => true,
  }
  ~> service { 'nginx':
    ensure => running,
    enable => true,
  }
}

node 'server2.testdomain' {
  # если мы хотим задать параметры класса, функция include не подойдёт* — нужно использовать resource-style declaration
  # *на самом деле подойдёт, но про это расскажу в следующей серии. Ключевое слово "Hiera".
  class { 'nginx_example':
    conf_source => 'puppet:///modules/example/nginx-conf',   # задаём параметры класса точно так же, как параметры для других ресурсов
  }
}

Variables are typed in Puppet. Eat many data types. Data types are commonly used to validate parameter values ​​passed to classes and defines. If the passed parameter does not match the specified type, a compilation error will occur.

The type is written immediately before the parameter name:

class example (
  String $param1,
  Integer $param2,
  Array $param3,
  Hash $param4,
  Hash[String, String] $param5,
) {
  ...
}

Classes: include classname vs class{'classname':}

Each class is a resource of type class. As with any other resource type, there cannot be two instances of the same class on the same node.

If you try to add a class to the same node twice with class { 'classname':} (no difference, with different or identical parameters), there will be a compilation error. But in the case of using a class in the style of a resource, you can explicitly set all its parameters right there in the manifest.

However, if you use include, then the class can be added as many times as you like. The fact is that include is an idempotent function that checks if the class has been added to the directory. If there is no class in the directory, it adds it, and if it already exists, it does nothing. But in case of using include you cannot set class parameters during class declaration - all required parameters must be set in an external data source - Hiera or ENC. We will talk about them in the next article.

Defines

As mentioned in the previous block, the same class cannot be present on a node more than once. However, in some cases it is necessary to be able to apply the same block of code with different parameters on the same node. In other words, there is a need for a native resource type.

For example, in order to install the PHP module, we do the following in Avito:

  1. Install the package with this module.
  2. We create a configuration file for this module.
  3. Create a symlink to the config for php-fpm.
  4. Create a symlink to the config for php cli.

In such cases, a structure such as define (define, defined type, defined resource type). A define is similar to a class, but there are differences: first, each define is a resource type, not a resource; secondly, each define has an implicit parameter $title, where the name of the resource goes when it is declared. Just like with classes, the define must first be described, after that it can be used.

Simplified PHP module example:

define php74::module (
  $php_module_name = $title,
  $php_package_name = "php7.4-${title}",
  $version = 'installed',
  $priority = '20',
  $data = "extension=${title}.son",
  $php_module_path = '/etc/php/7.4/mods-available',
) {
  package { $php_package_name:
    ensure          => $version,
    install_options => ['-o', 'DPkg::NoTriggers=true'],  # триггеры дебиановских php-пакетов сами создают симлинки и перезапускают сервис php-fpm - нам это не нужно, так как и симлинками, и сервисом мы управляем с помощью Puppet
  }
  -> file { "${php_module_path}/${php_module_name}.ini":
    ensure  => $ensure,
    content => $data,
  }
  file { "/etc/php/7.4/cli/conf.d/${priority}-${php_module_name}.ini":
    ensure  => link,
    target  => "${php_module_path}/${php_module_name}.ini",
  }
  file { "/etc/php/7.4/fpm/conf.d/${priority}-${php_module_name}.ini":
    ensure  => link,
    target  => "${php_module_path}/${php_module_name}.ini",
  }
}

node server3.testdomain {
  php74::module { 'sqlite3': }
  php74::module { 'amqp': php_package_name => 'php-amqp' }
  php74::module { 'msgpack': priority => '10' }
}

In define, it is easiest to catch the Duplicate declaration error. This happens if the define has a resource with a constant name, and there are two or more instances of this define on some node.

It is easy to protect yourself from this: all resources inside the define must have a name that depends on $title. As an alternative - idempotent addition of resources, in the simplest case, it is enough to move the resources common to all instances of the define into a separate class and include this class in the define - function include idempotent.

There are other ways to achieve idempotence when adding resources, namely using functions defined и ensure_resourcesbut I'll talk about that in the next episode.

Dependencies and notifications for classes and defines

Classes and defines add the following rules to handling dependencies and notifications:

  • class/define dependency adds dependencies on all class/define resources;
  • class/define dependency adds dependencies to all class/define resources;
  • class/define notification notifies all resources of the class/define;
  • class/define subscription subscribes to all class/define resources.

Conditional Operators and Selectors

Documentation here.

if

Everything is simple here:

if ВЫРАЖЕНИЕ1 {
  ...
} elsif ВЫРАЖЕНИЕ2 {
  ...
} else {
  ...
}

unless

unless is an if in reverse: the block of code will be executed if the expression is false.

unless ВЫРАЖЕНИЕ {
  ...
}

CASE​

There is nothing complicated here either. As values, you can use ordinary values ​​(strings, numbers, and so on), regular expressions, as well as data types.

case ВЫРАЖЕНИЕ {
  ЗНАЧЕНИЕ1: { ... }
  ЗНАЧЕНИЕ2, ЗНАЧЕНИЕ3: { ... }
  default: { ... }
}

Selectors

A selector is a language construct similar to case, only instead of executing a block of code, it returns a value.

$var = $othervar ? { 'val1' => 1, 'val2' => 2, default => 3 }

Modules

When the configuration is small, it can easily be kept in one manifest. But the more configuration we describe, the more classes and nodes become in the manifest, it grows, it becomes inconvenient to work with it.

In addition, there is the problem of code reuse - when all the code is in one manifest, it is difficult to share this code with others. To solve these two problems, Puppet has such an entity as modules.

Modules are sets of classes, defines, and other Puppet entities moved to a separate directory. In other words, a module is an independent piece of Puppet logic. For example, there may be a module for working with nginx, and it will contain everything and only what is needed for working with nginx, or there may be a module for working with PHP, and so on.

Modules are versioned, dependencies of modules from each other are also supported. There is an open repository of modules - Puppet Forge.

On the pappet server, the modules are located in the modules subdirectory of the root directory. Inside each module, there is a standard directory scheme - manifests, files, templates, lib, and so on.

Structure of files in a module

The module root can contain the following directories with telling names:

  • manifests - manifests are in it
  • files - it contains files
  • templates - it contains templates
  • lib - it contains Ruby code

This is not a complete list of directories and files, but it's enough for this article.

Resource names and file names in a module

Documentation here.

Resources (classes, defines) in a module cannot be named anything. In addition, there is a direct correspondence between the name of the resource and the name of the file in which Puppet will look for a description of this resource. If you violate the naming rules, then Puppet simply will not find the description of the resources, and a compilation error will result.

The rules are simple:

  • All resources in a module must be in the module's namespace. If the module is called foo, then all resources in it should be named foo::<anything>or just foo.
  • The resource with the name of the module must be in the file init.pp.
  • For other resources, the file naming scheme is as follows:
    • prefix with module name is discarded
    • all double colons, if any, are replaced with slashes
    • the extension is appended .pp

I will demonstrate with an example. Suppose I am writing a module nginx. It contains the following resources:

  • class nginx described in the manifest init.pp;
  • class nginx::service described in the manifest service.pp;
  • define nginx::server described in the manifest server.pp;
  • define nginx::server::location described in the manifest server/location.pp.

Patterns

Surely you yourself know what templates are, I will not describe here in detail. But just in case, I'll leave link to wikipedia.

How to use templates: the meaning of a template can be revealed using a function template, which is passed the path to the template. For resources like fillet use with parameter content. For example, like this:

file { '/tmp/example': content => template('modulename/templatename.erb')

View Path <modulename>/<filename> implies a file <rootdir>/modules/<modulename>/templates/<filename>.

In addition, there is a function inline_template - the text of the template is passed to it as input, and not the file name.

Within templates, all Puppet variables in the current scope can be used.

Puppet supports ERB and EPP templates:

Briefly about ERB

Control structures:

  • <%= ВЫРАЖЕНИЕ %> - insert the value of the expression
  • <% ВЫРАЖЕНИЕ %> - calculate the value of the expression (without inserting it). Conditional statements (if), loops (each) usually go here.
  • <%# КОММЕНТАРИЙ %>

Expressions in ERB are written in Ruby (actually, ERB is Embedded Ruby).

To access variables from the manifest, you need to add @ to the variable name. To remove a line break that appears after a control construct, you need to use the closing tag -%>.

Template usage example

Let's say I'm writing a module to manage ZooKeeper. The class responsible for creating the config looks something like this:

class zookeeper::configure (
  Array[String] $nodes,
  Integer $port_client,
  Integer $port_quorum,
  Integer $port_leader,
  Hash[String, Any] $properties,
  String $datadir,
) {
  file { '/etc/zookeeper/conf/zoo.cfg':
    ensure  => present,
    content => template('zookeeper/zoo.cfg.erb'),
  }
}

And the corresponding template zoo.cfg.erb - So:

<% if @nodes.length > 0 -%>
<% @nodes.each do |node, id| -%>
server.<%= id %>=<%= node %>:<%= @port_leader %>:<%= @port_quorum %>;<%= @port_client %>
<% end -%>
<% end -%>

dataDir=<%= @datadir %>

<% @properties.each do |k, v| -%>
<%= k %>=<%= v %>
<% end -%>

Facts and built-in variables

Often, a specific part of the configuration depends on what is currently happening on the node. For example, depending on which release of Debian is, you need to install one or another version of the package. You can keep track of all this manually by rewriting manifests in case of node changes. But this is not a serious approach, automation is much better.

To get information about nodes in Puppet, there is such a mechanism as facts. Facts is information about the node, available in manifests as regular variables in the global namespace. For example, hostname, operating system version, processor architecture, list of users, list of network interfaces and their addresses, and much, much more. Facts are available in manifests and templates as regular variables.

An example of working with facts:

notify { "Running OS ${facts['os']['name']} version ${facts['os']['release']['full']}": }
# ресурс типа notify просто выводит сообщение в лог

Speaking formally, a fact has a name (string) and a value (various types are available: strings, arrays, dictionaries). Eat set of embedded facts. You can also write your own. Fact collectors are described like functions in Ruby, or both executable files. The facts can also be represented in the form text files with data on nodes.

During operation, the pappet agent first copies all available fact collectors from the pappetserver to the node, then launches them and sends the collected facts to the server; after that the server starts compiling the directory.

Facts as executables

Such facts are placed in modules in the directory facts.d. Of course, the files must be executable. When run, they should output information to standard output either in YAML format or in key=value format.

Do not forget that the facts apply to all nodes that are controlled by the pappet server to which your module is rolled out. Therefore, in the script, take care to check that the system has all the programs and files necessary for the operation of your fact.

#!/bin/sh
echo "testfact=success"
#!/bin/sh
echo '{"testyamlfact":"success"}'

Facts in Ruby

Such facts are placed in modules in the directory lib/facter.

# всё начинается с вызова функции Facter.add с именем факта и блоком кода
Facter.add('ladvd') do
# в блоках confine описываются условия применимости факта — код внутри блока должен вернуть true, иначе значение факта не вычисляется и не возвращается
  confine do
    Facter::Core::Execution.which('ladvdc') # проверим, что в PATH есть такой исполняемый файл
  end
  confine do
    File.socket?('/var/run/ladvd.sock') # проверим, что есть такой UNIX-domain socket
  end
# в блоке setcode происходит собственно вычисление значения факта
  setcode do
    hash = {}
    if (out = Facter::Core::Execution.execute('ladvdc -b'))
      out.split.each do |l|
        line = l.split('=')
        next if line.length != 2
        name, value = line
        hash[name.strip.downcase.tr(' ', '_')] = value.strip.chomp(''').reverse.chomp(''').reverse
      end
    end
    hash  # значение последнего выражения в блоке setcode является значением факта
  end
end

Text facts

Such facts are placed on nodes in the directory /etc/facter/facts.d in old Pappet or /etc/puppetlabs/facts.d in the new Puppet.

examplefact=examplevalue
---
examplefact2: examplevalue2
anotherfact: anothervalue

Appeal to the facts

Facts can be accessed in two ways:

  • through a dictionary $facts: $facts['fqdn'];
  • using fact name as variable name: $fqdn.

It's best to use a dictionary $facts, and it is even better to specify the global namespace ($::facts).

Here is the relevant section of the documentation.

Built-in variables

In addition to the facts, there are some variables, available in the global namespace.

  • trusted facts - variables that are taken from the client certificate (since the certificate is usually issued on the pappet server, the agent cannot just take and change its certificate, therefore the variables are “trusted”): certificate name, host name and domain name, extensions from the certificate.
  • server facts —variables related to information about the server — version, name, server IP address, environment.
  • agent facts - variables added directly by puppet-agent, and not by facter - certificate name, agent version, puppet version.
  • master variables - pappetmaster variables (sic!). It's about the same as in server facts, plus the values ​​of the configuration parameters are available.
  • compiler variables - compiler variables that differ in each scope: the name of the current module and the name of the module in which the current object was accessed. They can be used, for example, to check that your private classes are not being used directly from other modules.

Addition 1: how to run and debug everything?

The article had many examples of puppet code, but it didn’t tell at all how to run this code. Well, I'm fixing it.

For Puppet to work, an agent is enough, but for most cases a server will also be needed.

Agent

Since at least version XNUMX, the puppet-agent packages from official Puppetlabs repository contain all the dependencies (ruby and the corresponding gems), so there are no installation difficulties (I'm talking about Debian-based distributions - we don't use RPM-based distributions).

In the simplest case, to apply the puppet configuration, it is enough to run the agent in serverless mode: provided that the puppet code is copied to the node, run puppet apply <путь к манифесту>:

atikhonov@atikhonov ~/puppet-test $ cat helloworld.pp 
node default {
    notify { 'Hello world!': }
}
atikhonov@atikhonov ~/puppet-test $ puppet apply helloworld.pp 
Notice: Compiled catalog for atikhonov.localdomain in environment production in 0.01 seconds
Notice: Hello world!
Notice: /Stage[main]/Main/Node[default]/Notify[Hello world!]/message: defined 'message' as 'Hello world!'
Notice: Applied catalog in 0.01 seconds

It is better, of course, to raise the server and run the agents on the nodes in daemon mode - then every half an hour they will apply the configuration downloaded from the server.

You can imitate the push model of work - go to the node you are interested in and run sudo puppet agent -t... Key -t (--test) actually includes several options that can be enabled individually. Among these options are:

  • do not run in daemon mode (by default, the agent starts in daemon mode);
  • exit after applying the catalog (by default, the agent will continue to work and apply the configuration once every half an hour);
  • write a detailed log of work;
  • show changes to files.

The agent has a mode of operation without changes - you can use it when you are not sure that you wrote the correct configuration and want to check what exactly the agent will change during operation. This mode is enabled with the parameter --noop on the command line: sudo puppet agent -t --noop.

In addition, you can turn on the debug log of work - in it puppet writes about all the actions that it performs: about the resource that is currently processing, about the parameters of this resource, about which programs it launches. Of course, this option --debug.

Server

I will not consider a full-fledged configuration of the pappetserver and deploying code to it in this article, I will only say that a fully functional version of the server is installed out of the box, which does not require additional configuration to work in conditions of a small number of nodes (say, up to a hundred). A larger number of nodes will already require tuning - by default, puppetserver launches no more than four workers, for better performance you need to increase their number and do not forget to increase the memory limits, otherwise the server will garbage collect most of the time.

Deploy code - if you need it quickly and easily, then look (at r10k)[https://github.com/puppetlabs/r10k], for small installations it should be enough.

Addendum 2: Code Guidance

  1. Take out all the logic in classes and defines.
  2. Keep classes and defines in modules, not in node manifests.
  3. Use the facts.
  4. Don't do if's on hostnames.
  5. Feel free to add parameters for classes and defines - it's better than implicit logic hidden in the class/define body.

And why I recommend doing this - I will explain in the next article.

Conclusion

This concludes the introduction. In the next article I will talk about Hiera, ENC and PuppetDB.

Only registered users can participate in the survey. Sign in, you are welcome.

In fact, there is much more material - I can write articles on the following topics, vote what you would be interested in reading about:

  • 59,1%Advanced puppet constructs - some next-level shit: loops, mappings and other lambda expressions, resource collectors, exported resources and host-to-host communication via Puppet, tags, providers, abstract data types.13
  • 31,8%“I’m an admin at my mother’s” or how we made friends with several pappet servers of different versions in Avito, and, in principle, the part about administering the pappet server.7
  • 81,8%How we write puppet code: tooling, documentation, testing, CI/CD.18

22 users voted. 9 users abstained.

Source: habr.com