Speed ​​up Ansible

Speed ​​up Ansible
It's no secret that with the "default" settings, Ansible can not do its job too quickly. In the article, I will point out several reasons for this and offer a useful minimum of settings that, quite possibly, will really increase the speed of your project.

We discuss here and below Ansible 2.9.x, which was installed in a freshly created virtualenv in your favorite way.

After installation, we create an “ansible.cfg” file next to your playbook - this location will allow you to transfer these settings along with the project, plus they will be loaded quite automatically.

Pipelining

The fact that you need to use pipelining, that is, not copying modules to the FS of the target system, but transferring a zip archive wrapped in Base64 directly to the stdin of the Python interpreter, someone could already hear, but someone could not, but the fact remains : this setting still remains underestimated. Unfortunately, one of the popular Linux distributions used to set up sudo not very well by default - so that this command required a tty (terminal), so Ansible left this very useful setting turned off by default.

pipelining = True

Fact gathering

Did you know that with the default settings Ansible for each play initiates a fact collection on all the hosts that participate in it? In general, if you did not know, now you know. In order to prevent this from happening, you need to enable either the explicit request mode for collecting facts (explicit), or the smart mode. In it, facts will be collected only from those hosts that were not met in previous plays.
UPD. When copying, you will have to choose one of these settings.

gathering = smart|explicit

Reusing ssh connections

If you've ever run Ansible in debug output mode (the "v" option repeated one to nine times), you may have noticed that ssh connections are constantly being established and dropped. So, here, too, there are a couple of subtleties.

You can avoid the stage of re-establishing an ssh connection at two levels at once: both directly in the ssh client, and when transferring files to a managed host from the manager.
To reuse an open ssh connection, simply pass the necessary keys to the ssh client. Then it will start doing the following: the first time an ssh connection is established, additionally create a so-called control socket, the next time it checks for the existence of this very socket, and if successful, reuse the existing ssh connection. And to make it all make sense, let's set the connection retention time when inactive. More details can be found in ssh documentation, and in the context of Ansible, we simply use the “forwarding” of the necessary options to the ssh client.

ssh_args = "-o ControlMaster=auto -o ControlPersist=15m"

To reuse an already open ssh connection when transferring files to a managed host, it is enough to specify one more unknown ssh_tranfer_method setting. The documentation on this is extremely stingy and misleading, because this option is quite working! But reading source code allows you to understand what exactly will happen: the dd command will be launched on the managed host, directly working with the desired file.

transfer_method = piped

By the way, in the “develop” branch, this setting also exists and has not gone anywhere.

Don't be afraid of the knife, be afraid of the fork

Another useful setting is forks. It determines the number of worker processes that will simultaneously connect to hosts and perform tasks. Due to the peculiarities of Python, it is processes, not threads, that are used as a PL, because Ansible still supports Python 2.7 - no asyncio for you, there is nothing to breed asynchronous here! By default, Ansible runs five workers, but if asked correctly, it will launch more:

forks = 20

I just warn you right away that there may be some difficulties associated with the available amount of memory on the control machine. In other words, you can, of course, set forks=100500, but who said it would work?

Putting it all together

As a result, for ansible.cfg (ini-format), the necessary settings may look like this:

[defaults]
gathering = smart|explicit
forks = 20
[ssh_connection]
pipelining = True
ssh_args = -o ControlMaster=auto -o ControlPersist=15m
transfer_method = piped

And if you want to hide everything in a normal YaML-inventory of a healthy person, then it might look something like this:

---
all:
  vars:
    ansible_ssh_pipelining: true
    ansible_ssh_transfer_method: piped
    ansible_ssh_args: -o ControlMaster=auto -o ControlPersist=15m

Unfortunately, with the settings "gathering = smart/explicit" and "forks = 20" this will not work: there are no YaML equivalents for them. Either we set them in ansible.cfg, or we pass them through the ANSIBLE_GATHERING and ANSIBLE_FORKS environment variables.

About Mitogen
— And where is it about Mitogen? You are right to ask, dear reader. Nowhere in this article. But if you are really ready to read its code and figure out why your playbook crashes with Mitogen, but works fine with vanilla Ansible, or why the same playbook was working properly until now, but after the update it started doing strange things - well, Mitogen could potentially be your tool. . Apply, understand, write articles - I will read with interest.

Why don't I personally use Mitogen? Because gladiolus it works only as long as the tasks are really simple and everything is fine. However, it’s worth turning a little to the left or right - that’s it, you’ve arrived: in response, a handful of indistinct exceptions fly at you, and to complete the picture, only the common phrase “thank you all, everyone is free” is missing. In general, I just do not want to waste time figuring out the reasons for the next "underground knock".

Some of these settings were discovered in the process of reading source code plugin'a connection under the speaking name "ssh.py". I share the results of reading in the hope that it will inspire someone else to look at the source codes, read them, check the implementation, compare with the documentation - after all, sooner or later all this will bring you positive results. Good luck!

Only registered users can participate in the survey. Sign in, you are welcome.

Which of the following Ansible settings do you use to speed up your projects?

  • 69,6%pipelining=true32

  • 34,8%gathering = smart/explicit16

  • 52,2%ssh_args = "-o ControlMaster=auto -o ControlPersist=…"24

  • 17,4%transfer_method = piped8

  • 63,0%forks=XXX29

  • 6,5%None of this, just Mitogen3

  • 8,7%Mitogen + note which of these settings4

46 users voted. 21 user abstained.

Want more about Ansible?

  • 78,3%yes, of course54

  • 21,7%yeah, just want more hardcore stuff!15

  • 0,0%no, and don't need it for free0

  • 0,0%no, it's complicatedaaaaaa!!!0

69 users voted. 7 users abstained.

Source: habr.com

Add a comment