Infrastructure as code: first acquaintance

Our company is in the process of onboarding an SRE team. I went into this whole story from the development side. In the process, I got some thoughts and insights that I want to share with other developers. In this reflection article, I talk about what is happening, how it is happening, and how everyone can live with it.

Infrastructure as code: first acquaintance

Continuation of a series of articles written based on speeches at our internal event DevForum:

1. Schrödinger's cat without a box: the problem of consensus in distributed systems.
2. Infrastructure as code. (you are here)
3. Generation of Typescript contracts for c# models. (In progress...)
4. Introduction to the Raft consensus algorithm. (In progress...)
...

We decided to make the SRE team, embodying the ideas google sre. They recruited programmers from their own developers and sent them to study for several months.

The team faced the following training tasks:

  • Describe our infrastructure, which is mostly in Microsoft Azure in the form of code (Terraform and everything around).
  • Teach developers how to work with the infrastructure.
  • Prepare developers for duty.

Introducing the concept of Infrastructure as code

In the usual model of the world (classical administration), infrastructure knowledge resides in two places:

  1. Or in the form of knowledge in the minds of experts.Infrastructure as code: first acquaintance
  2. Or this information is on some typewriters, some of which are known to experts. But it’s not a fact that a person from the outside (in case our entire team suddenly dies) will be able to figure out what works and how. There can be a lot of information on a machine: accessories, cronjobs, podmauchenny (see. disk mounting) disk and just an endless list of what can happen. It's hard to understand what's really going on.Infrastructure as code: first acquaintance

In both cases, we are trapped in becoming addicted:

  • either from a person who is mortal, prone to illness, falling in love, mood swings and just banal dismissals;
  • or from a physically working machine, which also falls, steals, presents surprises and inconveniences.

It goes without saying that, ideally, everything should be translated into human-readable, maintainable, well-written code.

Thus, infrastructure as code (Incfastructure as Code - IaC) is a description of the entire existing infrastructure in the form of code, as well as related tools for working with it and implementing real infrastructure from it.

Why translate everything into codePeople are not machines. They cannot remember everything. Human and machine reactions are different. Anything automated is potentially faster than anything human can do. The most important thing is a single source of truth.

Where do new SREs come from?So, we decided to bring in new SREs, but where do we get them from? book with correct answersGoogle SRE Book) tells us: from developers. After all, they work with the code, and you reach the ideal state.

We have been looking for them for a long time in the personnel market outside of our company. But we are forced to admit that we did not find a single one for our requests. I had to go through my own.

Problems Infrastructure as code

Now let's look at examples of how infrastructure can be hardwired into code. The code is well written, high quality, with comments and indents.

Sample code from Terraforma.

Infrastructure as code: first acquaintance

Sample code from Ansible.

Infrastructure as code: first acquaintance

Gentlemen, if only it were that easy! We are with you in the real world, and he is always ready to surprise you, present surprises, problems. Can't do without them here.

1. The first problem is that in most cases IaC is some kind of dsl.

And DSL, in turn, is a description of the structure. More precisely, what you should have: Json, Yaml, modifications from some large companies that came up with their own dsl (HCL is used in the terraform).

The trouble is that it can easily not have such familiar things as:

  • variables;
  • conditions;
  • there are no comments somewhere, for example, in Json, they are not provided by default;
  • functions
  • and I'm not talking about such high-level things as classes, inheritance and all that.

2. The second problem of such a code is most often a heterogeneous environment. Usually you sit and work with C#, i.e. with one language, one stack, one ecosystem. And here you have a huge variety of technologies.

It is quite a real situation when bash with python starts some process into which Json is slipped. You analyze it, then some other generator produces another 30 files. For all this, input variables come from Azure Key Vault, which are pulled together by a plugin for drone.io written in Go, and these variables pass through yaml, which was generated as a result of jsonnet template engine generation. It's quite difficult to have strictly well-described code when you have such a diverse environment.

Traditional one-task development comes with one language. Here we work with a large number of languages.

3. The third problem is tooling. We are used to cool editors (Ms Visual Studio, Jetbrains Rider) that do everything for us. And even if we are stupid, they will say that we are wrong. It seems to be normal and natural.

But somewhere nearby there is VSCode, in which there are some plugins that are somehow installed, supported or not supported. New versions came out, and they were not supported. A banal transition to the implementation of a function (even if it exists) becomes a complex and non-trivial problem. A simple rename of a variable is a replay in a project of a dozen files. You'll be lucky if he replays what needs to be done. There is, of course, a backlight in some places, there is an auto-completion, somewhere there is formatting (although it didn’t start in my terraform on Windows).

At the time of writing the article vscode-terraform plugin not yet released to support version 0.12, although it has been released for like 3 months now.

It's time to forget about...

  1. Debugging.
  2. refactoring tool.
  3. autocompletion.
  4. Detection of compilation errors.

Ridiculously, this also increases development time and increases the number of errors that inevitably occur.

The worst thing is that we are forced to think not about how to design, arrange the files into folders, decompose, make the code maintainable, readable, and so on, but about how I could correctly write this command, because I somehow wrote it incorrectly .

As a beginner, you are trying to learn terraforms, and the IDE does not help you in this at all. When there is documentation, they went in and looked. But if you were entering a new programming language, then the IDE would tell you that there is such a type, but there is no such type. At least at the int or string level. This is often helpful.

But what about the tests?

You ask: “What about the tests, gentlemen programmers?” Serious guys test everything on prod, and it's tough. Here is an example of a unit test for a terraform module from the site Microsoft.

Infrastructure as code: first acquaintance

They have good documentation. I have always liked Microsoft for their approach to documentation and training. But you don't have to be Uncle Bob to understand that this code isn't perfect. Notice the validation moved to the right.

The problem with a unit test is that you and I can check the correctness of Json in the output. I threw 5 parameters, I got a Json footcloth for 2000 lines. I can analyze what's going on here, validate test result...

It's hard to parse Json in Go. And you have to write in Go, because terraforming in Go is a good practice for testing in the language you write in. The very organization of the code is very weak. At the same time, this is the best library for testing.

Microsoft itself writes its modules, testing them in this way. Of course it's open source. Everything I say you can come and fix. I can sit down and fix everything in a week, open source plugins for VS code, terraforms, make a plugin for the rider. Maybe write a couple of analyzers, add linters, contribute a library for testing. I can do everything. But that's not what I should be doing.

Best practices Infrastructure as code

Let's go further. If there are no tests in IaC, bad with IDE and tooling, then there should be at least best practices. I just went to google analytics and compared two search queries: Terraform best practices and c# best practices.

Infrastructure as code: first acquaintance

What do we see? Merciless statistics are not in our favor. The amount of material is the same. In C# development, we just bathe in materials, we have ultra-best practices, there are books written by experts, and also books written on books by other experts who critique those books. A sea of ​​official documentation, articles, training courses, now also open source development.

As for the request for IaC: here you are trying to collect information bit by bit from the reports of the highload or HashiConf, from the official documentation and numerous issues on the github. How to scatter these modules in general, what to do with them? It seems that this is a real problem ... There is a community, gentlemen, where for any question you will be given 10 comments on the github. But it is not exactly.

Unfortunately, experts are just beginning to emerge at this point in time. So far, there are too few of them. And the community itself hangs out at the level of rudiments.

Where is this going and what to do

You can drop everything and go back to C #, to the rider's world. But no. Why would you do it at all if you can't find a solution. Here are my subjective findings. You can argue with me in the comments, it will be interesting.

Personally, I bet on a few things:

  1. Development in this area is very fast. Here is a schedule of requests for DevOps.

    Infrastructure as code: first acquaintance

    It may be a hype topic, but the very fact that the sphere is growing gives some hope.

    If something grows so fast, then there will definitely be smart people who will tell you how to do it and how not to do it. The increase in popularity leads to the fact that maybe someone will have time to finally add a plugin to jsonnet for vscode, which will allow you to proceed to the implementation of the function, and not look for it through ctrl + shift + f. When everything develops, more materials appear. The same release of a book from Google about SRE is an excellent example of this.

  2. There are developed techniques and practices in conventional development that we can successfully apply here. Yes, there are nuances with testing and a heterogeneous environment, insufficient tooling, but a huge number of practices have been accumulated that can be useful and help.

    A banal example: collaboration through pair programming. He helps a lot to figure it out. When you have a neighbor nearby who is also trying to understand something, together you will understand better.

    Understanding how refactoring is done helps even in such a situation to produce it. That is, you can not change everything at once, but change the naming, then change the location, then you can highlight some part, oh, but there are not enough comments here.

Conclusion

Despite the fact that my reasoning may seem pessimistic, I look to the future with hope and sincerely hope that we (and you) will succeed.

Next comes the second part of the article. In it, I will talk about how we tried to apply agile development practices to improve our learning process and work with infrastructure.

Source: habr.com

Add a comment