DevOps and Chaos: software delivery in a decentralized world

The founder and director of Otomato Software, one of the initiators and instructors of Israel's first DevOps certification, Anton Weiss, spoke at last year's DevOpsDays Moscow about chaos theory and the main principles of chaos engineering, and also explained how the ideal DevOps organization of the future works.

We have prepared a text version of the report.



Good morning!

DevOpsDays in Moscow for the second year in a row, this is my second time on this stage, many of you are in this hall for the second time. What does it mean? This means that the DevOps movement in Russia is growing, multiplying, and most importantly, it means that it's time to talk about what DevOps is in 2018.

Hands up if you think DevOps is a profession in 2018? There are such. Are there any DevOps engineers in the room whose job description says "DevOps engineer"? Are there DevOps managers in the room? There is no such. DevOps architects? Also no. Not enough. What, really no one says that he is a DevOps engineer?

So most of you think this is an antipattern? That such a profession should not exist? We can think whatever we want, but as long as we think, the industry is solemnly moving forward to the sounds of the DevOps trumpet.

Has anyone heard of a new theme called DevDevOps? This is such a new technique that allows for effective collaboration between developers and devops. And not so new. Judging by Twitter, they started talking about it 4 years ago. And until now, interest in this is growing and growing, that is, there is a problem. The problem must be solved.

DevOps and Chaos: software delivery in a decentralized world

We are creative people, we do not just calm down. We say: DevOps is not a comprehensive enough word, it still lacks all sorts of interesting elements. And we go to our secret laboratories and start producing curious mutations: DevTestOps, GitOps, DevSecOps, BizDevOps, ProdOps.

DevOps and Chaos: software delivery in a decentralized world

Iron logic, right? Our delivery system is not functional, our systems are unstable and users are dissatisfied, we do not have time to roll out software on time, we do not fit into the budget. How are we going to solve all this? We'll come up with a new word! It will end with "Ops" and the problem is solved.

So I call this approach - "Ops, and the problem is solved."

It all fades into the background if we remind ourselves why we came up with all this. We came up with all this DevOps to make the delivery of software and our own work in this process as smooth, painless, efficient, and most importantly, enjoyable as possible.

DevOps grew out of pain. And we are tired of suffering. And in order for all this to happen, we rely on evergreen practices: effective collaboration, flow practices, and most importantly, systems thinking, because no DevOps works without it.

What is a system?

And if we are already talking about systems thinking, let's remind ourselves what a system is.

DevOps and Chaos: software delivery in a decentralized world

If you are a revolutionary hacker, then for you the system is a definite evil. This is a cloud that hangs over you and makes you do what you don't want to do.

DevOps and Chaos: software delivery in a decentralized world

From the point of view of systems thinking, a system is a whole that consists of parts. In this sense, each of us is a system. The organizations we work for are systems. And what we are building, it is called just that - the system.

All this is part of one large socio-technological system. And only if we understand how this socio-technological system works together, only then can we truly optimize something in this matter.

From the point of view of systems thinking, the system has various interesting properties. First, it is composed of parts, which means that its behavior depends on the behavior of the parts. Moreover, all its parts are also interdependent. It turns out that the more parts a system has, the more difficult it is to understand or predict its behavior.

In terms of behavior, there is another interesting fact. A system can do something that none of its individual parts can do.

As Dr. Russell Ackoff (one of the founders of systems thinking) said, this is easy enough to prove with a thought experiment. For example, who in the room can write code? A lot of hands, and this is normal, because this is one of the main requirements for our profession. You know how to write, but can your hands write code separately from you? There are people who will say: "My hands do not write code, my brain writes code." And the brain can write code separately from you? Well, most likely not.

The brain is an amazing machine, even 10% of us do not know how it works there, but it cannot function separately from the system that our body is. And this is easy to prove: open your skull, take out your brain, put it in front of the computer, let him try to write something simple. "Hello, world" in Python, for example.

If a system can do something that none of its parts individually can do, then it means that its behavior is not determined by the behavior of its parts. What then defines it? It is determined by the interaction between these parts. And accordingly, the more parts, the more complex the interactions, the more difficult it is to understand and predict the behavior of the system. And this makes such a system chaotic, because any, the most insignificant, invisible change in any part of the system can lead to completely unpredictable results.

This sensitivity to initial conditions was first discovered and investigated by American meteorologist Ed Lorenz. Subsequently, it was called the "butterfly effect" and led to the development of such a movement of scientific thought, which is called "chaos theory". This theory became one of the major paradigm shifts in science in the 20th century.

Theory Chaos

People who study chaos call themselves chaosologists.

DevOps and Chaos: software delivery in a decentralized world

Actually the reason for this report was that, working with complex distributed systems and large international organizations, at some point I realized that this is who I feel myself to be. I am a chaosologist. This is, in general, such a clever way of saying: "I don't understand what's going on here and I don't know what to do about it."

I think that many of you also feel that way often, so you are chaosologists too. I invite you to the guild of chaosologists. The systems that we, dear colleagues of chaosologists, will study, are called “complex adaptive systems”.

What is adaptability? Adaptability means that the individual and collective behavior of parts in such an adaptive system changes and self-organizes in response to events or chains of micro-events in the system. That is, the system adapts to changes through self-organization. And this ability to self-organize is based on the voluntary, completely decentralized cooperation of free autonomous agents.

Another interesting property of such systems is that they are freely scalable. What we, as chaosologists-engineers, should undoubtedly be interested in. So, if we said that the behavior of a complex system is determined by the interaction of its parts, then what should we be interested in? Interaction.

There are two more interesting findings.
DevOps and Chaos: software delivery in a decentralized world

First, we understand that a complex system cannot be simplified by simplifying its parts. Second, the only way to simplify a complex system is by simplifying the interactions between its parts.

How do we interact? We are all part of a large information system called human society. We interact through a common language, if we have it, if we find it.

DevOps and Chaos: software delivery in a decentralized world

But language itself is a complex adaptive system. Accordingly, in order to interact more efficiently and simply, we need to create some kind of protocols. That is, some sequence of symbols and actions that will make the exchange of information between us simpler, more predictable, more understandable.

I want to say that the tendencies towards complication, towards adaptability, towards decentralization, towards randomness can be traced in everything. And in those systems that we are building, and in those systems of which we are a part.

And in order not to be unfounded, let's look at how the systems that we create are changing.

DevOps and Chaos: software delivery in a decentralized world

You were waiting for this word, I understand. We are at a DevOps conference, today this word will sound about a hundred thousand times and then we will dream at night.

Microservices are the first software architecture that emerged as a response to DevOps practices, which is designed to make our systems more flexible, more scalable, and provide continuous delivery. How does she do it? By reducing the volume of services, reducing the boundaries of the problems that these services process, reducing delivery times. That is, we reduce, simplify parts of the system, increase their number, respectively, the complexity of interactions between these parts invariably increases, that is, new problems arise that we have to solve.

DevOps and Chaos: software delivery in a decentralized world

Microservices are not the end yet, microservices are, in general, already yesterday, because Serverless is coming. All servers burned down, no servers, no operating systems, just pure executable code. Configurations separately, states separately, everything is driven by events. Beauty, purity, silence, no events, nothing happens, complete order.

Where is the difficulty? The complexity, of course, lies in the interactions. How much can one function do on its own? How does it interact with other features? Message queues, databases, balancers. How to recreate some event when a crash happened? Lots of questions and few answers.

Microservices and Serverless is what we computer hipsters call Cloud Native. It's all about the cloud. But the cloud is also inherently limited in scalability. We used to think of it as a distributed system. In fact, where do cloud providers' servers live? In data centers. That is, we have here a kind of centralized, very limited, distributed model.

Today we understand that the Internet of Things is no longer just big words about the fact that even according to modest predictions, in the next five to ten years, we will have billions of devices connected to the Internet. A huge amount of useful and useless data that will be merged into the cloud and poured from the cloud.

The cloud will not survive, so we are talking more and more about what is called "edge computing". Or else I like the wonderful definition of "fog computing". It is covered with mysticism of romanticism and mystery.

DevOps and Chaos: software delivery in a decentralized world

Fog Computing. We are talking about the fact that clouds are such centralized clots of water, steam, ice, stones. And fog is water droplets that are scattered around us in the atmosphere.

In the hazy paradigm, most of the work is done by these droplets completely autonomously or in collaboration with other droplets. And they turn to the cloud only when it's really really hard.

That is, again, decentralization, autonomy, and, of course, many of you already understand what all this is leading to, because you can’t talk about decentralization without mentioning the blockchain.

DevOps and Chaos: software delivery in a decentralized world

There are those who believe, these are those who have invested in cryptocurrency. There are those who believe, but are afraid, like me, for example. And there are those who do not believe. Here you can treat differently. There is a technology, a new incomprehensible business, there are problems. Like any new technology, it raises more questions than it answers.

The hype around blockchain is understandable. Gold rush aside, technology itself holds wonderful promises for a brighter future: more freedom, more autonomy, distributed global trust. What is there not to want?

Accordingly, more and more engineers around the world are starting to develop decentralized applications. And this is a force that cannot be brushed aside by simply saying, “Ahh, the blockchain is just a poorly implemented distributed database.” Or as skeptics like to say, “There are no real applications for blockchain.” If you think about it, 150 years ago they said the same thing about electricity. And even in some ways they were right, because what electricity makes possible today, in the 19th century was unrealistic at all.

By the way, who knows what kind of logo is on the screen? This is Hyperledger. This is a project that is being developed under the auspices of The Linux Foundation, it includes a set of blockchain technologies. This is the real strength of our open source community.

Chaos engineering

DevOps and Chaos: software delivery in a decentralized world

So, the system that we are developing is becoming more and more complex, more and more chaotic, more and more adaptive. Netflix are the pioneers of microservice systems. They were among the first to realize this, they developed a set of tools they called the Simian Army, the most famous of which was Chaos monkey. He defined what became known as "principles of chaos engineering".

By the way, in the process of working on the report, we even translated this text into Russian, so visit link, read, comment, scold.

Briefly, the principles of chaos engineering say the following. Complex distributed systems are inherently unpredictable and inherently buggy. Mistakes are inevitable, which means that we need to accept these mistakes and work with these systems in a completely different way.

We ourselves must try to introduce these errors into our production systems in order to test our systems for this very adaptability, for this very ability for self-organization, for survival.

And it changes everything. Not only how we launch the system into production, but also how we develop them, how we test them. There is no process of stabilization, code freezing, on the contrary, there is a constant process of destabilization. We are trying to kill the system and see that it continues to survive.

Distributed System Integration Protocols

DevOps and Chaos: software delivery in a decentralized world

Accordingly, this requires our systems to also change somehow. In order for them to become more stable, they need some new protocols for the interaction between their parts. So that these parts can negotiate and come to some kind of self-organization. And there are all sorts of new tools, new protocols, which I call “protocols for the interaction of distributed systems”.

DevOps and Chaos: software delivery in a decentralized world

What I'm talking about? First, the project opentracing. Some attempt to create a common distributed tracking protocol, which is an absolutely indispensable tool for debugging complex distributed systems.

DevOps and Chaos: software delivery in a decentralized world

Further - Open Policy Agent. We say that we cannot predict what will happen to the system, that is, we need to increase its observability, observability. Opentracing belongs to a family of tools that give visibility to our systems. But we need observability in order to determine whether the system behaves the way we expect it to or not. How do we define expected behavior? Due to the definition in it of some policy, some set of rules. The Open Policy Agent project defines this set of rules for everything from access to resource allocation.

DevOps and Chaos: software delivery in a decentralized world

As we said, our systems are increasingly event-driven. Serverless is a great example of event-driven systems. In order for us to be able to transfer events between systems and keep track of them, we need some common language, some common protocol for how we talk about events, how we transfer them to each other. This is done by a project called Cloud events.

DevOps and Chaos: software delivery in a decentralized world

The continuous stream of change that bathes our systems, constantly destabilizing them, is a continuous stream of software artifacts. In order for us to be able to maintain this constant stream of changes, we need some common protocol by which we can talk about what a software artifact is, how it is verified, what verification it passed. This is done by a project called Grafeas. That is, a common software artifact metadata protocol.

DevOps and Chaos: software delivery in a decentralized world

And, finally, if we want our systems to be completely independent, adaptive, self-organizing, we must give them the right to self-identify. The project called spiffe that's exactly what it does. This is also a project under the auspices of the Cloud Native Computing Foundation.

All these projects are young, they all need our love, our verification. It's all open source, our testing, our implementation. They show us where technology is heading.

But DevOps has never been primarily about technology, it has always been about collaboration between people first and foremost. And, accordingly, if we want the systems that we develop to change, then we ourselves must change. In fact, we are already changing, we do not have much choice.

DevOps and Chaos: software delivery in a decentralized world

There is a wonderful book British writer Rachel Botsman, in which she writes about the evolution of trust throughout human history. She says that initially, in primitive societies, trust was local, that is, we trusted only those whom we know personally.

Then there was a very long period - a dark time when trust was centralized, when we began to trust people whom we do not know on the basis that we belong to the same public or state institution.

And this is what we see in our modern world: trust is becoming more and more distributed and decentralized, and it is based on the freedom of information flows, on the availability of information.

If you think about it, then this very accessibility, which makes this trust possible, is what we are implementing. This means that both the way we collaborate and the way we do it must change, because the old-style centralized hierarchical IT organizations stop working. They start to die.

DevOps organization basics

The ideal DevOps organization of the future is a decentralized, adaptive system made up of autonomous teams, each of which is made up of autonomous individuals. These teams are scattered around the world, they effectively cooperate with each other through asynchronous communication, using highly transparent information exchange protocols. Very beautiful, right? A very beautiful future.

Of course, none of this is possible without cultural change. We must have transformational leadership, personal responsibility, intrinsic motivation.

DevOps and Chaos: software delivery in a decentralized world

This is the foundation of DevOps organizations: information transparency, asynchronous communications, transformational leadership, decentralization.

Burnout

The systems we are a part of and the ones we build are increasingly chaotic, and it is hard for us humans to deal with this thought, it is hard to give up the illusion of control. We try to continue to control them, and this often leads to burnout. I say this from my own experience, I also got burned, also disabled by unforeseen failures in production.

DevOps and Chaos: software delivery in a decentralized world

Burnout occurs when we try to control what is inherently uncontrollable. When we burn out, everything loses its meaning, because we lose the desire to do something new, we get into a defensive position and begin to protect what we have.

Engineering, as I often like to remind myself, is first and foremost a creative profession. If we lose the desire to create something, then we turn into ashes, turn into ashes. People burn out, entire organizations burn out.

In my opinion, only the acceptance of the creative power of chaos, only the building of cooperation according to its principles, is what will help us not to lose the good that is in our profession.

What I wish you: love your work, love what we do. This world feeds on information, we got the honor of feeding it. So let's study chaos, let's be chaosologists, let's bring value, create something new, well, problems, as we have already found out, are inevitable, and when they appear, we will simply say “Ops!”, and the problem is solved.

What besides Chaos Monkey?

In fact, all these instruments, they are so young. The same Netflix built tools for themselves. Build your own tools. Read the principles of chaos engineering and follow those principles rather than trying to find other tools that someone else has already built.

Try to understand how your systems break down and start breaking them down and see how they hold up. This is first of all. You can search for tools. There are all sorts of projects.

I did not quite understand the moment when you said that the system cannot be simplified by simplifying its components, and immediately switched to microservices, which just simplify the system by simplifying the components themselves and complicating interactions. These are essentially two parts that contradict each other.

That's right, microservices are a very controversial topic in general. In fact, simplifying parts increases flexibility. What do microservices provide? They give us flexibility and speed, but they certainly do not give us simplicity. They increase the difficulty.

That is, in the DevOps philosophy, microservices are not such a good thing?

Every good thing has a downside. There is a benefit: it increases flexibility, gives us the ability to make changes faster, but increases the complexity and, accordingly, the fragility of the entire system.

Still, what is more emphasis: on simplifying the interaction or on simplifying the parts?

The emphasis, of course, is on simplifying interactions, because if we look at it from the point of view of how we work with you, then, first of all, we need to pay attention to simplifying interactions, and not simplifying the work of each of us separately. Because simplifying work is turning into robots. Here at McDonald's it works fine when it's prescribed for you: here you put a burger, here you pour sauce on it. This does not work at all in our creative work.

Is it true that everything you said lives in a world without competition, and the chaos there is so kind, and there are no contradictions within this chaos, no one wants to eat, kill anyone? How competition and DevOps should live?

Well, it depends on what kind of competition we are talking about. About competition in the workplace or competition between companies?

About the competition of services that exist, because services are not several companies. We are creating a new type of information environment, and any environment cannot live without competition. There is competition everywhere.

The same Netflix, we take them as a role model. Why did they come up with this? Because they needed to be competitive. This flexibility and speed of movement, it is precisely the very competitive requirement, it introduces chaos into our systems. That is, chaos is not something that we consciously do because we want it, it is what happens because the world requires it. We just have to adapt. And chaos, it is just the result of competition.

Does this mean that chaos is the absence of goals, as it were? Or those goals that we do not want to see? We are in a house and do not understand the goals of others. Competition, in fact, is due to the fact that we have clear goals, and we know where we will end up at each next moment in time. This, from my point of view, is the essence of DevOps.

Same look at the question. I think that we all have the same goal: to survive and do it with
greatest pleasure. And the competitive goal of any organization is the same. Survival often occurs in a competitive struggle, there's nothing you can do about it.

This year's conference DevOpsDays Moscow will be held on December 7 at Technopolis. Until November 11, we accept applications for reports. Write us if you would like to speak.

Registration for participants is open, a ticket costs 7000 rubles. Join now!

Source: habr.com

Add a comment