Managing Chaos: Putting things in order with the help of a technological map

Managing Chaos: Putting things in order with the help of a technological map

Image: Unsplash

Hi all! We are automation engineers from the company Positive Technologies and we support the development of the company's products: we support the entire assembly pipeline from the commit of a line of code by developers to the publication of finished products and licenses on update servers. Informally, we are called DevOps engineers. In this article, we want to talk about the technological stages of the software production process, how we see them and how we classify them.

From the material you will learn about the complexity of coordinating multi-product development, about what a technological map is and how it helps to streamline and replicate solutions, what are the main stages and steps of the development process, how are the areas of responsibility between DevOps and teams in our company.

About Chaos and DevOps

Briefly, the concept of DevOps includes development tools and services, as well as methodologies and best practices for their use. Let's single out the global goal from the implementation of DevOps ideas in our company: this is a consistent reduction in the cost of production and maintenance of products in quantitative terms (man-hours or machine hours, CPU, RAM, Disk, etc.). The easiest and most obvious way to reduce the overall cost of development at the level of the entire company is minimizing the cost of performing typical serial tasks at all stages of production. But what are these stages, how to separate them from the general process, what steps do they consist of?

When a company develops one product, everything is more or less clear: there is usually a common roadmap and development scheme. But what to do when the product line expands and there are more products? At first glance, they have similar processes and assembly lines, and the “find X differences” game in logs and scripts begins. But what if there are already 5+ projects in active development and support for several versions developed over several years is required? Do we want to reuse the maximum possible number of solutions in product pipelines or are we ready to spend money on a unique development for each?

How to find a balance between uniqueness and serial solutions?

These questions began to arise before us more and more often since 2015. The number of products grew, and we tried to expand our automation department (DevOps), which supported the assembly lines of these products, to a minimum. At the same time, we wanted to replicate as many solutions as possible between products. After all, why do the same thing in ten products in different ways?

Development Director: “Guys, can we somehow evaluate what DevOps does for products?”

Мы: “We don’t know, we didn’t ask such a question, but what indicators should be considered?”

Development Director: "Who knows! Think…”

As in that famous film: "I'm in a hotel! .." - "Uh ... Can you show me the way?" On reflection, we came to the conclusion that we first need to decide on the final states of the products; this became our first goal.

So, how do you analyze a dozen products with fairly large teams from 10 to 200 people and determine measurable metrics when replicating solutions?

1:0 in favor of Chaos, or DevOps on the shoulder blades

We started with an attempt to apply IDEF0 diagrams and various business process diagrams from the BPwin series. The confusion began after the fifth square of the next stage of the next project, and these squares for each project can be drawn in the tail of a long python under 50+ steps. I felt sad and wanted to howl at the moon - it didn’t fit in general.

Typical production tasks

Modeling production processes is a very complex and painstaking job: you need to collect, process and analyze a lot of data from various departments and production chains. You can read more about this in the article "Modeling of production processes in an IT company».

When we started modeling our production process, we had a specific goal - to convey to every employee involved in the development of our company's products, and to project managers:

  • how products and their components, starting from the commit of a line of code, reach the customer in the form of installers and updates,
  • what resources are provided for each stage of production of products,
  • what services are involved at each stage,
  • how the areas of responsibility for each stage are delimited,
  • what contracts exist at the entrance and exit of each stage.

Managing Chaos: Putting things in order with the help of a technological map

Clicking on the image will open it in full size

Our work in the company is divided into several functional areas. The direction of the infrastructure is engaged in the optimization of the operation of all the "iron" resources of the department, as well as the automation of the deployment of virtual machines and the environment on them. The direction of monitoring provides 24/7 service performance control; we also provide monitoring as a service for developers. The workflow direction provides teams with tools to manage development and testing processes, analyze the state of the code, and get analytics on projects. And finally, the webdev direction provides the publication of releases on the GUS and FLUS update servers, as well as the licensing of products using the LicenseLab service. To support the production pipeline, we set up and maintain many different support services for developers (you can listen to stories about some of them on old meetups: Op!DevOps! 2016 и Op!DevOps! 2017). We also develop internal automation tools, including open source solutions.

Over the past five years, our work has accumulated a lot of the same type and routine operations, and our developers from other departments mainly come from the so-called typical tasks, the solution of which is fully or partially automated, does not cause difficulties for performers and does not require significant amounts of work. Together with the leading areas, we analyzed such tasks and were able to identify individual categories of work, or production steps, the stages were divided into indivisible steps, and several stages add up production process chain.

Managing Chaos: Putting things in order with the help of a technological map

The simplest example of a technological chain is the stages of assembly, deployment and testing of each of our products within the company. In turn, for example, the build stage consists of many separate typical steps: downloading sources from GitLab, preparing dependencies and 3rd-party libraries, unit testing and static code analysis, executing a build script on GitLab CI, publishing artifacts in the repository on Artifactory and generation of release notes through our internal ChangelogBuilder tool.

You can read about typical DevOps tasks in our other articles on Habré: "Personal experience: what our Continuous Integration system looks like" and "Automation of development processes: how we implemented DevOps ideas at Positive Technologies».

Many typical production chains form manufacturing process. The standard approach to describing processes is to use functional IDEF0 models.

An example of modeling a manufacturing CI process

We paid special attention to the development of standard projects for a continuous integration system. This made it possible to achieve the unification of projects, highlighting the so-called release build scheme with promotions.

Managing Chaos: Putting things in order with the help of a technological map

Here's how it works. All projects look typical: they include the configuration of assemblies that fall into the snapshot repository at Artifactory, after which they are deployed and tested on test benches, and then promoted to the release repository. The Artifactory service is a single distribution point for all build artifacts between teams and other services.

If we greatly simplify and generalize our release scheme, then it includes the following steps:

  • cross-platform product assembly,
  • deployment to test benches,
  • running functional and other tests,
  • promoting tested builds to release repositories at Artifactory,
  • publication of release builds on update servers,
  • delivery of assemblies and updates to production,
  • launching the installation and updating of the product.

For example, consider the technological model of this typical release scheme (hereinafter simply Model) in the form of a functional IDEF0 model. It reflects the main stages of our CI process. IDEF0 models use the so-called ICOM notation (Input-Control-Output-Mechanism) to describe what resources are used at each stage, based on what rules and requirements work is performed, what is the output, and what mechanisms, services or people implement a particular stage.

Managing Chaos: Putting things in order with the help of a technological map

Clicking on the image will open it in full size

As a rule, it is easier to decompose and detail the description of processes in functional models. But as the number of elements grows, it becomes more and more difficult to understand something in them. But in real development there are also auxiliary stages: monitoring, certification of products, automation of work processes, and others. It is because of the scaling problem that we abandoned this description.

Birth of hope

In one book, we came across old Soviet maps describing technological processes (which, by the way, are still used today at many state-owned enterprises and universities). Wait, wait, because we also have a workflow!.. There are stages, results, metrics, requirements, indicators, and so on and so forth… Why not try to apply flowsheets to our product pipelines as well? There was a feeling: “This is it! We have found the right thread, it's time to pull it well!

In a simple table, we decided to record products by columns, and technological stages and product pipeline steps by rows. Milestones are something big, such as a product build step. And steps are something smaller and more detailed, such as the step of downloading the source code to the build server or the step of compiling the code.

At the intersections of the rows and columns of the map, we put down the statuses for a specific stage and product. For statuses, a set of states was defined:

  1. No data - or inappropriate. It is necessary to analyze the demand for a stage in the product. Either the analysis has already been carried out, but the stage is currently not needed or is not economically justified.
  2. Postponed - or not relevant at the moment. A stage in the pipeline is needed, but there are no forces for implementation this year.
  3. Scheduled. The stage is planned for implementation this year.
  4. Implemented. The stage in the pipeline is implemented in the required volume.

Filling in the table began project by project. First, the stages and steps of one project were classified and their statuses were recorded. Then they took the next project, fixed the statuses in it and added the stages and steps that were missing in previous projects. As a result, we got the stages and steps of our entire production pipeline and their statuses in a specific project. It turned out something similar to the product pipeline competency matrix. We called such a matrix a technological map.

With the help of the technological map, we metrologically reasonably coordinate with the teams the work plans for the year and the targets that we want to achieve together: which stages we add to the project this year, and which ones we leave for later. Also, in the course of work, we may have improvements in the stages that we have completed for only one product. Then we expand our map and introduce this improvement as a stage or a new step, then we analyze for each product and find out the feasibility of replicating the improvement.

They may object to us: “This is all, of course, good, only with time the number of steps and stages will become prohibitively large. How to be?

We have introduced standard and fairly complete descriptions of the requirements for each stage and step, so that they are understood by everyone within the company in the same way. Over time, as improvements are introduced, a step may be absorbed into another stage or step, and then they will "collapse". At the same time, all requirements and technological nuances fit into the requirements of the generalizing stage or step.

How to evaluate the effect of replicating solutions? We use an extremely simple approach: we attribute the initial capital costs for the implementation of a new stage to annual general product costs, and then divide by all when replicating.

Parts of the development are already shown as milestones and steps on the map. We can influence the reduction of the cost of the product through the introduction of automation for typical stages. After that, we consider the changes in qualitative characteristics, quantitative metrics and the profit received by the teams (in man-hours or machine-hours of savings).

Technological map of the production process

If we take all our stages and steps, encode them with tags and expand them into one chain, then it will turn out to be very long and incomprehensible (just the same “python tail” that we talked about at the beginning of the article):

[Production] — [InfMonitoring] — [SourceCodeControl] — [Prepare] — [PrepareLinuxDocker] — [PrepareWinDocker] — [Build] — [PullSourceCode] — [PrepareDep] — [UnitTest] — [CodeCoverage] — [StaticAnalyze] — [BuildScenario] — [PushToSnapshot] — [ChangelogBuilder] — [Deploy] — [PrepareTestStand] — [PullTestCode] — [PrepareTestEnv] — [PullArtifact] — [DeployArtifact] — [Test] — [BVTTest] — [SmokeTest] — [FuncTest] — [LoadTest] — [IntegrityTest] — [DeliveryTest] — [MonitoringStands] — [TestManagement] — [Promote] — [QualityTag] — [MoveToRelease] — [License] — [Publish] — [PublishGUSFLUS] — [ControlVisibility] — [Install] — [LicenseActivation] — [RequestUpdates] — [PullUpdates] — [InitUpdates] — [PrepareEnv] — [InstallUpdates] — [Telemetry] — [Workflow] — [Communication] — [Certification] — [CISelfSufficiency]

These are the stages of building products [Build], deploying them to test servers [Deploy], testing [Test], promoting builds to release repositories based on the results of testing [Promote], generating and publishing licenses [License], publishing [Publish] on the GUS update server and delivery to FLUS update servers, installation and updating of product components on the customer's infrastructure using Product Configuration Management [Install], as well as collection of telemetry [Telemetry] from installed products.

In addition to them, separate stages can be distinguished: infrastructure state monitoring [InfMonitoring], source code versioning [SourceCodeControl], build environment preparation [Prepare], project management [Workflow], providing teams with communication tools [Communication], product certification [Certification] and ensuring self-sufficiency of CI processes [CISelfSufficiency] (for example, independence of assemblies from the Internet). Dozens of steps in our processes will not even be considered, because they are very specific.

It will be much easier to understand and view the entire production process if it is presented in the form technological map; this is a table in which the individual production stages and decomposed steps of the Model are written in rows, and in columns a description of what is done at each stage or step. The main emphasis is placed on the resources that provide each stage, and the delimitation of areas of responsibility.

The map for us is a kind of classifier. It reflects the large technological parts of the production of products. Thanks to it, it became easier for our automation team to interact with developers and jointly plan the implementation of automation stages, as well as understand what labor costs and resources (human and hardware) will be required for this.

Inside our company, the map is automatically generated from the jinja template as a regular HTML file, and then uploaded to the GitLab Pages server. A screenshot with an example of a fully generated map can be viewed here to register:.

Managing Chaos: Putting things in order with the help of a technological map

Clicking on the image will open it in full size

In short, the technological map is a generalized picture of the production process, which reflects clearly classified blocks with typical functionality.

Structure of our road map

The map consists of several parts:

  1. Title area - here is a general description of the map, basic concepts are introduced, the main resources and results of the production process are defined.
  2. Dashboard - here you can control the display of data for individual products, a summary of the implemented stages and steps in general for all products is provided.
  3. Technological map - a tabular description of the technological process. On the map:
    • all stages, steps and their codes are given;
    • short and complete descriptions of the stages are given;
    • the input resources and services used at each stage are indicated;
    • the results of each stage and a separate step are indicated;
    • the area of ​​responsibility for each stage and step is indicated;
    • the technical resources, such as HDD (SSD), RAM, vCPU, and the man-hours necessary to support the work at this stage, both at the current moment - a fact, and in the future - a plan, have been determined;
    • for each product, it is indicated which technological stages or steps for it have been implemented, planned for implementation, irrelevant or not implemented.

Decision making based on the technological map

After examining the map, it is possible to take some actions - depending on the role of the employee in the company (development manager, product manager, developer or tester):

  • understand what stages are missing in a real product or project, and assess the need for their implementation;
  • delimit the areas of responsibility between several departments if they work on different stages;
  • agree on contracts at the entrances and exits of the stages;
  • integrate your stage of work into the overall development process;
  • more accurately assess the need for resources that provide each of the stages.

Summarizing all of the above

The routing is versatile, extensible and easy to maintain. It is much easier to develop and maintain a description of processes in this form than in a strict academic IDEF0 model. In addition, a tabular description is simpler, more familiar, and better structured than a functional model.

For the technical implementation of the steps, we have a special internal tool CrossBuilder - a layer tool between CI systems, services and infrastructure. The developer does not need to cut his bike: in our CI system, it is enough to run one of the scripts (the so-called task) of the CrossBuilder tool, which will execute it correctly, taking into account the features of our infrastructure.

Results

The article turned out to be quite long, but this is inevitable when describing the modeling of complex processes. In the end, I would like to briefly fix our main ideas:

  • The goal of implementing DevOps ideas in our company is to consistently reduce the cost of production and maintenance of the company's products in quantitative terms (man-hours or machine hours, vCPU, RAM, Disk).
  • The way to reduce the overall cost of development is to minimize the cost of performing typical serial tasks: stages and steps of the technological process.
  • A typical task is a task whose solution is fully or partially automated, does not cause difficulties for performers and does not require significant labor costs.
  • The production process consists of stages, the stages are divided into indivisible steps, which are typical tasks of different scale and scope.
  • From disparate typical tasks, we have come to complex technological chains and multi-level models of the production process, which can be described by a functional IDEF0 model or a simpler technological map.
  • The technological map is a tabular representation of the stages and steps of the production process. Most importantly: the map allows you to see the whole process in its entirety, in large pieces with the possibility of detailing them.
  • Based on the technological map, it is possible to assess the need to introduce stages in a particular product, delineate areas of responsibility, agree on contracts at the inputs and outputs of stages, and more accurately assess the need for resources.

In the following articles, we will describe in more detail what technical tools are used to implement certain technological stages on our map.

Article authors:

Source: habr.com

Add a comment