To be honest, Ivan often laughed at the futile efforts of colleagues from the monitoring department. They made great efforts to implement the metrics that they were ordered by the company's management. They were so busy that no one else wanted to do anything.
And the management was not enough - it constantly ordered more and more new metrics, very quickly ceasing to use what had been done earlier.
Lately, everyone has been talking about LeadTime - the time of delivery of business features. The metric showed a crazy number - 200 days to deliver one task. How everyone groaned, gasped and raised their hands to the sky!
After some time, the noise gradually died down and an order was received from the management to create another metric.
It was completely clear to Ivan that the new metric would just as quietly die in a dark corner.
Indeed, Ivan thought, knowing the number does not tell anyone anything at all. 200 days or 2 days - there is no difference, because by the number it is impossible to determine the cause and understand whether it is good or bad.
This is a typical trap of metrics: it seems that a new metric will tell the essence of being and explain some secret secret. Everyone hopes so, but for some reason nothing happens. Yes, because the secret should not be sought at all in the metrics!
For Ivan, this was a passed stage. He understood that
For an online store, the object of influence will be its customers who bring in money, and for DevOps, it will be teams that create and roll out distributions using a pipeline.
One day, sitting in a comfortable chair in the lobby, Ivan decided to think carefully about how he would like to see DevOps metrics, taking into account the fact that teams are the object of influence.
Purpose of DevOps Metrics
It is clear that everyone wants to reduce the delivery time. 200 days is, of course, no good.
The company has hundreds of teams, and thousands of distributions go through the DevOps pipeline every day. The actual delivery time will look like distribution. Each of the teams will have their own time and their own characteristics. How can you find anything in this mess?
The answer arose by itself - you need to find problematic teams and figure out what is happening with them and why it takes so long, and learn from "good" teams how to do everything quickly. And this requires measuring the time spent by teams at each of the DevOps booths:
“The purpose of the system will be to select teams according to the time of passing the stands, i.e. in the end, we should get a list of commands with the selected time, not a number.
If we find out how much time was spent on the stand in total and how much time was spent on downtime between stands, then we can find teams, call them and understand the reasons in more detail and eliminate them, ”Ivan thought.
How to Calculate Delivery Time for DevOps
To count, it was necessary to delve into the DevOps process and its essence.
The company uses a limited number of systems, and information can only be obtained from them and nowhere else.
All tasks in the company were registered in Jira. When a task was taken into work, a branch was created for it, and after implementation, a commit was made in BitBucket and Pull Request. When a PR (Pull Request) was accepted, a distribution was automatically created and stored in the Nexus repository.
Next, the distribution kit was rolled out on several stands using Jenkins to check the correctness of rolling, automatic and manual testing:
Ivan painted from which systems what information can be taken to calculate the time on the stands:
- From Nexus - The time the distribution was created and the name of the folder that contained the command code
- From Jenkins - Start time, duration and result of processing each job, name of the stand (in the job parameters), stages (job steps), link to the distribution kit in Nexus.
- Ivan decided not to include Jira and BitBucket in the pipeline, because they were more related to the development stage, and not to rolling the finished distribution kit around the stands.
Based on the information available, the following scheme was drawn:
Knowing how much time distributions are created and how much time is spent on each of them, you can easily calculate the total cost of going through the entire DevOps pipeline (full cycle).
Here are the DevOps metrics that Ivan ended up with:
- Number of distributions created
- The share of distribution kits that "came" to the booth and "passed" the booth
- Time spent on the booth (booth cycle)
- Full cycle (total time for all stands)
- Job duration
- Downtime between stands
- Downtime between job launches on the same bench
On the one hand, the metrics characterized the DevOps pipeline very well in terms of time, on the other hand, they were considered very simple.
Satisfied with a job well done, Ivan made a presentation and went to present it to the management.
He came back gloomy and with lowered hands.
“This is a fiasco, bro,” the ironic colleague smiled…
Continue reading the article "
Source: habr.com