Slurm DevOps. 3rd day. ELK, ChatOps, SRE. And the developer's secret prayer

The third and final day of the first, but not the last DevOps Slurm has arrived.

We did not expect to be able to repeat DevOps Slurm. But unexpectedly for us, all the speakers agreed to come to Slurm in February, and the feedback showed how exactly to finalize the program. There is an understanding of how to make the intensive program more holistic and detailed, and some topics more practical. So in February we are going to hold a DevOps Slurm in Moscow. Details will be closer to December. The announcement will definitely appear on Habré.

Slurm DevOps. 3rd day. ELK, ChatOps, SRE. And the developer's secret prayer

On September 6, the third day of Slurm, four speakers spoke.

Vladimir Guryanov, engineer/team leader at Southbridge, whose speech on the second day of the DevOps Slurm was very popular with the participants of the intensive. Vladimir is an active supporter of the DevOps approach in his work, he tries to implement it everywhere.

Pavel Selivanov, recognized Slurm star, mastermind of the first Kubernetes Slurm. About him, students wrote that "it would be great if he led the entire program." Pavel is a Certified Kubernetes Administrator. He has vast practical experience in implementing Kubernetes - more than 25 projects in a team and individually.

Eduard Medvedev, CTO at Tungsten Labs, developed and implemented ChatOps for data center automation. After his speech at Slurm, many participants thought about implementing ChatOps in their companies. Now he is a successful security consultant.

Ivan Kruglov, Principal Developer at Booking.com, is a real guest star of the conference. It was for the sake of his speech that some participants signed up for Slurm DevOps. At Booking.com, he was involved in such infrastructure projects as distributed message delivery and processing, BigData and web-stack, and search. Now on the list of his tasks is building an internal cloud and Service Mesh.

We took extensive interviews with Eduard Medvedev and Ivan Kruglov - when ready, we will publish on Habré.

Slurm DevOps. 3rd day. ELK, ChatOps, SRE. And the developer's secret prayer

The audience showed a slight fatigue with all their thoughtful appearance. The previous two days of the intensive were forced to work at the limit, the heads required rest and days off. But the topics and speakers of the third day dispersed fatigue and slumber. Especially Site Reliability Engineering and Ivan Kruglov.

Under completion second day of slurm it was decided to move infrastructure monitoring from Prometheus to tomorrow. The intensive turned out to be too intense - not all participants kept up the pace.

Slurm DevOps. 3rd day. ELK, ChatOps, SRE. And the developer's secret prayer

And so the third day began with a speech by Vladimir Guryanov. He briefly explained why monitoring is actually needed. Described and classified types of monitoring. Raised the issue of notifications in monitoring.

The topics “How to build a healthy monitoring system” and “Human-readable notifications” entered the audience very vividly. Vladimir concluded his presentation with the Health Check topic, what you should pay attention to and how to equip automation based on monitoring data.

Slurm DevOps. 3rd day. ELK, ChatOps, SRE. And the developer's secret prayer

In order to stir up sleepy participants and activate their learning abilities to the maximum, following Vladimir Guryanov, Pavel Selivanov captured the attention of the public with the topic “Logging an application with ELK”. He showed the Slurm participants our logging best practices and reviewed the ELK stack.

After the first coffee break, full of communication and cookies, Slurm participants took their places in the audience.

The performances of Guryanov, Selivanov and the purine alkaloid caffeine did their insidious work. Caffeine got to the adenosine receptors of the brain, replacing the purine nucleoside adenosine there, which is responsible for the processes of inhibition - which simply deprived the Slurm participants of the chance to be “lazy” and “take a nap”. Not everyone understood what happened. But everyone cheered up.

Thus, the audience was one hundred percent ready for further learning and active absorption of knowledge. And to the speech of Eduard Medvedev.

Slurm DevOps. 3rd day. ELK, ChatOps, SRE. And the developer's secret prayer

Eduard spoke on the topic of infrastructure automation with ChatOps, talked about the integration of messengers with pipelines.

Slurm DevOps. 3rd day. ELK, ChatOps, SRE. And the developer's secret prayer

The finale of the third day of Slurm and Slurm DevOps as a whole was a presentation by Ivan Kruglov, Principal Developer at Booking.com. Ivan immediately captured the attention of the audience by confessing that he had more than 140 slides in his presentation, thus gently hinting that the Slurm participants should not make plans either for Friday itself or for the weekend.

Slurm DevOps. 3rd day. ELK, ChatOps, SRE. And the developer's secret prayer

In an intensive, lengthy and deep speech, Ivan Kruglov touched on the topic of DevOps and SRE, who they are to each other, how they relate. He talked about "terrible terms from the world of SRE": SLA, SLO, Error Budget and some others.

Slurm DevOps. 3rd day. ELK, ChatOps, SRE. And the developer's secret prayer

Slurm DevOps. 3rd day. ELK, ChatOps, SRE. And the developer's secret prayer

Then practice and even more practice went - monitoring SLI and SLO, applying Error Budget and managing interrupts and operational load (apigateway, service mesh, circuit brackers). And much, much more.

Slurm DevOps. 3rd day. ELK, ChatOps, SRE. And the developer's secret prayer

Slurm DevOps. 3rd day. ELK, ChatOps, SRE. And the developer's secret prayer
Developer's secret prayer.

Since the topic of SRE is extremely extensive and you can talk about the nuances for at least a few days, it was decided that in February at the next DevOps Slurm we will devote even more time to SRE and its practical application, as the most relevant and in-demand technology.

Sabbath, [6 сент. 2019 г., 18:25:30]:
Шикарный доклад!!
Я теперь думаю, что букинг по крутизне не уступают гуглу :)

aaa, [6 сент. 2019 г., 18:27:07]:
еще осталось UIUX подтянуть

mr. Dmitry, [6 сент. 2019 г., 18:28:47]:
Ага, сколько докладов слышал от спецов букинга - все круто, все четко, все по уму. Но пользоваться из-за их гуя крайне сложно

After the speeches, the turn of numerous questions came, both offline and in the working chat of Slurm:

Владимир Гурьянов, [6 сент. 2019 г., 23:24:54]:
Спрашивали про мониторинг, сколько items у нас.
Не забыл, отвечаю.
Активных: 297 432

Maksim Aleksandrov, [7 сент. 2019 г., 0:11:58]:
Спасибо . Это какое количество проверок в секунду (nvps) ?  И почему все таки prometheus ?

Владимир Гурьянов, [7 сент. 2019 г., 0:24:15]:
2.21K 
Почему prometheus? Ну, хотя бы из-за service discovery и его удобной и гибкой настройки.
У zabbix плохо все в средах, где инстансы не долго живут и часто создаются новые.
С мониторингом docker и k8s у zabbix все тоже грустно.
Но для нас, пока + у прома не столько, что бы вкладывать время и силы в переезд с zabbix.

Slurm participants shared their impressions:

Alexander B, [6 сент. 2019 г., 21:11:03]:
Спасибо за мероприятие, были "неровности", но для первого раза весьма достойно. 
Темп в некоторых практиках напрягал, это интенсив во всех смыслах этого слова ) Чтобы уместить всё и не выкидывать во второй и третий день из докладов и практик материалы по причине нехватки времени - рассмотрите возможность четырехдневного слёрма.


Roman D, [6 сент. 2019 г., 20:49:05]:
спасибо, местами было интересно. В качестве пожелания на будущее - за пару дней до мероприятия посадите пару человек с улицы и заставьте их пройти практику по вашим инструкциям, исправите ошибки и неточности.

Никита Суворов, [6 сент. 2019 г., 20:49:30 (06.09.2019, 20:50:07)]:
Если пол пожелания, тоже есть - спикерам тренироваться перед зеркалом, слух режут эээ, уууу, ыыы между словами


Max Grechnev, [6 сент. 2019 г., 19:42:57]:
Спасибо! Курс получился отличный! Финал вообще огонь)

Smith Wesson, [6 сент. 2019 г., 19:58:11]:
Спасибо за курс! Вы лучшие!

Igor Averin, [6 сент. 2019 г., 19:58:12]:
Согласен! Было оч здорово! Спасибо организаторам!

After the conference, we asked participants to leave feedback in the form of Google Docs. The results pleased and inspired us.

Slurm DevOps. 3rd day. ELK, ChatOps, SRE. And the developer's secret prayer
Slurm DevOps. 3rd day. ELK, ChatOps, SRE. And the developer's secret prayer
Slurm DevOps. 3rd day. ELK, ChatOps, SRE. And the developer's secret prayer

Thanks to everyone who was with us - offline, in the Selectel conference room, and online. And thanks a lot to the readers of Habr. "Slurm inspires!"(With)

Source: habr.com

Add a comment