ืกืœื•ืจื: ื–ื—ืœ ื”ืคืš ืœืคืจืคืจ

ืกืœื•ืจื: ื–ื—ืœ ื”ืคืš ืœืคืจืคืจ

  1. Slurm ื‘ืืžืช ืžืืคืฉืจ ืœืš ืœื”ื™ื›ื ืก ืœื ื•ืฉื Kubernetes ืื• ืœืฉืคืจ ืืช ื”ื™ื“ืข ืฉืœืš.
  2. ื”ืžืฉืชืชืคื™ื ืžืจื•ืฆื™ื. ื™ืฉ ืจืง ืžืขื˜ื™ื ืžืืœื” ืฉืœื ืœืžื“ื• ืฉื•ื ื“ื‘ืจ ื—ื“ืฉ ืื• ืฉืœื ืคืชืจื• ืืช ื‘ืขื™ื•ืชื™ื”ื. ื”ื”ื—ื–ืจ ื”ื‘ืœืชื™ ืžื•ืชื ื” ืฉืœ ื”ื™ื•ื ื”ืจืืฉื•ืŸ ("ืื ืืชื” ืžืจื’ื™ืฉ ืฉ-Slurm ืœื ืžืชืื™ื ืœืš, ื ื—ื–ื™ืจ ืืช ืžืœื•ื ื”ืžื—ื™ืจ ืฉืœ ื”ื›ืจื˜ื™ืก") ืฉื™ืžืฉ ืจืง ืื“ื ืื—ื“, ืžื” ืฉืžืฆื“ื™ืง ืืช ื”ืขืจื›ืช ื›ื•ื—ื• ื™ืชืจ ืขืœ ื”ืžื™ื“ื”.
  3. ื”-Slurm ื”ื‘ื ื™ืชืงื™ื™ื ื‘ืชื—ื™ืœืช ืกืคื˜ืžื‘ืจ ื‘ืกื ื˜ ืคื˜ืจืกื‘ื•ืจื’. Selectel, ื ื•ืชื ืช ื”ื—ืกื•ืช ื”ืงื‘ื•ืขื” ืฉืœื ื•, ืžืกืคืงืช ืœื ืจืง ืขื ืŸ ืœื“ื•ื›ื ื™ื, ืืœื ื’ื ื—ื“ืจ ื™ืฉื™ื‘ื•ืช ืžืฉืœื”.
  4. ืื ื• ื—ื•ื–ืจื™ื ืขืœ ื”-Slurm ื”ื‘ืกื™ืกื™ (9-11 ื‘ืกืคื˜ืžื‘ืจ) ื•ืžืฆื™ื’ื™ื ืชื•ื›ื ื™ืช ื—ื“ืฉื”: DevOps Slurm (4-6 ื‘ืกืคื˜ืžื‘ืจ).

ืžื” ื–ื” Slurm ื•ืื™ืš ื–ื” ื”ืฉืชื ื”?

ืœืคื ื™ ืฉื ื” ื”ื’ืขื ื• ืœืจืขื™ื•ืŸ ืœื”ืขื‘ื™ืจ ืงื•ืจืกื™ื ื‘-Kubernetes. ื‘ืื•ื’ื•ืกื˜ 18' ื”ืชืงื™ื™ื Slurm-1: ืงืฉื”, ืขื ืคืจื–ื™ื ื˜ืฆื™ื” ืžืชืžืฉื›ืช (ื›ืฉื”ืžืฆื’ืช ืžืกืชื™ื™ืžืช ืขืœ ื”ื‘ืžื”), ืขื ืฉืœืœ ื‘ืขื™ื•ืช ื™ื•ืžื™ื•ืžื™ื•ืช. ื ื™ืกื•ื™ื™ื ืžืชืื—ื“ื™ื: ื”ืžืฉืชืชืคื™ื ื‘ืกืœืจื•ื ื”ืจืืฉื•ืŸ, ื›ืžื• ืื—ื•ื•ืช ื”ื˜ื‘ืขืช, ืขื“ื™ื™ืŸ ืžืชืงืฉืจื™ื ื–ื” ืขื ื–ื”.

ืกืœื•ืจื: ื–ื—ืœ ื”ืคืš ืœืคืจืคืจ
ื›ืš ื ืจืื” Slurm-1

ื‘-Slurm ื”ืจืืฉื•ืŸ ื ื•ืœื“ ื”ืจืขื™ื•ืŸ ืœื”ื—ื–ื™ืง MegaSlurm. ืฉืืœื ื• ืื ืฉื™ื ื‘ืื™ืœื• ื ื•ืฉืื™ื ื”ื ืžืชืขื ื™ื™ื ื™ื, ื•ื‘ืื•ืงื˜ื•ื‘ืจ ืงื™ื™ืžื ื• ืงื•ืจืก ืžืชืงื“ื "ืœืคื™ ื‘ืงืฉืช ื”ืžืฉืชืชืคื™ื". ื–ื” ื”ืชื‘ืจืจ ื›ืื™ืจื•ืข ืžืขื ื™ื™ืŸ, ืื‘ืœ ื—ื“ ืคืขืžื™. ืขื“ ืžืื™ 19' ื”ื›ื ื• ืงื•ืจืก ืžืชืงื“ื ืืžื™ืชื™, ืขื ื”ื™ื’ื™ื•ืŸ ื•ื”ื™ืกื˜ื•ืจื™ื” ืคื ื™ืžื™ืช ืžืฉืœื•.

ื‘ืžื”ืœืš ื”ืฉื ื”, Slurm ื”ืฉืชื ื” ืžื‘ื—ื™ื ื” ืืจื’ื•ื ื™ืช:
- Docker ื•-Anisble ื”ื•ืกืจื• ืžื”ืชื•ื›ื ื™ืช ื”ืจืืฉื™ืช ื•ื ืขืฉื• ืงื•ืจืกื™ื ืžืงื•ื•ื ื™ื ื ืคืจื“ื™ื.
- ืชืžื™ื›ื” ื˜ื›ื ื™ืช ืžืื•ืจื’ื ืช ื”ืžืกื™ื™ืขืช ืœืชืœืžื™ื“ื™ื ืœืคืชื•ืจ ืชืงืœื•ืช ื‘ืืฉื›ื•ืœื•ืช ืœืžื™ื“ื”.
- ื”ื“ื•ื‘ืจื™ื ื–ื•ื›ื™ื ื›ืขืช ืœืชืžื™ื›ื” ืžืชื•ื“ื•ืœื•ื’ื™ืช.

ืกืœื•ืจื: ื–ื—ืœ ื”ืคืš ืœืคืจืคืจ
ื”ืฆื•ื•ืช ืฉื™ืฆืจ ืืช Slurm 4

ืžืฉื•ื‘ ืžื”ืžืฉืชืชืคื™ื

ืฉื™ื ื ื•ืกืฃ ื ืงื‘ืข: 170 ืžืฉืชืชืคื™ื ื‘ืกืœืจื ื”ื‘ืกื™ืกื™, 75 ื‘-MegaSlurm.

ืกืœื•ืจื: ื–ื—ืœ ื”ืคืš ืœืคืจืคืจ

Slurm-4
101 ืžืชื•ืš 170 ืื ืฉื™ื ืžื™ืœืื• ืืช ื˜ื•ืคืก ื”ืžืฉื•ื‘.

ื”ืื Kubernetes ื”ืชื‘ื”ืจ?
41 - ืื ื™ ืขื“ื™ื™ืŸ ืœื ืžื‘ื™ืŸ K8s, ืื‘ืœ ืื ื™ ืจื•ืื” ืื™ืคื” ืœื—ืคื•ืจ.
36 - ืœื ื”ื›ืจืชื™ K8s ืœืคื ื™ ื›ืŸ, ืื‘ืœ ืขื›ืฉื™ื• ื”ื‘ื ืชื™ ืืช ื–ื”.
23 - ื”ื›ืจืชื™ K8s ื‘ืขื‘ืจ, ืื‘ืœ ืขื›ืฉื™ื• ืื ื™ ื™ื•ื“ืข ื˜ื•ื‘ ื™ื•ืชืจ.
1 - ืœื ืœืžื“ืชื™ ืฉื•ื ื“ื‘ืจ ื—ื“ืฉ.
0 - ืœื ื”ื‘ื ืชื™ ื›ืœื•ื ืœื’ื‘ื™ k8s.

ืื™ืš ืืชื” ืื•ื”ื‘ ืืช ื”ืขื•ืฆืžื” ืฉืœ Slurm?

16 ืื ืฉื™ื ื—ื•ืฉื‘ื™ื ืฉืกืœื•ืจื ืงืœ ื•ืื™ื˜ื™ ืžื“ื™, ื•-14 ืื ืฉื™ื ื—ื•ืฉื‘ื™ื ืฉื–ื” ืงืฉื” ื•ืžื”ื™ืจ ืžื“ื™. ืžืชืื™ื ื‘ื“ื™ื•ืง ืœืฉืืจ.

ื”ืื ืคืชืจืช ืืช ื”ื‘ืขื™ื” ืฉืื™ืชื” ื”ืชื›ื•ื•ื ืช ืœืกืœืจื?

90 - ื›ืŸ.
11 - ืœื.

MegaSlurm

40 ืื ืฉื™ื ืžื™ืœืื• ืืช ื˜ื•ืคืก ื”ืžืฉื•ื‘. 2 ืื ืฉื™ื ืืžืจื• ืฉื–ื” ืงืœ ื•ืื™ื˜ื™ ืžื“ื™. ืื“ื ืื—ื“ ืœื ืคืชืจ ืืช ื”ื‘ืขื™ื” ืฉืื™ืชื” ื”ื•ื ื”ื•ืœืš ืœืžื’ื”. ื”ืฉืืจ ื‘ืกื“ืจ.

ืกืงื™ืจื” ืฉืœ Slurm ื‘-https://serveradmin.ru

ื‘ื™ืงื•ืจื•ืช ื“ื•ื‘ืจื™ื

ืกืœื•ืจื: ื–ื—ืœ ื”ืคืš ืœืคืจืคืจ

ืื ื‘ืกืœืจื•ื ืกื ื˜ ืคื˜ืจืกื‘ื•ืจื’ ื‘ืคื‘ืจื•ืืจ ื”ื™ื• ื‘ืขื™ืงืจ ืžืชื—ื™ืœื™ื, ืื– ื‘ืกืœืจื•ื ื‘ืžื•ืกืงื‘ื” ืื ืฉื™ื ื‘ื›ืžื•ืช ื’ื“ื•ืœื” ื›ื‘ืจ ื ื™ืกื• ืืช Kubernetes. ื”ื™ื• ื”ืจื‘ื” ืฉืืœื•ืช ืžืชืงื“ืžื•ืช ืฉื’ืจืžื• ืœืš ืœื—ืฉื•ื‘.

ืื ื‘ืกื ื˜ ืคื˜ืจืกื‘ื•ืจื’ ืฉืืœื• ืžืชื™ ื ืคืจืกื ืืช ืžื–ืœื’ ื”ืงื•ื‘ืกืคืจื™ื™ ืฉืœื ื•, ืื– ื‘ืžื•ืกืงื‘ื” ื›ื‘ืจ ืฉืืœื• ืœืžื” ืื ื—ื ื• ืžืฆื™ืขื™ื ืœื”ืฉืชืžืฉ ื‘ืžื–ืœื’ ืฉืœื ื• ื•ืœื ืœืงื—ืช ืืช ื”ืงื•ื‘ืกืคืจื™ื™ ื”ืžืงื•ืจื™. ื–ื• ื›ื‘ืจ ื”ื—ืฉื™ื‘ื” ื”ื‘ื™ืงื•ืจืชื™ืช ืฉืœ ืงืฉื™ืฉื™ื ื‘ื™ื ื•ื ื™ื™ื.

ื”ืชืจื’ื•ืœ ื”ื™ื” ืงืฉื”, ืื ืฉื™ื ืขืฉื• ื”ืจื‘ื” ื˜ืขื•ื™ื•ืช, ื•ื–ื” ื ื”ื“ืจ: ืฆืจื™ืš ืœืขืฉื•ืช ื˜ืขื•ื™ื•ืช ืชื•ืš ื›ื“ื™ ืœื™ืžื•ื“, ื•ืœื ื‘ืงืจื‘.

ื ืชืงืœื ื• ื‘ืงื‘ื™ืขื•ืช ื‘ืžื’ื‘ืœื•ืช ืขืœ ื”ืฉื’ืช ืื™ืฉื•ืจื™ื, ืžื’ื‘ืœื•ืช ืขืœ ื”ื•ืจื“ื” ืž-Github ื•ื›ื•'. ืืœื• ื”ื ื”ื—ื™ื™ื - ืคืจืกื ื• ื‘ืžืงื‘ื™ืœ ื›-200 ืืฉื›ื•ืœื•ืช ื‘ืขื ืŸ Selectel. ืืฃ ืื—ื“ ืœื ืžื›ื™ืŸ ืืช ื”ืžืฉืื‘ื™ื ื•ื”ื’ื‘ื•ืœื•ืช ืฉืœื• ืœื›ืš.

ื”ื›ืจื–ื” ืขืœ Slurm ื‘-Selectel

โ†’ ื”ืจืฉืžื” ืœ-Slurm-5
ืžื—ื™ืจ: 25 โ‚ฝ

ื‘ืชื›ื ื™ืช:

ื ื•ืฉื ืžืก' 1: ืžื‘ื•ื ืœ-Kubernetes, ืžืจื›ื™ื‘ื™ื ืขื™ืงืจื™ื™ื
- ืžื‘ื•ื ืœื˜ื›ื ื•ืœื•ื’ื™ื™ืช k8s. ืชื™ืื•ืจ, ื™ื™ืฉื•ื, ืžื•ืฉื’ื™ื
- Pod, ReplicaSet, Deployment, Service, Ingress, PV, PVC, ConfigMap, Secret

ื ื•ืฉื ืžืก' 2: ืขื™ืฆื•ื‘ ืืฉื›ื•ืœ, ืจื›ื™ื‘ื™ื ืขื™ืงืจื™ื™ื, ืกื•ื‘ืœื ื•ืช ืชืงืœื•ืช, ืจืฉืช k8s
- ืขื™ืฆื•ื‘ ืืฉื›ื•ืœ, ืจื›ื™ื‘ื™ื ืขื™ืงืจื™ื™ื, ืกื‘ื™ืœื•ืช ืœืชืงืœื•ืช
- ืจืฉืช k8s

ื ื•ืฉื ืžืก' 3: Kubespray, ื›ื•ื•ื ื•ืŸ ื•ื”ืงืžืช ืืฉื›ื•ืœ Kubernetes
- Kubespray, ืชืฆื•ืจื” ื•ื›ื•ื•ื ื•ืŸ ืฉืœ ืืฉื›ื•ืœ Kubernetes

ื ื•ืฉื ืžืก' 4: ื”ืคืฉื˜ื•ืช ืžืชืงื“ืžื•ืช ืฉืœ Kubernetes
- DaemonSet, StatefulSet, RBAC, Job, CronJob, Pod Scheduling, InitContainer

ื ื•ืฉื ืžืก' 5: ืฉื™ืจื•ืชื™ ืคืจืกื•ื ื•ื™ื™ืฉื•ืžื™ื
โ€” ืกืงื™ืจื” ื›ืœืœื™ืช ืฉืœ ืฉื™ื˜ื•ืช ืคืจืกื•ื ืฉื™ืจื•ืช: NodePort ืœืขื•ืžืช LoadBalancer ืœืขื•ืžืช Ingress
- ื‘ืงืจ ื›ื ื™ืกื” (Nginx): ืื™ื–ื•ืŸ ืชืขื‘ื•ืจื” ื ื›ื ืกืช
โ€” ะกert-manager: ื”ืฉื’ ืื•ื˜ื•ืžื˜ื™ืช ืชืขื•ื“ื•ืช SSL/TLS

ื ื•ืฉื ืžืก' 6: ืžื‘ื•ื ืœื”ืœื

ื ื•ืฉื ืžืก' 7: ื”ืชืงื ืช cert-manager

ื ื•ืฉื #8: Ceph: ื”ืชืงื ืช "ืขืฉื” ื›ืžื•ื ื™".

ื ื•ืฉื ืžืก' 9: ืจื™ืฉื•ื ื•ื ื™ื˜ื•ืจ
- ื ื™ื˜ื•ืจ ืืฉื›ื•ืœื•ืช, ืคืจื•ืžืชืื•ืก
- ืจื™ืฉื•ื ืืฉื›ื•ืœื•ืช, Fluentd/Elastic/Kibana

ื ื•ืฉื ืžืก' 10: ืขื“ื›ื•ืŸ ืืฉื›ื•ืœ

ื ื•ืฉื ืžืก' 11: ืขื‘ื•ื“ื” ืžืขืฉื™ืช, ืขื™ื’ื•ืŸ ื™ื™ืฉื•ืžื™ื ื•ื”ืฉืงื” ืœืืฉื›ื•ืœ

ืงื•ืจืกื™ื ืขืœ Docker ื•-Ansible ื‘ืืชืจ stepik.org ื›ืœื•ืœื™ื ื‘ืžื—ื™ืจ.

โ†’ ื”ืจืฉืžื” ืœ-Slurm DevOps
ืžื—ื™ืจ: 45 โ‚ฝ

ื‘ืชื›ื ื™ืช:

ื ื•ืฉื ืžืก' 1: ืžื‘ื•ื ืœ-Git
- ืคืงื•ื“ื•ืช ื‘ืกื™ืกื™ื•ืช git init, commit, add, diff, log, status, pull, push
- ื”ืงืžืช ื”ืกื‘ื™ื‘ื” ื”ืžืงื•ืžื™ืช: ื”ืžืœืฆื•ืช ืžืขืฉื™ื•ืช
- ื–ืจื™ืžืช Git, ืขื ืคื™ื ื•ืชื’ื™ื, ืืกื˜ืจื˜ื’ื™ื•ืช ืžื™ื–ื•ื’
- ืขื‘ื•ื“ื” ืขื ืจื™ืคื• ืžืจื—ื•ืง ืžืจื•ื‘ื™ื

ื ื•ืฉื ืžืก' 2: ืขื‘ื•ื“ืช ืฆื•ื•ืช ืขื Git
- ื–ืจื™ืžืช GitHub
- ืœื‘ืงืฉ ืžื–ืœื’, ืœื”ืกื™ืจ, ืœืžืฉื•ืš
- ืงื•ื ืคืœื™ืงื˜ื™ื, ืฉื—ืจื•ืจื™ื, ืฉื•ื‘ ืขืœ Gitflow ื•ื–ืจื™ืžื•ืช ืื—ืจื•ืช ื‘ื™ื—ืก ืœืฆื•ื•ืชื™ื

ื ื•ืฉื ืžืก' 3: CI/CD ืžื‘ื•ื ืœืื•ื˜ื•ืžืฆื™ื”
- ืื•ื˜ื•ืžืฆื™ื” ื‘-git (ื‘ื•ื˜ื™ื, ืžื‘ื•ื ืœ-CI, ื”ื•ืงืก)
- ื›ืœื™ื (ืœื‘ืฉ, ืœืขืฉื•ืช, ืœื“ืจื’)
- ืคืกื™ ื™ื™ืฆื•ืจ ื‘ืžืคืขืœ ื•ื™ื™ืฉื•ืžื ื‘-IT

ื ื•ืฉื #4: CI/CD: ืขื‘ื•ื“ื” ืขื Gitlab
- ืœื‘ื ื•ืช, ืœื‘ื“ื•ืง, ืœืคืจื•ืก
- ืฉืœื‘ื™ื, ืžืฉืชื ื™ื, ื‘ืงืจืช ื‘ื™ืฆื•ืข (ืจืง, ื›ืืฉืจ, ื›ื•ืœืœื™ื)

ื ื•ืฉื ืžืก' 5: ืขื‘ื•ื“ื” ืขื ื”ืืคืœื™ืงืฆื™ื” ืžื ืงื•ื“ืช ืžื‘ื˜ ืฉืœ ืคื™ืชื•ื—
- ืื ื• ื›ื•ืชื‘ื™ื ืฉื™ืจื•ืช ืžื™ืงืจื• ื‘-Python (ื›ื•ืœืœ ื‘ื“ื™ืงื•ืช)
- ืฉื™ืžื•ืฉ ื‘-docer-compose ื‘ืคื™ืชื•ื—

ื ื•ืฉื ืžืก' 6: ืชืฉืชื™ืช ื›ืงื•ื“
- IaC: ื’ื™ืฉื” ืœืชืฉืชื™ืช ื›ืงื•ื“
- IaC ื‘ืืžืฆืขื•ืช Terraform ื›ื“ื•ื’ืžื”
- IaC ื‘ืืžืฆืขื•ืช Ansible ื›ื“ื•ื’ืžื”
- ืื™ืžืคื•ื˜ื ื˜ื™ื•ืช, ื”ืฆื”ืจื•ืช
- ืชืจื’ืœ ื™ืฆื™ืจืช ืกืคืจื™ ืžืฉื—ืง ืฉืœ Ansible
- ืื—ืกื•ืŸ ืชืฆื•ืจื”, ืฉื™ืชื•ืฃ ืคืขื•ืœื”, ืื•ื˜ื•ืžืฆื™ื” ืฉืœ ื™ื™ืฉื•ืžื™ื

ื ื•ืฉื ืžืก' 7: ื‘ื“ื™ืงืช ืชืฉืชื™ืช
- ื‘ื“ื™ืงื” ื•ืื™ื ื˜ื’ืจืฆื™ื” ืžืชืžืฉื›ืช ืขื ืžื•ืœืงื•ืœื” ื•-Gitlab CI

ื ื•ืฉื ืžืก' 8: ืื•ื˜ื•ืžืฆื™ื” ืฉืœ ื”ืขืœืืช ืฉืจืชื™ื
- ืื™ืกื•ืฃ ืชืžื•ื ื•ืช
- PXE ื•-DHCP

ื ื•ืฉื ืžืก' 9: ืื•ื˜ื•ืžืฆื™ื” ืฉืœ ืชืฉืชื™ื•ืช
โ€” ื“ื•ื’ืžื” ืœืฉื™ืจื•ืช ืชืฉืชื™ืช ืœื”ืจืฉืื” ื‘ืฉืจืชื™ื
- ChatOps (ืฉื™ืœื•ื‘ ืฉืœ ืฉืœื™ื—ื™ื ืžื™ื™ื“ื™ื™ื ืขื ืฆื™ื ื•ืจื•ืช)

ื ื•ืฉื ืžืก' 10: ืื•ื˜ื•ืžืฆื™ื” ืฉืœ ืื‘ื˜ื—ื”
- ื—ืชื™ืžื” ืขืœ ื—ืคืฆื™ CI/CD
- ืกืจื™ืงืช ืคื’ื™ืขื•ืช

ื ื•ืฉื ืžืก' 11: ื ื™ื˜ื•ืจ
- ื”ื’ื“ืจื” ืฉืœ SLA, SLO, Error Budget ื•ืžื•ื ื—ื™ื ืžืคื—ื™ื“ื™ื ืื—ืจื™ื ืžืขื•ืœื SRE
- SRE: ืชืจื’ื•ืœ ื ื™ื˜ื•ืจ SLI ื•-SLO
- SRE: ืชืจื’ื•ืœ ืฉืœ ืฉื™ืžื•ืฉ ื‘ืชืงืฆื™ื‘ ืฉื’ื™ืื”
- SRE: ื ื™ื”ื•ืœ ืขื•ืžืกื™ื ืฉืœ ื”ืคืกืงื•ืช ื•ืชืคืขื•ืœ (ืืคื™ื’ืื˜, ืจืฉืช ืฉื™ืจื•ืช, ืžืคืกืงื™ ื–ืจื)
- ื ื™ื˜ื•ืจ ืฆื™ื ื•ืจื•ืช ื•ืžื“ื“ื™ ืคื™ืชื•ื—

ืžืงื•ืจ: www.habr.com

ื”ื•ืกืคืช ืชื’ื•ื‘ื”