Slurm: Kubernetes ืื™ื ื˜ื ืกื™ื‘ื™. ืชื•ื›ื ื™ืช ื•ื‘ื•ื ื•ืกื™ื

ื‘ืชืืจื™ื›ื™ื 27-29 ื‘ืžืื™ ืื ื• ืขื•ืจื›ื™ื ืืช ื”ืกืœืจื•ื ื”ืจื‘ื™ืขื™: ืื™ื ื˜ื ืกื™ื‘ื™ ืขืœ Kubernetes.

Slurm: Kubernetes ืื™ื ื˜ื ืกื™ื‘ื™. ืชื•ื›ื ื™ืช ื•ื‘ื•ื ื•ืกื™ื

ื‘ื•ื ื•ืก: ืงื•ืจืกื™ื ืžืงื•ื•ื ื™ื ืขืœ Docker, Ansible, Ceph
ื ื’ื–ืจื ื• ืž-Slurm ื ื•ืฉืื™ื ืฉื—ืฉื•ื‘ื™ื ืœืขื‘ื•ื“ื” ืขื Kubernetes, ืืš ืื™ื ื ืงืฉื•ืจื™ื ื™ืฉื™ืจื•ืช ืœ-k8s. ืื™ืš, ืœืžื” ื•ืžื” ืงืจื” - ืžืชื—ืช ืœื’ื–ืจื”.
ืœื›ืœ ืžืฉืชืชืคื™ Slurm 4 ืชื”ื™ื” ื’ื™ืฉื” ืœืงื•ืจืกื™ื ืืœื•.

ื”ื—ื–ืจ ื›ืกืคื™ ืžืœื ื‘ื™ื•ื ื”ืจืืฉื•ืŸ
ื‘ืกืœืจื•ื ืกื ื˜ ืคื˜ืจื‘ื•ืจื’ ื™ืฆืื• ืฉื ื™ ืžืฉืชืชืคื™ื ื‘ื™ืงื•ืจื•ืช ืฉืœื™ืœื™ื•ืช ื‘ื™ื•ืชืจ. ื›ืžื” ื”ืฆื˜ืขืจืชื™ ืฉืื™ ืืคืฉืจ ืœื—ื–ื•ืจ ืื—ื•ืจื” ื‘ื–ืžืŸ ื•ืœื”ื™ืคืจื“ ืžื”ื ื‘ืœื™ ื˜ืขื ื•ืช ื”ื“ื“ื™ื•ืช.
ืื ืชื’ืœื” ืžื” ืืชื” ืžืžืฉ ืœื ืื•ื”ื‘ ื‘ืกืœืจื, ื”ื™ื•ื ื”ืจืืฉื•ืŸ ืœื›ืชื•ื‘ ืœื›ืœ ืื—ื“ ืžื”ืžืืจื’ื ื™ื. ืื ื• ื ืฉื‘ื™ืช ืืช ื”ื’ื™ืฉื” ื•ื ื—ื–ื™ืจ ืืช ืžืœื•ื ืžื—ื™ืจ ื”ื”ืฉืชืชืคื•ืช.

ื™ื•ืขืฆื™ื ื˜ื›ื ื™ื™ื
ืื ืžื™ืฉื”ื• ื™ื•ื“ืข ื“ืžื™ื˜ืจื™ ืกื™ืžื•ื ื•ื‘ (ื”ื•ื ื”ืงื™ื ืžื•ืขื“ื•ืŸ ืฉืœ ืžื ื”ืœื™ื ื˜ื›ื ื™ื™ื), ื”ื–ืžื ื• ืื•ืชื• ืœืกืœืจื (ืœืœืžื•ื“, ืœื ืœื”ื•ืคื™ืข). ื”ื•ื ื”ื‘ื˜ื™ื— ืœื™ื™ืขืฅ ืœื›ื•ืœื. ืกื‘ื™ืจ ืœื”ื ื™ื— ืฉื–ื” ืœื ื™ืขื ื™ื™ืŸ ืžื ื”ืœื™ื ื•ืžืคืชื—ื™ื, ืื‘ืœ ื–ื” ื™ื”ื™ื” ืžืื•ื“ ืžืขื ื™ื™ืŸ ืœืžื ื”ืœื™ IT.

ืžื” ื–ื” Slurm

Slurm: Kubernetes ืื™ื ื˜ื ืกื™ื‘ื™. ืชื•ื›ื ื™ืช ื•ื‘ื•ื ื•ืกื™ื

Slurm-4: ืงื•ืจืก ื‘ืกื™ืกื™ (27-29 ื‘ืžืื™)
ืžื™ื•ืขื“ ืœืืœื• ืฉืจื•ืื™ื ืืช Kubernetes ื‘ืคืขื ื”ืจืืฉื•ื ื” ืื• ืจื•ืฆื™ื ืœืขืฆื‘ ืืช ื”ื™ื“ืข ืฉืœื”ื.
ื›ืœ ืžืฉืชืชืฃ ื™ืฆื•ืจ ืืฉื›ื•ืœ ืžืฉืœื• ื‘ืขื ืŸ Selectel ื•ื™ืคืจื•ืก ืืช ื”ืืคืœื™ืงืฆื™ื” ืฉื.

ืžื—ื™ืจ: 25 ืืœืฃ

ืชื›ื ื™ืช

ื ื•ืฉื ืžืก' 1: ืžื‘ื•ื ืœ-Kubernetes, ืžืจื›ื™ื‘ื™ื ืขื™ืงืจื™ื™ื
โ€ข ืžื‘ื•ื ืœื˜ื›ื ื•ืœื•ื’ื™ื™ืช k8s. ืชื™ืื•ืจ, ื™ื™ืฉื•ื, ืžื•ืฉื’ื™ื
โ€ข Pod, ReplicaSet, Deployment, Service, Ingress, PV, PVC, ConfigMap, Secret
โ€ข ืชืจื’ื•ืœ

ื ื•ืฉื ืžืก' 2: ืขื™ืฆื•ื‘ ืืฉื›ื•ืœ, ืจื›ื™ื‘ื™ื ืขื™ืงืจื™ื™ื, ืกื•ื‘ืœื ื•ืช ืชืงืœื•ืช, ืจืฉืช k8s
โ€ข ืขื™ืฆื•ื‘ ืืฉื›ื•ืœ, ืจื›ื™ื‘ื™ื ืขื™ืงืจื™ื™ื, ืกื‘ื™ืœื•ืช ืœืชืงืœื•ืช
โ€ข ืจืฉืช k8s

ื ื•ืฉื ืžืก' 3: Kubespray, ื›ื•ื•ื ื•ืŸ ื•ื”ืงืžืช ืืฉื›ื•ืœ Kubernetes
โ€ข Kubespray, ืงื•ื ืคื™ื’ื•ืจืฆื™ื” ื•ื›ื•ื•ื ื•ืŸ ืฉืœ ืืฉื›ื•ืœ Kubernetes
โ€ข ืชืจื’ื•ืœ

ื ื•ืฉื #4: Ceph, ื”ื’ื“ืจืช ืืฉื›ื•ืœ ื•ืชื›ื•ื ื•ืช ืฉืœ ืขื‘ื•ื“ื” ื‘ื™ื™ืฆื•ืจ
โ€ข Ceph, ื”ื’ื“ืจืช ืืฉื›ื•ืœ ื•ืชื›ื•ื ื•ืช ืฉืœ ืขื‘ื•ื“ื” ื‘ื™ื™ืฆื•ืจ
โ€ข ืชืจื’ื•ืœ: ื”ืงืžืช ceph

ื ื•ืฉื ืžืก' 5: ื”ืคืฉื˜ื•ืช ืžืชืงื“ืžื•ืช ืฉืœ Kubernetes
โ€ข DaemonSet, StatefulSet, RBAC, Job, CronJob, Pod Scheduling, InitContainer

ื ื•ืฉื ืžืก' 6: ืžื‘ื•ื ืœื”ืœื
โ€ข ืžื‘ื•ื ืœื”ืœื
โ€ข ืชืจื’ื•ืœ

ื ื•ืฉื ืžืก' 7: ืฉื™ืจื•ืชื™ ืคืจืกื•ื ื•ื™ื™ืฉื•ืžื™ื
โ€ข ืกืงื™ืจื” ื›ืœืœื™ืช ืฉืœ ืฉื™ื˜ื•ืช ืคืจืกื•ื ืฉื™ืจื•ืช: NodePort ืœืขื•ืžืช LoadBalancer ืœืขื•ืžืช Ingress
โ€ข ื‘ืงืจ ื›ื ื™ืกื” (Nginx): ืื™ื–ื•ืŸ ืชืขื‘ื•ืจื” ื ื›ื ืกืช
โ€ข ะกert-manager: ื”ืฉื’ ืื•ื˜ื•ืžื˜ื™ืช ืชืขื•ื“ื•ืช SSL/TLS
โ€ข ืชืจื’ื•ืœ

ื ื•ืฉื ืžืก' 8: ืจื™ืฉื•ื ื•ื ื™ื˜ื•ืจ
โ€ข ื ื™ื˜ื•ืจ ืืฉื›ื•ืœื•ืช, ืคืจื•ืžืชืื•ืก
โ€ข ืจื™ืฉื•ื ืืฉื›ื•ืœื•ืช, Fluentd/Elastic/Kibana
โ€ข ืชืจื’ื•ืœ

ื ื•ืฉื ืžืก' 9: CI/CD, ื‘ื ื™ื™ืช ืคืจื™ืกื” ืœืืฉื›ื•ืœ ืžืืคืก

ื ื•ืฉื ืžืก' 10: ืขื‘ื•ื“ื” ืžืขืฉื™ืช, ืขื™ื’ื•ืŸ ื™ื™ืฉื•ืžื™ื ื•ื”ืฉืงื” ืœืืฉื›ื•ืœ

ืืชืจ ืกืœืจื

MegaSlurm: ืงื•ืจืก ืžืชืงื“ื (31 ื‘ืžืื™ - 2 ื‘ื™ื•ื ื™)
ืžื™ื•ืขื“ ืœืžื”ื ื“ืกื™ื ื•ืื“ืจื™ื›ืœื™ื ืฉืœ Kubernetes, ื›ืžื• ื’ื ืœื‘ื•ื’ืจื™ ืงื•ืจืก ื™ืกื•ื“.
ืื ื• ืžื’ื“ื™ืจื™ื ืืช ื”ืืฉื›ื•ืœ ื›ืš ืฉื™ืฉื™ืง ื‘ื•-ื–ืžื ื™ืช ืืช ื”ืขื“ื›ื•ืŸ ืฉืœ ืจื›ื™ื‘ื™ ื”ืืฉื›ื•ืœ ื•ืคืจื™ืกื” ืœืืฉื›ื•ืœ.

ืžื—ื™ืจ: 60 ืืœืฃ (45 ืืœืฃ ืœืžืฉืชืชืคื™ Slurm-4)

ืชื›ื ื™ืช

ื ื•ืฉื ืžืก' 1: ืชื”ืœื™ืš ื™ืฆื™ืจืช ืืฉื›ื•ืœ ื›ืฉืœ ืžื‘ืคื ื™ื
โ€ข ืขื‘ื•ื“ื” ืขื Kubespray
โ€ข ื”ืชืงื ืช ืจื›ื™ื‘ื™ื ื ื•ืกืคื™ื
โ€ข ื‘ื“ื™ืงืช ืืฉื›ื•ืœื•ืช ื•ืคืชืจื•ืŸ ืชืงืœื•ืช
โ€ข ืชืจื’ื•ืœ

ื ื•ืฉื ืžืก' 2: ื”ืจืฉืื” ื‘ืืฉื›ื•ืœ ื‘ืืžืฆืขื•ืช ืกืคืง ื—ื™ืฆื•ื ื™
โ€ข LDAP (Nginx + Python)
โ€ข OIDC (Dex + Gangway)
โ€ข ืชืจื’ื•ืœ

ื ื•ืฉื ืžืก' 3: ืžื“ื™ื ื™ื•ืช ืจืฉืช
โ€ข ืžื‘ื•ื ืœ-CNI
โ€ข ืžื“ื™ื ื™ื•ืช ืื‘ื˜ื—ืช ืจืฉืช
โ€ข ืชืจื’ื•ืœ

ื ื•ืฉื ืžืก' 4: ื™ื™ืฉื•ืžื™ื ืžืื•ื‘ื˜ื—ื™ื ื•ื–ืžื™ื ื™ื ื‘ื™ื•ืชืจ ื‘ืืฉื›ื•ืœ
โ€ข PodSecurityPolicy
โ€ข PodDisruptionBudget

ื ื•ืฉื ืžืก' 5: Kubernetes. ื‘ื•ืื• ื ืกืชื›ืœ ืžืชื—ืช ืœืžื›ืกื” ื”ืžื ื•ืข
โ€ข ืžื‘ื ื” ื”ื‘ืงืจ
โ€ข ืžืคืขื™ืœื™ื ื•-CRDs
โ€ข ืชืจื’ื•ืœ

ื ื•ืฉื ืžืก' 6: ื™ื™ืฉื•ืžื™ื ืžืžืœื›ืชื™ื™ื ื‘ืืฉื›ื•ืœ
โ€ข ื”ืฉืงืช ืืฉื›ื•ืœ ืžืกื“ ื ืชื•ื ื™ื ื‘ืืžืฆืขื•ืช PostgreSQL ื›ื“ื•ื’ืžื”
โ€ข ื”ืฉืงืช ืืฉื›ื•ืœ RabbitMQ
โ€ข ืชืจื’ื•ืœ

ื ื•ืฉื ืžืก' 7: ืฉืžื™ืจืช ืกื•ื“ื•ืช
โ€ข ื ื™ื”ื•ืœ ืกื•ื“ื•ืช ื‘- Kubernetes
โ€ข ื›ืกืคืช

ื ื•ืฉื #8: Autoscaler Pod Horizontal
โ€ข ืชื™ืื•ืจื™ื”
โ€ข ืชืจื’ื•ืœ

ื ื•ืฉื ืžืก' 9: ื’ื™ื‘ื•ื™ ื•ืฉื—ื–ื•ืจ ืžืืกื•ืŸ
โ€ข ื’ื™ื‘ื•ื™ ื•ืฉื—ื–ื•ืจ ืืฉื›ื•ืœื•ืช ื‘ืืžืฆืขื•ืช Heptio Velero (ืœืฉืขื‘ืจ Ark) ื•ื›ื•'
โ€ข ืชืจื’ื•ืœ

ื ื•ืฉื ืžืก' 10: ืคืจื™ืกืช ื™ื™ืฉื•ืžื™ื
โ€ขืžื•ึนืš
โ€ข ื›ืœื™ ืชื‘ื ื™ืช ื•ืคืจื™ืกื”
โ€ข ืืกื˜ืจื˜ื’ื™ื•ืช ืคืจื™ืกื”

ื ื•ืฉื ืžืก' 11: ืขื‘ื•ื“ื” ืžืขืฉื™ืช
โ€ข ื‘ื ื™ื™ืช CI/CD ืœืคืจื™ืกืช ื™ื™ืฉื•ืžื™ื
โ€ข ืขื“ื›ื•ืŸ ืืฉื›ื•ืœ

ืืชืจ MegaSlurm

Docker, Ansible ื•-Ceph

Slurm: Kubernetes ืื™ื ื˜ื ืกื™ื‘ื™. ืชื•ื›ื ื™ืช ื•ื‘ื•ื ื•ืกื™ื

ื™ึฐืจื•ึผืฉึธืื”

ื”-Slurm ื”ืจืืฉื•ืŸ ื”ื™ื” ื ื™ืกื•ื™. ื”ื“ื•ื‘ืจื™ื ื”ืฉืœื™ืžื• ืืช ื”ืžืฆื’ื•ืช ืฉืœื”ื ืžืžืฉ ืขืœ ื”ื‘ืžื”, ื•ื‘ืงื”ืœ ื™ืฉื‘ื• ืžื ื”ืœื™ื ื‘ืจืžื” ื›ื–ื• ืฉื”ื’ื™ืข ื”ื–ืžืŸ ืœื”ื–ืžื™ืŸ ืื•ืชื ื›ื“ื•ื‘ืจื™ื.

ื”ืงื•ืจืก ื”ื‘ืกื™ืกื™ ื”ืืžื™ืชื™ ื”ืชืงื™ื™ื ื‘-Slurm ื”ืฉื ื™: 80% ืžื”ืžืฉืชืชืคื™ื ืจืื• ืืช Kubernetes ื‘ืคืขื ื”ืจืืฉื•ื ื”, ื•ืฉืœื™ืฉ ืžืขื•ืœื ืœื ืขื‘ื“ ืขื Docker.
ื”ื™ื” ื‘ืจื•ืจ ื›ืžื” ืงืฉื” ื”ื™ื” ืœืื ืฉื™ื ืœื”ืื–ื™ืŸ ืœื”ืจืฆืื” ืขืœ Docker ื‘ื‘ื•ืงืจ ื•ืœืขื‘ื•ื“ ืื™ืชื” ื‘ืžืฆื‘ ืงืจื‘ ื‘ืขืจื‘.
Ceph ื’ืจื ืœื”ืจื‘ื” ืงืฉื™ื™ื. ื™ืชืจื” ืžื›ืš, ื”ื™ื• 20 ืื ืฉื™ื ื‘ืงื”ืœ ืฉื‘ื”ื—ืœื˜ ื”ื™ื• ืฆืจื™ื›ื™ื ืœื”ืกื‘ื™ืจ ืืช Ceph, ื•ืขื•ื“ 60 ืฉืœื ื”ื™ื• ืฆืจื™ื›ื™ื ืืช Ceph ื‘ื›ืœืœ.

ืขื‘ื•ืจ ื”-Slurm ื”ืฉืœื™ืฉื™, ื”ืขื‘ืจื ื• ืืช Docker ื•-Ansible ืœืกืžื™ื ืจื™ื ืžืงื•ื•ื ื™ื ื ืคืจื“ื™ื, ืžื” ืฉื™ืคื ื” ื™ื•ืชืจ ื–ืžืŸ ืขื‘ื•ืจ Kubernetes. ื”ืคืชืจื•ืŸ ื”ืชื‘ืจืจ ื›ืžืขืฉื™ ื‘ืžื”ื•ืชื• ื•ืœื ืžืคื•ืชื— ื‘ื™ื™ืฉื•ื: ื”ื”ืจืฆืื” ืœื ืขื ื™ื™ื ื” ื—ื‘ืจ'ื” ืžื ื•ืกื™ื, ื•ื”ื“ื™ื•ืŸ ืœื ืขื ื™ื™ืŸ ืžืชื—ื™ืœื™ื.

ืขื‘ื•ืจ ื”-Slurm ื”ืจื‘ื™ืขื™, ื”ื›ื ื• ืงื•ืจืกื™ื ืžืงื•ื•ื ื™ื ืขืœ Docker, Ansible ื•-Ceph. ื”ืจืขื™ื•ืŸ ืคืฉื•ื˜: ืžื™ ืฉื–ืงื•ืง ืœื• ื™ื™ืงื— ืืช ื”ืงื•ืจืก ืžืชื•ืš ืžื—ืฉื‘ื”, ืžื™ ืฉืœื ืฆืจื™ืš ื™ืชืขืœื ืžืžื ื• ื‘ืฉืœื•ื•ื”. ืื ืœืฉืคื•ื˜ ืœืคื™ ืงื‘ื•ืฆืช ื”ื‘ื•ื—ื ื™ื, ืงื•ืจืก ื”ื“ื•ืงืจ ื ืžืฉืš 6-8 ืฉืขื•ืช. Ansible ื•-Ceph ืขื“ื™ื™ืŸ ืœื ืขืฉื• ืฉืขื•ืŸ.

ื›ืชื‘ ื•ื™ืชื•ืจ:

  • ืงื•ืจืก ื ืกื™ื•ื ื™. ื—ืœืง ืžื”ื”ื—ืœื˜ื•ืช ื›ื ืจืื” ื™ืชื‘ืจืจื• ื›ืœื ืžื•ืฆืœื—ื•ืช.
  • ื”ืคืœื˜ืคื•ืจืžื” (Stepik.org) ื”ื™ื ื’ืกื”, ื•ืœื ืขื‘ื“ื ื• ืื™ืชื” ื‘ืขื‘ืจ. ืกื‘ื™ืจ ืœื”ื ื™ื— ืฉื™ื”ื™ื• ื‘ืœื™ื˜ื•ืช ื•ืชืงืœื•ืช.
  • ื”ืงื•ืจืก ื ื‘ื“ืง ืจืง ืขืœ ืขื•ื‘ื“ื™ Southbridge. ื‘ื•ื•ื“ืื™ ืชืฆื˜ืจืš ืœืกื™ื™ื ืžืฉื”ื• ืชื•ืš ื›ื“ื™.

Slurm: Kubernetes ืื™ื ื˜ื ืกื™ื‘ื™. ืชื•ื›ื ื™ืช ื•ื‘ื•ื ื•ืกื™ื

ืจืง ืœืคื ื™ ื›ืžื” ื™ืžื™ื ื‘ืฆ'ืื˜ ืฉืœ Slurm ื”ืจืืฉื•ืŸ ื”ื ื ื–ื›ืจื• ื›ืžื” ื–ื” ื”ื™ื” ืžื’ื ื™ื‘ ื•ื›ื™ืคื™, ืœืžืจื•ืช ื›ืœ ื”ื–ื•ื•ืขื•ืช ื”ืืจื’ื•ื ื™ื•ืช. ื”ืจืืฉื•ืŸ ืœืงื‘ืœ ืืช ื”ื”ื•ืคืขื•ืช ื”ื—ื™ื•ืช ื‘ื™ื•ืชืจ. ื‘ื•ืื• ื ืจืื” ืžื” ืงื•ืจื” ืœืชืœืžื™ื“ื™ื ื”ืจืืฉื•ื ื™ื ืฉืœ ืงื•ืจืกื™ื ืžืงื•ื•ื ื™ื. ๐Ÿ™‚

Slurm: Kubernetes ืื™ื ื˜ื ืกื™ื‘ื™. ืชื•ื›ื ื™ืช ื•ื‘ื•ื ื•ืกื™ื

ืžืงื•ืจ: www.habr.com

ื”ื•ืกืคืช ืชื’ื•ื‘ื”