Nā pilikia me DNS ma Kubernetes. ʻO ka make make o ka lehulehu

Nānā unuhi: He unuhi kēia o ka postmortem lehulehu mai ka blog ʻenekinia o ka hui Mākaukau. Hōʻike ia i kahi pilikia me ka conntrack i loko o kahi pūʻulu Kubernetes, kahi i alakaʻi i ka haʻahaʻa haʻahaʻa o kekahi mau lawelawe hana.

Pono paha kēia ʻatikala no ka poʻe makemake e aʻo hou aʻe e pili ana i ka postmortems a i ʻole e pale i kekahi pilikia DNS i ka wā e hiki mai ana.

Nā pilikia me DNS ma Kubernetes. ʻO ka make make o ka lehulehu
ʻAʻole kēia DNS
ʻAʻole hiki iā DNS
ʻO DNS ia

He wahi liʻiliʻi e pili ana i ka postmortems a me nā kaʻina hana ma Preply

Hōʻike ka postmortem i kahi hana ʻino a i ʻole kekahi hanana i ka hana ʻana. Aia i ka postmortem kahi papa manawa o nā hanana, ka hopena o ka mea hoʻohana, ke kumu kumu, nā hana i hana ʻia, a me nā haʻawina i aʻo ʻia.

Ke ʻimi nei iā SRE

Ma nā hālāwai pule me ka pizza, ma waena o ka hui ʻenehana, kaʻana like mākou i nā ʻike like ʻole. ʻO kekahi o nā māhele koʻikoʻi o ia mau hālāwai he post-mortems, i hele pinepine ʻia me kahi hōʻike me nā kiʻi paheʻe a me kahi loiloi hohonu o ka hanana. ʻOiai ʻaʻole mākou e paʻipaʻi ma hope o ka postmortems, hoʻāʻo mākou e hoʻomohala i kahi moʻomeheu "ʻaʻohe hewa" (moʻomeheu hala ʻole). Manaʻo mākou hiki i ke kākau ʻana a me ka hōʻike ʻana i nā postmortems ke kōkua iā mākou (a me nā mea ʻē aʻe) e pale i nā hanana like i ka wā e hiki mai ana, ʻo ia ke kumu a mākou e kaʻana like ai.

Pono ka poʻe i komo i kahi hanana e hiki iā lākou ke ʻōlelo kikoʻī me ka makaʻu ʻole i ka hoʻopaʻi a i ʻole ka hoʻopaʻi. ʻAʻohe hewa! ʻO ka kākau ʻana i ka postmortem ʻaʻole ia he hoʻopaʻi, akā he manawa aʻo no ka hui holoʻokoʻa.

E mālama iā CALMS & DevOps: S no ka Kaʻana like

Nā pilikia me DNS ma Kubernetes. Post mortem

Lā: 28.02.2020

Nā mea kākau: Amet U., Andrey S., Igor K., Alexey P.

Kūlana: Ua pau

Penei: Loaʻa ʻole ka hapa DNS (26 min) no kekahi mau lawelawe ma ka hui Kubernetes

Hoʻohuli manaʻo: Ua nalowale nā ​​hanana 15000 no nā lawelawe A, B a me C

Ke kumu kumu: ʻAʻole hiki iā Kube-proxy ke wehe pololei i kahi komo kahiko mai ka papa conntrack, no laila ke hoʻāʻo nei kekahi mau lawelawe e hoʻopili i nā pods ʻole.

E0228 20:13:53.795782       1 proxier.go:610] Failed to delete kube-system/kube-dns:dns endpoint connections, error: error deleting conntrack entries for UDP peer {100.64.0.10, 100.110.33.231}, error: conntrack command returned: ...

Hoʻomaka: Ma muli o ka haʻahaʻa haʻahaʻa i loko o ka pūʻulu Kubernetes, ua hoʻemi ʻo CoreDNS-autoscaler i ka helu o nā pods i ka hoʻoili ʻana mai ʻekolu a ʻelua.

pāʻoihana: ʻO ka hoʻopuka hou ʻana o ka noi i hoʻomaka i ka hana ʻana i nā nodes hou, ua hoʻohui ʻo CoreDNS-autoscaler i nā pods hou aʻe e lawelawe i ka pūʻulu, kahi i hoʻonāukiuki i ka kākau hou ʻana i ka papa conntrack.

ʻIke: ʻIke ʻia ka nānā ʻana o Prometheus i ka nui o nā hewa 5xx no nā lawelawe A, B a me C a hoʻomaka i kahi kelepona i nā ʻenekini hana.

Nā pilikia me DNS ma Kubernetes. ʻO ka make make o ka lehulehu
5xx hewa ma Kibana

Nā hana

kanawai
ʻAno
Kuleana
Nń Pahuhopu

Hoʻopau i ka autoscaler no CoreDNS
pale ʻia
Amet U.
DEVOPS-695

E hoʻonohonoho i kahi kikowaena DNS cache
emi iho
ʻO Max V.
DEVOPS-665

E hoʻonohonoho i ka nānā ʻana i ka conntrack
pale ʻia
Amet U.
DEVOPS-674

Nā haʻawina i aʻo ʻia

He aha ka mea i hele maikaʻi:

  • Ua hana maikaʻi ka nānā ʻana. Ua wikiwiki a hoʻonohonoho ʻia ka pane
  • ʻAʻole mākou i kau i nā palena ma nā nodes

He aha ka hewa:

  • ʻAʻole ʻike ʻia ke kumu maoli, e like me kiko kiko ma ka hui ʻana
  • Hoʻoponopono nā hana a pau i nā hopena wale nō, ʻaʻole ke kumu kumu (bug)
  • Ua ʻike mākou e loaʻa paha iā mākou nā pilikia me DNS, akā ʻaʻole mākou i hoʻokumu i nā hana

Ma hea mākou i laki:

  • ʻO ka hoʻouka hou ʻana ua hoʻoulu ʻia e CoreDNS-autoscaler, nāna i hoʻopiʻi i ka papa conntrack.
  • Hoʻopilikia wale kēia ʻino i kekahi mau lawelawe

Kalaina manawa (EET)


kanawai

22:13
Ua hōʻemi ʻo CoreDNS-autoscaler i ka helu o nā pods mai ʻekolu a ʻelua

22:18
Ua hoʻomaka ka poʻe ʻenekinia e hana i nā kelepona mai ka ʻōnaehana nānā

22:21
Ua hoʻomaka nā ʻenekinia e hana nei e ʻimi i ke kumu o nā hewa.

22:39
Ua hoʻomaka ka poʻe ʻenekinia ma ka hana e hoʻihoʻi i kekahi o nā lawelawe hou loa i ka mana mua

22:40
Ua pau nā hewa 5xx i ka puka ʻana, ua kūpaʻa ke kūlana

  • Ka manawa e ʻike ai: 4 min
  • Ka manawa ma mua o ka hana: 21 min
  • Ka manawa e hoʻoponopono ai: 1 min

hou ike

No ka hōʻemi i ka hoʻohana ʻana i ka CPU, hoʻohana ka kernel Linux i kahi mea i kapa ʻia conntrack. I ka pōkole, he mea pono kēia i loaʻa kahi papa inoa o nā moʻolelo NAT i mālama ʻia i kahi papaʻaina kūikawā. Ke hōʻea mai ka ʻeke aʻe mai ka pod hoʻokahi i ka pod e like me ka wā ma mua, ʻaʻole e helu hou ʻia ka helu IP hope, akā e lawe ʻia mai ka papa conntrack.
Nā pilikia me DNS ma Kubernetes. ʻO ka make make o ka lehulehu
Pehea e hana ai ka conntrack

Nā hopena

He laʻana kēia o kekahi o kā mākou postmortems me kekahi mau loulou pono. Ma kēia ʻatikala, kaʻana like mākou i ka ʻike e pono ai i nā ʻoihana ʻē aʻe. ʻO ia ke kumu ʻaʻole mākou makaʻu e hana hewa a no laila mākou e hoʻolaha i kekahi o kā mākou postmortems. Eia kekahi mau postmortem lehulehu hou aʻe.

Source: www.habr.com

Pākuʻi i ka manaʻo hoʻopuka