Matambudziko neDNS muKubernetes. Public postmortem

Cherechedza. shandura Iyi ishandurudzo yeruzhinji postmortem kubva kukambani engineering blog Gadzirira. Inotsanangura dambudziko ne conntrack muKubernetes cluster, izvo zvakakonzera kuderera kwechikamu chemamwe masevhisi ekugadzira.

Ichi chinyorwa chinogona kubatsira kune avo vanoda kudzidza zvishoma nezve postmortems kana kudzivirira mamwe anogona kuitika DNS matambudziko mune ramangwana.

Matambudziko neDNS muKubernetes. Public postmortem
Iyi haisi DNS
Haikwanise kuva DNS
Yaiva DNS

Zvishoma nezve postmortems uye maitiro mu Preply

A postmortem inotsanangura kusashanda zvakanaka kana chimwe chiitiko mukugadzirwa. Iyo postmortem inosanganisira rondedzero yenguva yezviitiko, kukanganisa kwevashandisi, mudzi chikonzero, zviito zvakatorwa, uye zvidzidzo zvakadzidzwa.

Kutsvaga SRE

Pamisangano yevhiki nevhiki nepizza, pakati pechikwata chehunyanzvi, tinogovana ruzivo rwakasiyana. Chimwe chezvikamu zvakakosha zvemisangano yakadai ndeye post-mortems, iyo inowanzoperekedzwa nemharidzo ine masiraidhi uye kuongorora kwakadzama kwechiitiko. Kunyangwe isu tisingaombere mushure mekufa kwekufa, tinoedza kukudziridza tsika ye "hapana mhosva" (tsika dzisina mhosva) Isu tinotenda kuti kunyora nekupa postmortems kunogona kutibatsira (nevamwe) kudzivirira zviitiko zvakafanana mune ramangwana, ndosaka tiri kugovana navo.

Vanhu vanobatanidzwa mune chimwe chiitiko vanofanira kunzwa kuti vanogona kutaura zvakadzama vasingatyi kurangwa kana kutsiva. Hapana mhosva! Kunyora postmortem hakusi chirango, asi mukana wekudzidza wekambani yese.

Chengetedza CALMS & DevOps: S ndeyeKugovera

Matambudziko neDNS muKubernetes. Postmortem

Zuva: 28.02.2020

Vanyori: Amet U., Andrey S., Igor K., Alexey P.

Mamiriro Yapera

Mutsva: Chikamu cheDNS kusawanikwa (26 min) kune mamwe masevhisi muKubernetes cluster

Kufurira 15000 zviitiko zvakarasika nekuda kwesevhisi A, B uye C

Chikonzero chikuru: Kube-proxy haina kukwanisa kunyatso bvisa chekare kubva patafura yecontrack, saka mamwe masevhisi aive achiri kuyedza kubatana nemapodhi asipo.

E0228 20:13:53.795782       1 proxier.go:610] Failed to delete kube-system/kube-dns:dns endpoint connections, error: error deleting conntrack entries for UDP peer {100.64.0.10, 100.110.33.231}, error: conntrack command returned: ...

Trigger: Nekuda kwekuremerwa kwakaderera mukati meKubernetes cluster, CoreDNS-autoscaler yakaderedza huwandu hwemapods mukutumirwa kubva pamatatu kusvika maviri.

mhinduro: Kutumirwa kunotevera kwechikumbiro kwakatanga kusikwa kwemanode matsva, CoreDNS-autoscaler yakawedzera mamwe mapodhi kuti ashumire sumbu, izvo zvakamutsa kunyorwazve kwetafura yecontrack.

Kuonekwa: Prometheus yekutarisa yakaona huwandu hukuru hwe5xx zvikanganiso zvesevhisi A, B uye C uye akatanga runhare kune vari-basa mainjiniya.

Matambudziko neDNS muKubernetes. Public postmortem
5xx kukanganisa muKibana

Zviito

kushanda
Tora
Inotarisirwa
Basa

Dzima autoscaler yeCoreDNS
kudziviswa
Amet U.
DEVOPS-695

Gadzirisa caching DNS server
kuderedza
Max V.
DEVOPS-665

Gadzira contrack monitoring
kudziviswa
Amet U.
DEVOPS-674

Zvidzidzo Zvakadzidzwa

Zvakafamba zvakanaka:

  • Kuongorora kwakashanda zvakanaka. Mhinduro yacho yakakurumidza uye yakarongeka
  • Hatina kurova chero miganhu pamanodhi

Chii chakanga chisina kunaka:

  • Zvichiri kuzivikanwa chikonzero chaicho chaicho, chakafanana ne chaiyo bug mukupesana
  • Zvese zviito zvinogadzirisa mhedzisiro chete, kwete iyo mudzi chikonzero (bug)
  • Isu taiziva kuti munguva pfupi kana kuti gare gare taigona kuve nematambudziko neDNS, asi isu hatina kukoshesa mabasa

Kwatakaita rombo rakanaka:

  • Kuendeswa kunotevera kwakakonzereswa neCoreDNS-autoscaler, iyo yakatsikitsira iyo conntrack tafura.
  • Iyi bug yakangobata mamwe masevhisi

Nguva yakatarwa (EET)

ВрСмя
kushanda

22:13
CoreDNS-autoscaler yakaderedza nhamba yemapodhi kubva pamatatu kusvika maviri

22:18
Mainjiniya vari pabasa vakatanga kugashira nhare kubva kumonitoring system

22:21
Mainjiniya ari pabasa akatanga kutsvaga chikonzero chekukanganisa.

22:39
Mainjiniya vari pabasa vakatanga kudzosera imwe yeazvino masevhisi kune yakapfuura vhezheni

22:40
5xx zvikanganiso zvakamira kuoneka, mamiriro acho akagadzikana

  • Nguva yekuona: 4 min
  • Nguva isati yaitwa: 21 min
  • Nguva yekugadzirisa: 1 min

mamwe mashoko

Kuti uderedze kushandiswa kweCPU, iyo Linux kernel inoshandisa chimwe chinhu chinonzi contrack. Muchidimbu, ichi chishandiso chine runyorwa rweNAT zvinyorwa zvakachengetwa mutafura yakakosha. Kana iyo inotevera packet inosvika kubva kune imwechete pod ichienda kune imwechete pod sepakutanga, yekupedzisira IP kero haizoverengerwezve, asi ichatorwa kubva patafura yecontrack.
Matambudziko neDNS muKubernetes. Public postmortem
Kuti contrack inoshanda sei

Migumisiro

Uyu waive muenzaniso weimwe yedu postmortems ine mamwe malink anobatsira. Kunyanya mune ino chinyorwa, tinogovana ruzivo runogona kubatsira kune mamwe makambani. Ndosaka isu tisingatyi kukanganisa ndosaka tichiita kuti imwe yema postmortems edu ive pachena. Heano mamwe anonakidza veruzhinji postmortems:

Source: www.habr.com

Voeg