Cherechedza. shandura Iyi ishandurudzo yeruzhinji postmortem kubva kukambani engineering blog
Ichi chinyorwa chinogona kubatsira kune avo vanoda kudzidza zvishoma nezve postmortems kana kudzivirira mamwe anogona kuitika DNS matambudziko mune ramangwana.
Iyi haisi DNS
Haikwanise kuva DNS
Yaiva DNS
Zvishoma nezve postmortems uye maitiro mu Preply
A postmortem inotsanangura kusashanda zvakanaka kana chimwe chiitiko mukugadzirwa. Iyo postmortem inosanganisira rondedzero yenguva yezviitiko, kukanganisa kwevashandisi, mudzi chikonzero, zviito zvakatorwa, uye zvidzidzo zvakadzidzwa.
Pamisangano yevhiki nevhiki nepizza, pakati pechikwata chehunyanzvi, tinogovana ruzivo rwakasiyana. Chimwe chezvikamu zvakakosha zvemisangano yakadai ndeye post-mortems, iyo inowanzoperekedzwa nemharidzo ine masiraidhi uye kuongorora kwakadzama kwechiitiko. Kunyangwe isu tisingaombere mushure mekufa kwekufa, tinoedza kukudziridza tsika ye "hapana mhosva" (
Vanhu vanobatanidzwa mune chimwe chiitiko vanofanira kunzwa kuti vanogona kutaura zvakadzama vasingatyi kurangwa kana kutsiva. Hapana mhosva! Kunyora postmortem hakusi chirango, asi mukana wekudzidza wekambani yese.
Matambudziko neDNS muKubernetes. Postmortem
Zuva: 28.02.2020
Vanyori: Amet U., Andrey S., Igor K., Alexey P.
Mamiriro Yapera
Mutsva: Chikamu cheDNS kusawanikwa (26 min) kune mamwe masevhisi muKubernetes cluster
Kufurira 15000 zviitiko zvakarasika nekuda kwesevhisi A, B uye C
Chikonzero chikuru: Kube-proxy haina kukwanisa kunyatso bvisa chekare kubva patafura yecontrack, saka mamwe masevhisi aive achiri kuyedza kubatana nemapodhi asipo.
E0228 20:13:53.795782 1 proxier.go:610] Failed to delete kube-system/kube-dns:dns endpoint connections, error: error deleting conntrack entries for UDP peer {100.64.0.10, 100.110.33.231}, error: conntrack command returned: ...
Trigger: Nekuda kwekuremerwa kwakaderera mukati meKubernetes cluster, CoreDNS-autoscaler yakaderedza huwandu hwemapods mukutumirwa kubva pamatatu kusvika maviri.
mhinduro: Kutumirwa kunotevera kwechikumbiro kwakatanga kusikwa kwemanode matsva, CoreDNS-autoscaler yakawedzera mamwe mapodhi kuti ashumire sumbu, izvo zvakamutsa kunyorwazve kwetafura yecontrack.
Kuonekwa: Prometheus yekutarisa yakaona huwandu hukuru hwe5xx zvikanganiso zvesevhisi A, B uye C uye akatanga runhare kune vari-basa mainjiniya.
5xx kukanganisa muKibana
Zviito
kushanda
Tora
Inotarisirwa
Basa
Dzima autoscaler yeCoreDNS
kudziviswa
Amet U.
DEVOPS-695
Gadzirisa caching DNS server
kuderedza
Max V.
DEVOPS-665
Gadzira contrack monitoring
kudziviswa
Amet U.
DEVOPS-674
Zvidzidzo Zvakadzidzwa
Zvakafamba zvakanaka:
- Kuongorora kwakashanda zvakanaka. Mhinduro yacho yakakurumidza uye yakarongeka
- Hatina kurova chero miganhu pamanodhi
Chii chakanga chisina kunaka:
- Zvichiri kuzivikanwa chikonzero chaicho chaicho, chakafanana ne
chaiyo bug mukupesana - Zvese zviito zvinogadzirisa mhedzisiro chete, kwete iyo mudzi chikonzero (bug)
- Isu taiziva kuti munguva pfupi kana kuti gare gare taigona kuve nematambudziko neDNS, asi isu hatina kukoshesa mabasa
Kwatakaita rombo rakanaka:
- Kuendeswa kunotevera kwakakonzereswa neCoreDNS-autoscaler, iyo yakatsikitsira iyo conntrack tafura.
- Iyi bug yakangobata mamwe masevhisi
Nguva yakatarwa (EET)
ΠΡΠ΅ΠΌΡ
kushanda
22:13
CoreDNS-autoscaler yakaderedza nhamba yemapodhi kubva pamatatu kusvika maviri
22:18
Mainjiniya vari pabasa vakatanga kugashira nhare kubva kumonitoring system
22:21
Mainjiniya ari pabasa akatanga kutsvaga chikonzero chekukanganisa.
22:39
Mainjiniya vari pabasa vakatanga kudzosera imwe yeazvino masevhisi kune yakapfuura vhezheni
22:40
5xx zvikanganiso zvakamira kuoneka, mamiriro acho akagadzikana
- Nguva yekuona: 4 min
- Nguva isati yaitwa: 21 min
- Nguva yekugadzirisa: 1 min
mamwe mashoko
- CoreDNS matanda:
I0228 20:13:53.507780 1 event.go:221] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"kube-system", Name:"coredns", UID:"2493eb55-3dc0-11ea-b3a2-02bb48f8c230", APIVersion:"apps/v1", ResourceVersion:"132690686", FieldPath:""}): type: 'Normal' reason: 'ScalingReplicaSet' Scaled down replica set coredns-6cbb6646c9 to 2
- Links to Kibana (cheka), Grafana (cheka)
Iko Linux contrack haisisiri shamwari yako kube-proxy Subtleties: Debugging an Intermittent Connection Reset Racy contrack uye DNS yekutarisa nguva yekubuda
Kuti uderedze kushandiswa kweCPU, iyo Linux kernel inoshandisa chimwe chinhu chinonzi contrack. Muchidimbu, ichi chishandiso chine runyorwa rweNAT zvinyorwa zvakachengetwa mutafura yakakosha. Kana iyo inotevera packet inosvika kubva kune imwechete pod ichienda kune imwechete pod sepakutanga, yekupedzisira IP kero haizoverengerwezve, asi ichatorwa kubva patafura yecontrack.
Kuti contrack inoshanda sei
Migumisiro
Uyu waive muenzaniso weimwe yedu postmortems ine mamwe malink anobatsira. Kunyanya mune ino chinyorwa, tinogovana ruzivo runogona kubatsira kune mamwe makambani. Ndosaka isu tisingatyi kukanganisa ndosaka tichiita kuti imwe yema postmortems edu ive pachena. Heano mamwe anonakidza veruzhinji postmortems:
- GitLab:
Postmortem yekubuda kwedatabase muna Ndira 31 - Dropbox:
Kupera mushure mekufa - Spotify:
Spotify's Rudo / Ruvengo Hukama neDNS - Vamwe vazhinji kubva
pfungwa iyi uye repositoryKubernetes Kukundikana Nyaya - Uyewo
muenzaniso public postmortem neSRE Book
Source: www.habr.com