Mafi kyawun Cyan

Mafi kyawun Cyan

Duk mai kyau! 

Sunana Nikita, Ni ne jagoran ƙungiyar Cian injiniyoyi. Ɗaya daga cikin nauyin da nake da shi a kamfanin shine rage yawan al'amuran da suka shafi abubuwan more rayuwa a cikin samarwa zuwa sifili.
Abin da za a tattauna a ƙasa ya kawo mana zafi mai yawa, kuma manufar wannan labarin shine don hana wasu mutane maimaita kuskurenmu ko aƙalla rage tasirin su. 

Preamble

Da dadewa, lokacin da Cian ya ƙunshi monoliths, kuma babu alamun microservices tukuna, mun auna samun albarkatun ta hanyar duba shafuka 3-5. 

Suna amsa - duk abin da yake lafiya, idan ba su amsa na dogon lokaci ba - faɗakarwa. Yaya tsawon lokacin da za su daina aiki don a yi la'akari da shi wani lamari ne mutane suka yanke shawara a cikin tarurruka. Tawagar injiniyoyi a kodayaushe suna shiga cikin binciken lamarin. Lokacin da aka kammala binciken, sun rubuta postmortem - wani nau'in rahoto ta imel a cikin tsari: abin da ya faru, tsawon lokacin da ya kasance, abin da muka yi a lokacin, abin da za mu yi a nan gaba. 

Babban shafukan yanar gizon ko yadda muka fahimci cewa mun buga kasa

 
Don ko ta yaya fahimtar fifikon kuskuren, mun gano mafi mahimmancin shafukan yanar gizo don ayyukan kasuwanci. Amfani da su, muna ƙidaya adadin buƙatun nasara/marasa nasara da ƙarewar lokaci. Wannan shine yadda muke auna lokacin aiki. 

Bari mu ce mun gano cewa akwai wasu sassa masu mahimmanci na rukunin yanar gizon da ke da alhakin babban sabis - bincike da ƙaddamar da tallace-tallace. Idan adadin buƙatun da suka gaza ya wuce 1%, wannan lamari ne mai mahimmanci. Idan a cikin mintuna 15 a lokacin babban lokaci adadin kuskuren ya wuce 0,1%, to ana ɗaukar wannan a matsayin lamari mai mahimmanci. Waɗannan sharuɗɗan sun haɗa da yawancin abubuwan da suka faru; sauran sun wuce iyakar wannan labarin.

Mafi kyawun Cyan

Mafi kyawun abubuwan da suka faru Cian

Don haka, tabbas mun koyi sanin gaskiyar cewa wani lamari ya faru. 

Yanzu kowane abin da ya faru an bayyana shi dalla-dalla kuma an nuna shi a cikin almara na Jira. Af: saboda wannan mun fara wani aiki daban, wanda ake kira FAIL - kawai za a iya ƙirƙirar almara a ciki. 

Idan kun tattara duk gazawar a cikin ƴan shekarun da suka gabata, shugabannin sune: 

  • mssql abubuwan da suka faru;
  • abubuwan da suka faru ta hanyar abubuwan waje;
  • kurakurai admin.

Bari mu dubi dalla-dalla game da kurakuran masu gudanarwa, da kuma wasu gazawa masu ban sha'awa.

Wuri na biyar - "Samar da abubuwa cikin tsari a cikin DNS"

Ranar talata mai hadari. Mun yanke shawarar maido da tsari a cikin gungu na DNS. 

Ina so in canja wurin sabar DNS na ciki daga ɗaure zuwa powerdns, ana rarraba sabobin daban don wannan, inda babu komai sai DNS. 

Mun sanya sabar DNS guda ɗaya a kowane wuri na DCs ɗinmu, kuma lokacin ya zo don matsar da yankuna daga ɗaure zuwa powerdns da canza kayan aikin zuwa sabbin sabobin. 

A cikin tsakiyar motsi, na duk sabar da aka ƙayyade a cikin caching na gida a kan duk sabobin, daya ne kawai ya rage, wanda ke cikin cibiyar bayanai a St. Petersburg. Da farko an ayyana wannan DC a matsayin wanda ba shi da mahimmanci a gare mu, amma ba zato ba tsammani ya zama maki guda na gazawa.
A cikin wannan lokacin ƙaura ne magudanar ruwa dake tsakanin Moscow da St. Petersburg ta rushe. A zahiri an bar mu ba tare da DNS ba na mintuna biyar kuma mun dawo lokacin da mai masaukin baki ya gyara matsalar. 

Ƙarshe:

Idan a baya mun yi watsi da abubuwan waje yayin shirye-shiryen aiki, yanzu kuma an haɗa su cikin jerin abubuwan da muke shiryawa. Kuma yanzu muna ƙoƙari don tabbatar da cewa an adana duk abubuwan haɗin gwiwa n-2, kuma yayin aikin za mu iya rage wannan matakin zuwa n-1.

  • Lokacin zana shirin aiki, yi alama wuraren da sabis ɗin zai iya kasawa, kuma kuyi tunani ta hanyar yanayin inda komai ya tafi "daga muni zuwa mafi muni" a gaba.
  • Rarraba sabar DNS na ciki a cikin wurare daban-daban/cibiyoyin bayanai/racks/switchs/input.
  • A kowane uwar garken, shigar da uwar garken DNS mai caching na gida, wanda ke tura buƙatun zuwa manyan sabar DNS, kuma idan babu shi, zai amsa daga cache. 

Wuri na hudu - "Tsaftace abubuwa a cikin Nginx"

Wata rana mai kyau, ƙungiyarmu ta yanke shawarar cewa "mun isa wannan," kuma an fara aiwatar da sake fasalin nginx. Babban makasudin shine a kawo configs zuwa tsarin da ya dace. A baya can, duk abin da aka "kafa tarihi" kuma bai dauki wani tunani ba. Yanzu kowane sunan uwar garke an matsar da shi zuwa fayil mai suna iri ɗaya kuma an rarraba duk abubuwan da aka tsara zuwa manyan fayiloli. Af, saitin ya ƙunshi layukan 253949 ko haruffa 7836520 kuma yana ɗaukar kusan megabyte 7. Babban matakin tsari: 

Tsarin Nginx

├── access
│   ├── allow.list
...
│   └── whitelist.conf
├── geobase
│   ├── exclude.conf
...
│   └── geo_ip_to_region_id.conf
├── geodb
│   ├── GeoIP.dat
│   ├── GeoIP2-Country.mmdb
│   └── GeoLiteCity.dat
├── inc
│   ├── error.inc
...
│   └── proxy.inc
├── lists.d
│   ├── bot.conf
...
│   ├── dynamic
│   └── geo.conf
├── lua
│   ├── cookie.lua
│   ├── log
│   │   └── log.lua
│   ├── logics
│   │   ├── include.lua
│   │   ├── ...
│   │   └── utils.lua
│   └── prom
│       ├── stats.lua
│       └── stats_prometheus.lua
├── map.d
│   ├── access.conf
│   ├── .. 
│   └── zones.conf
├── nginx.conf
├── robots.txt
├── server.d
│   ├── cian.ru
│   │   ├── cian.ru.conf
│   │   ├── ...
│   │   └── my.cian.ru.conf
├── service.d
│   ├── ...
│   └── status.conf
└── upstream.d
    ├── cian-mcs.conf
    ├── ...
    └── wafserver.conf

Ya zama mafi kyau, amma a cikin aiwatar da sake suna da rarraba saiti, wasu daga cikinsu suna da tsawo mara kyau kuma ba a haɗa su cikin umarnin * .conf ba. Sakamakon haka, wasu runduna sun zama babu samuwa kuma sun mayar da 301 zuwa babban shafi. Saboda gaskiyar cewa lambar amsa ba 5xx/4xx ba ne, ba a lura da hakan nan da nan ba, amma kawai da safe. Bayan haka, mun fara rubuta gwaje-gwaje don bincika abubuwan abubuwan more rayuwa.

Ƙarshe: 

  • Tsara saitunan ku daidai (ba kawai nginx ba) kuma kuyi tunanin tsarin a farkon matakin aikin. Ta wannan hanyar za ku ƙara fahimtar su ga ƙungiyar, wanda hakan zai rage TTM.
  • Rubuta gwaje-gwaje don wasu abubuwan abubuwan more rayuwa. Misali: duba cewa duk sunaye na uwar garken suna ba da madaidaicin matsayi + jikin amsawa. Zai isa kawai a sami 'yan rubutun hannu waɗanda ke duba mahimman ayyukan ɓangaren, don kada a tuna da baƙin ciki da ƙarfe 3 na safe me kuma ya kamata a bincika. 

Wuri na uku - "Ba zato ba tsammani ya kare sararin samaniya a Cassandra"

Bayanan sun yi girma a hankali, kuma komai ya yi kyau har zuwa lokacin da gyaran manyan wuraren da aka fara lalacewa ya fara raguwa a cikin gungun Cassandra, saboda ƙaddamarwa ba zai iya aiki a kansu ba. 

Wata rana da guguwa ta kusa rikidewa ta zama kabewa, wato:

  • akwai kusan kashi 20% na jimlar sararin da ya rage a cikin gungu;
  • Ba shi yiwuwa a yi cikakken ƙara nodes, saboda tsaftacewa ba ya shiga bayan ƙara kumburi saboda rashin sarari a kan sassan;
  • yawan aiki a hankali yana faɗuwa yayin da haɗin gwiwa ba ya aiki; 
  • Tarin yana cikin yanayin gaggawa.

Mafi kyawun Cyan

Fita - mun ƙara ƙarin nodes 5 ba tare da tsaftacewa ba, bayan haka mun fara cire su a tsari daga gungu kuma mu sake shigar da su, kamar nodes na banza waɗanda suka ƙare. An kashe lokaci mai yawa fiye da yadda muke so. Akwai haɗarin rashin samuwar ɓangaren ko gabaɗaya na gungu. 

Ƙarshe:

  • A duk sabobin cassandra, bai kamata a shagaltar da sama da kashi 60% na sarari akan kowane bangare ba. 
  • Ya kamata a loda su a ƙasa da 50% cpu.
  • Kada ku manta game da tsara iyawa kuma kuna buƙatar yin la'akari da shi ga kowane ɓangaren, dangane da ƙayyadaddun sa.
  • Yawancin nodes a cikin tari, mafi kyau. Sabar da ke ɗauke da ɗan ƙaramin bayanai ana yin lodi da sauri fiye da kima, kuma irin wannan gungu ya fi sauƙi don farfado. 

Wuri na biyu - "Bayanai sun ɓace daga ma'ajiyar ƙimar maɓalli na ofishin jakadanci"

Don gano sabis, mu, kamar mutane da yawa, muna amfani da consul. Amma muna amfani da maɓalli-darajar sa don shimfidar shuɗi-kore na monolith. Yana adana bayanai game da rafukan sama masu aiki da marasa aiki, waɗanda ke canza wurare yayin turawa. Don wannan dalili, an rubuta sabis ɗin turawa wanda yayi hulɗa da KV. A wani lokaci, bayanan daga KV sun ɓace. An dawo da shi daga ƙwaƙwalwar ajiya, amma tare da adadin kurakurai. A sakamakon haka, yayin lodawa, an rarraba kayan da ke sama ba daidai ba, kuma mun sami kurakurai 502 da yawa saboda abubuwan da aka yi lodin baya akan CPU. Sakamakon haka, mun ƙaura daga ofishin jakadanci KV zuwa postgres, daga inda ba shi da sauƙin cire su.  

Ƙarshe:

  • Sabis ba tare da izini ba bai kamata ya ƙunshi bayanai masu mahimmanci ga aikin rukunin yanar gizon ba. Misali, idan ba ku da izini a cikin ES, zai fi kyau a hana shiga a matakin cibiyar sadarwa daga ko'ina inda ba a buƙata, bar waɗanda ake buƙata kawai, sannan saita mataki.destructive_requires_name: gaskiya.
  • Yi aikin ajiyar ku da tsarin dawo da ku gaba. Misali, yi rubutun gaba (misali, a cikin python) wanda zai iya ajiyewa da mayarwa.

Wuri na farko - "Kyaftin da ba a sani ba" 

A wani lokaci, mun lura da rarraba kaya mara daidaituwa akan magudanar ruwa na nginx a lokuta inda akwai sabar 10+ a bayan baya. Saboda gaskiyar cewa zagaye-robin ya aika da buƙatun daga 1st zuwa na ƙarshe a cikin tsari, kuma kowane nginx ya sake farawa, kullun farko na farko yana karɓar buƙatun fiye da sauran. Wannan ya zama sananne yayin da adadin zirga-zirga ya karu. Kawai sabunta nginx don kunna bazuwar bai yi aiki ba - muna buƙatar sake yin gungun lua code wanda bai tashi ba akan sigar 1.15 (a lokacin). Dole ne mu daidaita nginx 1.14.2, gabatar da tallafin bazuwar a ciki. Wannan ya warware matsalar. Wannan kwaro ya lashe nau'in "Kyaftin Ba-Gaskiya".

Ƙarshe:

Ya kasance mai ban sha'awa da ban sha'awa don bincika wannan kwaro). 

  • Tsara saka idanu don ya taimake ku samun irin wannan sauyi cikin sauri. Misali, zaku iya amfani da ELK don saka idanu akan rps akan kowane ƙarshen kowane sama, saka idanu lokacin amsawar su daga ra'ayi na nginx. A wannan yanayin, wannan ya taimaka mana gano matsalar. 

A sakamakon haka, yawancin gazawar an iya kaucewa tare da hanya mafi kyau ga abin da kuke yi. Dole ne a koyaushe mu tuna da dokar Murphy: Duk abin da zai iya yin kuskure zai yi kuskure, da kuma gina abubuwan da suka danganci shi. 

source: www.habr.com

Add a comment