Indlela thina e-CIAN eyenze ngayo i-terabytes yezigodo

Indlela thina e-CIAN eyenze ngayo i-terabytes yezigodo

Molweni nonke, igama lam nguAlexander, ndisebenza e-CIAN njengenjineli kwaye ndibandakanyeka kulawulo lwenkqubo kunye ne-automation yeenkqubo zeziseko zophuhliso. Kumagqabantshintshi kwelinye lamanqaku angaphambili, sacelwa ukuba sixelele apho sifumana khona i-4 TB yezigodo ngosuku kunye nento esiyenzayo ngayo. Ewe, sineelogi ezininzi, kwaye i-cluster yeziseko zophuhliso ezihlukeneyo zenziwe ukuba zisebenze, ezivumela ukuba sisombulule ngokukhawuleza iingxaki. Kweli nqaku ndiza kuthetha ngendlela esiyilungelelanise ngayo ekuhambeni konyaka ukuze sisebenze ngokukhula okuhamba rhoqo kwedatha.

Saqala phi?

Indlela thina e-CIAN eyenze ngayo i-terabytes yezigodo

Kwiminyaka embalwa edlulileyo, umthwalo kwi-cian.ru ukhule ngokukhawuleza, kwaye kwikota yesithathu ye-2018, i-traffic traffic ifikelele kwi-11.2 yezigidi zabasebenzisi abakhethekileyo ngenyanga. Ngelo xesha, ngamaxesha anzima silahlekelwe ukuya kwi-40% yelogi, yingakho asikwazanga ukujongana neziganeko ngokukhawuleza kwaye sichitha ixesha elininzi kunye nomzamo wokuzicombulula. Kaninzi asikwazanga ukufumana unobangela wengxaki, kwaye ibinokuphinda ivele emva kwexesha elithile. Yayisisihogo kwaye kwakufuneka kwenziwe okuthile ngaso.

Ngelo xesha, sasebenzisa i-cluster ye-data ye-10 kunye ne-ElasticSearch version 5.5.2 kunye nezicwangciso zesalathisi eziqhelekileyo zokugcina izingodo. Yaziswa ngaphezu konyaka odlulileyo njengesisombululo esithandwayo nesifikelelekayo: emva koko ukuhamba kweelogi kwakungeyona nto inkulu, kwakungekho mfuneko yokuza nokucwangciswa okungaqhelekanga. 

Ukusetyenzwa kwelogi engenayo kubonelelwe nguLogstash kumazibuko ahlukeneyo kubanxibelelanisi be-ElasticSearch abahlanu. Isalathiso esinye, kungakhathaliseki ukuba singakanani na, sasinamaqhekeza amahlanu. Ukujikeleza kweyure kunye nemihla ngemihla kwacwangciswa, ngenxa yoko, malunga ne-100 amatsha amatsha avela kwiqela ngalinye ngeyure. Ngelixa kwakungekho zigodo ezininzi, iqela lahlangabezana kakuhle kwaye akukho mntu wanikela ingqalelo kwiisetingi zayo. 

Imingeni yokukhula ngokukhawuleza

Umthamo weelogi ezenziweyo ukhule ngokukhawuleza, njengoko iinkqubo ezimbini zadibana. Kwelinye icala, inani labasebenzisi benkonzo likhule. Kwelinye icala, saqala ukutshintshela kwi-microservice architecture, sabona i-monoliths yethu endala kwi-C # kunye ne-Python. Ishumi elinambini leenkonzo ezincinci ezitsha ezithathe indawo ye-monolith zenze iilog ezininzi zeqela leziseko zophuhliso. 

Yayikukukhula okuthe kwasikhokelela kwinqanaba apho iqela liye labonakala lingalawuleki. Xa iingodo ziqala ukufika kwinqanaba le-20 yemiyalezo engamawaka ngomzuzwana, ukujikeleza okungenamsebenzi rhoqo kwandisa inani lee-shards ukuya kwi-6 lamawaka, kwaye kwakukho ngaphezu kwe-600 shards kwi-node nganye. 

Oku kwakhokelela kwiingxaki zokwabiwa kwe-RAM, kwaye xa i-node iphahlazeka, zonke ii-shards zaqala ukuhamba ngaxeshanye, ziphindaphinda i-traffic kunye nokulayisha ezinye ii-nodes, okwenza kube nzima ukubhala idatha kwiqela. Kwaye ngeli xesha sasishiywe singenazigodo. Kwaye ukuba bekukho ingxaki ngomncedisi, siphulukene ne 1/10 yeqela. Inani elikhulu lezalathisi ezincinci zongeze ubunzima.

Ngaphandle kweelogi, asizange siziqonde izizathu zesehlo kwaye sinokuthi ngokukhawuleza okanye kamva sinyathele kwirake enye kwakhona, kwaye kwingcamango yeqela lethu yayingamkelekanga, kuba zonke iindlela zethu zokusebenza ziyilelwe ukwenza okuchaseneyo - ungaze uphinde. iingxaki ezifanayo. Ukwenza oku, besidinga umthamo opheleleyo weelogi kunye nokuhanjiswa kwazo phantse ngexesha lokwenyani, kuba iqela leenjineli ezisemsebenzini libeke iliso kwizilumkiso kungekuphela kwiimethrikhi, kodwa nakwilog. Ukuqonda ubungakanani bengxaki, ngelo xesha umthamo opheleleyo weelogi wawumalunga ne-2 TB ngosuku. 

Sizibekele injongo yokuphelisa ngokupheleleyo ukulahleka kweelogi kunye nokunciphisa ixesha lokunikezelwa kwabo kwiqela le-ELK ukuya kwimizuzu engama-15 ngexesha le-force majeure (kamva sithembele kulo mfanekiso njenge-KPI yangaphakathi).

Indlela entsha yokujikeleza kunye neendawo ezishushu ezishushu

Indlela thina e-CIAN eyenze ngayo i-terabytes yezigodo

Siqale ukuguqulwa kweqela ngokuhlaziya inguqulo ye-ElasticSearch ukusuka kwi-5.5.2 ukuya kwi-6.4.3. Kwakhona iqela lethu lenguqulo yesi-5 yafa, kwaye sagqiba kwelokuba siyicime kwaye siyihlaziye ngokupheleleyo - akukabikho zigodo. Ke senze olu tshintsho ngeyure nje ezimbalwa.

Olona tshintsho lukhulu kakhulu kweli nqanaba yayikukuphunyezwa kwe-Apache Kafka kwiindawo ezintathu ezinomququzeleli njenge-buffer ephakathi. Umthengisi womyalezo usisindise ekuphulukaneni neelog ngexesha leengxaki nge-ElasticSearch. Ngexesha elifanayo, songeze ii-nodes ze-2 kwi-cluster kwaye sitshintshela kwi-architecture eshushu-eshushu kunye neendawo ezintathu "ezishushu" ezibekwe kwii-racks ezahlukeneyo kwiziko ledatha. Sathumela iilogi kubo sisebenzisa imaski ekungafuneki ukuba ilahleke phantsi kwayo nayiphi na imeko - nginx, kunye neelogi zempazamo zesicelo. Izigodo ezincinci zathunyelwa kwiindawo eziseleyo - ukulungiswa, isilumkiso, njl., kwaye emva kweeyure ezingama-24, iilogi "ezibalulekileyo" ezivela kwiindawo "ezishushu" zadluliselwa.

Ukuze singanyusi inani lezalathisi ezincinci, sitshintshe ukusuka kwixesha lokujikeleza ukuya kwindlela yokujikeleza. Kwakukho ulwazi oluninzi kwiiforum ukuba ukujikeleza ngesayizi yesalathisi akuthembekanga kakhulu, ngoko ke sagqiba ekubeni sisebenzise ukujikeleza ngenani lamaxwebhu kwisalathisi. Sihlalutye isalathisi ngasinye kwaye sirekhoda inani lamaxwebhu emva kokuba ukujikeleza kufuneka kusebenze. Ngaloo ndlela, sifikelele kubukhulu obufanelekileyo be-shard - akukho ngaphezu kwe-50 GB. 

Ukulungiselela iQela

Indlela thina e-CIAN eyenze ngayo i-terabytes yezigodo

Nangona kunjalo, asikapheli ngokupheleleyo iingxaki. Ngelishwa, izalathisi ezincinci zisabonakala: azizange zifikelele kumthamo ochaziweyo, azizange zijikelezwe, kwaye zicinywe ngokucocwa kwehlabathi jikelele kwezalathisi ezidala kuneentsuku ezintathu, kuba sasusa ukujikeleza ngomhla. Oku kukhokelele ekulahlekeni kwedatha ngenxa yokuba isalathiso esivela kwiqela sanyamalala ngokupheleleyo, kwaye inzame yokubhalela kwisalathiso esingekhoyo yaphula ingqiqo yomgcini-mgcini esasiyisebenzisile kulawulo. I-Alias ​​yokubhala yaguqulwa yaba sisalathiso kwaye yaphula i-logic ye-rollover, ibangela ukukhula okungalawulwayo kwezinye izalathisi ukuya kuthi ga kwi-600 GB. 

Umzekelo, kulungiselelo lokujikeleza:

сurator-elk-rollover.yaml

---
actions:
  1:
    action: rollover
    options:
      name: "nginx_write"
      conditions:
        max_docs: 100000000
  2:
    action: rollover
    options:
      name: "python_error_write"
      conditions:
        max_docs: 10000000

Ukuba bekungekho i-alias rollover, kwenzeke impazamo:

ERROR     alias "nginx_write" not found.
ERROR     Failed to complete action: rollover.  <type 'exceptions.ValueError'>: Unable to perform index rollover with alias "nginx_write".

Sishiye isisombululo kule ngxaki kwi-iteration elandelayo kwaye sathatha omnye umba: satshintshela kwi-log logic ye-Logstash, eqhuba iilogi ezingenayo (ukususa ulwazi olungeyomfuneko kunye nokuphucula). Siyibeke kwi-docker, esiyisungula nge-docker-compose, kwaye siphinde sabeka i-logstash-exporter apho, ethumela i-metrics kwi-Prometheus ukubeka iliso lokusebenza kwi-log stream. Ngale ndlela sizinike ithuba lokutshintsha ngokutyibilikayo inani leemeko zelogstash ezinoxanduva lokujongana nodidi ngalunye lwelog.

Ngelixa sasiphucula i-cluster, i-traffic ye-cian.ru yanda ukuya kwi-12,8 yezigidi zabasebenzisi abakhethekileyo ngenyanga. Ngenxa yoko, kwavela ukuba iinguqu zethu zincinci emva kweenguqu kwimveliso, kwaye sasijongene nelokuba "iindawo ezifudumeleyo" azikwazi ukujamelana nomthwalo kwaye zanciphisa konke ukuhanjiswa kweelogi. Sifumene idatha "eshushu" ngaphandle kokungaphumeleli, kodwa kwafuneka singenelele ekuhanjisweni kwabanye kwaye senze i-rollover manual ukuze sisasaze ngokulinganayo izalathisi. 

Kwangaxeshanye, ukulinganisa kunye nokutshintsha useto lweziganeko zelogstash kwiqela lantsonkothelwa yinto yokuba yayiyi-docker-compose yendawo, kwaye zonke izenzo zenziwe ngesandla (ukongeza iziphelo ezitsha, kwakuyimfuneko ukuba uhambe kuzo zonke izinto ngesandla. abancedisi kwaye benze i-docker-compose up -d kuyo yonke indawo).

Ukwabiwa kwakhona kwelogi

NgoSeptemba walo nyaka, sisanqumla i-monolith, umthwalo kwi-cluster wawusanda, kwaye ukuhamba kweelogi kwakusondela kwi-30 yemiyalezo eyiwaka ngomzuzwana. 

Indlela thina e-CIAN eyenze ngayo i-terabytes yezigodo

Saqala i-iteration elandelayo ngohlaziyo lwehardware. Sitshintshe ukusuka kubaququzeleli abahlanu ukuya kwabathathu, satshintsha iindawo zedatha kwaye saphumelela ngokwemali kunye nendawo yokugcina. Kwiindawo zokuhlala sisebenzisa ulungelelwaniso ezimbini: 

  • Kwiindawo "ezishushu": E3-1270 v6 / 960Gb SSD / 32 Gb x 3 x 2 (3 ye-Hot1 kunye ne-3 ye-Hot2).
  • Kwiindawo "ezifudumeleyo": E3-1230 v6 / 4Tb SSD / 32 Gb x 4.

Kule iteration, sihambise isalathisi kunye neelogi zokufikelela kwii-microservices, ezithatha indawo efanayo ne-front-line nginx logs, kwiqela lesibini leendawo ezintathu "ezishushu". Ngoku sigcina idatha kwi-nodes "eshushu" kwiiyure ze-20, kwaye emva koko idlulisele kwiindawo "ezifudumele" kuzo zonke iilogi. 

Sisombulule ingxaki yezalathisi ezincinci ezinyamalalayo ngokuphinda siqwalasele ukujikeleza kwazo. Ngoku izalathisi zijikeleziswa rhoqo kwiiyure ezingama-23 kuyo nayiphi na imeko, nokuba kukho idatha encinci apho. Oku kwandisa kancinane inani lee-shards (kwakukho malunga ne-800 kubo), kodwa ukusuka kwindawo yokujonga ukusebenza kweqela kuyanyamezeleka. 

Ngenxa yoko, kwakukho amathandathu "ashushu" kunye neendawo ezine "ezifudumele" kuphela kwiqela. Oku kubangela ukulibaziseka okuncinci kwizicelo kwixesha elide, kodwa ukwandisa inani leendawo kwixesha elizayo kuya kusombulula le ngxaki.

Oku kuphindaphindwa kwakhona kwalungisa ingxaki yokunqongophala kwe-semi-automatic scaling. Ukwenza oku, siye safaka iqela le-Nomad leziseko ezingundoqo-ezifana naleyo sele siyifakile kwimveliso. Okwangoku, inani leLogstash alitshintshi ngokuzenzekelayo ngokuxhomekeka kumthwalo, kodwa siya kuza kule nto.

Indlela thina e-CIAN eyenze ngayo i-terabytes yezigodo

Izicwangciso zekamva

Izikali zoqwalaselo eziphunyeziweyo ngokugqibeleleyo, kwaye ngoku sigcina i-13,3 TB yedatha - zonke iilogi zeentsuku ze-4, eziyimfuneko kuhlalutyo oluphuthumayo lwezilumkiso. Siguqula ezinye iilogi zibe ziimethrikhi, esizongeza kwiGraphite. Ukwenza umsebenzi weenjineli ube lula, sineemethrikhi zeqela leziseko ezingundoqo kunye nezikripthi zokulungiswa kwe-semi-automatic yeengxaki eziqhelekileyo. Emva kokwandisa inani leedatha zedatha, ezicwangciswe kunyaka ozayo, siya kutshintshela kwisitoreji sedatha ukusuka kwi-4 ukuya kwiintsuku ze-7. Oku kuya kukwanela kumsebenzi wokusebenza, ekubeni sihlala sizama ukuphanda iziganeko ngokukhawuleza, kwaye uphando lwexesha elide kukho idatha ye-telemetry. 

Ngo-Okthobha ka-2019, i-traffic kwi-cian.ru yayisele ikhule yaya kutsho kwi-15,3 yezigidi zabasebenzisi abakhethekileyo ngenyanga. Oku kuye kwaba luvavanyo olunzulu lwesisombululo soyilo lokuhambisa iinkuni. 

Ngoku silungiselela ukuhlaziya i-ElasticSearch kuguqulelo lwesi-7. Nangona kunjalo, koku kuya kufuneka sihlaziye imephu yezalathisi ezininzi kwi-ElasticSearch, ekubeni zasuka kuguqulelo 5.5 kwaye zabhengezwa njengezihoxisiweyo kuguqulelo lwesi-6 (azikho nje kwinguqulelo. 7). Oku kuthetha ukuba ngexesha lenkqubo yohlaziyo ngokuqinisekileyo kuya kubakho uhlobo oluthile lwe-force majeure, oluya kusishiya ngaphandle kwamalogi ngelixa umba usonjululwa. Kwinguqulo yesi-7, sijonge phambili kwi-Kibana nge-interface ephuculweyo kunye nezihlungi ezintsha. 

Siyifezekisile eyona njongo yethu iphambili: siyekile ukuphulukana nezigodo kwaye sanciphisa ixesha lokuncipha kweqela leziseko ezingundoqo ukusuka kwiingozi ezi-2-3 ngeveki ukuya kwiiyure ezimbalwa zomsebenzi wokulungisa ngenyanga. Wonke lo msebenzi kwimveliso phantse ungabonakali. Nangona kunjalo, ngoku sinokugqiba ngokuthe ngqo oko kwenzekayo ngenkonzo yethu, sinokuyenza ngokukhawuleza kwimodi yokuthula kwaye ungakhathazeki ukuba izigodo ziya kulahleka. Ngokubanzi, sinelisekile, sivuya kwaye silungiselela izinto ezintsha, esiza kuthetha ngazo kamva.

umthombo: www.habr.com

Yongeza izimvo