I-Failover Cluster PostgreSQL + Patroni. Umuzwa wokusebenzisa

Esihlokweni ngizokutshela ukuthi sibhekane kanjani nodaba lokubekezelela iphutha le-PostgreSQL, kungani kwaba okubalulekile kithi nokuthi kwenzekani ekugcineni.

Sinesevisi elayishwe kakhulu: abasebenzisi abayizigidi ezingu-2,5 emhlabeni wonke, abasebenzisi abasebenzayo abangu-50K+ nsuku zonke. Amaseva atholakala e-Amazone esifundeni esisodwa sase-Ireland: amaseva ahlukene angu-100+ ahlala esebenza, cishe angu-50 awo anemininingwane yolwazi.

I-backend yonke iyisicelo esikhulu se-Java esinesimo esihle se-monolithic esigcina ukuxhumana okungaguquki kwe-websocket neklayenti. Lapho abasebenzisi abaningana besebenza ebhodini elifanayo ngesikhathi esifanayo, bonke babona izinguquko ngesikhathi sangempela, ngoba sibhala ushintsho ngalunye ku-database. Sinezicelo ezingaba ngu-10K ngesekhondi kusizindalwazi sethu. Ekulayisheni okuphezulu e-Redis, sibhala izicelo ezingu-80-100K ngomzuzwana.
I-Failover Cluster PostgreSQL + Patroni. Umuzwa wokusebenzisa

Kungani sisuka ku-Redis saya ku-PostgreSQL

Ekuqaleni, insizakalo yethu ibisebenza neRedis, isitolo senani elingukhiye esigcina yonke idatha ku-RAM yeseva.

Izinzuzo ze-Redis:

  1. Isivinini sokuphendula esiphezulu, ngoba konke kugcinwa enkumbulweni;
  2. Kulula ukwenza isipele nokuphindaphinda.

I-Cons of Redis kithi:

  1. Akukho ukuthengiselana kwangempela. Sizamile ukuzenza ezingeni lohlelo lwethu lokusebenza. Ngeshwa, lokhu akuzange kusebenze kahle ngaso sonke isikhathi futhi kwakudinga ukubhala ikhodi eyinkimbinkimbi kakhulu.
  2. Inani ledatha linqunyelwe inani lememori. Njengoba inani ledatha likhula, inkumbulo izokhula, futhi, ekugcineni, sizongena ezicini zesibonelo esikhethiwe, okuthi ku-AWS sidinge ukumisa isevisi yethu ukuze siguqule uhlobo lwesibonelo.
  3. Kuyadingeka ukugcina njalo izinga eliphansi le-latency, ngoba. sinenani elikhulu kakhulu lezicelo. Izinga lokulibaziseka elilungile kithi ngu-17-20 ms. Ezingeni lika-30-40 ms, sithola izimpendulo ezinde zezicelo ezivela kuhlelo lwethu lokusebenza kanye nokululazwa kwesevisi. Ngeshwa, lokhu kwenzeka kithi ngoSepthemba 2018, lapho esinye sezimo nge-Redis ngesizathu esithile sithola ukubambezeleka izikhathi ezi-2 ngaphezu kokujwayelekile. Ukuze sixazulule inkinga, simise isevisi phakathi nosuku ukuze silungise okungahleliwe futhi sashintsha isenzakalo esiyinkinga se-Redis.
  4. Kulula ukuthola ukungahambisani kwedatha ngisho namaphutha amancane kukhodi bese uchitha isikhathi esiningi ubhala ikhodi ukuze ulungise le datha.

Sicabangele ububi futhi sabona ukuthi sidinga ukuthuthela kokuthile okulula kakhulu, okunemisebenzi evamile nokuncika kancane ekubambezelekeni. Ucwaningo olwenziwe, lwahlaziya izinketho eziningi futhi lwakhetha i-PostgreSQL.

Sesivele sithuthela ku-database entsha iminyaka engu-1,5 futhi sihambise ingxenye encane kuphela yedatha, ngakho manje sisebenza kanyekanye ne-Redis ne-PostgreSQL. Ulwazi olwengeziwe mayelana nezigaba zokuhambisa nokushintsha idatha phakathi kolwazi olugciniwe lubhalwe kuyo isihloko sikazakwethu.

Lapho siqala ukuthutha, isicelo sethu sasebenza ngokuqondile nesizindalwazi futhi safinyelela ku-master Redis kanye ne-PostgreSQL. Iqoqo le-PostgreSQL laliqukethe okuyinhloko kanye nesifaniso esinokuphindaphinda okungavumelaniyo. Nansi indlela i-database scheme ebukeka ngayo:
I-Failover Cluster PostgreSQL + Patroni. Umuzwa wokusebenzisa

Ukusebenzisa i-PgBouncer

Ngenkathi sisahamba, umkhiqizo nawo wawuthuthuka: inani labasebenzisi kanye nenani lamaseva asebenze ne-PostgreSQL anda, futhi saqala ukuntula ukuxhumana. I-PostgreSQL idala inqubo ehlukile yokuxhumana ngakunye futhi isebenzisa izinsiza. Ungakwazi ukwandisa inani lokuxhumana kuze kufike endaweni ethile, ngaphandle kwalokho kukhona ithuba lokuthola ukusebenza kwedatha engaphansi. Inketho ekahle esimweni esinjalo kungaba ukukhetha umphathi wokuxhuma ozoma phambi kwesisekelo.

Sinezinketho ezimbili zomphathi wokuxhuma: I-Pgpool ne-PgBouncer. Kodwa eyokuqala ayisekeli imodi yokuthengiselana yokusebenza nesizindalwazi, ngakho sikhethe i-PgBouncer.

Simise uhlelo lomsebenzi olulandelayo: uhlelo lwethu lokusebenza lufinyelela i-PgBouncer eyodwa, ngemuva kwayo kukhona izingcweti ze-PostgreSQL, futhi ngemuva kwenkosi ngayinye kunekhophi eyodwa ene-asynchronous replication.
I-Failover Cluster PostgreSQL + Patroni. Umuzwa wokusebenzisa

Ngesikhathi esifanayo, asikwazanga ukugcina yonke inani ledatha ku-PostgreSQL futhi isivinini sokusebenza ne-database sasibalulekile kithi, ngakho-ke saqala ukwabelana nge-PostgreSQL ezingeni lesicelo. Uhlelo oluchazwe ngenhla lulungele lokhu: lapho ungeza i-PostgreSQL shard entsha, kwanele ukuvuselela ukucushwa kwe-PgBouncer futhi uhlelo lokusebenza lungasebenza ngokushesha neshadi elisha.

Ukuhluleka kwe-PgBouncer

Lolu hlelo lusebenze kwaze kwaba yisikhathi lapho ukufa kwesibonelo se-PgBouncer kuphela. Siku-AWS, lapho zonke izimo zisebenza ku-hardware efa ngezikhathi ezithile. Ezimweni ezinjalo, isibonelo sithuthela kuhadiwe entsha bese sisebenza futhi. Lokhu kwenzeke nge-PgBouncer, kodwa ayizange itholakale. Umphumela walokhu kuwa kwaba ukungatholakali kwenkonzo yethu imizuzu engama-25. I-AWS incoma ukuthi kusetshenziswe ukuphindaphindeka kohlangothi lomsebenzisi ezimweni ezinjalo, okungazange kwenziwe yithi ngaleso sikhathi.

Ngemuva kwalokho, sacabanga ngokungathi sΓ­na ngokubekezelelwa kwamaphutha kwamaqoqo e-PgBouncer ne-PostgreSQL, ngoba isimo esifanayo singenzeka nganoma yisiphi isenzakalo ku-akhawunti yethu ye-AWS.

Sakhe uhlelo lokubekezelela iphutha lwe-PgBouncer ngendlela elandelayo: zonke iziphakeli zohlelo lokusebenza zifinyelela i-Network Load Balancer, ngemuva kwayo kunama-PgBouncer amabili. I-PgBouncer ngayinye ibheka umpetha ofanayo we-PostgreSQL weshadi ngalinye. Uma isigameko se-AWS senzeka futhi, yonke ithrafikhi iqondiswa kabusha ngenye i-PgBouncer. I-Network Load Balancer failover ihlinzekwa yi-AWS.

Lolu hlelo lwenza kube lula ukwengeza amaseva amasha e-PgBouncer.
I-Failover Cluster PostgreSQL + Patroni. Umuzwa wokusebenzisa

Dala i-PostgreSQL Failover Cluster

Lapho sixazulula le nkinga, sicabangele izinketho ezihlukene: i-faillover ebhalwe ngokwakho, i-repmgr, i-AWS RDS, i-Patroni.

Imibhalo ezibhalile

Bangakwazi ukuqapha umsebenzi we-master futhi, uma ihluleka, bakhuthaze i-replica ku-master futhi babuyekeze ukucushwa kwe-PgBouncer.

Izinzuzo zale ndlela zilula kakhulu, ngoba ubhala izikripthi ngokwakho futhi uqonde kahle ukuthi zisebenza kanjani.

Umthengi:

  • Umphathi kungenzeka ukuthi akafanga, esikhundleni salokho kungenzeka ukuthi kwenzeke ukwehluleka kwenethiwekhi. U-Failover, engazi ngalokhu, uzothuthukisa i-replica ku-master, kuyilapho inkosi endala izoqhubeka nokusebenza. Ngenxa yalokho, sizothola amaseva amabili endimeni yokuba ngumpetha futhi ngeke sazi ukuthi yimaphi kuwo anedatha yakamuva yakamuva. Lesi simo sibizwa nangokuthi i-split-brain;
  • Sasala singaphenduli. Ekucupheni kwethu, okuyinhloko kanye nesifaniso esisodwa, ngemva kokushintsha, isifaniso sikhuphukela phezulu futhi asisenazo izifaniso, ngakho-ke kufanele sengeze ikhophi entsha mathupha;
  • Sidinga ukuqapha okwengeziwe kokusebenza kwe-failover, kuyilapho sine-12 PostgreSQL shards, okusho ukuthi kufanele siqaphe amaqoqo ayi-12. Ngokunyuka kwenani lama-shards, kufanele futhi ukhumbule ukuvuselela i-failover.

Ukuhluleka okuzibhalelayo kubukeka kuyinkimbinkimbi kakhulu futhi kudinga ukwesekwa okungasho lutho. Ngeqoqo elilodwa le-PostgreSQL, lokhu kungaba inketho elula, kodwa ayilingani, ngakho-ke ayisifanele.

I-Repmgr

Umphathi Wokuphindaphinda wamaqoqo e-PostgreSQL, angaphatha ukusebenza kweqoqo le-PostgreSQL. Ngesikhathi esifanayo, ayinayo i-faillover ezenzakalelayo ngaphandle kwebhokisi, ngakho-ke emsebenzini uzodinga ukubhala "i-wrapper" yakho phezu kwesisombululo esiphelile. Ngakho-ke yonke into ingaba yinkimbinkimbi kakhulu kunemibhalo ezibhalele yona, ngakho-ke asizange sizame nokuzama i-Repmgr.

AWS RDS

Isekela yonke into esiyidingayo, iyazi ukwenza izipele futhi igcina iqoqo lokuxhumana. Inokushintsha okuzenzakalelayo: lapho inkosi ifa, i-replica iba inkosi entsha, futhi i-AWS ishintsha irekhodi le-dns libe yinkosi entsha, kuyilapho izifanekiso zingatholakala kuma-AZ ahlukene.

Okubi kuhlanganisa ukuntula ukulungisa okuhle. Njengesibonelo sokulungisa kahle: izimo zethu zinemikhawulo yokuxhumeka kwe-tcp, okuthi, ngeshwa, kungenzeki nge-RDS:

net.ipv4.tcp_keepalive_time=10
net.ipv4.tcp_keepalive_intvl=1
net.ipv4.tcp_keepalive_probes=5
net.ipv4.tcp_retries2=3

Ngaphezu kwalokho, i-AWS RDS ibiza cishe ngokuphindwe kabili kunentengo evamile, okwakuyisizathu esikhulu sokushiya lesi sixazululo.

Patroni

Lesi isifanekiso se-python sokuphatha i-PostgreSQL ngemibhalo emihle, i-failiver ezenzakalelayo kanye nekhodi yomthombo ku-github.

Izinzuzo zePatroni:

  • Ipharamitha yokucushwa ngayinye ichaziwe, kuyacaca ukuthi isebenza kanjani;
  • Ukuhluleka okuzenzakalelayo kusebenza ngaphandle kwebhokisi;
  • Ibhalwe nge-python, futhi njengoba thina ngokwethu sibhala okuningi ku-python, kuyoba lula ngathi ukubhekana nezinkinga futhi, mhlawumbe, ngisho nokusiza ukuthuthukiswa kwephrojekthi;
  • Iphatha ngokugcwele i-PostgreSQL, ikuvumela ukuthi uguqule ukucushwa kuwo wonke ama-node eqoqo ngesikhathi esisodwa, futhi uma iqoqo lidinga ukuqaliswa kabusha ukuze kusetshenziswe ukucushwa okusha, khona-ke lokhu kungenziwa futhi kusetshenziswa i-Patroni.

Umthengi:

  • Akucaci emibhalweni ukuthi isebenza kanjani ne-PgBouncer ngendlela efanele. Nakuba kunzima ukuyibiza ngokuthi i-minus, ngoba umsebenzi we-Patroni ukuphatha i-PostgreSQL, nokuthi ukuxhumeka ku-Patroni kuzohamba kanjani kakade inkinga yethu;
  • Kunezibonelo ezimbalwa zokuqaliswa kwe-Patroni ngamavolumu amakhulu, kuyilapho kunezibonelo eziningi zokuqaliswa kusukela ekuqaleni.

Ngenxa yalokho, sikhethe i-Patroni ukudala iqoqo le-failover.

I-Patroni Implementation Process

Ngaphambi kukaPatroni, sasine-12 PostgreSQL shards ekucushweni kwenkosi eyodwa kanye nesifaniso esisodwa esiphindaphindayo. Amaseva ohlelo lokusebenza afinyelele kusizindalwazi nge-Network Load Balancer, ngemuva kwayo kube nezimo ezimbili nge-PgBouncer, futhi ngemuva kwazo kwakukhona wonke amaseva e-PostgreSQL.
I-Failover Cluster PostgreSQL + Patroni. Umuzwa wokusebenzisa

Ukuze sisebenzise i-Patroni, kwakudingeka sikhethe ukulungiselelwa kweqoqo lesitoreji esisabalalisiwe. I-Patroni isebenza nezinhlelo zokugcina ezisabalalisiwe zokucushwa ezifana nokunyed, Zookeeper, Consul. Sineqoqo le-Consul eligcwele ngokugcwele emakethe, elisebenza ngokubambisana ne-Vault futhi asisalisebenzisi. Isizathu esihle sokuqala ukusebenzisa i-Consul ngenjongo yayo ehlosiwe.

Indlela uPatroni asebenza ngayo ne-Consul

Sineqoqo le-Consul, eliqukethe ama-node amathathu, kanye neqoqo le-Patroni, eliqukethe umholi kanye ne-replica (ku-Patroni, inkosi ibizwa ngokuthi umholi weqoqo, futhi izigqila zibizwa ngokuthi ama-replicas). Isenzakalo ngasinye se-Patroni cluster sihlala sithumela ulwazi mayelana nesimo seqoqo ku-Consul. Ngakho-ke, ku-Consul ungakwazi njalo ukuthola ukucushwa kwamanje kweqoqo le-Patroni futhi ngubani ongumholi okwamanje.

I-Failover Cluster PostgreSQL + Patroni. Umuzwa wokusebenzisa

Ukuxhuma i-Patroni ku-Consul, kwanele ukufunda imibhalo esemthethweni, ethi udinga ukucacisa umsingathi ngefomethi ye-http noma ye-https, kuye ngokuthi sisebenza kanjani ne-Consul, kanye nohlelo lokuxhuma, ngokuzikhethela:

host: the host:port for the Consul endpoint, in format: http(s)://host:port
scheme: (optional) http or https, defaults to http

Kubukeka kulula, kodwa nansi izingibe ziqala. Nge-Consul, sisebenza ngoxhumano oluphephile nge-https futhi ukulungiselelwa kwethu kokuxhuma kuzobukeka kanje:

consul:
  host: https://server.production.consul:8080 
  verify: true
  cacert: {{ consul_cacert }}
  cert: {{ consul_cert }}
  key: {{ consul_key }}

Kodwa lokho akusebenzi. Ekuqaleni, u-Patroni akakwazi ukuxhuma ku-Consul, ngoba uzama ukudlula ku-http noma kunjalo.

Ikhodi yomthombo kaPatroni yasiza ukubhekana nale nkinga. Okuhle ukuthi kubhalwe nge-python. Kuvela ukuthi ipharamitha yesikhungo ayidluliswanga nganoma iyiphi indlela, futhi umthetho olandelwayo kufanele ucaciswe esikimini. Lena yindlela ibhulokhi yokumisa yokusebenza yokusebenza ne-Consul ibonakala ngathi:

consul:
  host: server.production.consul:8080
  scheme: https
  verify: true
  cacert: {{ consul_cacert }}
  cert: {{ consul_cert }}
  key: {{ consul_key }}

i-consul-template

Ngakho, sikhethe isitoreji sokucushwa. Manje sidinga ukuqonda ukuthi i-PgBouncer izoshintsha kanjani ukucushwa kwayo lapho ishintsha umholi kuqoqo le-Patroni. Ayikho impendulo yalo mbuzo emibhalweni, ngoba. lapho, ngokomthetho, ukusebenza ne-PgBouncer akuchazwanga.

Ekufuneni isisombululo, sithole isihloko (Ngeshwa angisikhumbuli isihloko) lapho kwakubhalwe khona ukuthi i-Π‘onsul-template yasiza kakhulu ekubhanqaniseni i-PgBouncer ne-Patroni. Lokhu kusenze saphenya ukuthi isifanekiso se-Consul sisebenza kanjani.

Kuvele ukuthi isifanekiso se-Consul sihlala siqapha ukumiswa kweqoqo le-PostgreSQL ku-Consul. Lapho umholi eshintsha, ibuyekeza ukucushwa kwe-PgBouncer bese ithumela umyalo wokuyilayisha kabusha.

I-Failover Cluster PostgreSQL + Patroni. Umuzwa wokusebenzisa

I-plus enkulu yesifanekiso ukuthi igcinwa njengekhodi, ngakho-ke lapho wengeza i-shard entsha, kwanele ukwenza isivumelwano esisha nokuvuselela isifanekiso ngokuzenzakalelayo, sisekela Ingqalasizinda njengomgomo wekhodi.

Isakhiwo esisha noPatroni

Ngenxa yalokho, sithole uhlelo lokusebenza olulandelayo:
I-Failover Cluster PostgreSQL + Patroni. Umuzwa wokusebenzisa

Wonke amaseva ohlelo lokusebenza afinyelela ibhalansi β†’ kunezimo ezimbili ze-PgBouncer ngemuva kwayo β†’ esimweni ngasinye, kwethulwa isifanekiso se-Consul, esiqapha isimo seqoqo ngalinye le-Patroni futhi siqaphe ukuhlobana kwe-PgBouncer config, ethumela izicelo kumholi wamanje. yeqoqo ngalinye.

Ukuhlola mathupha

Sasebenzisa lolu hlelo ngaphambi kokuluthula endaweni encane yokuhlola futhi sahlola ukusebenza kokushintsha okuzenzakalelayo. Bavula ibhodi, bahambisa isitikha, futhi ngaleso sikhathi "babulala" umholi weqembu. Ku-AWS, lokhu kulula njengokuvala isenzakalo nge-console.

I-Failover Cluster PostgreSQL + Patroni. Umuzwa wokusebenzisa

Isitika sibuye emuva phakathi kwamasekhondi angu-10-20, sabe sesiqala ukuhamba ngendlela evamile. Lokhu kusho ukuthi iqoqo le-Patroni lisebenze kahle: lishintshe umholi, lathumela ulwazi ku-Π‘onsul, futhi ithempulethi ye-Π‘onsul yaluthatha ngokushesha lolu lwazi, yashintsha ukucushwa kwe-PgBouncer futhi yathumela umyalo wokulayisha kabusha.

Ungasinda kanjani ngaphansi komthwalo omkhulu futhi ugcine isikhathi sokuphumula sincane?

Konke kusebenza ngokuphelele! Kepha kunemibuzo emisha: Kuzosebenza kanjani ngaphansi komthwalo omkhulu? Indlela yokukhipha ngokushesha futhi ngokuphepha yonke into ekukhiqizeni?

Indawo yokuhlola esenza kuyo ukuhlola umthwalo isisiza ukuthi siphendule umbuzo wokuqala. Ifana ngokuphelele nokukhiqizwa ngokuya ngezakhiwo futhi ikhiqize idatha yokuhlola ecishe ilingane ngevolumu yokukhiqiza. Sinquma ukuvele β€œsibulale” omunye wezingcweti ze-PostgreSQL ngesikhathi sokuhlolwa futhi sibone ukuthi kwenzekani. Kodwa ngaphambi kwalokho, kubalulekile ukuhlola ukugoqa okuzenzakalelayo, ngoba kule ndawo sinezinhlamvu eziningana ze-PostgreSQL, ngakho-ke sizothola ukuhlolwa okuhle kakhulu kwemibhalo yokucushwa ngaphambi kokukhiqiza.

Yomibili le misebenzi ibukeka ifuna ukuvelela, kepha sinePostgreSQL 9.6. Singakwazi ukuthuthukela ku-11.2 ngokushesha?

Sinquma ukukwenza ngezinyathelo ezi-2: okokuqala thuthukela ku-11.2, bese sethula i-Patroni.

Isibuyekezo se-PostgreSQL

Ukuze ubuyekeze ngokushesha inguqulo ye-PostgreSQL, sebenzisa inketho -k, lapho kwakhiwa khona izixhumanisi eziqinile kudiski futhi asikho isidingo sokukopisha idatha yakho. Kuzisekelo ze-300-400 GB, isibuyekezo sithatha isekhondi elingu-1.

Sinenqwaba yamashadi, ngakho-ke isibuyekezo sidinga ukwenziwa ngokuzenzakalelayo. Ukuze senze lokhu, sibhale incwadi yokudlala ye-Ansible esisingatha yonke inqubo yokubuyekeza:

/usr/lib/postgresql/11/bin/pg_upgrade 
<b>--link </b>
--old-datadir='' --new-datadir='' 
 --old-bindir=''  --new-bindir='' 
 --old-options=' -c config_file=' 
 --new-options=' -c config_file='

Kubalulekile ukuqaphela lapha ukuthi ngaphambi kokuqala ukuthuthukisa, kufanele ukwenze ngepharamitha --hlolaukuze uqiniseke ukuthi ungathuthukisa. Iskripthi sethu futhi senza ukushintshwa kwezilungiselelo ngesikhathi sokuthuthukisa. Umbhalo wethu uqede ngemizuzwana engama-30, okuwumphumela omuhle kakhulu.

Yethula i-Patroni

Ukuxazulula inkinga yesibili, vele ubheke ukucushwa kwe-Patroni. Inqolobane esemthethweni inesibonelo sokucushwa nge-initdb, enesibopho sokuqalisa isizindalwazi esisha lapho uqala i-Patroni. Kodwa njengoba sesivele sinesizindalwazi esenziwe ngomumo, sivele sasusa lesi sigaba ekucushweni.

Lapho siqala ukufaka i-Patroni kuqoqo le-PostgreSQL eselivele likhona futhi siliqhuba, sihlangabezane nenkinga entsha: womabili amaseva aqale njengomholi. U-Patroni akazi lutho ngesimo sakuqala seqoqo futhi uzama ukuqala womabili amaseva njengamaqoqo amabili ahlukene anegama elifanayo. Ukuze uxazulule le nkinga, udinga ukususa uhla lwemibhalo ngedatha yesigqila:

rm -rf /var/lib/postgresql/

Lokhu kudinga ukwenziwa kuphela ngesigqila!

Uma i-replica ehlanzekile ixhunyiwe, u-Patroni wenza umholi we-basebackup futhi ayibuyisele ku-replica, bese ebamba isimo samanje ngokuya ngamalogi odonga.

Obunye ubunzima esihlangabezane nabo ukuthi wonke amaqoqo e-PostgreSQL aqanjwa ngokuzenzakalelayo. Uma iqoqo ngalinye lingazi lutho ngomunye, lokhu kujwayelekile. Kodwa uma ufuna ukusebenzisa i-Patroni, khona-ke wonke amaqoqo kufanele abe negama eliyingqayizivele. Isixazululo ukushintsha igama leqoqo ekucushweni kwe-PostgreSQL.

ukuhlolwa komthwalo

Sethule uhlolo olulingisa ulwazi lomsebenzisi emabhodini. Lapho umthwalo ufinyelela inani lethu eliphakathi kwansuku zonke, saphinda ukuhlolwa okufanayo, sacisha isenzakalo esisodwa nomholi we-PostgreSQL. Ukuhluleka okuzenzakalelayo kusebenze njengoba besilindele: U-Patroni ushintshe umholi, Isifanekiso se-Consul sibuyekeze ukucushwa kwe-PgBouncer futhi sithumele umyalo wokulayisha kabusha. Ngokusho kwamagrafu ethu e-Grafana, kwacaca ukuthi kukhona ukubambezeleka kwamasekhondi angu-20-30 kanye nenani elincane lamaphutha avela kumaseva ahlotshaniswa nokuxhumeka ku-database. Lesi isimo esijwayelekile, amanani anjalo amukelekile ku-failiver yethu futhi angcono impela kunesikhathi sokuphumula sesevisi.

Ukuletha i-Patroni ekukhiqizeni

Ngenxa yalokho, siqhamuke nohlelo olulandelayo:

  • Hambisa isifanekiso se-Consul kumaseva we-PgBouncer futhi uqalise;
  • Ukuvuselelwa kwe-PostgreSQL kunguqulo 11.2;
  • Shintsha igama leqoqo;
  • Kwethulwa i-Patroni Cluster.

Ngesikhathi esifanayo, uhlelo lwethu lusivumela ukuthi senze iphuzu lokuqala cishe nganoma yisiphi isikhathi, singasusa i-PgBouncer ngayinye emsebenzini ngokushintshana futhi sisebenzise futhi sisebenzise isifanekiso se-consul kuso. Senza kanjalo.

Ukuze sisetshenziswe ngokushesha, sisebenzise i-Ansible, njengoba sesivele sihlole zonke izincwadi zokudlala endaweni yokuhlola, futhi isikhathi sokwenza iskripthi esigcwele sasisuka kumaminithi angu-1,5 kuya kwangu-2 kushadi ngalinye. Singakhipha yonke into ngokushintshana ku-shard ngayinye ngaphandle kokumisa isevisi yethu, kodwa kuzodingeka sivale i-PostgreSQL ngayinye imizuzu embalwa. Kulokhu, abasebenzisi idatha yabo ikule shard abakwazanga ukusebenza ngokugcwele ngalesi sikhathi, futhi lokhu akwamukelekile kithi.

Indlela yokuphuma kulesi simo kwaba ukugcinwa okuhleliwe, okwenzeka njalo ezinyangeni ezi-3. Leli iwindi lomsebenzi ohleliwe, lapho sivala ngokuphelele isevisi yethu futhi sithuthukisa izimo zethu zesizindalwazi. Kwase kusele isonto elilodwa kuze kufike iwindi elilandelayo, futhi sanquma ukulinda nje silungiselele okwengeziwe. Ngesikhathi sokulinda, sabuye sazivikela: kushadi ngalinye le-PostgreSQL, siphakamise isifaniso esiyisipele uma kwenzeka sihluleka ukugcina idatha yakamuva, futhi sengeza isibonelo esisha seshadi ngalinye, okufanele libe ikhophi entsha kuqoqo le-Patroni, ukuze ungasebenzisi umyalo wokususa idatha . Konke lokhu kwasiza ukunciphisa ingozi yephutha.
I-Failover Cluster PostgreSQL + Patroni. Umuzwa wokusebenzisa

Siqale kabusha insizakalo yethu, yonke into yasebenza ngendlela efanele, abasebenzisi baqhubeka nokusebenza, kepha kumagrafu sabona umthwalo ophakeme ngokungajwayelekile kumaseva e-Consul.
I-Failover Cluster PostgreSQL + Patroni. Umuzwa wokusebenzisa

Kungani singakubonanga lokhu endaweni yokuhlola? Le nkinga ibonisa kahle kakhulu ukuthi kuyadingeka ukulandela Ingqalasizinda njengomgomo wekhodi futhi kucwengisiswe yonke ingqalasizinda, kusukela ezindaweni zokuhlola kuye ekukhiqizeni. Uma kungenjalo, kulula kakhulu ukuthola inkinga esinayo. Kwenzenjani? I-Consul iqale yavela ekukhiqizeni, kwase kuthi ezindaweni zokuhlola, ngenxa yalokho, ezindaweni zokuhlola, inguqulo ye-Consul yayiphakeme kuneyokukhiqiza. Kokunye okukhishiwe, ukuvuza kwe-CPU kwaxazululwa lapho kusetshenzwa ne-consul-template. Ngakho-ke, simane sabuyekeza i-Consul, ngaleyo ndlela saxazulula inkinga.

Qala kabusha iqoqo le-Patroni

Nokho, sithole inkinga entsha, esingazange siyisole nokuyisola. Lapho sibuyekeza i-Consul, sivele sisuse i-Consul node kuqoqo sisebenzisa umyalo we-consul leave β†’ I-Patroni ixhuma kwenye iseva ye-Consul β†’ yonke into iyasebenza. Kodwa lapho sifika esimweni sokugcina seqoqo le-Consul futhi sithumela umyalo we-consul leave kuwo, wonke amaqoqo e-Patroni avele aqala kabusha, futhi ezingodweni sabona iphutha elilandelayo:

ERROR: get_cluster
Traceback (most recent call last):
...
RetryFailedError: 'Exceeded retry deadline'
ERROR: Error communicating with DCS
<b>LOG: database system is shut down</b>

Iqoqo le-Patroni alikwazanga ukubuyisa ulwazi mayelana neqoqo lalo futhi laqala kabusha.

Ukuthola isisombululo, sithinte ababhali be-Patroni ngodaba oluku-github. Baphakamise ukuthuthukiswa kumafayela ethu okulungiselela:

consul:
 consul.checks: []
bootstrap:
 dcs:
   retry_timeout: 8

Sikwazile ukuphinda inkinga endaweni yokuhlola futhi sahlola lezi zinketho lapho, kodwa ngeshwa azisebenzanga.

Inkinga isalokhu ingaxazululiwe. Sihlela ukuzama izixazululo ezilandelayo:

  • Sebenzisa i-Consul-ejenti kusibonelo ngasinye seqoqo le-Patroni;
  • Lungisa inkinga kukhodi.

Siyaqonda lapho iphutha lenzeke khona: inkinga kungenzeka iwukusetshenziswa kwesikhathi sokuvala esimisiwe, esingachithiwe ngefayela lokumisa. Lapho iseva yokugcina ye-Consul isuswa ku-cluster, iqoqo le-Consul lilenga isikhathi esingaphezu kwesekhondi, ngenxa yalokhu, u-Patroni akakwazi ukuthola isimo seqoqo futhi aqale kabusha yonke iqoqo.

Ngenhlanhla, asiphindanga sihlangabezane namaphutha.

Imiphumela yokusebenzisa i-Patroni

Ngemva kokwethulwa ngempumelelo kwe-Patroni, sengeze ikhophi eyengeziwe kuqoqo ngalinye. Manje eqoqweni ngalinye kunokufana kwekhoramu: umholi oyedwa nezifaniso ezimbili, ngenethi yokuphepha uma kwenzeka kuhlukana ubuchopho lapho ushintsha.
I-Failover Cluster PostgreSQL + Patroni. Umuzwa wokusebenzisa

U-Patroni usebenze ekukhiqizeni isikhathi esingaphezu kwezinyanga ezintathu. Phakathi nalesi sikhathi, usekwazile kakade ukusisiza. Muva nje, umholi welinye lamaqoqo ushonile ku-AWS, i-failover ezenzakalelayo yasebenza futhi abasebenzisi baqhubeka nokusebenza. U-Patroni wafeza umsebenzi wakhe oyinhloko.

Isifinyezo esincane sokusetshenziswa kwePatroni:

  • Ubulula bezinguquko zokumisa. Kwanele ukushintsha ukucushwa ngesikhathi esisodwa futhi izodonswa ifike kulo lonke iqoqo. Uma ukuqalisa kabusha kuyadingeka ukuze usebenzise ukucushwa okusha, u-Patroni uzokwazisa. I-Patroni ingakwazi ukuqala kabusha iqoqo lonke ngomyalo owodwa, obuye ube lula kakhulu.
  • I-failover ezenzakalelayo iyasebenza futhi isivele ikwazile ukusisiza.
  • Isibuyekezo se-PostgreSQL ngaphandle kwesikhathi sokuphumula sohlelo lokusebenza. Kumelwe uqale ubuyekeze izifanekiso enguqulweni entsha, bese ushintsha umholi kuqoqo le-Patroni futhi ubuyekeze umholi omdala. Kulokhu, ukuhlolwa okudingekile kwe-failover okuzenzakalelayo kwenzeka.

Source: www.habr.com

Engeza amazwana