Iqela leFailover PostgreSQL + Patroni. Amava okuphunyezwa

Kweli nqaku ndiza kukuxelela indlela esiye sawujonga ngayo umcimbi we-PostgreSQL ukunyamezela impazamo, kutheni kubalulekile kuthi, kwaye kwenzekani ekugqibeleni.

Sinenkonzo elayishwe kakhulu: abasebenzisi abazizigidi ezi-2,5 kwihlabathi liphela, 50K+ abasebenzisi abasebenzayo yonke imihla. Iiseva zifumaneka kwi-Amazone kummandla omnye wase-Ireland: I-100+ iiseva ezahlukeneyo zihlala zisebenza, phantse i-50 yazo ine-database.

I-backend yonke sisicelo esikhulu seJava semonolithic esigcina uqhagamshelo lwewebhu oluzingileyo kunye nomxhasi. Xa abasebenzisi abaninzi besebenza ngaxeshanye kwibhodi enye, bonke babona utshintsho ngexesha langempela, kuba sirekhoda lonke utshintsho kwisiseko sedatha. Sinezicelo ezimalunga ne-10K ngesekhondi kuluhlu lwethu lwedatha. Kumthwalo ophakamileyo kwiRedis sibhala izicelo ze-80-100K ngesekhondi.
Iqela leFailover PostgreSQL + Patroni. Amava okuphunyezwa

Kutheni sitshintshe ukusuka kwiRedis ukuya kwiPostgreSQL

Ekuqaleni, inkonzo yethu yasebenza kunye neRedis, ukugcinwa kwexabiso eliphambili eligcina yonke idatha kwi-RAM yomncedisi.

Iimpawu zeRedis:

  1. Isantya esiphezulu sokuphendula, kuba yonke into igcinwe kwimemori;
  2. I-backup elula kunye nokuphindaphinda.

Iingxaki zeRedis kuthi:

  1. Akukho ntengiselwano yokwenyani. Sazama ukubaxelisa kwinqanaba lethu lesicelo. Ngelishwa, oku akuzange kusebenze kakuhle kwaye kufuna ukubhala ikhowudi enzima kakhulu.
  2. Isixa sedatha sikhawulelwe sisixa sememori. Njengoko inani ledatha linyuka, imemori iya kukhula, kwaye, ekugqibeleni, siya kuqhuba kwiimpawu zomzekelo okhethiweyo, othi kwi-AWS ifune ukuyeka inkonzo yethu ukutshintsha uhlobo lomzekelo.
  3. Kuyimfuneko ukugcina rhoqo inqanaba eliphantsi le-latency, kuba Sinenani elikhulu kakhulu lezicelo. Elona nqanaba liphezulu le-latency kuthi li-17-20 ms. Kwinqanaba le-30-40 ms, sifumana iimpendulo ezinde kwizicelo zethu zesicelo kunye nokuthotywa kwenkonzo. Ngelishwa, oku kwenzeka kuthi ngoSeptemba 2018, xa enye yeemeko kunye neRedis ngesizathu esithile ifumene i-latency eyayiphindwe ngamaxesha e-2 ngaphezu kwesiqhelo. Ukusombulula ingxaki, siye samisa inkonzo phakathi kosuku lokusebenza ukwenzela ukugcinwa okungacwangciswanga kwaye sithathe indawo yomzekelo weRedis oyingxaki.
  4. Kulula ukufumana idatha engahambelaniyo nakwiimpazamo ezincinci kwikhowudi kwaye emva koko uchithe ixesha elininzi lokubhala ikhowudi ukulungisa loo datha.

Sithathele ingqalelo ukungalungi kwaye saqonda ukuba kufuneka sifudukele kwinto elula ngakumbi, kunye nokuthengiselana okuqhelekileyo kunye nokuxhomekeka okuncinci kwi-latency. Senze uphando lwethu, sahlalutya iindlela ezininzi kwaye sakhetha i-PostgreSQL.

Siye safudukela kwi-database entsha iminyaka eyi-1,5 ngoku kwaye sidlulisele kuphela inxalenye encinci yedatha, ngoko ngoku sisebenza kunye neRedis kunye nePostgreSQL. Ulwazi oluninzi malunga nezigaba zokuhamba kunye nokutshintsha idatha phakathi kwedatha ebhaliweyo inqaku ngugxa wam.

Xa saqala ukuhamba, isicelo sethu sasebenza ngokuthe ngqo kwisiseko sedatha kwaye safikelela kwiRedis kunye ne-PostgreSQL master. Iqela le-PostgreSQL liquka inkosi kunye nekopi enokuphindaphindwa kwe-asynchronous. Nantsi indlela idatabase workflows ejonge ngayo:
Iqela leFailover PostgreSQL + Patroni. Amava okuphunyezwa

Ukuphumeza iPgBouncer

Ngelixa sasihamba, imveliso yayiphuhliswa kwakhona: inani labasebenzisi kunye nenani leeseva ezisebenza ngePostgreSQL zanda, kwaye saqala ukuphelelwa uxhulumaniso. I-PostgreSQL idala inkqubo eyahlukileyo yoqhagamshelwano ngalunye kwaye isebenzisa izixhobo. Unokwandisa inani lokudibanisa ukuya kwinqanaba elithile, ngaphandle koko kukho ithuba lokuba i-database ayiyi kusebenza ngokufanelekileyo. Inketho efanelekileyo kwimeko enjalo iya kuba kukukhetha umphathi woqhagamshelwano oya kuma phambi kwesiseko sedatha.

Sineendlela ezimbini ongakhetha kuzo kumphathi wonxibelelwano: iPgpool kunye nePgBouncer. Kodwa eyokuqala ayixhasi imo yentengiselwano yokusebenza neenkcukacha, ngoko ke sikhethe iPgBouncer.

Siye saqwalasela le nkqubo yomsebenzi ilandelayo: isicelo sethu sifikelela kwiPgBouncer enye, emva kwayo kukho iinkosi zePostgreSQL, kwaye emva kwenkosi nganye kukho ikopi enye ene-asynchronous replication.
Iqela leFailover PostgreSQL + Patroni. Amava okuphunyezwa

Ngelo xesha, asikwazanga ukugcina yonke isamba sedatha kwi-PostgreSQL kwaye isantya sokusebenza kunye nesiseko sedatha sasibalulekile kuthi, ngoko ke saqala ukusabalalisa i-PostgreSQL kwinqanaba lesicelo. Iskimu esichazwe ngasentla sikulungele oku: xa ufaka i-PostgreSQL shard entsha, kwanele ukuhlaziya uqwalaselo lwePgBouncer kwaye isicelo sinokusebenza ngokukhawuleza kunye neshard entsha.

PgBouncer Fault Tolerance

Esi sikimu sisebenze de kwasweleka umzekelo wePgBouncer kuphela. Sikwi-AWS, apho zonke iimeko ziqaliswa kwi-hardware efa ngamaxesha athile. Kwiimeko ezinjalo, umzekelo ufudukela kwi-hardware entsha kwaye usebenze kwakhona. Oku kwenzeka ngePgBouncer, kodwa ayizange ifumaneke. Isiphumo sale ngozi yaba kukuba inkonzo yethu ibingafumaneki kangangemizuzu engama-25. Kwiimeko ezinjalo, i-AWS incoma ukusebenzisa ukuphindaphinda kwicala lomsebenzisi, esingazange siyisebenzise ngelo xesha.

Emva koko, sicinge nzulu malunga nokunyamezela impazamo yePgBouncer kunye nePostgreSQL amaqela, kuba imeko efanayo inokuphinda yenzeke nangawuphi na umzekelo kwiakhawunti yethu ye-AWS.

Sakhe i-PgBouncer fault tolerance scheme ngolu hlobo lulandelayo: zonke iiseva zesicelo zifikelela kwi-Network Load Balancer, emva kwayo kukho iiPgBouncers ezimbini. Nganye kwi-PgBouncers ijonge kwi-postgreSQL eyinkosi efanayo ye-shard nganye. Ukuba imeko ngengozi ye-AWS iyaphinda, zonke itrafikhi ziqondiswe kwenye iPgBouncer. Network Load Balancer ukunyamezela impazamo kubonelelwa yi-AWS.

Olu dweliso lukuvumela ukuba wongeze ngokulula iiseva ezintsha zePgBouncer.
Iqela leFailover PostgreSQL + Patroni. Amava okuphunyezwa

Ukwenza i-PostgreSQL Failover Cluster

Xa sixazulula le ngxaki, siye saqwalasela iindlela ezahlukeneyo zokukhetha: i-failover self-written, repmgr, AWS RDS, Patroni.

Izikripthi ezizibhalayo

Bangakwazi ukubeka esweni umsebenzi wenkosi kwaye, ukuba iyasilela, khuthaza i-replica kwi-master kwaye uhlaziye uqwalaselo lwePgBouncer.

Iingenelo zale ndlela zilula kakhulu, kuba ubhala izikripthi ngokwakho kwaye uqonde ngqo ukuba zisebenza njani.

Umgcini:

  • Inkosi isenokuba ayifanga; endaweni yoko, kunokubakho ukusilela kwenethiwekhi. I-Failover, engazi oku, iya kukhuthaza i-replica kwi-master, kwaye inkosi endala iya kuqhubeka isebenza. Ngenxa yoko, siya kufumana iiseva ezimbini kwindima enkulu kwaye asiyi kukwazi ukuba yeyiphi na kubo enedatha yangoku. Le meko ikwabizwa ngokuba yi-split-brain;
  • Sashiyeka singaphenduli. Kuqwalaselo lwethu kukho i-master kunye ne-replica enye, emva kokutshintsha i-replica iphakanyiswe kwi-master kwaye asisenazo ii-replicas, ngoko kufuneka songeze ngesandla i-replica entsha;
  • Sidinga ukubekwa esweni okongeziweyo kokusebenza kwe-failover, kwaye sine-12 PostgreSQL shards, oku kuthetha ukuba kufuneka sibeke iliso kumaqela e-12. Xa unyusa inani leeshadi, kufuneka ukhumbule kwakhona ukuhlaziya i-faillover.

I-failover ebhaliweyo ngokwayo ibonakala inzima kakhulu kwaye idinga inkxaso engeyiyo encinci. Ngeqela elinye lePostgreSQL, oku kuya kuba yeyona ndlela ilula, kodwa ayilinganisi, ngoko ayifanelekanga kuthi.

Repmgr

Umphathi wokuReplication for PostgreSQL clusters, enokulawula ukusebenza kwePostgreSQL cluster. Ngelo xesha, ayinayo i-faillover ngokuzenzekelayo ngaphandle kwebhokisi, ngoko ke ukusebenza kuya kufuneka ubhale "i-wrapper" yakho phezu kwesisombululo esele senziwe. Ke yonke into inokujika ibenzima ngakumbi kunezikripthi ezizibhalileyo, yiyo loo nto singazamanga nokuzama iRepmgr.

AWS RDS

Ixhasa yonke into esiyidingayo, inokwenza i-backups kwaye ixhase i-pool yoqhagamshelwano. Inokutshintsha ngokuzenzekelayo: xa inkosi ifa, i-replica iba yinkosi entsha, kwaye i-AWS itshintsha irekhodi ye-DNS kwinkosi entsha, ngelixa ii-replicas zingafumaneka kwii-AZs ezahlukeneyo.

Izinto ezingeloncedo ziquka ukunqongophala kwezicwangciso ezifanelekileyo. Njengomzekelo wokulungisa kakuhle: iimeko zethu zinezithintelo zoqhagamshelo lwe-tcp, olo, ngelishwa, olungenakwenziwa kwi-RDS:

net.ipv4.tcp_keepalive_time=10
net.ipv4.tcp_keepalive_intvl=1
net.ipv4.tcp_keepalive_probes=5
net.ipv4.tcp_retries2=3

Ukongeza, i-AWS RDS iphantse yabiza ngokuphindwe kabini njengexabiso eliqhelekileyo, esona sizathu sokushiya esi sisombululo.

uPatroni

Le template yepython yokulawula i-PostgreSQL ngamaxwebhu amahle, i-failiver ngokuzenzekelayo kunye nekhowudi yomthombo kwi-github.

Iimpawu zePatroni:

  • Iparameter yoqwalaselo nganye ichazwe, icacile ukuba isebenza njani;
  • I-failover ezenzekelayo isebenza ngaphandle kwebhokisi;
  • Ibhalwe kwi-python, kwaye ekubeni thina ngokwethu sibhala kakhulu kwi-python, kuya kuba lula ngathi ukujongana neengxaki kwaye, mhlawumbi, nokuba sincede ukuphuhliswa kweprojekthi;
  • Ukulawula ngokupheleleyo i-PostgreSQL, ikuvumela ukuba utshintshe uqwalaselo kuzo zonke iindawo zeqela kanye, kwaye ukuba ukusebenzisa uqwalaselo olutsha kufuna ukuqaliswa kwakhona kweqela, oku kunokwenziwa kwakhona usebenzisa iPatroni.

Umgcini:

  • Akucaci kuxwebhu ukuba kusetyenzwa njani ngokuchanekileyo ngePgBouncer. Nangona kunzima ukubiza le minus, kuba umsebenzi kaPatroni kukulawula i-PostgreSQL, kunye nendlela ukudibanisa kwi-Patroni kuya kusebenza ngayo ingxaki yethu;
  • Kukho imizekelo embalwa yokuphunyezwa kwePatroni kwimilinganiselo emikhulu, ngelixa kukho imizekelo emininzi yokuphunyezwa ukusuka ekuqaleni.

Ngenxa yoko, sakhetha uPatroni ukuba enze i-failover cluster.

Inkqubo yokuphunyezwa kwePatroni

Ngaphambi kukaPatroni, sasine-12 PostgreSQL shards kwinkosi enye kunye noqwalaselo olunye lwe-replica kunye nokuphindaphinda okungafaniyo. Iiseva zesicelo zifikelele kwii-database nge-Network Load Balancer, emva kwayo kwakukho iimeko ezimbini kunye ne-PgBouncer, kwaye emva kwazo zonke iiseva ze-PostgreSQL.
Iqela leFailover PostgreSQL + Patroni. Amava okuphunyezwa

Ukuphumeza i-Patroni, kwafuneka sikhethe ukugcinwa koqwalaselo lweqela elisasaziweyo. I-Patroni isebenza kunye neenkqubo zokugcinwa koqwalaselo olusasaziweyo ezifana ne-etcd, i-Zookeeper, i-Consul. Sineqela elipheleleyo le-Consul kwimveliso, esebenza ngokubambisana neVault kwaye asisayisebenzisi. Isizathu esikhulu sokuqalisa ukusebenzisa i-Consul ngenjongo yayo.

Indlela uPatroni asebenza ngayo kunye no-Consul

Sineqela le-Consul, eliqukethe iinqununu ezintathu, kunye neqela lePatroni, eliqukethe inkokeli kunye ne-replica (kwi-Patroni, inkosi ibizwa ngokuba yinkokeli yeqela, kwaye amakhoboka abizwa ngokuba yi-replicas). Umzekelo ngamnye weqela le-Patroni uhlala uthumela ulwazi malunga nemeko yeqela ku-Consul. Ke ngoko, ukusuka ku-Consul unokuhlala ufumanisa uqwalaselo lwangoku lweqela lePatroni kwaye ngubani oyinkokeli okwangoku.

Iqela leFailover PostgreSQL + Patroni. Amava okuphunyezwa

Ukuqhagamshela uPatroni ku-Consul, funda nje amaxwebhu asemthethweni, athi kufuneka uchaze umamkeli kwi-http okanye kwifomathi ye-https, kuxhomekeke kwindlela esisebenza ngayo no-Consul, kunye nomzobo woqhagamshelwano, ngokukhetha:

host: the host:port for the Consul endpoint, in format: http(s)://host:port
scheme: (optional) http or https, defaults to http

Ijongeka ilula, kodwa kulapho imigibe iqala khona. Nge-Consul sisebenza kuqhagamshelo olukhuselekileyo nge-https kunye noqwalaselo lwethu lonxibelelwano luya kujongeka ngolu hlobo:

consul:
  host: https://server.production.consul:8080 
  verify: true
  cacert: {{ consul_cacert }}
  cert: {{ consul_cert }}
  key: {{ consul_key }}

Kodwa ayisebenzi ngaloo ndlela. Ekuqalisweni, uPatroni akakwazi ukuqhagamshela kwi-Consul kuba isazama ukuhamba nge-http.

Ikhowudi yomthombo wePatroni yanceda ukusombulula ingxaki. Kulungile ukuba ibhalwe ngepython. Kuvela ukuba ipharamitha yomkhosi ayinqunyulwanga nangayiphi na indlela, kwaye iprotocol kufuneka icaciswe kwiskimu. Nantsi indlela ibhloko yoqwalaselo esebenzayo yokusebenza no-Consul ibonakala ngathi:

consul:
  host: server.production.consul:8080
  scheme: https
  verify: true
  cacert: {{ consul_cacert }}
  cert: {{ consul_cert }}
  key: {{ consul_key }}

Consul-template

Ngoko ke, sikhethe ukugcinwa koqwalaselo. Ngoku kufuneka siqonde indlela iPgBouncer eya kutshintsha ngayo ukucwangciswa kwayo xa inkokeli itshintsha kwiqela lePatroni. Akukho mpendulo kulo mbuzo kumaxwebhu, kuba... Ngokomgaqo, ukusebenza kunye nePgBouncer akuchazwanga apho.

Ukukhangela isisombululo, sifumene inqaku (ngelishwa, andikhumbuli igama), apho kubhaliwe ukuba i-Consul-template yayiluncedo kakhulu ekudibaniseni iPgBouncer kunye nePatroni. Oku kubangele ukuba sifunde umsebenzi we-Consul-template.

Kwavela ukuba i-Consul-template isoloko ibeka iliso kuqwalaselo lweqela le-PostgreSQL kwi-Consul. Xa inkokeli itshintsha, ihlaziya uqwalaselo lwePgBouncer kwaye ithumele umyalelo wokuyilayisha kwakhona.

Iqela leFailover PostgreSQL + Patroni. Amava okuphunyezwa

Inzuzo enkulu yethemplate kukuba igcinwe njengekhowudi, ngoko ke xa ufaka i-shard entsha, kwanele ukwenza isithembiso esitsha kunye nokuhlaziya i-template ngokuzenzekelayo, ukuxhasa i-Infrastructure njengomgaqo wekhowudi.

Uyilo olutsha kunye noPatroni

Ngenxa yoko, sifumene isikimu somsebenzi olandelayo:
Iqela leFailover PostgreSQL + Patroni. Amava okuphunyezwa

Bonke abancedisi besicelo bafikelela kwisilinganisi β†’ emva kwayo kukho imizekelo emibini yePgBouncer β†’ kumzekelo ngamnye iConsul-template iyasebenza, ebeka iliso kwimeko yeqela ngalinye lePatroni kwaye ijonge ukufaneleka kwePgBouncer config, esalathisa izicelo kwinkokeli yangoku. yeqela ngalinye.

Uvavanyo lwezandla

Ngaphambi kokuyisungula kwimveliso, sasungula esi sikimu kwindawo encinci yovavanyo kwaye sajonga ukusebenza kokutshintsha okuzenzekelayo. Bavula ibhodi, bahambisa isincamathelisi kwaye ngelo xesha "babulala" inkokeli yeqela. Kwi-AWS, konke okufuneka ukwenze kukucima umzekelo nge-console.

Iqela leFailover PostgreSQL + Patroni. Amava okuphunyezwa

Isincamathelisi sibuyile kwimizuzwana eyi-10 ukuya kwengama-20, emva koko saqala ukuhamba ngokwesiqhelo kwakhona. Oku kuthetha ukuba iqela le-Patroni lisebenze ngokuchanekileyo: litshintshe inkokeli, lathumela ulwazi ku-Consul, kunye ne-Consul-template ngokukhawuleza yathatha olu lwazi, yatshintsha i-PgBouncer configuration kwaye yathumela umyalelo wokulayisha kwakhona.

Uphila njani phantsi komthwalo ophezulu kwaye ugcine ixesha elincinci lokuphumla?

Yonke into isebenza ngokugqibeleleyo! Kodwa kuvela imibuzo emitsha: Iza kusebenza njani phantsi komthwalo omkhulu? Indlela yokukhupha ngokukhawuleza nangokukhuselekileyo yonke into kwimveliso?

Indawo yovavanyo esiqhuba kuyo uvavanyo lomthwalo isinceda siphendule umbuzo wokuqala. Ifana ngokupheleleyo nemveliso kuyilo lwezakhiwo kwaye ivelise idatha yovavanyo, ephantse ilingane nomthamo kwimveliso. Sithatha isigqibo sokuba "sibulale" enye yeenkosi zePostgreSQL ngexesha lovavanyo kwaye sibone ukuba kwenzeka ntoni. Kodwa ngaphambi koko, kubalulekile ukujonga ukukhutshwa okuzenzekelayo, kuba kule ndawo sineeshadi ezininzi ze-PostgreSQL, ngoko ke siya kufumana uvavanyo oluhle kakhulu lwezikripthi zoqwalaselo ngaphambi kokuveliswa.

Yomibini imisebenzi ibonakala inebhongo, kodwa sinePostgreSQL 9.6. Mhlawumbi singahlaziya ukuya ku-11.2 ngoko nangoko?

Sithatha isigqibo sokwenza oku ngezigaba ezi-2: kuqala uhlaziye inguqulo kwi-11.2, uze uqalise i-Patroni.

Uhlaziyo lwePostgreSQL

Ukuhlaziya ngokukhawuleza inguqulelo yePostgreSQL, kufuneka usebenzise ukhetho -k, apho ikhonkco elinzima lenziwe kwidiski kwaye akukho mfuneko yokukhuphela idatha yakho. Kwiinkcukacha ze-300-400 GB, ukuhlaziywa kuthatha i-1 yesibini.

Sine-shards ezininzi, ngoko ke uhlaziyo kufuneka lwenziwe ngokuzenzekelayo. Ukwenza oku, sibhale i-Ansible playbook esenzela yonke inkqubo yohlaziyo:

/usr/lib/postgresql/11/bin/pg_upgrade 
<b>--link </b>
--old-datadir='' --new-datadir='' 
 --old-bindir=''  --new-bindir='' 
 --old-options=' -c config_file=' 
 --new-options=' -c config_file='

Kubalulekile ukuqaphela apha ukuba ngaphambi kokuba uqale uphuculo, kufuneka ulwenze ngeparameter --jongaukuqinisekisa ukuba uhlaziyo lunokwenzeka. Isikripthi sethu sikwathatha indawo yoqwalaselo ngexesha lophuculo. Iskripthi sethu sigqitywe ngemizuzwana engama-30, esisiphumo esihle.

Ukuqaliswa kwePatroni

Ukusombulula ingxaki yesibini, jonga nje uqwalaselo lwePatroni. Indawo yokugcina esemthethweni inomzekelo woqwalaselo kunye ne-initdb, enoxanduva lokuqalisa i-database entsha xa i-Patroni iqaliswa okokuqala. Kodwa ekubeni sele sinayo i-database esele yenziwe, sisuse nje eli candelo kuqwalaselo.

Xa saqala ukufaka iPatroni kwiqela le-PostgreSQL esele lilungile kwaye silisungula, sadibana nengxaki entsha: zombini iiseva zasungulwa njengenkokeli. UPatroni akazi nto malunga nemeko yokuqala yeqela kwaye uzama ukuqhuba zombini iiseva njengamaqela amabini ahlukeneyo anegama elifanayo. Ukusombulula le ngxaki, kufuneka ucime uluhlu lwedatha kwikhoboka:

rm -rf /var/lib/postgresql/

Oku kufuneka kwenziwe kwikhoboka kuphela!

Xa udibanisa i-replica ecocekileyo, u-Patroni wenza inkokeli ye-basebackup kwaye uyibuyisele kwi-replica, kwaye emva koko ubambe imeko yangoku usebenzisa i-wal logs.

Obunye ubunzima esiye sadibana nabo kukuba onke amaqela e-PostgreSQL abizwa ngokuba yintloko ngokungagqibekanga. Xa iqela ngalinye lingazi nto malunga nelinye, oku kuqhelekile. Kodwa xa ufuna ukusebenzisa iPatroni, onke amaqela kufuneka abe negama elahlukileyo. Isisombululo kukutshintsha igama leqela kuqwalaselo lwePostgreSQL.

Uvavanyo lomthwalo

Siqalise uvavanyo olufanisa indlela abasebenzisi abasebenza ngayo kwiibhodi. Xa umthwalo ufikelela kumyinge wethu wemihla ngemihla, siphinda uvavanyo olufanayo, sacima umzekelo omnye kunye nenkokeli yePostgreSQL. I-failover ezenzekelayo isebenze njengoko besilindele: UPatroni watshintsha inkokeli, i-Consul-template ihlaziywe ukucwangciswa kwePgBouncer kwaye yathumela umyalelo wokulayisha kwakhona. Ngokutsho kweegrafu zethu eGrafana, kwacaca ukuba kukho ukulibaziseka kwemizuzwana ye-20-30 kunye nenani elincinci leempazamo ezivela kumaseva anxulumene noqhagamshelo kwisiseko sedatha. Le yimeko eqhelekileyo, amaxabiso anjalo amkelekile kwi-failiver yethu kwaye ngokuqinisekileyo angcono kunexesha lokuphumla kwenkonzo.

Ukuqaliswa kwePatroni kwimveliso

Ngenxa yoko, size nesicwangciso silandelayo:

  • Sebenzisa i-Consul-template kumncedisi wePgBouncer kwaye uqalise;
  • uhlaziyo lwePostgreSQL kwinguqulo 11.2;
  • Ukutshintsha igama leqela;
  • Ukusungulwa kweqela lePatroni.

Kwangaxeshanye, iskimu sethu sisivumela ukuba senze inqaku lokuqala phantse nangaliphi na ixesha; sinokususa iPgBouncer nganye emsebenzini ngamnye kwaye senze ukuthunyelwa kunye nokuqaliswa kwe-consul-template kuyo. Yiloo nto esayenzayo.

Ukuvavanya ngokukhawuleza, sasebenzisa i-Ansible, ekubeni sele sivavanya yonke incwadi yokudlala kwindawo yokuvavanya, kwaye ixesha lokuphumeza iskripthi esipheleleyo sasivela kwi-1,5 ukuya kwi-2 imizuzu kwi-shard nganye. Sinokukhupha yonke into nganye nganye kwi-shard nganye ngaphandle kokumisa inkonzo yethu, kodwa kuya kufuneka sicime i-PostgreSQL nganye imizuzu emininzi. Kule meko, abasebenzisi abanedatha kule shard abayi kukwazi ukusebenza ngokupheleleyo ngeli xesha, kwaye oku akwamkelekanga kuthi.

Indlela yokuphuma kule meko yayicwangcisiwe ukugcinwa, esikwenza rhoqo kwiinyanga ezi-3. Le yifestile yomsebenzi ocwangcisiweyo, xa sicima ngokupheleleyo inkonzo yethu kwaye sihlaziye iimeko zesiseko sedatha. Kwakusele iveki enye kude kufike ifestile elandelayo, kwaye sagqiba ekubeni silinde kwaye silungiselele ngakumbi. Ngexesha lokulinda, songeze sabiyela ukubheja kwethu: kwi-PostgreSQL shard nganye siphakamise i-replica esecaleni kwimeko yokusilela, ukuze sigcine idatha yamva nje, kwaye songeze umzekelo omtsha kwi-shard nganye, ekufuneka ibe yi-replica entsha kwiPatroni. iqela, ukuze ungakhuphi umyalelo wokucima idatha . Konke oku kwanceda ukunciphisa umngcipheko wempazamo.
Iqela leFailover PostgreSQL + Patroni. Amava okuphunyezwa

Saqala kwakhona inkonzo yethu, yonke into yasebenza njengoko kufanelekile, abasebenzisi baqhubeka besebenza, kodwa kwiigrafu sabona umthwalo ophezulu ngokungaqhelekanga kwiiseva ze-Consul.
Iqela leFailover PostgreSQL + Patroni. Amava okuphunyezwa

Kutheni le nto singayibonanga kwindawo yovavanyo? Le ngxaki ibonisa kakuhle kakhulu ukuba kuyimfuneko ukulandela i-Infrastructure njengomgaqo wekhowudi kunye nokuphucula iziseko zophuhliso, ukusuka kwiindawo zokuvavanya ukuya kwimveliso. Kungenjalo, kulula kakhulu ukufumana ingxaki esinayo. Kweneke ntoni? U-Consul waqala ukubonakala kwimveliso, kwaye emva koko kwiindawo zovavanyo; ngenxa yoko, kwiindawo zovavanyo inguqulelo ye-Consul yayiphezulu kunemveliso. Kanye nje kokunye okukhutshiweyo, ukuvuza kwe-CPU xa usebenza ne-consul-template yasonjululwa. Ke sihlaziye nje u-Consul, ngaloo ndlela sisombulula ingxaki.

Qala kwakhona iqela lePatroni

Noko ke, siye safumana ingxaki entsha esingazange siyikrokrele. Xa sihlaziya i-Consul, sisusa ngokulula i-Consul node kwiqela sisebenzisa umyalelo wekhefu le-consul β†’ I-Patroni idibanisa kwenye i-Consul server β†’ yonke into isebenza. Kodwa xa sifika kumzekelo wokugqibela weqela le-Consul kwaye sayithumela umyalelo wekhefu le-consul, onke amaqela e-Patroni aqala kwakhona, kwaye kwiilogi sabona impazamo elandelayo:

ERROR: get_cluster
Traceback (most recent call last):
...
RetryFailedError: 'Exceeded retry deadline'
ERROR: Error communicating with DCS
<b>LOG: database system is shut down</b>

Iqela le-Patroni alikwazanga ukufumana ulwazi malunga neqela layo kwaye liqalise ngokutsha.

Ukufumana isisombululo, saqhagamshelana nababhali bePatroni ngomcimbi kwi-github. Bacebise uphuculo kwiifayile zethu zoqwalaselo:

consul:
 consul.checks: []
bootstrap:
 dcs:
   retry_timeout: 8

Sikwazile ukuphinda-phinda umba kwindawo yovavanyo kwaye savavanya olu seto apho, kodwa ngelishwa aluzange lusebenze.

Ingxaki ayikasonjululwa. Siceba ukuzama ezi zisombululo zilandelayo:

  • Sebenzisa i-Consul-agent kumzekelo ngamnye we-Patroni cluster;
  • Lungisa ingxaki kwikhowudi.

Siyaqonda apho impazamo yenzeka khona: mhlawumbi ingxaki isekusebenzisa ixesha lokuvala elimiselweyo, elingabhalwanga phantsi kwifayile yoqwalaselo. Xa i-server yokugqibela ye-Consul isusiwe kwiqela, yonke i-Consul cluster iyaba ngumkhenkce, ehlala ixesha elide kuneyesibini; ngenxa yoko, i-Patroni ayikwazi ukufumana imeko yeqela kwaye iqalise ngokupheleleyo iqela lonke.

Ngethamsanqa, asiphindanga sidibane nezinye iimpazamo.

Iziphumo zokusebenzisa iPatroni

Emva kokuqaliswa ngempumelelo kwePatroni, songeze i-replica eyongezelelweyo kwiqela ngalinye. Ngoku iqela ngalinye linenani lekhoram: inkokeli enye kunye neekopi ezimbini, ukukhusela ngokuchasene nobuchopho xa utshintsha.
Iqela leFailover PostgreSQL + Patroni. Amava okuphunyezwa

UPatroni usebenze kwimveliso ngaphezu kweenyanga ezintathu. Ngeli xesha, wayesele ekwazile ukusinceda. Kungekudala, inkokeli yelinye lamaqela yafa kwi-AWS, i-failover ngokuzenzekelayo yasebenza kwaye abasebenzisi baqhubeka nokusebenza. I-Patroni ifezekise umsebenzi wayo oyintloko.

Isishwankathelo esifutshane sokusebenzisa iPatroni:

  • Ukulula kotshintsho loqwalaselo. Kwanele ukutshintsha uqwalaselo kwimeko enye kwaye iya kusebenza kwiqela lonke. Ukuba ukuqaliswa kwakhona kuyafuneka ukufaka uqwalaselo olutsha, uPatroni uya kukwazisa ngoku. I-Patroni inokuqalisa kwakhona iqela lonke ngomyalelo omnye, oluncedo kakhulu.
  • I-failover ezenzekelayo iyasebenza kwaye sele isincedile.
  • Ukuhlaziya iPostgreSQL ngaphandle kwexesha lokuphumla kwesicelo. Kuya kufuneka uqale uhlaziye iireplicas kwinguqulelo entsha, emva koko utshintshe inkokeli kwiqela lePatroni kwaye uhlaziye inkokeli endala. Kule meko, uvavanyo oluyimfuneko lwe-failover oluzenzekelayo lwenzeka.

umthombo: www.habr.com

Yongeza izimvo