Okuhlangenwe nakho kwami ​​kokuqala ukubuyisela isizindalwazi se-Postgres ngemva kokwehluleka (ikhasi elingavumelekile ku-block 4123007 ye-relatton base/16490)

Ngingathanda ukwabelana nawe ngesipiliyoni sami sokuqala esiyimpumelelo sokubuyisela isizindalwazi se-Postgres ukuze sisebenze ngokugcwele. Ngaqala ukuhlangana noPostgres ezinyangeni eziyisithupha ezedlule; ngaphambi kwalokho, bengingenakho isipiliyoni sokuphatha imininingwane yolwazi.

Okuhlangenwe nakho kwami ​​kokuqala ukubuyisela isizindalwazi se-Postgres ngemva kokwehluleka (ikhasi elingavumelekile ku-block 4123007 ye-relatton base/16490)

Ngisebenza njengonjiniyela we-semi-DevOps enkampanini enkulu ye-IT. Inkampani yethu ithuthukisa isofthiwe yezinsizakalo ezilayisha kakhulu, futhi nginomthwalo wemfanelo wokwethembeka ekusebenzeni, ukunakekelwa, kanye nokusetshenziswa. Nginikezwe umsebenzi ojwayelekile: ukubuyekeza uhlelo lokusebenza kuseva eyodwa. Uhlelo lokusebenza lubhalwe nge-Django, futhi ukufuduka (izinguquko kusakhiwo sedathabhethi) kwenziwa ngesikhathi sokubuyekezwa. Ngaphambi kwale nqubo, senza ukulahlwa kwedatha egcwele sisebenzisa uhlelo olujwayelekile lwe-pg_dump, uma kwenzeka.

Kwenzeke iphutha elingalindelekile ngenkathi kulahlwa (inguqulo ye-Postgres 9.5):

pg_dump: Oumping the contents of table “ws_log_smevlog” failed: PQgetResult() failed.
pg_dump: Error message from server: ERROR: invalid page in block 4123007 of relatton base/16490/21396989
pg_dump: The command was: COPY public.ws_log_smevlog [...]
pg_dunp: [parallel archtver] a worker process dled unexpectedly

iphutha "ikhasi elingavumelekile ebhlokini" ibonisa izinkinga ezingeni lesistimu yefayela, okuyinto embi kakhulu. Izinkundla ezehlukene ziphakamise ukuthi kwenziwe I-Vacuum Egcwele ngenketho aziro_amakhasi_alimele ukuxazulula le nkinga. Yebo, i-poprobeum ...

Ukulungiselela ukululama

QAPHELA! Qiniseka ukuthi wenza ikhophi yasenqolobaneni yokufaka kwakho i-Postgres ngaphambi kokuzama ukubuyisela isizindalwazi sakho. Uma usebenzisa umshini obonakalayo, misa isizindalwazi bese uthatha isifinyezo. Uma ungakwazi ukuthatha isifinyezo, misa isizindalwazi bese ukopisha okuqukethwe kunkomba ye-Postgres (okuhlanganisa namafayela we-.wal) endaweni ephephile. Okubaluleke kakhulu ukugwema ukwenza izinto zibe zimbi kakhulu. Funda uqhubeke. lokhu kungukuthi.

Njengoba i-database yami ngokuvamile yayisebenza, ngizikhawulele endaweni yokulahlwa yedatha evamile, kodwa angibandakanyi ithebula elinedatha eyonakele (inketho -T, --exclude-table=TABLE ku-pg_dump).

Iseva ibingokomzimba, ngakho ukuthatha isifinyezo bekungenzeki. Ikhophi yasenqolobaneni isendaweni, asiqhubeke.

Ihlola isistimu yefayela

Ngaphambi kokuzama ukubuyisela i-database, sidinga ukuqinisekisa ukuthi isistimu yefayela ngokwayo iphelele. Futhi uma kukhona amaphutha, sidinga ukuwalungisa, njengoba kungenjalo, singenza izinto zibe zimbi nakakhulu.

Endabeni yami, isistimu yefayela ene-database ifakwe phakathi "/srv" futhi uhlobo lwaluyi-ext4.

Misa isizindalwazi: systemctl stop postgresql@9.5-main.service futhi sihlola ukuthi isistimu yefayela ayisetshenziswa yinoma ubani futhi ingehliswa kusetshenziswa umyalo lsof:
lsof +D /srv

Kwadingeka futhi ngimise i-database ye-redis ngoba nayo yayisebenzisa "/srv"Ngabe sengiyehlisa. / srv (ehlisa).

Ukuhlolwa kwesistimu yefayela kwenziwa kusetshenziswa insiza umabhebhana ngokhiye -f (Phoqa ukuhlola noma ngabe isistimu yefayela imakwe njengehlanzekile):

Okuhlangenwe nakho kwami ​​kokuqala ukubuyisela isizindalwazi se-Postgres ngemva kokwehluleka (ikhasi elingavumelekile ku-block 4123007 ye-relatton base/16490)

Okulandelayo, sebenzisa uhlelo lokusebenza dumbe2fs (sudo dumpe2fs /dev/mapper/gu2—sys-srv | i-grep ihloliwe) ungaqinisekisa ukuthi ukuhlola kwenziwe ngempela:

Okuhlangenwe nakho kwami ​​kokuqala ukubuyisela isizindalwazi se-Postgres ngemva kokwehluleka (ikhasi elingavumelekile ku-block 4123007 ye-relatton base/16490)

umabhebhana ithi azikho izinkinga ezitholakele ezingeni lesistimu yefayela le-ext4, okusho ukuthi ungaqhubeka nokuzama ukubuyisela i-database, noma ngokuqondile, ubuyele ku- vacuum ngokugcwele (Yebo, udinga ukukhweza isistimu yefayela futhi uqale i-database).

Uma uneseva ebonakalayo, qiniseka ukuthi ubheka isimo samadiski (nge smartctl -a /dev/XXX) noma isilawuli se-RAID ukuze siqinisekise ukuthi inkinga ibingahlobene nehadiwe. Endabeni yami, i-RAID iphenduke i-hardware-based, ngakho ngacela umlawuli wendawo ukuthi ahlole isimo se-RAID (iseva yayiqhele ngamakhilomitha angamakhulu ambalwa). Uthe awekho amaphutha okusho ukuthi nakanjani singaqala ukululama.

Umzamo 1: zero_damaged_pages

Xhuma kusizindalwazi nge-psql usebenzisa i-akhawunti enamalungelo abasebenzisi abakhulu. Sidinga umsebenzisi omkhulu ngoba inketho aziro_amakhasi_alimele Yiyo kuphela engayishintsha. Endabeni yami, ama-postgres:

psql -h 127.0.0.1 -U postgres -s [database_name]

Inketho aziro_amakhasi_alimele okudingekayo ukuziba amaphutha okufunda (kusuka kuwebhusayithi ye-postgrespro):

Uma kutholwa unhlokweni wekhasi owonakele, i-PostgreSQL ivame ukubika iphutha futhi ihoxise okwenziwayo manje. Uma ipharamitha ye-zero_damaged_pages inikwe amandla, isistimu esikhundleni salokho ikhipha isexwayiso, ikhiphe ikhasi elilimele, futhi iqhubeke nokucubungula. Lokhu kuziphatha konakalisa idatha, ikakhulukazi yonke imigqa ekhasini elimele.

Sivumela inketho futhi sizame ukwenza i-vacuum ephelele yetafula:

VACUUM FULL VERBOSE

Okuhlangenwe nakho kwami ​​kokuqala ukubuyisela isizindalwazi se-Postgres ngemva kokwehluleka (ikhasi elingavumelekile ku-block 4123007 ye-relatton base/16490)
Ngeshwa, ukwehluleka.

Sihlangabezane nephutha elifanayo:

INFO: vacuuming "“public.ws_log_smevlog”
WARNING: invalid page in block 4123007 of relation base/16400/21396989; zeroing out page
ERROR: unexpected chunk number 573 (expected 565) for toast value 21648541 in pg_toast_106070

pg_toast – indlela yokugcina “idatha ende” ku-Poetgres uma ingangeni ekhasini elilodwa (okuzenzakalelayo 8kb).

Umzamo 2: reindex

Ithiphu lokuqala elivela ku-Google alizange lisize. Ngemva kwemizuzu embalwa yokusesha, ngathola ithiphu lesibili - ukwenza reindex Itafula elonakele. Ngiye ngasibona lesi seluleko ezindaweni eziningi, kodwa asizange sikhuthaze ukuzethemba. Masenze i-reindex:

reindex table ws_log_smevlog

Okuhlangenwe nakho kwami ​​kokuqala ukubuyisela isizindalwazi se-Postgres ngemva kokwehluleka (ikhasi elingavumelekile ku-block 4123007 ye-relatton base/16490)

reindex kuqedwe ngaphandle kwezinkinga.

Nokho, akusizanga, IVACUUM IGCWELE iphahlazeke ngephutha elifanayo. Njengoba ngijwayele ukwehluleka, ngiqhubekile nokuthungatha iseluleko ku-inthanethi futhi ngathola esithakazelisa kakhulu indatshana.

Umzamo 3: KHETHA, LIMIT, OFFSET

Isihloko esingenhla siphakamise ukuthi kubuyekezwe umugqa wethebula ngomugqa futhi kususwe idatha eyinkinga. Okokuqala, kwakudingeka ubuyekeze yonke imigqa:

for ((i=0; i<"Number_of_rows_in_nodes"; i++ )); do psql -U "Username" "Database Name" -c "SELECT * FROM nodes LIMIT 1 offset $i" >/dev/null || echo $i; done

Endabeni yami itafula equkethwe 1 628 991 imigqa! Kwakudingekile ukunakekela ukwahlukanisa idatha, kodwa leso isihloko senye ingxoxo. KwakungoMgqibelo, ngagijima lomyalo ku-tmux ngalala:

for ((i=0; i<1628991; i++ )); do psql -U my_user -d my_database -c "SELECT * FROM ws_log_smevlog LIMIT 1 offset $i" >/dev/null || echo $i; done

Ekuseni, nganquma ukuhlola ukuthi izinto zihamba kanjani. Ngamangala lapho ngathola ukuthi ngemva kwamahora angu-20, kwakuskeniwe u-2% kuphela wedatha! Ngangingafuni ukulinda izinsuku ezingu-50. Okunye ukwehluleka okuphelele.

Kodwa angizange ngiphele amandla. Ngazibuza ukuthi kungani isithwebuli sithatha isikhathi eside kangaka. Kusuka emibhalweni (futhi ku-postgrespro), ngafunda:

I-OFFSET iyitshela ukuthi yeqe inombolo eshiwo yemigqa ngaphambi kokuqala ukukhipha imigqa.
Uma kokubili OFFSET kanye ne-LIMIT kucacisiwe, isistimu iqala yeqa imigqa ye-OFFSET bese iqala ukubala imigqa kokuthi LIMIT.

Uma usebenzisa i-LIMIT, kubalulekile ukuthi uphinde usebenzise i-ORDER BY clasha ukuze uqinisekise ukuthi imigqa yemiphumela ibuyiselwa ngokulandelana okuthile. Uma kungenjalo, amasethi angaphansi angabikezeleki emigqa azobuyiswa.

Kusobala ukuthi umyalo ongenhla wawungalungile: okokuqala, kwakungekho oda nge, umphumela ungase ube nephutha. Okwesibili, i-Postgres bekufanele iqale iskene futhi yeqe imigqa ye-OFFSET, futhi ngokwanda OFFSET ukukhiqiza bekuzokwehla nakakhulu.

Umzamo 4: Thatha ukulahlwa kombhalo

Kwabe sekufika umbono obonakala uhlakaniphile kimi: ukuthatha indawo yokulahla ngendlela yombhalo futhi ngihlaziye umugqa wokugcina orekhodiwe.

Kodwa okokuqala, ake sijwayelane nesakhiwo setafula. ws_log_smevlog:

Okuhlangenwe nakho kwami ​​kokuqala ukubuyisela isizindalwazi se-Postgres ngemva kokwehluleka (ikhasi elingavumelekile ku-block 4123007 ye-relatton base/16490)

Esimweni sethu sinekholomu "Id", equkethe inkomba ehlukile (ikhawunta) yomugqa. Uhlelo belumi kanje:

  1. Siqala ukulahla idatha ngefomu lombhalo (ngendlela yemiyalo ye-SQL)
  2. Ngesinye isikhathi, ukulahla kuzophazanyiswa ngenxa yephutha, kodwa ifayela lombhalo lisazogcinwa kudiski.
  3. Sibheka ekugcineni kwefayela lombhalo, ngaleyo ndlela sithola isihlonzi (id) somugqa wokugcina esikhishwe ngempumelelo.

Ngiqale ukulahla ngefomu lombhalo:

pg_dump -U my_user -d my_database -F p -t ws_log_smevlog -f ./my_dump.dump

Ukulahla, njengoba bekulindelekile, kuhlulekile ngephutha elifanayo:

pg_dump: Error message from server: ERROR: invalid page in block 4123007 of relatton base/16490/21396989

Ngokuqhubekayo umsila Ngabheka ekugcineni kwendawo yokulahla (umsila -5 ./my_dump.dump) uthole ukuthi ukulahlwa kuphazanyiswe kulayini one-id 186 525"Ngakho-ke inkinga iku-ID yolayini 186 526, uphukile, futhi udinga ukususwa!" Ngacabanga. Kepha ngemuva kokubuza i-database:
«khetha * kokuthi ws_log_smevlog lapho id=186529"Kwavela ukuthi konke kwakuhamba kahle ngalo mugqa ... Imigqa enezikhombo 186,530 - 186,540 nayo yasebenza ngaphandle kwezinkinga. Omunye "umbono okhaliphile" wehlulekile. Kamuva, ngaqonda ukuthi kungani lokhu kwenzeka: lapho ususa / ushintsha idatha etafuleni, ayisuswanga ngokomzimba, kodwa imakwe ngokuthi "ama-tuples afile", bese kufika. i-autovacuum futhi imaka le migqa njengesusiwe futhi iyivumele ukuthi iphinde isetshenziswe. Ukuze kucace, uma idatha ekuthebula ishintsha futhi i-autovacuum inikwe amandla, ayigcinwa ngokulandelana.

Umzamo 5: KHETHA, USUKA, LAPHO id=

Ukwehluleka kusenza sibe namandla. Akufanele neze ulahle ithemba, kufanele uqhubeke futhi ukholelwe kuwena nasemakhonweni akho. Ngakho-ke nginqume ukuzama enye inketho: mane ngibheke kuwo wonke amarekhodi ku-database ngayinye. Ukwazi ukwakheka kwetafula lami (bheka ngenhla), sinenkambu ye-id, eyingqayizivele (ukhiye oyinhloko). Sinemigqa eyi-1,628,991 etafuleni kanye id Zihlelekile, okusho ukuthi singamane siziphindaphinde ngamunye ngamunye:

for ((i=1; i<1628991; i=$((i+1)) )); do psql -U my_user -d my_database  -c "SELECT * FROM ws_log_smevlog where id=$i" >/dev/null || echo $i; done

Kulabo abangaqondi, umyalo usebenza kanje: uskena umugqa wetafula ngomugqa bese uthumela i-stdout ku- / dev / null, kodwa uma umyalo KHETHA uhluleka, khona-ke umbhalo wephutha uyaphrintwa (i-stderr ithunyelwa ku-console) futhi umugqa oqukethe iphutha uphrintiwe (sibonga ku-||, okusho ukuthi ukukhetha kube nezinkinga (ikhodi yokubuyisela umyalo akuyona i-0)).

Ngaba nenhlanhla, ngaba nezinkomba ezakhiwe enkundleni id:

Okuhlangenwe nakho kwami ​​kokuqala ukubuyisela isizindalwazi se-Postgres ngemva kokwehluleka (ikhasi elingavumelekile ku-block 4123007 ye-relatton base/16490)

Lokhu kusho ukuthi ukuthola umugqa one-id edingekayo akumele kuthathe isikhathi esiningi. Ngokombono, kufanele kusebenze. Ngakho-ke, masisebenzise umyalo phakathi I-tmux asihambe siyolala.

Ekuseni, ngase ngibone amaposi angaba ngu-90,000, okungaphezudlwana kuka-5%. Umphumela omuhle kakhulu uma uqhathaniswa nendlela yangaphambilini (2%)! Kodwa ngangingafuni ukulinda izinsuku ezingu-20...

Umzamo 6: KHETHA, USUKA, LAPHO id >= kanye ne-id

Ikhasimende lineseva enhle kakhulu eyabelwe isizindalwazi: i-dual-processor eyodwa. I-Intel Xeon E5-2697 v2Sinemicu engama-48 etholakalayo! Ukulayisha kweseva bekumaphakathi, ngakho-ke besikwazi ukuphatha kalula imicu engaba ngu-20. Siphinde saba ne-RAM eningi: amagigabhayithi angu-384!

Ngakho-ke, iqembu kwakufanele lihambisane:

for ((i=1; i<1628991; i=$((i+1)) )); do psql -U my_user -d my_database  -c "SELECT * FROM ws_log_smevlog where id=$i" >/dev/null || echo $i; done

Bengingabhala iskripthi esihle nesinhle lapha, kodwa ngakhetha indlela yokufanisa eshesha kakhulu: hlukanisa ngokuzenzakalelayo ububanzi 0-1628991 ngezikhawu zamarekhodi angu-100,000 futhi ngisebenzise imiyalo engu-16 yohlobo olulandelayo ngokuhlukana:

for ((i=N; i<M; i=$((i+1)) )); do psql -U my_user -d my_database  -c "SELECT * FROM ws_log_smevlog where id=$i" >/dev/null || echo $i; done

Kodwa akugcini lapho. Ukuxhuma kusizindalwazi nakho kuthatha isikhathi nezinsiza zesistimu. Ukuxhuma i-1,628,991 bekungehlakaniphile kakhulu, uzovuma. Ngakho-ke masibuyise imigqa eyi-1000 ngoxhumano ngalunye esikhundleni soyedwa nje. Umyalo wagcina ubukeke kanje:

for ((i=N; i<M; i=$((i+1000)) )); do psql -U my_user -d my_database  -c "SELECT * FROM ws_log_smevlog where id>=$i and id<$((i+1000))" >/dev/null || echo $i; done

Vula amafasitela angu-16 kuseshini ye-tmux bese usebenzisa imiyalo elandelayo:

1) for ((i=0; i<100000; i=$((i+1000)) )); do psql -U my_user -d my_database  -c "SELECT * FROM ws_log_smevlog where id>=$i and id<$((i+1000))" >/dev/null || echo $i; done
2) for ((i=100000; i<200000; i=$((i+1000)) )); do psql -U my_user -d my_database  -c "SELECT * FROM ws_log_smevlog where id>=$i and id<$((i+1000))" >/dev/null || echo $i; done
…
15) for ((i=1400000; i<1500000; i=$((i+1000)) )); do psql -U my_user -d my_database -c "SELECT * FROM ws_log_smevlog where id>=$i and id<$((i+1000))" >/dev/null || echo $i; done
16) for ((i=1500000; i<1628991; i=$((i+1000)) )); do psql -U my_user -d my_database  -c "SELECT * FROM ws_log_smevlog where id>=$i and id<$((i+1000))" >/dev/null || echo $i; done

Ngemva kosuku, ngathola imiphumela yokuqala! Ngokuqondile (amanani e-XXX ne-ZZZ ayengasalondolozwa):

ERROR:  missing chunk number 0 for toast value 37837571 in pg_toast_106070
829000
ERROR:  missing chunk number 0 for toast value XXX in pg_toast_106070
829000
ERROR:  missing chunk number 0 for toast value ZZZ in pg_toast_106070
146000

Lokhu kusho ukuthi sinemigqa emithathu enamaphutha. Omazisi berekhodi lokuqala nelesibili elinezinkinga bebephakathi kuka-829,000 no-830,000, kanti umazisi wesithathu ubuphakathi kuka-146,000 no-147,000. Okulandelayo, besimane sidinga ukuthola amanani aqondile e-ID amarekhodi ayinkinga. Ukuze senze lokhu, siskena ububanzi bethu bamarekhodi ayinkinga ngokunyuka koku-1 futhi sikhombe ama-ID:

for ((i=829000; i<830000; i=$((i+1)) )); do psql -U my_user -d my_database -c "SELECT * FROM ws_log_smevlog where id=$i" >/dev/null || echo $i; done
829417
ERROR:  unexpected chunk number 2 (expected 0) for toast value 37837843 in pg_toast_106070
829449
for ((i=146000; i<147000; i=$((i+1)) )); do psql -U my_user -d my_database -c "SELECT * FROM ws_log_smevlog where id=$i" >/dev/null || echo $i; done
829417
ERROR:  unexpected chunk number ZZZ (expected 0) for toast value XXX in pg_toast_106070
146911

Isiphetho esijabulisayo

Sithole imigqa eyinkinga. Masifinyelele kusizindalwazi sisebenzisa i-psql futhi sizame ukuyisusa:

my_database=# delete from ws_log_smevlog where id=829417;
DELETE 1
my_database=# delete from ws_log_smevlog where id=829449;
DELETE 1
my_database=# delete from ws_log_smevlog where id=146911;
DELETE 1

Ngokumangala kwami, amarekhodi asusiwe ngaphandle kwezinkinga ngisho nangaphandle kwenketho aziro_amakhasi_alimele.

Ngabe sengixhuma ku-database, ngakwenza IVACUUM IGCWELE (Ngicabanga ukuthi bekungadingekile ukwenza lokhu), futhi ekugcineni ngathatha isipele sisebenzisa ngempumelelo pg_lahlaUkulahlwa kuthathwe ngaphandle kwamaphutha! Inkinga yaxazululwa ngale ndlela ewubulima ngendlela emangalisayo. Ngachichima injabulo ekugcineni ngathola isixazululo ngemva kokwehluleka okuningi!

Ukubonga kanye nesiphetho

Lokhu bekuyisipiliyoni sami sokuqala sokubuyisela isizindalwazi sangempela se-Postgres. Ngizokhumbula lesi sigameko isikhathi eside.

Futhi ekugcineni, ngithanda ukubonga i-PostgresPro ngokuhumusha imibhalo ngesiRashiya kanye ne izifundo zamahhala eziku-inthanethi ngokuphelele, ezaba usizo kakhulu phakathi nokuhlaziywa kwenkinga.

Source: www.habr.com

Engeza amazwana