Ngingathanda ukwabelana nawe ngesipiliyoni sami sokuqala esiyimpumelelo sokubuyisela isizindalwazi se-Postgres ukuze sisebenze ngokugcwele. Ngaqala ukuhlangana noPostgres ezinyangeni eziyisithupha ezedlule; ngaphambi kwalokho, bengingenakho isipiliyoni sokuphatha imininingwane yolwazi.

Ngisebenza njengonjiniyela we-semi-DevOps enkampanini enkulu ye-IT. Inkampani yethu ithuthukisa isofthiwe yezinsizakalo ezilayisha kakhulu, futhi nginomthwalo wemfanelo wokwethembeka ekusebenzeni, ukunakekelwa, kanye nokusetshenziswa. Nginikezwe umsebenzi ojwayelekile: ukubuyekeza uhlelo lokusebenza kuseva eyodwa. Uhlelo lokusebenza lubhalwe nge-Django, futhi ukufuduka (izinguquko kusakhiwo sedathabhethi) kwenziwa ngesikhathi sokubuyekezwa. Ngaphambi kwale nqubo, senza ukulahlwa kwedatha egcwele sisebenzisa uhlelo olujwayelekile lwe-pg_dump, uma kwenzeka.
Kwenzeke iphutha elingalindelekile ngenkathi kulahlwa (inguqulo ye-Postgres 9.5):
pg_dump: Oumping the contents of table “ws_log_smevlog” failed: PQgetResult() failed.
pg_dump: Error message from server: ERROR: invalid page in block 4123007 of relatton base/16490/21396989
pg_dump: The command was: COPY public.ws_log_smevlog [...]
pg_dunp: [parallel archtver] a worker process dled unexpectedly iphutha "ikhasi elingavumelekile ebhlokini" ibonisa izinkinga ezingeni lesistimu yefayela, okuyinto embi kakhulu. Izinkundla ezehlukene ziphakamise ukuthi kwenziwe I-Vacuum Egcwele ngenketho aziro_amakhasi_alimele ukuxazulula le nkinga. Yebo, i-poprobeum ...
Ukulungiselela ukululama
QAPHELA! Qiniseka ukuthi wenza ikhophi yasenqolobaneni yokufaka kwakho i-Postgres ngaphambi kokuzama ukubuyisela isizindalwazi sakho. Uma usebenzisa umshini obonakalayo, misa isizindalwazi bese uthatha isifinyezo. Uma ungakwazi ukuthatha isifinyezo, misa isizindalwazi bese ukopisha okuqukethwe kunkomba ye-Postgres (okuhlanganisa namafayela we-.wal) endaweni ephephile. Okubaluleke kakhulu ukugwema ukwenza izinto zibe zimbi kakhulu. Funda uqhubeke. .
Njengoba i-database yami ngokuvamile yayisebenza, ngizikhawulele endaweni yokulahlwa yedatha evamile, kodwa angibandakanyi ithebula elinedatha eyonakele (inketho -T, --exclude-table=TABLE ku-pg_dump).
Iseva ibingokomzimba, ngakho ukuthatha isifinyezo bekungenzeki. Ikhophi yasenqolobaneni isendaweni, asiqhubeke.
Ihlola isistimu yefayela
Ngaphambi kokuzama ukubuyisela i-database, sidinga ukuqinisekisa ukuthi isistimu yefayela ngokwayo iphelele. Futhi uma kukhona amaphutha, sidinga ukuwalungisa, njengoba kungenjalo, singenza izinto zibe zimbi nakakhulu.
Endabeni yami, isistimu yefayela ene-database ifakwe phakathi "/srv" futhi uhlobo lwaluyi-ext4.
Misa isizindalwazi: systemctl stop postgresql@9.5-main.service futhi sihlola ukuthi isistimu yefayela ayisetshenziswa yinoma ubani futhi ingehliswa kusetshenziswa umyalo lsof:
lsof +D /srv
Kwadingeka futhi ngimise i-database ye-redis ngoba nayo yayisebenzisa "/srv"Ngabe sengiyehlisa. / srv (ehlisa).
Ukuhlolwa kwesistimu yefayela kwenziwa kusetshenziswa insiza umabhebhana ngokhiye -f (Phoqa ukuhlola noma ngabe isistimu yefayela imakwe njengehlanzekile):

Okulandelayo, sebenzisa uhlelo lokusebenza dumbe2fs (sudo dumpe2fs /dev/mapper/gu2—sys-srv | i-grep ihloliwe) ungaqinisekisa ukuthi ukuhlola kwenziwe ngempela:

umabhebhana ithi azikho izinkinga ezitholakele ezingeni lesistimu yefayela le-ext4, okusho ukuthi ungaqhubeka nokuzama ukubuyisela i-database, noma ngokuqondile, ubuyele ku- vacuum ngokugcwele (Yebo, udinga ukukhweza isistimu yefayela futhi uqale i-database).
Uma uneseva ebonakalayo, qiniseka ukuthi ubheka isimo samadiski (nge smartctl -a /dev/XXX) noma isilawuli se-RAID ukuze siqinisekise ukuthi inkinga ibingahlobene nehadiwe. Endabeni yami, i-RAID iphenduke i-hardware-based, ngakho ngacela umlawuli wendawo ukuthi ahlole isimo se-RAID (iseva yayiqhele ngamakhilomitha angamakhulu ambalwa). Uthe awekho amaphutha okusho ukuthi nakanjani singaqala ukululama.
Umzamo 1: zero_damaged_pages
Xhuma kusizindalwazi nge-psql usebenzisa i-akhawunti enamalungelo abasebenzisi abakhulu. Sidinga umsebenzisi omkhulu ngoba inketho aziro_amakhasi_alimele Yiyo kuphela engayishintsha. Endabeni yami, ama-postgres:
psql -h 127.0.0.1 -U postgres -s [database_name]
Inketho aziro_amakhasi_alimele okudingekayo ukuziba amaphutha okufunda (kusuka kuwebhusayithi ye-postgrespro):
Uma kutholwa unhlokweni wekhasi owonakele, i-PostgreSQL ivame ukubika iphutha futhi ihoxise okwenziwayo manje. Uma ipharamitha ye-zero_damaged_pages inikwe amandla, isistimu esikhundleni salokho ikhipha isexwayiso, ikhiphe ikhasi elilimele, futhi iqhubeke nokucubungula. Lokhu kuziphatha konakalisa idatha, ikakhulukazi yonke imigqa ekhasini elimele.
Sivumela inketho futhi sizame ukwenza i-vacuum ephelele yetafula:
VACUUM FULL VERBOSE 
Ngeshwa, ukwehluleka.
Sihlangabezane nephutha elifanayo:
INFO: vacuuming "“public.ws_log_smevlog”
WARNING: invalid page in block 4123007 of relation base/16400/21396989; zeroing out page
ERROR: unexpected chunk number 573 (expected 565) for toast value 21648541 in pg_toast_106070– indlela yokugcina “idatha ende” ku-Poetgres uma ingangeni ekhasini elilodwa (okuzenzakalelayo 8kb).
Umzamo 2: reindex
Ithiphu lokuqala elivela ku-Google alizange lisize. Ngemva kwemizuzu embalwa yokusesha, ngathola ithiphu lesibili - ukwenza reindex Itafula elonakele. Ngiye ngasibona lesi seluleko ezindaweni eziningi, kodwa asizange sikhuthaze ukuzethemba. Masenze i-reindex:
reindex table ws_log_smevlog 
reindex kuqedwe ngaphandle kwezinkinga.
Nokho, akusizanga, IVACUUM IGCWELE iphahlazeke ngephutha elifanayo. Njengoba ngijwayele ukwehluleka, ngiqhubekile nokuthungatha iseluleko ku-inthanethi futhi ngathola esithakazelisa kakhulu .
Umzamo 3: KHETHA, LIMIT, OFFSET
Isihloko esingenhla siphakamise ukuthi kubuyekezwe umugqa wethebula ngomugqa futhi kususwe idatha eyinkinga. Okokuqala, kwakudingeka ubuyekeze yonke imigqa:
for ((i=0; i<"Number_of_rows_in_nodes"; i++ )); do psql -U "Username" "Database Name" -c "SELECT * FROM nodes LIMIT 1 offset $i" >/dev/null || echo $i; doneEndabeni yami itafula equkethwe 1 628 991 imigqa! Kwakudingekile ukunakekela , kodwa leso isihloko senye ingxoxo. KwakungoMgqibelo, ngagijima lomyalo ku-tmux ngalala:
for ((i=0; i<1628991; i++ )); do psql -U my_user -d my_database -c "SELECT * FROM ws_log_smevlog LIMIT 1 offset $i" >/dev/null || echo $i; doneEkuseni, nganquma ukuhlola ukuthi izinto zihamba kanjani. Ngamangala lapho ngathola ukuthi ngemva kwamahora angu-20, kwakuskeniwe u-2% kuphela wedatha! Ngangingafuni ukulinda izinsuku ezingu-50. Okunye ukwehluleka okuphelele.
Kodwa angizange ngiphele amandla. Ngazibuza ukuthi kungani isithwebuli sithatha isikhathi eside kangaka. Kusuka emibhalweni (futhi ku-postgrespro), ngafunda:
I-OFFSET iyitshela ukuthi yeqe inombolo eshiwo yemigqa ngaphambi kokuqala ukukhipha imigqa.
Uma kokubili OFFSET kanye ne-LIMIT kucacisiwe, isistimu iqala yeqa imigqa ye-OFFSET bese iqala ukubala imigqa kokuthi LIMIT.Uma usebenzisa i-LIMIT, kubalulekile ukuthi uphinde usebenzise i-ORDER BY clasha ukuze uqinisekise ukuthi imigqa yemiphumela ibuyiselwa ngokulandelana okuthile. Uma kungenjalo, amasethi angaphansi angabikezeleki emigqa azobuyiswa.
Kusobala ukuthi umyalo ongenhla wawungalungile: okokuqala, kwakungekho oda nge, umphumela ungase ube nephutha. Okwesibili, i-Postgres bekufanele iqale iskene futhi yeqe imigqa ye-OFFSET, futhi ngokwanda OFFSET ukukhiqiza bekuzokwehla nakakhulu.
Umzamo 4: Thatha ukulahlwa kombhalo
Kwabe sekufika umbono obonakala uhlakaniphile kimi: ukuthatha indawo yokulahla ngendlela yombhalo futhi ngihlaziye umugqa wokugcina orekhodiwe.
Kodwa okokuqala, ake sijwayelane nesakhiwo setafula. ws_log_smevlog:

Esimweni sethu sinekholomu "Id", equkethe inkomba ehlukile (ikhawunta) yomugqa. Uhlelo belumi kanje:
- Siqala ukulahla idatha ngefomu lombhalo (ngendlela yemiyalo ye-SQL)
- Ngesinye isikhathi, ukulahla kuzophazanyiswa ngenxa yephutha, kodwa ifayela lombhalo lisazogcinwa kudiski.
- Sibheka ekugcineni kwefayela lombhalo, ngaleyo ndlela sithola isihlonzi (id) somugqa wokugcina esikhishwe ngempumelelo.
Ngiqale ukulahla ngefomu lombhalo:
pg_dump -U my_user -d my_database -F p -t ws_log_smevlog -f ./my_dump.dumpUkulahla, njengoba bekulindelekile, kuhlulekile ngephutha elifanayo:
pg_dump: Error message from server: ERROR: invalid page in block 4123007 of relatton base/16490/21396989 Ngokuqhubekayo umsila Ngabheka ekugcineni kwendawo yokulahla (umsila -5 ./my_dump.dump) uthole ukuthi ukulahlwa kuphazanyiswe kulayini one-id 186 525"Ngakho-ke inkinga iku-ID yolayini 186 526, uphukile, futhi udinga ukususwa!" Ngacabanga. Kepha ngemuva kokubuza i-database:
«khetha * kokuthi ws_log_smevlog lapho id=186529"Kwavela ukuthi konke kwakuhamba kahle ngalo mugqa ... Imigqa enezikhombo 186,530 - 186,540 nayo yasebenza ngaphandle kwezinkinga. Omunye "umbono okhaliphile" wehlulekile. Kamuva, ngaqonda ukuthi kungani lokhu kwenzeka: lapho ususa / ushintsha idatha etafuleni, ayisuswanga ngokomzimba, kodwa imakwe ngokuthi "ama-tuples afile", bese kufika. i-autovacuum futhi imaka le migqa njengesusiwe futhi iyivumele ukuthi iphinde isetshenziswe. Ukuze kucace, uma idatha ekuthebula ishintsha futhi i-autovacuum inikwe amandla, ayigcinwa ngokulandelana.
Umzamo 5: KHETHA, USUKA, LAPHO id=
Ukwehluleka kusenza sibe namandla. Akufanele neze ulahle ithemba, kufanele uqhubeke futhi ukholelwe kuwena nasemakhonweni akho. Ngakho-ke nginqume ukuzama enye inketho: mane ngibheke kuwo wonke amarekhodi ku-database ngayinye. Ukwazi ukwakheka kwetafula lami (bheka ngenhla), sinenkambu ye-id, eyingqayizivele (ukhiye oyinhloko). Sinemigqa eyi-1,628,991 etafuleni kanye id Zihlelekile, okusho ukuthi singamane siziphindaphinde ngamunye ngamunye:
for ((i=1; i<1628991; i=$((i+1)) )); do psql -U my_user -d my_database -c "SELECT * FROM ws_log_smevlog where id=$i" >/dev/null || echo $i; doneKulabo abangaqondi, umyalo usebenza kanje: uskena umugqa wetafula ngomugqa bese uthumela i-stdout ku- / dev / null, kodwa uma umyalo KHETHA uhluleka, khona-ke umbhalo wephutha uyaphrintwa (i-stderr ithunyelwa ku-console) futhi umugqa oqukethe iphutha uphrintiwe (sibonga ku-||, okusho ukuthi ukukhetha kube nezinkinga (ikhodi yokubuyisela umyalo akuyona i-0)).
Ngaba nenhlanhla, ngaba nezinkomba ezakhiwe enkundleni id:

Lokhu kusho ukuthi ukuthola umugqa one-id edingekayo akumele kuthathe isikhathi esiningi. Ngokombono, kufanele kusebenze. Ngakho-ke, masisebenzise umyalo phakathi I-tmux asihambe siyolala.
Ekuseni, ngase ngibone amaposi angaba ngu-90,000, okungaphezudlwana kuka-5%. Umphumela omuhle kakhulu uma uqhathaniswa nendlela yangaphambilini (2%)! Kodwa ngangingafuni ukulinda izinsuku ezingu-20...
Umzamo 6: KHETHA, USUKA, LAPHO id >= kanye ne-id
Ikhasimende lineseva enhle kakhulu eyabelwe isizindalwazi: i-dual-processor eyodwa. I-Intel Xeon E5-2697 v2Sinemicu engama-48 etholakalayo! Ukulayisha kweseva bekumaphakathi, ngakho-ke besikwazi ukuphatha kalula imicu engaba ngu-20. Siphinde saba ne-RAM eningi: amagigabhayithi angu-384!
Ngakho-ke, iqembu kwakufanele lihambisane:
for ((i=1; i<1628991; i=$((i+1)) )); do psql -U my_user -d my_database -c "SELECT * FROM ws_log_smevlog where id=$i" >/dev/null || echo $i; doneBengingabhala iskripthi esihle nesinhle lapha, kodwa ngakhetha indlela yokufanisa eshesha kakhulu: hlukanisa ngokuzenzakalelayo ububanzi 0-1628991 ngezikhawu zamarekhodi angu-100,000 futhi ngisebenzise imiyalo engu-16 yohlobo olulandelayo ngokuhlukana:
for ((i=N; i<M; i=$((i+1)) )); do psql -U my_user -d my_database -c "SELECT * FROM ws_log_smevlog where id=$i" >/dev/null || echo $i; doneKodwa akugcini lapho. Ukuxhuma kusizindalwazi nakho kuthatha isikhathi nezinsiza zesistimu. Ukuxhuma i-1,628,991 bekungehlakaniphile kakhulu, uzovuma. Ngakho-ke masibuyise imigqa eyi-1000 ngoxhumano ngalunye esikhundleni soyedwa nje. Umyalo wagcina ubukeke kanje:
for ((i=N; i<M; i=$((i+1000)) )); do psql -U my_user -d my_database -c "SELECT * FROM ws_log_smevlog where id>=$i and id<$((i+1000))" >/dev/null || echo $i; doneVula amafasitela angu-16 kuseshini ye-tmux bese usebenzisa imiyalo elandelayo:
1) for ((i=0; i<100000; i=$((i+1000)) )); do psql -U my_user -d my_database -c "SELECT * FROM ws_log_smevlog where id>=$i and id<$((i+1000))" >/dev/null || echo $i; done 2) for ((i=100000; i<200000; i=$((i+1000)) )); do psql -U my_user -d my_database -c "SELECT * FROM ws_log_smevlog where id>=$i and id<$((i+1000))" >/dev/null || echo $i; done … 15) for ((i=1400000; i<1500000; i=$((i+1000)) )); do psql -U my_user -d my_database -c "SELECT * FROM ws_log_smevlog where id>=$i and id<$((i+1000))" >/dev/null || echo $i; done 16) for ((i=1500000; i<1628991; i=$((i+1000)) )); do psql -U my_user -d my_database -c "SELECT * FROM ws_log_smevlog where id>=$i and id<$((i+1000))" >/dev/null || echo $i; done
Ngemva kosuku, ngathola imiphumela yokuqala! Ngokuqondile (amanani e-XXX ne-ZZZ ayengasalondolozwa):
ERROR: missing chunk number 0 for toast value 37837571 in pg_toast_106070
829000
ERROR: missing chunk number 0 for toast value XXX in pg_toast_106070
829000
ERROR: missing chunk number 0 for toast value ZZZ in pg_toast_106070
146000Lokhu kusho ukuthi sinemigqa emithathu enamaphutha. Omazisi berekhodi lokuqala nelesibili elinezinkinga bebephakathi kuka-829,000 no-830,000, kanti umazisi wesithathu ubuphakathi kuka-146,000 no-147,000. Okulandelayo, besimane sidinga ukuthola amanani aqondile e-ID amarekhodi ayinkinga. Ukuze senze lokhu, siskena ububanzi bethu bamarekhodi ayinkinga ngokunyuka koku-1 futhi sikhombe ama-ID:
for ((i=829000; i<830000; i=$((i+1)) )); do psql -U my_user -d my_database -c "SELECT * FROM ws_log_smevlog where id=$i" >/dev/null || echo $i; done 829417 ERROR: unexpected chunk number 2 (expected 0) for toast value 37837843 in pg_toast_106070 829449 for ((i=146000; i<147000; i=$((i+1)) )); do psql -U my_user -d my_database -c "SELECT * FROM ws_log_smevlog where id=$i" >/dev/null || echo $i; done 829417 ERROR: unexpected chunk number ZZZ (expected 0) for toast value XXX in pg_toast_106070 146911
Isiphetho esijabulisayo
Sithole imigqa eyinkinga. Masifinyelele kusizindalwazi sisebenzisa i-psql futhi sizame ukuyisusa:
my_database=# delete from ws_log_smevlog where id=829417;
DELETE 1
my_database=# delete from ws_log_smevlog where id=829449;
DELETE 1
my_database=# delete from ws_log_smevlog where id=146911;
DELETE 1Ngokumangala kwami, amarekhodi asusiwe ngaphandle kwezinkinga ngisho nangaphandle kwenketho aziro_amakhasi_alimele.
Ngabe sengixhuma ku-database, ngakwenza IVACUUM IGCWELE (Ngicabanga ukuthi bekungadingekile ukwenza lokhu), futhi ekugcineni ngathatha isipele sisebenzisa ngempumelelo pg_lahlaUkulahlwa kuthathwe ngaphandle kwamaphutha! Inkinga yaxazululwa ngale ndlela ewubulima ngendlela emangalisayo. Ngachichima injabulo ekugcineni ngathola isixazululo ngemva kokwehluleka okuningi!
Ukubonga kanye nesiphetho
Lokhu bekuyisipiliyoni sami sokuqala sokubuyisela isizindalwazi sangempela se-Postgres. Ngizokhumbula lesi sigameko isikhathi eside.
Futhi ekugcineni, ngithanda ukubonga i-PostgresPro ngokuhumusha imibhalo ngesiRashiya kanye ne , ezaba usizo kakhulu phakathi nokuhlaziywa kwenkinga.
Source: www.habr.com
