Elasticsearch pawg 200 TB +

Elasticsearch pawg 200 TB +

Ntau tus neeg tawm tsam nrog Elasticsearch. Tab sis ua li cas thaum koj xav siv nws los khaws cov cav "hauv cov ntim tshwj xeeb"? Thiab nws puas tseem mob siab rau kev ua tsis tiav ntawm ib qho ntawm ntau lub chaw khaws ntaub ntawv? Yuav ua li cas hom ntawm architecture koj yuav tsum ua, thiab dab tsi pitfalls koj yuav dawm?

Peb ntawm Odnoklassniki tau txiav txim siab siv elasticsearch los daws qhov teeb meem ntawm kev tswj xyuas lub cav, thiab tam sim no peb qhia peb cov kev paub dhau los nrog Habr: ob qho tib si hais txog kev tsim vaj tsev thiab txog qhov pitfalls.

Kuv yog Pyotr Zaitsev, kuv ua haujlwm ua tus thawj tswj hwm ntawm Odnoklassniki. Ua ntej ntawd, kuv kuj yog tus thawj tswj hwm, ua haujlwm nrog Manticore Search, Sphinx search, Elasticsearch. Tej zaum, yog tias lwm qhov ... tshawb nrhiav tshwm, Kuv yuav zaum ua haujlwm nrog nws thiab. Kuv kuj koom nrog ntau qhov kev qhib qhov project raws li kev yeem.

Thaum kuv tuaj rau Odnoklassniki, kuv tau hais tsis zoo ntawm kev xam phaj tias kuv tuaj yeem ua haujlwm nrog Elasticsearch. Tom qab kuv tau txais lub hang ntawm nws thiab ua tiav qee yam haujlwm yooj yim, kuv tau txais txoj haujlwm loj ntawm kev hloov kho lub cav tswj qhov uas muaj nyob rau lub sijhawm ntawd.

uas yuav tsum tau

Cov kev cai system tau tsim raws li hauv qab no:

  • Greylog yuav tsum tau siv ua lub hauv ntej. Vim tias lub tuam txhab twb muaj kev paub siv cov khoom no, cov programmers thiab cov neeg sim paub nws, nws tau paub thiab yooj yim rau lawv.
  • Cov ntaub ntawv ntim: nruab nrab ntawm 50-80 txhiab cov lus ib ob, tab sis yog tias muaj qee yam tawg, ces kev khiav tsheb tsis txwv los ntawm ib yam dab tsi, nws tuaj yeem yog 2-3 lab kab ib ob.
  • Tom qab sib tham nrog cov neeg siv khoom txog qhov yuav tsum tau ua kom ceev cov lus nug tshawb fawb, peb pom tau hais tias tus qauv ntawm kev siv cov txheej txheem zoo li no yog qhov no: tib neeg tab tom nrhiav cav ntawm lawv daim ntawv thov rau ob hnub dhau los thiab tsis xav tos ntau tshaj ib. thib ob rau qhov tshwm sim ntawm cov lus nug formulated.
  • Cov thawj coj tau hais tias lub kaw lus yuav yooj yim scalable yog tias tsim nyog, tsis tas yuav tsum tau delve tob rau hauv nws ua haujlwm li cas.
  • Yog li ntawd tsuas yog txoj haujlwm tu vaj tse uas cov tshuab no xav tau ib ntus yog hloov qee qhov kho vajtse.
  • Tsis tas li ntawd, Odnoklassniki muaj cov kev coj ua zoo tshaj plaws: txhua qhov kev pabcuam uas peb tso tawm yuav tsum muaj sia nyob ntawm qhov chaw tsis ua haujlwm (dhau, tsis tau npaj thiab txhua lub sijhawm).

Qhov kawg yuav tsum tau ua nyob rau hauv qhov kev siv ntawm qhov project no raug nqi ntau tshaj peb, uas kuv yuav tham txog nyob rau hauv kom meej ntxiv.

Ib puag ncig

Peb ua haujlwm nyob rau hauv plaub lub chaw zov me nyuam, thaum Elasticsearch cov ntaub ntawv nodes tsuas yog nyob rau hauv peb (rau ib tug xov tooj ntawm cov uas tsis yog-technology yog vim li cas).

Cov ntaub ntawv plaub lub chaw muaj kwv yees li 18 txhiab qhov sib txawv ntawm qhov chaw - kho vajtse, ntim, tshuab virtual.

Ib qho tseem ceeb feature: cov pawg pib hauv cov thawv podman tsis nyob rau hauv lub cev tshuab, tab sis nyob rau tus kheej huab khoom ib-huab. Cov thawv ntim tau lav 2 cores, zoo ib yam li 2.0Ghz v4, nrog rau kev rov ua dua cov cores ntxiv yog tias lawv tsis ua haujlwm.

Hauv lwm lo lus:

Elasticsearch pawg 200 TB +

Topology

Kuv pib pom daim ntawv dav dav ntawm kev daws raws li hauv qab no:

  • 3-4 VIPs nyob tom qab A-cov ntaub ntawv ntawm Greylog sau, qhov no yog qhov chaw nyob uas cov cav raug xa mus.
  • Txhua VIP yog LVS balancer.
  • Tom qab nws, cov cav mus rau lub roj teeb Greylog, qee cov ntaub ntawv yog nyob rau hauv GELF hom, qee qhov hauv syslog hom.
  • Tom qab ntawd tag nrho cov no tau sau rau hauv cov khoom loj mus rau lub roj teeb ntawm Elasticsearch tus tswj hwm.
  • Thiab lawv, nyob rau hauv lem, xa sau thiab nyeem ntawv thov rau cov ntaub ntawv muaj feem xyuam.

Elasticsearch pawg 200 TB +

Terminology

Tej zaum tsis yog txhua leej txhua tus nkag siab txog cov ntsiab lus nthuav dav, yog li kuv xav nyob ntawm nws me ntsis.

Elasticsearch muaj ntau hom nodes - tus tswv, tus tswj xyuas, cov ntaub ntawv node. Muaj ob hom kev sib txawv ntawm cov cav sib txawv thiab kev sib txuas lus ntawm cov pawg sib txawv, tab sis peb tsuas yog siv cov npe.

Master
Nws pings tag nrho cov nodes tam sim no nyob rau hauv pawg, tuav ib tug tshiab-to-hnub daim ntawv qhia pawg thiab faib nws ntawm nodes, txheej txheem kev tshwm sim logic, thiab ua ntau yam ntawm pawg dav kev tu vaj tse.

Coordinator
Ua ib txoj haujlwm nkaus xwb: lees txais nyeem lossis sau cov lus thov los ntawm cov neeg siv khoom thiab xa cov tsheb mus los no. Nyob rau hauv rooj plaub uas muaj ib daim ntawv thov, feem ntau yuav, nws yuav nug tus tswv qhov twg shard ntawm qhov cuam tshuam qhov ntsuas nws yuav tsum tau muab tso rau hauv, thiab yuav redirect qhov kev thov ntxiv.

Cov ntaub ntawv node
Khaws cov ntaub ntawv, ua cov lus nug tshawb nrhiav los ntawm sab nraud thiab ua haujlwm ntawm shards nyob rau ntawm nws.

Nkaujnooghawj
Nov yog qee yam zoo li kev sib xyaw ntawm Kibana nrog Logstash hauv ELK pawg. Graylog ua ke ob qho tib si UI thiab lub cav ua cov kav dej. Hauv qab lub hood, Graylog khiav Kafka thiab Zookeeper, uas muab kev sib txuas rau Graylog ua pawg. Graylog tuaj yeem cache cav (Kafka) yog tias Elasticsearch tsis muaj thiab rov ua tsis tiav nyeem thiab sau cov lus thov, pab pawg thiab kos npe raws li cov cai teev tseg. Zoo li Logstash, Graylog muaj kev ua haujlwm los hloov cov kab ua ntej sau rau Elasticsearch.

Tsis tas li ntawd, Graylog muaj ib qho kev pabcuam nrhiav pom uas tso cai rau, raws li ib qho muaj Elasticsearch node, kom tau txais tag nrho daim ntawv qhia pawg thiab lim nws los ntawm cov cim tshwj xeeb, uas ua rau nws muaj peev xwm ncaj qha thov rau cov ntim tshwj xeeb.

Visually nws zoo li ib yam dab tsi zoo li no:

Elasticsearch pawg 200 TB +

Qhov no yog ib qho screenshot ntawm ib qho piv txwv tshwj xeeb. Ntawm no peb tsim cov histogram raws li cov lus nug tshawb fawb thiab tso saib cov kab uas cuam tshuam.

Indexes

Rov qab mus rau qhov system architecture, kuv xav nyob hauv cov ntsiab lus ntxiv ntawm yuav ua li cas peb tsim cov qauv ntsuas kom nws txhua tus ua haujlwm raug.

Hauv daim duab saum toj no, qhov no yog qib qis tshaj: Elasticsearch cov ntaub ntawv nodes.

Ib qho Performance index yog ib qhov chaw loj virtual ua los ntawm Elasticsearch shards. Nyob rau hauv nws tus kheej, txhua qhov shards tsis muaj dab tsi ntau tshaj li qhov ntsuas Lucene. Thiab txhua qhov ntsuas Lucene, nyob rau hauv lem, muaj ib lossis ntau ntu.

Elasticsearch pawg 200 TB +

Thaum tsim qauv, peb xav tias txhawm rau ua kom tau raws li qhov yuav tsum tau muaj rau kev nyeem ntawv ceev ntawm cov ntaub ntawv loj, peb yuav tsum "kho" cov ntaub ntawv no sib npaug ntawm cov ntaub ntawv nodes.

Qhov no ua rau lub fact tias tus naj npawb ntawm shards ib index (nrog replicas) yuav tsum nruj me ntsis sib npaug zos rau tus naj npawb ntawm cov ntaub ntawv nodes. Ua ntej, txhawm rau kom ntseeg tau tias qhov sib npaug sib npaug ntawm ob (uas yog, peb tuaj yeem poob ib nrab ntawm pawg). Thiab, qhov thib ob, txhawm rau ua cov ntawv nyeem thiab sau cov lus thov tsawg kawg ib nrab ntawm pawg.

Peb thawj zaug txiav txim lub sijhawm cia li 30 hnub.

Kev faib tawm ntawm shards tuaj yeem sawv cev graphically raws li hauv qab no:

Elasticsearch pawg 200 TB +

Tag nrho cov duab plaub grey tsaus yog qhov ntsuas. Sab laug liab square nyob rau hauv nws yog thawj shard, thawj nyob rau hauv lub Performance index. Thiab lub xiav square yog ib tug replica shard. Lawv nyob hauv cov chaw sib txawv cov ntaub ntawv.

Thaum peb ntxiv lwm shard, nws mus rau qhov chaw thib peb cov ntaub ntawv. Thiab, thaum kawg, peb tau txais cov qauv no, uas ua rau nws tuaj yeem poob DC yam tsis tau poob cov ntaub ntawv sib xws:

Elasticsearch pawg 200 TB +

Kev sib hloov ntawm indexes, i.e. tsim qhov ntsuas tshiab thiab tshem tawm qhov qub tshaj plaws, peb ua kom nws sib npaug li 48 teev (raws li tus qauv ntawm kev siv index: 48 teev dhau los yog tshawb nrhiav feem ntau).

Qhov kev hloov pauv qhov ntsuas no yog vim li cas hauv qab no:

Thaum qhov kev thov tshawb fawb tuaj txog ntawm cov ntaub ntawv tshwj xeeb, tom qab ntawd, los ntawm qhov kev ua tau zoo ntawm qhov pom, nws muaj txiaj ntsig ntau dua thaum ib tus shard raug nug, yog tias nws qhov loj me piv rau qhov loj ntawm lub hauv siab. Qhov no tso cai rau koj khaws qhov "kub" ntawm qhov ntsuas hauv ib heap thiab nkag mus sai sai. Thaum muaj ntau qhov "kub qhov chaw", qhov nrawm ntawm kev tshawb nrhiav qhov ntsuas degrades.

Thaum ib lub pob pib ua cov lus nug tshawb fawb ntawm ib qho shard, nws faib cov xov tooj sib npaug rau tus naj npawb ntawm hyperthreading cores ntawm lub tshuab lub cev. Yog tias cov lus nug tshawb fawb cuam tshuam rau ntau tus shards, ces tus xov tooj ntawm threads loj hlob proportionally. Qhov no muaj kev cuam tshuam tsis zoo rau kev tshawb nrhiav nrawm thiab cuam tshuam tsis zoo rau kev txheeb xyuas cov ntaub ntawv tshiab.

Txhawm rau muab qhov tsim nyog tshawb nrhiav latency, peb txiav txim siab siv SSD. Yuav kom ua tiav cov lus thov sai, cov tshuab uas tuav cov thawv no yuav tsum muaj tsawg kawg yog 56 cores. Daim duab ntawm 56 tau raug xaiv raws li tus nqi txaus uas txiav txim siab cov xov tooj uas Elasticsearch yuav tsim thaum lub sijhawm ua haujlwm. Hauv Elasitcsearch, ntau cov xov pas dej tsis sib haum ncaj qha nyob ntawm tus naj npawb ntawm cov cores muaj, uas cuam tshuam ncaj qha rau tus lej ntawm cov nodes hauv pawg raws li lub hauv paus ntsiab lus "tsawg dua cores - ntau cov nodes".

Yog li ntawd, peb pom tias qhov nruab nrab ntawm ib qho shard hnyav txog 20 gigabytes, thiab muaj 1 ​​shards ib qhov Performance index. Yog li, yog tias peb tig lawv ib zaug txhua 360 teev, ces peb muaj 48 ntawm lawv. Txhua qhov ntsuas muaj cov ntaub ntawv rau 15 hnub.

Cov ntaub ntawv sau thiab nyeem ntawv circuits

Cia peb xav seb yuav sau cov ntaub ntawv li cas hauv qhov system no.

Cia peb hais qee qhov kev thov tuaj txog ntawm Graylog mus rau tus neeg saib xyuas. Piv txwv li, peb xav index 2-3 txhiab kab.

Tus neeg saib xyuas, tau txais kev thov los ntawm Graylog, nug tus tswv: "Nyob rau hauv daim ntawv thov indexing, peb tshwj xeeb teev ib qho kev ntsuas, tab sis nyob rau hauv uas shard sau nws tsis tau teev."

Tus tswv teb: "Sau cov ntaub ntawv no rau shard tus lej 71," tom qab ntawd nws raug xa ncaj qha mus rau cov ntaub ntawv cuam tshuam, qhov chaw thawj-shard tus lej 71 nyob.

Tom qab ntawd cov ntaub ntawv sib pauv hloov mus rau ib qho replica-shard, uas nyob rau hauv lwm qhov chaw cov ntaub ntawv.

Elasticsearch pawg 200 TB +

Kev tshawb nrhiav tuaj txog ntawm Graylog mus rau tus neeg saib xyuas. Tus neeg saib xyuas redirects nws raws li qhov Performance index, thaum Elasticsearch faib kev thov ntawm thawj-shard thiab replica-shard siv lub round-robin txoj cai.

Elasticsearch pawg 200 TB +

Lub 180 nodes teb tsis sib npaug, thiab thaum lawv teb, tus neeg koom tes tau sau cov ntaub ntawv uas twb tau "spitted" los ntawm cov ntaub ntawv ceev dua. Tom qab ntawd, thaum twg tag nrho cov ntaub ntawv tau tuaj txog, lossis qhov kev thov tau mus txog lub sijhawm, nws muab txhua yam ncaj qha rau tus neeg siv khoom.

Qhov no tag nrho cov txheej txheem ntawm nruab nrab cov txheej txheem tshawb nrhiav cov lus nug kawg rau 48 teev dhau los hauv 300-400ms, tsis suav nrog cov lus nug nrog cov ntawv ua ntej.

Paj nrog Elasticsearch: Java teeb

Elasticsearch pawg 200 TB +

Txhawm rau ua kom txhua yam ua haujlwm raws li qhov peb xav tau, peb tau siv sijhawm ntev heev kom debugging ntau yam hauv pawg.

Thawj feem ntawm cov teeb meem nrhiav pom muaj feem xyuam rau txoj kev Java yog pre-configured los ntawm lub neej ntawd hauv Elasticsearch.

Teeb meem ib
Peb tau pom ntau cov lus ceeb toom hais tias nyob rau theem Lucene, thaum cov haujlwm tom qab tab tom khiav, Lucene ntu kev sib koom ua ke ua tsis tiav nrog qhov yuam kev. Nyob rau tib lub sijhawm, nws tau pom tseeb hauv cov cav tias qhov no yog qhov yuam kev OutOfMemoryError. Peb pom los ntawm telemetry tias lub duav yog dawb, thiab nws tsis meej vim li cas qhov haujlwm no ua tsis tiav.

Nws muab tawm tias Lucene index sib koom ua ke tshwm sim sab nraum lub duav. Thiab cov thawv ntim tau nruj nruj raws li cov peev txheej siv. Tsuas yog heap tuaj yeem haum rau cov peev txheej no (tus nqi heap.size yog kwv yees li sib npaug ntawm RAM), thiab qee qhov haujlwm tawm ntawm heap poob nrog lub cim xeeb faib yuam kev yog vim qee qhov lawv tsis haum rau ~ 500MB uas tseem nyob ua ntej qhov txwv.

Kev kho yog qhov tsis tseem ceeb: tus nqi ntawm RAM muaj rau lub thawv tau nce, tom qab ntawd peb tsis nco qab tias peb txawm muaj teeb meem zoo li no.

Teeb meem ob
4-5 hnub tom qab tso tawm ntawm pawg, peb pom tias cov ntaub ntawv nodes pib ib ntus tawm ntawm pawg thiab nkag mus tom qab 10-20 vib nas this.

Thaum peb pib txheeb xyuas nws, nws tau muab tawm tias qhov kev nco tsis zoo no hauv Elasticsearch tsis tau tswj hwm txhua txoj hauv kev. Thaum peb muab ntau lub cim xeeb rau lub thawv, peb tuaj yeem sau cov pas dej ncaj qha nrog ntau cov ntaub ntawv, thiab nws tau raug tshem tawm tsuas yog tom qab qhov tseeb GC tau pib los ntawm Elasticsearch.

Qee qhov xwm txheej, qhov haujlwm no tau siv sijhawm ntev heev, thiab lub sijhawm no pawg tswj hwm tau txheeb xyuas qhov node raws li twb tau tawm lawm. Qhov teeb meem no tau piav qhia zoo no.

Txoj kev daws teeb meem yog raws li hauv qab no: peb txwv Java lub peev xwm los siv ntau lub cim xeeb sab nraum lub heap rau cov haujlwm no. Peb txwv nws rau 16 gigabytes (-XX:MaxDirectMemorySize = 16g), xyuas kom meej tias GC raug hu ua ntau zaus thiab ua tiav sai dua, yog li tsis muaj kev cuam tshuam rau pawg.

Teeb meem peb
Yog tias koj xav tias cov teeb meem nrog "nodes tawm hauv pawg ntawm lub sijhawm tsis xav txog tshaj plaws" dhau lawm, koj ua yuam kev.

Thaum peb teeb tsa kev ua haujlwm nrog indexes, peb xaiv mmapfs rau txo lub sijhawm tshawb nrhiav ntawm tshiab shards nrog zoo segmentation. Qhov no yog qhov tsis txaus ntseeg, vim tias thaum siv mmapfs cov ntaub ntawv tau teeb tsa rau hauv RAM, thiab tom qab ntawd peb ua haujlwm nrog cov ntaub ntawv mapped. Vim li no, nws hloov tawm tias thaum GC sim nres cov xov hauv daim ntawv thov, peb mus rau qhov chaw nyab xeeb ntev heev, thiab ntawm txoj kev mus rau nws, daim ntawv thov nres teb rau tus tswv qhov kev thov txog seb nws puas muaj sia nyob. . Raws li txoj cai, tus tswv ntseeg tias qhov node tsis nyob hauv pawg lawm. Tom qab ntawd, tom qab 5-10 vib nas this, cov khib nyiab khib nyiab ua haujlwm, cov node los rau lub neej, nkag mus rau hauv pawg dua thiab pib pib shards. Nws txhua tus xav tias zoo li "cov khoom peb tsim nyog" thiab tsis haum rau ib yam dab tsi loj.

Txhawm rau tshem tawm ntawm tus cwj pwm no, peb thawj zaug hloov mus rau tus qauv niofs, thiab tom qab ntawd, thaum peb tsiv los ntawm lub thib tsib versions ntawm Elastic mus rau thib rau, peb sim hybridfs, qhov twg qhov teeb meem no tsis yog rov tsim dua. Koj tuaj yeem nyeem ntxiv txog hom kev khaws cia no.

Teeb meem plaub
Tom qab ntawd muaj lwm qhov teeb meem nthuav heev uas peb tau kho rau lub sijhawm sau tseg. Peb ntes tau 2-3 lub hlis vim nws cov qauv yog qhov tsis nkag siab kiag li.

Qee lub sij hawm peb cov neeg koom tes tau mus rau Full GC, feem ntau qee zaum tom qab noj su, thiab tsis tau rov qab los ntawm qhov ntawd. Nyob rau tib lub sijhawm, thaum nkag GC ncua sijhawm, nws zoo li qhov no: txhua yam mus zoo, zoo, zoo, thiab tom qab ntawd mam li nco dheev txhua yam mus tsis zoo.

Thaum xub thawj peb xav tias peb muaj ib tus neeg siv siab phem uas tau pib ua qee yam kev thov uas ua rau tus neeg sib koom tes tawm ntawm kev ua haujlwm. Peb tau sau npe thov rau lub sijhawm ntev heev, sim xyuas seb qhov twg tau tshwm sim.

Raws li qhov tshwm sim, nws tau pom tias lub sijhawm no thaum tus neeg siv pib qhov kev thov loj, thiab nws tau mus rau tus kws saib xyuas Elasticsearch tshwj xeeb, qee qhov nodes teb ntev dua li lwm tus.

Thiab thaum tus neeg saib xyuas tab tom tos cov lus teb los ntawm tag nrho cov nodes, nws sau cov txiaj ntsig xa los ntawm cov nodes uas twb tau teb lawm. Rau GC, qhov no txhais tau tias peb cov qauv siv heap hloov sai heev. Thiab GC uas peb siv tsis tuaj yeem tiv nrog txoj haujlwm no.

Qhov kev kho tsuas yog peb pom los hloov tus cwj pwm ntawm pawg hauv qhov xwm txheej no yog kev tsiv teb tsaws chaw rau JDK13 thiab siv Shenandoah cov khoom khib nyiab. Qhov no daws qhov teeb meem, peb cov neeg koom tes tau nres.

Qhov no yog qhov teeb meem nrog Java xaus thiab teeb meem bandwidth pib.

"Berry" nrog Elasticsearch: throughput

Elasticsearch pawg 200 TB +

Cov teeb meem nrog kev xa tawm txhais tau tias peb pawg ua haujlwm ruaj khov, tab sis ntawm qhov siab tshaj ntawm cov ntaub ntawv txheeb xyuas thiab thaum lub sijhawm ua haujlwm, kev ua haujlwm tsis txaus.

Thawj cov tsos mob tau ntsib: thaum qee qhov "explosions" nyob rau hauv ntau lawm, thaum muaj coob tus cav yuav dheev generated, indexing yuam kev es_rejected_execution pib flash nquag hauv Graylog.

Qhov no yog vim lub fact tias thread_pool.write.queue ntawm ib cov ntaub ntawv node, mus txog rau thaum lub sij hawm Elasticsearch muaj peev xwm ua cov indexing thov thiab upload cov ntaub ntawv mus rau lub shard ntawm disk, yog muaj peev xwm cache tsuas yog 200 thov los ntawm lub neej ntawd. Thiab hauv Cov ntaub ntawv Elasticsearch Tsawg heev yog hais txog qhov parameter no. Tsuas yog qhov siab tshaj plaws ntawm cov xov thiab qhov loj me yog qhia.

Tau kawg, peb tau mus twist tus nqi no thiab pom cov hauv qab no: tshwj xeeb, hauv peb qhov teeb tsa, txog li 300 qhov kev thov raug kaw zoo heev, thiab tus nqi siab dua yog qhov tsis txaus ntseeg nrog qhov tseeb tias peb rov ya mus rau Full GC.

Tsis tas li ntawd, txij li cov no yog pawg ntawm cov lus uas tuaj txog hauv ib qho kev thov, nws yog ib qho tsim nyog yuav tsum tau tweak Graylog kom nws tsis sau ntau zaus thiab hauv cov khoom me me, tab sis hauv cov khoom loj lossis ib zaug txhua 3 vib nas this yog tias cov batch tseem tsis tiav. Nyob rau hauv cov ntaub ntawv no, nws hloov tawm hais tias cov ntaub ntawv uas peb sau nyob rau hauv Elasticsearch yuav tsis muaj nyob rau hauv ob lub vib nas this, tab sis nyob rau hauv tsib (uas suits peb zoo heev), tab sis tus naj npawb ntawm retrays uas yuav tsum tau ua nyob rau hauv thiaj li yuav thawb los ntawm ib tug loj. pawg ntawm cov ntaub ntawv raug txo.

Qhov no yog qhov tseem ceeb tshwj xeeb tshaj yog nyob rau lub sijhawm ntawd thaum qee yam tau tsoo qhov chaw thiab furiously ceeb toom txog nws, yog li kom tsis txhob tau txais ib qho spammed Elastic kiag li, thiab tom qab qee lub sij hawm - Greylog nodes uas ua haujlwm tsis tau vim yog cov khoom thaiv.

Tsis tas li ntawd, thaum peb muaj cov kev tawg zoo li no hauv kev tsim khoom, peb tau txais kev tsis txaus siab los ntawm cov programmers thiab cov neeg sim: thaum lub sijhawm lawv xav tau cov cav no tiag tiag, lawv tau muab rau lawv qeeb heev.

Lawv pib xam nws tawm. Ntawm ib sab, nws tau pom tseeb tias ob qho tib si kev tshawb nrhiav thiab kev ntsuas cov lus nug tau ua tiav, qhov tseem ceeb, ntawm tib lub tshuab lub cev, thiab ib txoj hauv kev los yog lwm qhov yuav muaj qee qhov kev poob qis.

Tab sis qhov no tuaj yeem raug cuam tshuam ib nrab vim qhov tseeb tias nyob rau hauv thib rau versions ntawm Elasticsearch, ib qho algorithm tau tshwm sim uas tso cai rau koj faib cov lus nug ntawm cov ntaub ntawv cuam tshuam tsis raws li txoj cai random round-robin (lub thawv uas ua indexing thiab tuav cov thawj. -shard tuaj yeem ua haujlwm heev, yuav tsis muaj txoj hauv kev los teb sai), tab sis xa mus rau qhov kev thov no mus rau lub thawv ntim tsawg dua nrog lub replica-shard, uas yuav teb sai dua. Hauv lwm lo lus, peb tuaj txog ntawm use_adaptive_replica_selection: tseeb.

Daim duab nyeem pib zoo li no:

Elasticsearch pawg 200 TB +

Kev hloov pauv mus rau qhov algorithm no ua rau nws muaj peev xwm txhim kho cov lus nug hauv lub sijhawm ntawd thaum peb muaj cov ntaub ntawv loj los sau.

Thaum kawg, qhov teeb meem tseem ceeb yog qhov kev tshem tawm tsis mob ntawm cov ntaub ntawv chaw.

Qhov peb xav tau los ntawm pawg tam sim ntawd tom qab poob kev sib txuas nrog ib qho DC:

  • Yog tias peb muaj tus tswv tam sim no hauv qhov chaw ua tsis tiav, ces nws yuav raug xaiv dua thiab hloov mus ua lub luag haujlwm rau lwm qhov ntawm lwm qhov DC.
  • Tus tswv yuav sai sai tshem tag nrho cov nodes nkag tsis tau los ntawm pawg.
  • Raws li cov uas tseem tshuav, nws yuav nkag siab: nyob rau hauv cov ntaub ntawv poob peb muaj xws li thiab cov thawj shards, nws yuav sai sai txhawb complementary replica shards nyob rau hauv cov ntaub ntawv seem, thiab peb yuav txuas ntxiv indexing cov ntaub ntawv.
  • Raws li qhov tshwm sim ntawm qhov no, pawg kev sau ntawv thiab kev nyeem ntawv yuav maj mam txo qis, tab sis feem ntau txhua yam yuav ua haujlwm, txawm tias maj mam, tab sis ruaj khov.

Raws li nws tau tshwm sim, peb xav tau qee yam zoo li no:

Elasticsearch pawg 200 TB +

Thiab peb tau txais cov hauv qab no:

Elasticsearch pawg 200 TB +

Qhov no tshwm sim li cas?

Thaum lub chaw khaws ntaub ntawv poob, peb tus tswv tau los ua lub fwj.

ΠŸΠΎΡ‡Π΅ΠΌΡƒ?

Qhov tseeb yog tias tus tswv muaj TaskBatcher, uas yog lub luag haujlwm rau kev faib cov haujlwm thiab cov xwm txheej hauv pawg. Txhua qhov kev tawm ntawm lub qhov, ib qho kev nce qib ntawm shard los ntawm replica mus rau thawj, ib txoj hauj lwm los tsim ib tug shard qhov chaw - tag nrho cov no mus ua ntej mus rau TaskBatcher, qhov twg nws yog ua raws li nyob rau hauv ib tug xov tooj.

Thaum lub sijhawm tshem tawm ntawm ib lub chaw khaws ntaub ntawv, nws tau pom tias tag nrho cov ntaub ntawv ntawm cov ntaub ntawv tseem muaj sia nyob tau suav tias yog lawv lub luag haujlwm los qhia rau tus tswv "peb tau ploj xws li shards thiab xws li thiab cov ntaub ntawv nodes."

Nyob rau tib lub sijhawm, cov ntaub ntawv tseem muaj sia nyob tau xa tag nrho cov ntaub ntawv no mus rau tus tswv tam sim no thiab sim tos kom paub tseeb tias nws lees txais nws. Lawv tsis tau tos txog qhov no, txij li tus tswv tau txais kev pabcuam sai dua li nws tuaj yeem teb tau. Cov nodes tau ncua sij hawm rov qab thov, thiab tus tswv ntawm lub sij hawm no tsis txawm sim teb lawv, tab sis tau absorbed kiag li nyob rau hauv txoj hauj lwm ntawm sorting thov los ntawm qhov tseem ceeb.

Hauv daim ntawv davhlau ya nyob twg, nws tau muab tawm tias cov ntaub ntawv nodes spammed tus tswv mus rau qhov chaw uas nws mus rau tag nrho GC. Tom qab ntawd, peb tus tswv lub luag haujlwm tau tsiv mus rau qee qhov tom ntej no, tib yam nkaus li tshwm sim rau nws, thiab vim li ntawd cov pawg tau tawg tag.

Peb tau ntsuas, thiab ua ntej version 6.4.0, qhov no tau kho, nws txaus rau peb ib txhij tso tawm tsuas yog 10 cov ntaub ntawv tawm ntawm 360 txhawm rau txhawm rau kaw tag nrho pawg.

Nws ntsia ib yam li no:

Elasticsearch pawg 200 TB +

Tom qab version 6.4.0, qhov twg cov kab phem no tau kho, cov ntaub ntawv nodes nres tua tus tswv. Tab sis qhov ntawd tsis ua rau nws "smarter." Namely: thaum peb tso tawm 2, 3 lossis 10 (txhua tus lej uas tsis yog ib qho) cov ntaub ntawv nodes, tus tswv tau txais qee cov lus thawj uas hais tias ntawm node A tau tawm mus, thiab sim qhia node B, node C txog qhov no, ntawm D.

Thiab tam sim no, qhov no tsuas yog daws tau los ntawm kev teeb tsa lub sijhawm rau kev sim qhia ib tus neeg txog ib yam dab tsi, sib npaug li 20-30 vib nas this, thiab yog li tswj qhov ceev ntawm cov ntaub ntawv chaw tsiv tawm ntawm pawg.

Hauv cov ntsiab lus, qhov no haum rau cov kev xav tau uas tau nthuav tawm thawj zaug rau cov khoom kawg uas yog ib feem ntawm qhov project, tab sis los ntawm qhov pom ntawm "kev tshawb fawb ntshiab" qhov no yog kab laum. Uas, los ntawm txoj kev, tau ua tiav kho los ntawm cov neeg tsim khoom hauv version 7.2.

Tsis tas li ntawd, thaum qee cov ntaub ntawv node tawm mus, nws tau pom tias kev tshaj tawm cov ntaub ntawv hais txog nws txoj kev tawm yog qhov tseem ceeb tshaj li qhia tag nrho pawg tias muaj xws li thiab cov thawj-shards ntawm nws (kom txhawb nqa ib qho replica-shard hauv lwm cov ntaub ntawv. nyob rau hauv lub hauv paus, thiab hauv cov ntaub ntawv tuaj yeem sau rau ntawm lawv).

Yog li ntawd, thaum txhua yam twb tuag lawm, cov ntaub ntawv tso tawm nodes tsis tam sim ntawd cim raws li stale. Raws li, peb raug yuam kom tos kom txog thaum tag nrho cov pings tau teem sijhawm rau cov ntaub ntawv tso tawm, thiab tsuas yog tom qab ntawd peb pawg pib qhia peb tias nyob ntawd, muaj, thiab qhov ntawd peb yuav tsum tau sau cov ntaub ntawv txuas ntxiv. Koj tuaj yeem nyeem ntxiv txog qhov no no.

Raws li qhov tshwm sim, kev ua haujlwm ntawm kev tshem tawm cov chaw khaws ntaub ntawv niaj hnub no yuav siv sijhawm li 5 feeb thaum lub sijhawm maj nrawm. Rau xws li ib tug loj thiab clumsy colossus, qhov no yog ib tug zoo nkauj tshwm sim.

Yog li ntawd, peb tuaj rau qhov kev txiav txim siab hauv qab no:

  • Peb muaj 360 cov ntaub ntawv nodes nrog 700 gigabyte disks.
  • 60 tus neeg koom tes rau kev khiav tsheb khiav los ntawm cov tib cov ntaub ntawv nodes.
  • 40 tus tswv uas peb tau tso tseg raws li ib yam khoom qub txeeg qub teg txij li versions ua ntej 6.4.0 - txhawm rau kom muaj sia nyob ntawm kev tshem tawm ntawm cov ntaub ntawv chaw, peb tau npaj siab kom poob ntau lub tshuab txhawm rau txhawm rau lav kom muaj pawg thawj coj txawm nyob hauv qhov xwm txheej phem tshaj plaws
  • Txhua qhov kev sim ua ke cov luag haujlwm ntawm ib lub thawv tau ntsib nrog qhov tseeb tias tsis ntev los yog tom qab lub node yuav tawg nyob rau hauv load.
  • Tag nrho cov pawg siv heap.size ntawm 31 gigabytes: txhua qhov kev sim txo qhov loj me uas ua rau muaj kev tua qee qhov ntawm cov lus nug hnyav nrog cov ntawv ua ntej lossis tau txais lub Circuit Court breaker hauv Elasticsearch nws tus kheej.
  • Tsis tas li ntawd, txhawm rau kom ntseeg tau tias kev tshawb nrhiav kev ua tau zoo, peb tau sim khaws cov khoom hauv pawg kom tsawg li sai tau, txhawm rau ua cov txheej txheem tsawg li sai tau hauv lub fwj uas peb tau txais hauv tus tswv.

Thaum kawg txog kev saib xyuas

Txhawm rau kom ntseeg tau tias txhua qhov no ua haujlwm raws li qhov xav tau, peb saib xyuas cov hauv qab no:

  • Txhua cov ntaub ntawv node qhia rau peb huab tias nws muaj, thiab muaj xws li thiab xws li shards rau nws. Thaum peb tua ib yam dab tsi, pawg qhia tom qab 2-3 vib nas this nyob rau hauv nruab nrab A peb tua cov nodes 2, 3, thiab 4 - qhov no txhais tau hais tias nyob rau hauv lwm cov ntaub ntawv chaw peb nyob rau hauv tsis muaj xwm txheej yuav tua tau cov nodes uas tsuas muaj ib tug shard. sab laug.
  • Paub qhov xwm txheej ntawm tus tswv tus cwj pwm, peb ua tib zoo saib xyuas cov haujlwm tseem ceeb ntawm cov haujlwm tseem ceeb. Vim tias txawm tias ib qho haujlwm daig, yog tias nws tsis ua haujlwm raws sijhawm, kev xav hauv qee qhov xwm txheej kub ntxhov tuaj yeem dhau los ua qhov laj thawj vim li cas, piv txwv li, kev nce qib ntawm kev hloov pauv hauv thawj zaug tsis ua haujlwm, uas yog vim li cas indexing yuav tsum tsis ua haujlwm.
  • Peb kuj saib zoo heev ntawm cov khoom khib nyiab qeeb, vim tias peb twb muaj teeb meem loj nrog qhov no thaum lub sij hawm optimization.
  • Tsis lees paub los ntawm xov kom nkag siab ua ntej qhov twg lub bottleneck yog.
  • Zoo, cov qauv ntsuas xws li heap, RAM thiab I / O.

Thaum saib xyuas lub tsev, koj yuav tsum coj mus rau hauv tus account cov yam ntxwv ntawm Thread Pool hauv Elasticsearch. Cov ntaub ntawv Elasticsearch piav qhia txog kev xaiv kev teeb tsa thiab qhov tseem ceeb rau kev tshawb nrhiav thiab kev ntsuas, tab sis nyob twj ywm ntawm thread_pool.management.Cov txheej txheem no, tshwj xeeb, cov lus nug xws li _cat/shards thiab lwm yam zoo sib xws, uas yooj yim siv thaum sau ntawv saib xyuas. Qhov loj dua cov pawg, qhov kev thov ntau dua yog ua tiav ib chav tsev ntawm lub sijhawm, thiab cov lus hais saum toj no thread_pool.management tsis yog tsuas yog nthuav tawm hauv cov ntaub ntawv raug cai, tab sis kuj raug txwv los ntawm lub neej ntawd rau 5 threads, uas tau muab pov tseg sai heev, tom qab. uas saib xyuas tsis ua haujlwm raug.

Qhov kuv xav hais hauv qhov xaus: peb tau ua! Peb muaj peev xwm muab peb cov programmers thiab cov tsim tawm ib lub cuab yeej uas, yuav luag txhua qhov xwm txheej, tuaj yeem muab cov ntaub ntawv ceev thiab ntseeg siab txog qhov tshwm sim hauv kev tsim khoom.

Yog, nws tau dhau los ua qhov nyuaj heev, tab sis, txawm li cas los xij, peb tau tswj hwm kom haum peb qhov kev xav tau rau hauv cov khoom uas twb muaj lawm, uas peb tsis tas yuav kho thiab rov sau dua rau peb tus kheej.

Elasticsearch pawg 200 TB +

Tau qhov twg los: www.hab.com

Ntxiv ib saib