Qee zaum ntau dua yog tsawg. Thaum txo cov load ua rau nce latency

Xws li hauv feem ntau tshaj tawm, muaj teeb meem nrog kev faib khoom, cia peb hu rau qhov kev pabcuam no Alvin. Lub sijhawm no kuv tsis pom qhov teeb meem kuv tus kheej, cov txiv neej los ntawm cov neeg siv khoom tau qhia rau kuv.

Muaj ib hnub kuv sawv los ntawm email tsis txaus siab vim muaj kev ncua ntev nrog Alvin, uas peb npaj yuav tshaj tawm yav tom ntej. Tshwj xeeb, tus neeg siv khoom tau ntsib 99 feem pua ​​​​latency hauv cheeb tsam ntawm 50 ms, zoo dua peb cov peev nyiaj latency. Qhov no yog qhov xav tsis thoob thaum kuv sim cov kev pabcuam dav dav, tshwj xeeb tshaj yog ntawm latency, uas yog ib qho kev tsis txaus siab.

Ua ntej kuv muab Alvin tso rau hauv kev sim, kuv tau khiav ntau qhov kev sim nrog 40k queries ib ob (QPS), txhua yam qhia latency tsawg dua 10ms. Kuv tau npaj siab tshaj tawm tias kuv tsis pom zoo nrog lawv cov txiaj ntsig. Tab sis ua tib zoo saib ntawm tsab ntawv, kuv pom ib yam tshiab: Kuv tsis tau sim raws nraim cov xwm txheej uas lawv tau hais, lawv QPS qis dua kuv. Kuv sim ntawm 40k QPS, tab sis lawv tsuas yog ntawm 1k. Kuv tau khiav lwm qhov kev sim, lub sijhawm no nrog QPS qis dua, tsuas yog kom txaus siab rau lawv.

Txij li thaum kuv tab tom blogging txog qhov no, tej zaum koj twb paub lawm tias lawv cov lej raug. Kuv tau sim kuv tus neeg siv khoom virtual ntau dhau los, nrog rau qhov tshwm sim zoo ib yam: tsawg kawg ntawm kev thov tsis tsuas yog nce latency, tab sis nce tus naj npawb ntawm kev thov nrog latency ntau dua 10 ms. Hauv lwm lo lus, yog tias ntawm 40k QPS txog 50 qhov kev thov ib thib ob tshaj 50 ms, tom qab ntawd ntawm 1k QPS muaj 100 thov siab dua 50 ms txhua ob. Paradox!

Qee zaum ntau dua yog tsawg. Thaum txo cov load ua rau nce latency

Txo qhov kev tshawb nrhiav

Thaum ntsib teeb meem latency nyob rau hauv ib qho kev faib tawm nrog ntau yam khoom, thawj kauj ruam yog los tsim cov npe luv luv ntawm cov neeg raug liam. Cia peb khawb me ntsis tob rau hauv Alvin's architecture:

Qee zaum ntau dua yog tsawg. Thaum txo cov load ua rau nce latency

Qhov pib zoo yog cov npe ntawm kev ua tiav I / O hloov pauv (kev hu xov tooj / disk nrhiav, thiab lwm yam). Cia peb sim xyuas seb qhov kev ncua yog qhov twg. Dhau li ntawm qhov pom tseeb I / O nrog cov neeg siv khoom, Alvin siv ib kauj ruam ntxiv: nws nkag mus rau cov ntaub ntawv khaws cia. Txawm li cas los xij, qhov chaw cia no ua haujlwm hauv tib pawg li Alvin, yog li qhov latency yuav tsum tsawg dua nrog tus neeg siv khoom. Yog li, cov npe ntawm cov neeg raug liam:

  1. Network hu los ntawm tus neeg siv khoom rau Alvin.
  2. Network hu los ntawm Alvin mus rau cov ntaub ntawv khaws cia.
  3. Nrhiav rau disk hauv cov ntaub ntawv khaws cia.
  4. Network hu los ntawm cov ntaub ntawv warehouse rau Alvin.
  5. Network hu los ntawm Alvin mus rau tus neeg siv khoom.

Cia peb sim hla qee cov ntsiab lus.

Cov ntaub ntawv khaws cia tsis muaj dab tsi ua nrog nws

Thawj qhov kuv tau ua yog hloov Alvin mus rau ping-ping server uas tsis ua raws li kev thov. Thaum nws tau txais ib qho kev thov, nws rov qab teb qhov khoob. Yog tias qhov latency txo qis, ces kab laum hauv Alvin lossis cov ntaub ntawv khaws cia siv tsis muaj dab tsi tsis hnov ​​​​txog. Hauv thawj qhov kev sim peb tau txais cov duab hauv qab no:

Qee zaum ntau dua yog tsawg. Thaum txo cov load ua rau nce latency

Raws li koj tuaj yeem pom, tsis muaj kev txhim kho thaum siv ping-ping server. Qhov no txhais tau hais tias cov ntaub ntawv warehouse tsis nce latency, thiab cov npe ntawm cov neeg raug liam raug txiav ib nrab:

  1. Network hu los ntawm tus neeg siv khoom rau Alvin.
  2. Network hu los ntawm Alvin mus rau tus neeg siv khoom.

Zoo heev! Daim ntawv yog shrinking sai. Kuv xav tias kuv yuav luag paub txog qhov laj thawj.

GRPC

Tam sim no yog lub sijhawm los qhia koj rau tus neeg ua si tshiab: GRPC. Nov yog lub tsev qiv ntawv qhib los ntawm Google rau kev sib txuas lus hauv txheej txheem RPC... Txawm hais tias gRPC zoo optimized thiab dav siv, qhov no yog kuv thawj zaug siv nws nyob rau hauv ib tug system ntawm no loj thiab kuv xav kom kuv kev siv yuav suboptimal - hais qhov tsawg tshaj plaws.

muaj gRPC nyob rau hauv pawg ua rau muaj lus nug tshiab: tej zaum nws yog kuv qhov kev siv lossis kuv tus kheej gRPC ua teeb meem latency? Ntxiv tus neeg raug liam tshiab rau hauv daim ntawv teev npe:

  1. Tus neeg thov hu rau lub tsev qiv ntawv gRPC
  2. tsev qiv ntawv gRPC ua lub network hu rau lub tsev qiv ntawv ntawm tus neeg siv khoom gRPC ntawm server
  3. tsev qiv ntawv gRPC tiv tauj Alvin (tsis muaj kev ua haujlwm thaum muaj ping-pong server)

Txhawm rau muab koj lub tswv yim ntawm qhov chaws zoo li cas, kuv cov neeg siv khoom / Alvin kev siv tsis txawv ntau ntawm cov neeg siv khoom-neeg rau zaub mov. async piv txwv.

Nco tseg: Cov npe saum toj no yog qhov yooj yim me ntsis vim gRPC ua rau nws muaj peev xwm siv koj tus kheej (template?) threading qauv, nyob rau hauv uas cov execution pawg yog intertwined gRPC thiab kev siv cov neeg siv. Rau lub hom phiaj ntawm kev yooj yim, peb yuav lo rau cov qauv no.

Profileing yuav kho txhua yam

Thaum hla tawm cov khw muag khoom, kuv xav tias kuv yuav luag ua tiav: "Tam sim no nws yooj yim! Cia peb siv qhov profile thiab nrhiav seb qhov twg ncua sij hawm tshwm sim. " Kuv tus kiv cua loj ntawm precision profileing, vim hais tias CPUs ceev heev thiab feem ntau tsis yog lub fwj. Feem ntau qeeb tshwm sim thaum lub processor yuav tsum tsis txhob ua lwm yam. Qhov tseeb CPU Profiling ua li ntawd: nws sau txhua yam kom raug cov ntsiab lus hloov thiab ua kom pom tseeb qhov twg muaj kev ncua.

Kuv coj plaub qhov profile: nrog QPS siab (tsawg latency) thiab nrog ping-pong server nrog qis QPS (siab latency), ob qho tib si ntawm cov neeg siv khoom thiab sab server. Thiab tsuas yog nyob rau hauv rooj plaub, kuv kuj coj tus qauv processor profile. Thaum sib piv cov profiles, kuv feem ntau saib rau ib qho anomalous hu pawg. Piv txwv li, nyob rau sab phem nrog siab latency muaj ntau ntau cov ntsiab lus hloov pauv (10 zaug lossis ntau dua). Tab sis hauv kuv rooj plaub, tus naj npawb ntawm cov ntsiab lus hloov pauv yuav luag tib yam. Rau kuv ntshai, tsis muaj dab tsi tseem ceeb nyob ntawd.

Ntxiv Debugging

Kuv poob siab. Kuv tsis paub lwm yam cuab yeej uas kuv siv tau, thiab kuv txoj kev npaj tom ntej yog qhov tseem ceeb los rov ua qhov kev sim nrog ntau qhov sib txawv es tsis yog kuaj pom qhov teeb meem.

Yog dab tsi

Txij thaum pib, kuv tau txhawj xeeb txog qhov tshwj xeeb 50ms latency. Nov yog lub sijhawm loj heev. Kuv txiav txim siab tias kuv yuav txiav cov chunks tawm ntawm cov cai kom txog thaum kuv tuaj yeem paub tseeb tias qhov twg ua rau qhov yuam kev no. Tom qab ntawd tuaj qhov kev sim uas ua haujlwm.

Raws li niaj zaus, nyob rau hauv hindsight nws zoo nkaus li tias txhua yam yog pom tseeb. Kuv tso tus neeg siv khoom ntawm tib lub tshuab li Alvin - thiab xa ib daim ntawv thov rau localhost. Thiab qhov nce hauv latency ploj mus!

Qee zaum ntau dua yog tsawg. Thaum txo cov load ua rau nce latency

Ib yam dab tsi tsis ncaj ncees lawm nrog lub network.

Kawm paub network engineer txawj

Kuv yuav tsum lees paub: kuv qhov kev paub txog kev siv thev naus laus zis yog qhov txaus ntshai, tshwj xeeb yog xav txog qhov kuv ua haujlwm nrog lawv txhua hnub. Tab sis lub network yog tus neeg phem tshaj plaws, thiab kuv yuav tsum kawm paub yuav ua li cas debug nws.

Hmoov zoo, Internet hlub cov neeg uas xav kawm. Kev sib xyaw ua ke ntawm ping thiab tracert zoo li qhov pib zoo txaus rau kev debugging network thauj teeb meem.

Ua ntej, kuv launched PsPing mus rau Alvin's TCP chaw nres nkoj. Kuv siv lub neej ntawd nqis - tsis muaj dab tsi tshwj xeeb. Ntawm ntau tshaj ib txhiab pings, tsis muaj tshaj 10 ms, tshwj tsis yog thawj tus rau kev ua kom sov. Qhov no yog qhov tsis sib xws rau qhov pom qhov nce hauv latency ntawm 50 ms ntawm 99 feem pua: muaj, rau txhua 100 qhov kev thov, peb yuav tsum tau pom txog ib qho kev thov nrog latency ntawm 50 ms.

Ces kuv sim kev poob plig: Tej zaum yuav muaj teeb meem ntawm ib qho ntawm cov nodes raws txoj kev ntawm Alvin thiab tus neeg siv khoom. Tab sis tus neeg taug qab kuj rov qab los ntawm tes.

Yog li nws tsis yog kuv txoj cai, kev siv gRPC, lossis lub network uas ua rau qeeb. Kuv tab tom pib txhawj xeeb tias kuv yuav tsis nkag siab qhov no.

Tam sim no peb nyob ntawm OS

gRPC dav siv ntawm Linux, tab sis kab txawv ntawm Windows. Kuv txiav txim siab sim ib qho kev sim, uas ua haujlwm: Kuv tsim Linux virtual tshuab, sau Alvin rau Linux, thiab siv nws.

Qee zaum ntau dua yog tsawg. Thaum txo cov load ua rau nce latency

Thiab ntawm no yog qhov tshwm sim: Linux ping-pong server tsis muaj qhov ncua sij hawm zoo ib yam li Windows host zoo sib xws, txawm hais tias cov ntaub ntawv qhov chaw tsis txawv. Nws hloov tawm tias qhov teeb meem yog nyob rau hauv kev siv gRPC rau Windows.

Nagle's algorithm

Tag nrho lub sij hawm no kuv xav tias kuv twb ploj lawm ib tug chij gRPC. Tam sim no kuv nkag siab tias nws yog dab tsi tiag gRPC Windows chij ploj lawm. Kuv pom ib lub tsev qiv ntawv RPC sab hauv uas kuv ntseeg siab yuav ua haujlwm zoo rau txhua tus chij teeb winsock. Tom qab ntawd kuv ntxiv tag nrho cov chij no rau gRPC thiab xa Alvin ntawm Windows, hauv patched Windows ping-pong server!

Qee zaum ntau dua yog tsawg. Thaum txo cov load ua rau nce latency

Yuav luag Ua tiav: Kuv pib tshem cov chij ntxiv ib zaug kom txog thaum rov qab los yog li kuv tuaj yeem txheeb xyuas qhov laj thawj. Nws yog infamous TCP_NODELAY, Nagle's algorithm hloov.

Nagle's algorithm sim txo tus naj npawb ntawm cov pob ntawv xa tawm hauv lub network los ntawm kev ncua kev xa xov mus txog thaum lub pob ntawv loj tshaj li qee tus lej ntawm bytes. Thaum qhov no yuav yog qhov zoo rau cov neeg siv nruab nrab, nws yog qhov puas tsuaj rau cov servers tiag tiag vim OS yuav ncua qee cov lus, ua rau poob qis QPS. U gRPC tus chij no tau teeb tsa hauv Linux kev siv rau TCP lub qhov, tab sis tsis nyob hauv Windows. kuv yog qhov no kho.

xaus

Qhov siab dua latency ntawm QPS qis yog tshwm sim los ntawm OS optimization. Nyob rau hauv retrospect, profiling tsis pom latency vim nws tau ua nyob rau hauv kernel hom es tsis nyob rau hauv hom neeg siv. Kuv tsis paub yog Nagle's algorithm tuaj yeem pom los ntawm ETW captures, tab sis nws yuav nthuav.

Raws li rau qhov kev sim hauv zos, tej zaum nws tsis kov qhov tseeb networking code thiab Nagle's algorithm tsis khiav, yog li cov teeb meem latency tau ploj mus thaum tus neeg siv khoom mus txog Alvin los ntawm localhost.

Lub sijhawm tom ntej koj pom qhov nce hauv latency raws li tus naj npawb ntawm kev thov ib thib ob txo qis, Nagle's algorithm yuav tsum nyob rau hauv koj cov npe ntawm cov neeg raug liam!

Tau qhov twg los: www.hab.com

Ntxiv ib saib