BPF rau cov me me, ib feem: txuas ntxiv BPF

Thaum pib muaj ib lub tshuab thiab nws hu ua BPF. Peb ntsia nws yav dhau los, Phau Qub kab lus ntawm no series. Nyob rau hauv 2013, los ntawm kev siv zog ntawm Alexei Starovoitov thiab Daniel Borkman, ib qho kev txhim kho ntawm nws, optimized rau niaj hnub 64-ntsis tshuab, tau tsim thiab suav nrog hauv Linux ntsiav. Cov thev naus laus zis tshiab no tau luv luv hu ua Internal BPF, tom qab ntawd hloov npe hu ua Extended BPF, thiab tam sim no, tom qab ob peb xyoos, txhua tus tsuas yog hu nws BPF.

Hais lus kwv yees, BPF tso cai rau koj los khiav cov neeg siv khoom siv tsis raug cai hauv Linux kernel qhov chaw, thiab cov qauv tshiab tau ua tiav zoo heev uas peb yuav xav tau kaum ob kab lus ntxiv los piav qhia tag nrho nws cov ntawv thov. (Qhov tsuas yog qhov uas cov neeg tsim khoom ua tsis tau zoo, raws li koj tuaj yeem pom hauv cov lej ua haujlwm hauv qab no, tau tsim lub logo zoo.)

Kab lus no piav qhia txog cov qauv ntawm BPF lub tshuab virtual, kernel interfaces rau kev ua haujlwm nrog BPF, cov cuab yeej tsim kho, nrog rau cov ntsiab lus luv luv, luv luv ntawm cov peev txheej uas twb muaj lawm, i.e. txhua yam uas peb yuav xav tau yav tom ntej rau kev kawm tob txog kev siv tswv yim ntawm BPF.
BPF rau cov me me, ib feem: txuas ntxiv BPF

Cov ntsiab lus ntawm tsab xov xwm

Taw qhia rau BPF architecture. Ua ntej, peb yuav coj tus noog qhov muag pom ntawm BPF architecture thiab piav qhia cov khoom tseem ceeb.

Registers thiab command system ntawm BPF virtual tshuab. Twb tau muaj ib lub tswv yim ntawm cov architecture tag nrho, peb yuav piav qhia txog cov qauv ntawm BPF virtual tshuab.

Lub neej voj voog ntawm BPF cov khoom, bpffs file system. Hauv seem no, peb yuav ua tib zoo saib lub neej voj voog ntawm BPF cov khoom - cov kev pab cuam thiab cov duab qhia chaw.

Tswj cov khoom siv bpf system hu. Nrog rau qee qhov kev nkag siab ntawm qhov system twb nyob rau hauv qhov chaw, peb yuav thaum kawg saib yuav ua li cas los tsim thiab tswj cov khoom los ntawm cov neeg siv qhov chaw siv lub kaw lus tshwj xeeb hu - bpf(2).

ПишСм ΠΏΡ€ΠΎΠ³Ρ€Π°ΠΌΠΌΡ‹ BPF с ΠΏΠΎΠΌΠΎΡ‰ΡŒΡŽ libbpf. Tau kawg, koj tuaj yeem sau cov kev pab cuam uas siv lub kaw lus hu. Tab sis nws nyuaj. Rau qhov xwm txheej muaj tseeb dua, cov kws tshaj lij nuclear tsim lub tsev qiv ntawv libbpf. Peb mam li tsim ib qho BPF daim ntawv thov pob txha pob txha uas peb yuav siv hauv cov piv txwv tom ntej.

Kernel Helpers. Ntawm no peb yuav kawm paub yuav ua li cas BPF cov kev pab cuam tuaj yeem nkag mus rau cov neeg pab cuam lub luag haujlwm - lub cuab yeej uas, nrog rau cov duab qhia chaw, nthuav dav lub peev xwm ntawm BPF tshiab piv rau cov classic.

Nkag mus rau daim duab qhia chaw los ntawm BPF cov kev pab cuam. Los ntawm qhov no, peb yuav paub txaus kom nkag siab raws nraim li cas peb tuaj yeem tsim cov kev pab cuam uas siv daim duab qhia chaw. Thiab cia peb mus saib sai sai rau hauv qhov zoo tshaj plaws thiab muaj zog pov thawj.

Cov cuab yeej tsim kho. Pab ntu ntawm yuav ua li cas los ua ke cov khoom siv hluav taws xob uas xav tau thiab cov ntsiav rau kev sim.

Qhov xaus. Thaum kawg ntawm tsab xov xwm, cov neeg uas nyeem qhov deb no yuav pom cov lus txhawb nqa thiab cov lus piav qhia luv luv ntawm qhov yuav tshwm sim hauv cov ntawv hauv qab no. Peb tseem yuav sau ntau qhov txuas rau kev kawm tus kheej rau cov neeg uas tsis muaj lub siab xav lossis muaj peev xwm tos kom txuas ntxiv mus.

Taw qhia rau BPF Architecture

Ua ntej peb pib xav txog BPF architecture, peb yuav xa ib lub sijhawm kawg (oh) rau classic BPF, uas tau tsim los ua cov lus teb rau qhov kev tshwm sim ntawm RISC cov cav tov thiab daws qhov teeb meem ntawm kev lim dej kom zoo. Lub tsev tsim ua tau zoo heev uas, tau yug los nyob rau hauv nineties dashing nyob rau hauv Berkeley UNIX, nws tau ported mus rau feem ntau uas twb muaj lawm operating systems, ciaj sia nyob rau hauv lub vwm nees nkaum thiab tseem tab tom nrhiav tshiab daim ntaub ntawv.

BPF tshiab tau tsim los ua cov lus teb rau qhov chaw ntawm 64-ntsis tshuab, kev pabcuam huab thiab xav tau ntau ntxiv rau cov cuab yeej tsim SDN (Software-defined nkev ua haujlwm). Tsim los ntawm kernel network engineers raws li kev txhim kho hloov tshiab rau BPF classic, BPF tshiab tau hais rau lub hlis tom qab pom cov ntawv thov hauv txoj haujlwm nyuaj ntawm kev taug qab Linux systems, thiab tam sim no, rau xyoo tom qab nws tshwm sim, peb yuav xav tau tag nrho cov ntawv txuas ntxiv mus rau sau ntau hom kev pab cuam.

Funny duab

Ntawm nws cov tub ntxhais kawm, BPF yog lub tshuab sandbox virtual uas tso cai rau koj los khiav "txoj cai" chaws hauv qhov chaw ntsiav yam tsis muaj kev cuam tshuam txog kev ruaj ntseg. BPF cov kev pab cuam yog tsim nyob rau hauv cov neeg siv qhov chaw, loaded rau hauv lub ntsiav, thiab txuas nrog rau qee qhov xwm txheej. Ib qho kev tshwm sim tuaj yeem yog, piv txwv li, kev xa cov pob ntawv mus rau lub network interface, kev tso tawm ntawm qee qhov kev ua haujlwm kernel, thiab lwm yam. Nyob rau hauv cov ntaub ntawv ntawm ib pob, qhov kev pab cuam BPF yuav muaj kev nkag tau mus rau cov ntaub ntawv thiab metadata ntawm lub pob (rau kev nyeem ntawv thiab, tejzaum nws, sau ntawv, nyob ntawm seb hom kev pab cuam); nyob rau hauv cov ntaub ntawv ntawm kev khiav ib kernel muaj nuj nqi, cov kev sib cav ntawm muaj nuj nqi, suav nrog pointers rau kernel nco, thiab lwm yam.

Cia wb mus saib cov txheej txheem no. Yuav pib nrog, cia peb tham txog thawj qhov sib txawv ntawm classic BPF, cov kev pab cuam uas tau sau rau hauv assembler. Nyob rau hauv lub tshiab version, lub architecture tau nthuav dav kom cov kev pab cuam tuaj yeem sau ua hom lus siab, feem ntau, ntawm chav kawm, hauv C. Rau qhov no, ib qho backend rau llvm tau tsim, uas tso cai rau koj los tsim bytecode rau BPF architecture.

BPF rau cov me me, ib feem: txuas ntxiv BPF

BPF architecture tau tsim, ib feem, kom khiav tau zoo ntawm cov tshuab niaj hnub. Txhawm rau ua qhov haujlwm no hauv kev xyaum, BPF bytecode, ib zaug thauj mus rau hauv cov ntsiav, tau muab txhais ua cov cai ib txwm siv cov khoom hu ua JIT compiler (Just In Tyim). Tom ntej no, yog tias koj nco qab, nyob rau hauv classic BPF qhov kev pab cuam tau loaded rau hauv lub ntsiav thiab txuas mus rau qhov kev tshwm sim qhov chaw atomically - nyob rau hauv cov ntsiab lus ntawm ib tug system hu. Nyob rau hauv lub tshiab architecture, qhov no tshwm sim nyob rau hauv ob theem - ua ntej, cov code yog loaded rau hauv lub ntsiav siv ib tug system hu. bpf(2)thiab tom qab ntawd, tom qab ntawd, los ntawm lwm cov txheej txheem uas txawv nyob ntawm seb hom kev pab cuam, qhov kev pab cuam txuas mus rau qhov xwm txheej.

Ntawm no tus nyeem ntawv yuav muaj lus nug: nws puas ua tau? Yuav ua li cas yog kev ruaj ntseg ntawm xws li code guaranteed? Kev ua tiav kev nyab xeeb yog lav rau peb los ntawm theem ntawm kev thauj khoom BPF cov kev pabcuam hu ua verifier (hauv lus Askiv theem no yog hu ua verifier thiab kuv yuav txuas ntxiv siv lo lus Askiv):

BPF rau cov me me, ib feem: txuas ntxiv BPF

Verifier yog ib qho kev ntsuas zoo li qub uas ua kom ntseeg tau tias qhov kev zov me nyuam tsis cuam tshuam qhov kev ua haujlwm ib txwm muaj ntawm cov ntsiav. Qhov no, los ntawm txoj kev, tsis txhais hais tias qhov kev zov me nyuam tsis tuaj yeem cuam tshuam nrog kev ua haujlwm ntawm lub kaw lus - BPF cov kev pab cuam, nyob ntawm hom, tuaj yeem nyeem thiab sau dua ntu ntawm cov cim nco, rov qab qhov txiaj ntsig ntawm kev ua haujlwm, luas, ntxiv, rov sau dua. thiab txawm xa mus rau cov pob ntawv network. Verifier tau lees paub tias kev ua haujlwm BPF yuav tsis cuam tshuam lub kernel thiab tias ib qho kev pab cuam uas, raws li txoj cai, tau sau nkag, piv txwv li, cov ntaub ntawv ntawm cov pob ntawv tawm, yuav tsis tuaj yeem sau cov kernel nco sab nraum pob ntawv. Peb yuav saib cov neeg txheeb xyuas hauv qhov nthuav dav me ntsis hauv ntu sib txuas, tom qab peb tau paub txog tag nrho lwm cov khoom ntawm BPF.

Yog li peb tau kawm dab tsi txog tam sim no? Tus neeg siv sau ib qho kev pab cuam hauv C, thauj nws mus rau hauv cov ntsiav uas siv lub kaw lus hu bpf(2), qhov twg nws raug kuaj xyuas los ntawm tus neeg txheeb xyuas thiab muab txhais ua ib txwm bytecode. Tom qab ntawd tus tib lossis lwm tus neeg siv txuas qhov program mus rau qhov chaw tshwm sim thiab nws pib ua. Kev sib cais khau raj thiab kev sib txuas yog tsim nyog rau ntau yam. Ua ntej, khiav lub tshuab kuaj xyuas yog qhov kim thiab los ntawm kev rub tawm tib qhov kev pab cuam ob peb zaug peb nkim lub sijhawm siv computer. Qhov thib ob, raws nraim li cas qhov kev pab cuam txuas nrog nyob ntawm nws hom, thiab ib qho "universal" interface tsim ib xyoo dhau los yuav tsis haum rau hom tshiab ntawm cov kev pab cuam. (Txawm hais tias tam sim no hais tias lub architecture tau dhau los ua neeg paub tab, muaj lub tswv yim los koom ua ke qhov kev sib txuas no ntawm qib libbpf.)

Tus neeg nyeem ntawv yuav pom tias peb tseem tsis tau tiav nrog cov duab. Tseeb, tag nrho cov saum toj no tsis piav qhia vim li cas BPF hloov pauv daim duab piv rau classic BPF. Ob qhov kev tsim kho tshiab uas nthuav dav dav ntawm kev siv tau yog lub peev xwm los siv cov cim xeeb sib koom thiab cov neeg pab cuam lub luag haujlwm. Hauv BPF, kev sib koom nco tau siv los siv cov duab qhia chaw - cov ntaub ntawv sib koom nrog cov API tshwj xeeb. Tej zaum lawv tau txais lub npe no vim tias thawj hom ntawv qhia pom yog lub rooj hash. Tom qab ntawd cov arrays tshwm sim, hauv zos (ib-CPU) cov lus hash thiab cov arrays hauv zos, nrhiav ntoo, cov duab qhia chaw uas muaj cov taw qhia rau BPF cov kev pab cuam thiab ntau ntxiv. Dab tsi nthuav rau peb tam sim no yog tias BPF cov kev pab cuam tam sim no muaj peev xwm ua rau lub xeev ntawm kev hu xov tooj thiab qhia nws nrog lwm cov kev pab cuam thiab nrog rau qhov chaw siv.

Maps tau nkag los ntawm cov neeg siv cov txheej txheem siv kev hu xov tooj bpf(2), thiab los ntawm BPF cov kev pab cuam uas khiav hauv cov ntsiav siv cov kev pab cuam. Ntxiv mus, cov neeg pab muaj nyob tsis tau tsuas yog ua hauj lwm nrog daim ntawv qhia, tab sis kuj mus rau lwm yam kernel peev xwm. Piv txwv li, BPF cov kev pab cuam tuaj yeem siv cov kev pab cuam los xa cov pob ntawv mus rau lwm qhov sib cuam tshuam, tsim cov xwm txheej perf, nkag mus rau cov qauv ntawm cov ntsiav, thiab lwm yam.

BPF rau cov me me, ib feem: txuas ntxiv BPF

Hauv cov ntsiab lus, BPF muab lub peev xwm thauj khoom arbitrary, piv txwv li, kuaj xyuas, tus neeg siv code rau hauv qhov chaw kernel. Cov cai no tuaj yeem txuag lub xeev ntawm kev hu xov tooj thiab pauv cov ntaub ntawv nrog cov neeg siv qhov chaw, thiab tseem muaj kev nkag mus rau kernel subsystems tso cai los ntawm hom kev pab cuam no.

Qhov no twb zoo ib yam li lub peev xwm muab los ntawm cov ntsiav modules, piv rau qhov uas BPF muaj qee qhov zoo (qhov tseeb, koj tuaj yeem sib piv cov ntawv siv zoo sib xws, piv txwv li, kev taug qab cov kab ke - koj tsis tuaj yeem sau tus tsav tsheb tsis txaus ntseeg nrog BPF). Koj tuaj yeem nco qab qhov pib nkag qis dua (qee qhov kev siv hluav taws xob uas siv BPF tsis tas yuav kom tus neeg siv kom muaj cov kev txawj ntse hauv lub khoos phis tawj, lossis kev txawj ua haujlwm feem ntau), kev nyab xeeb ntawm lub sijhawm (tso koj txhais tes hauv cov lus rau cov uas tsis ua txhaum lub kaw lus thaum sau ntawv los yog kuaj modules), atomicity - muaj downtime thaum reloading modules, thiab lub BPF subsystem xyuas kom meej tias tsis muaj cov txheej xwm yuav plam (kom ncaj ncees, qhov no tsis muaj tseeb rau txhua hom BPF cov kev pab cuam).

Lub xub ntiag ntawm cov peev txheej no ua rau BPF yog ib qho cuab yeej thoob ntiaj teb rau kev nthuav dav cov ntsiav, uas tau lees paub hauv kev xyaum: ntau thiab ntau hom kev pabcuam tshiab tau ntxiv rau BPF, ntau thiab ntau lub tuam txhab loj siv BPF ntawm kev sib ntaus sib tua servers 24 Γ— 7, ntau thiab ntau dua. startups tsim lawv cov lag luam ntawm cov kev daws teeb meem raws li uas yog raws li BPF. BPF yog siv nyob txhua qhov chaw: hauv kev tiv thaiv DDoS tawm tsam, tsim SDN (piv txwv li, kev siv cov tes hauj lwm rau kubernetes), raws li lub hauv paus system tracing cuab tam thiab cov txheeb cais sau, nyob rau hauv intrusion detection systems thiab sandbox systems, thiab lwm yam.

Cia peb ua kom tiav cov ntsiab lus ntawm tsab xov xwm no thiab saib lub tshuab virtual thiab BPF ecosystem kom ntxaws ntxiv.

Digression: kev siv

Txhawm rau kom muaj peev xwm khiav cov piv txwv hauv cov ntu hauv qab no, koj yuav xav tau ntau cov khoom siv hluav taws xob, tsawg kawg llvm/clang nrog bpf kev txhawb nqa thiab bpftoolCov. Hauv seem Cov cuab yeej tsim kho Koj tuaj yeem nyeem cov lus qhia rau kev sib sau cov khoom siv hluav taws xob, nrog rau koj lub kernel. Tshooj lus no tau muab tso rau hauv qab no kom tsis txhob cuam tshuam kev sib haum xeeb ntawm peb qhov kev nthuav qhia.

BPF Virtual Machine Sau npe thiab Kev Qhia Txheej Txheem

Lub architecture thiab kev hais kom ua ntawm BPF tau tsim los rau hauv tus account qhov tseeb tias cov kev pab cuam yuav raug sau ua lus C thiab, tom qab thauj khoom mus rau hauv cov ntsiav, txhais ua cov cai ib txwm muaj. Yog li ntawd, tus naj npawb ntawm cov ntawv sau npe thiab cov lus txib tau xaiv nrog lub qhov muag rau kev sib tshuam, hauv kev nkag siab ntawm lej, ntawm lub peev xwm ntawm cov tshuab niaj hnub. Tsis tas li ntawd, ntau yam kev txwv raug txwv rau cov kev pab cuam, piv txwv li, txog thaum tsis ntev los no nws tsis tuaj yeem sau cov voj voog thiab cov txheej txheem subroutines, thiab cov lus qhia tau txwv rau 4096 (tam sim no cov kev pabcuam muaj cai tuaj yeem thauj mus txog ib lab cov lus qhia).

BPF muaj kaum ib tus neeg siv nkag tau 64-ntsis sau npe r0-r10 thiab ib qho program counter. Sau npe r10 muaj tus taw qhia ncej thiab nyeem nkaus xwb. Cov kev zov me nyuam muaj kev nkag mus rau 512-byte pawg ntawm lub sijhawm ua haujlwm thiab qhov txwv tsis pub muaj kev sib koom ua ke hauv daim duab qhia chaw.

BPF cov kev pabcuam raug tso cai los khiav ib qho tshwj xeeb ntawm cov kev pabcuam-hom kernel pab thiab, tsis ntev los no, ua haujlwm tsis tu ncua. Txhua qhov haujlwm hu ua tuaj yeem siv txog tsib qhov kev sib cav, dhau mus rau hauv cov ntawv sau npe r1-r5, thiab tus nqi xa rov qab mus rau r0. Nws tau lees tias tom qab rov qab los ntawm kev ua haujlwm, cov ntsiab lus ntawm cov ntawv sau npe r6-r9 Yuav tsis hloov.

Rau kev txhais lus zoo, sau npe r0-r11 rau tag nrho cov kev txhawb nqa architectures yog tshwj xeeb mapped rau tiag tiag registers, coj mus rau hauv tus account lub ABI nta ntawm tam sim no architecture. Piv txwv li, rau x86_64 sau npe r1-r5, siv los hla kev ua haujlwm tsis muaj nuj nqi, yog tso tawm rau rdi, rsi, rdx, rcx, r8, uas yog siv los hla cov parameter rau kev ua haujlwm ntawm x86_64. Piv txwv li, tus lej ntawm sab laug txhais mus rau tus lej ntawm sab xis zoo li no:

1:  (b7) r1 = 1                    mov    $0x1,%rdi
2:  (b7) r2 = 2                    mov    $0x2,%rsi
3:  (b7) r3 = 3                    mov    $0x3,%rdx
4:  (b7) r4 = 4                    mov    $0x4,%rcx
5:  (b7) r5 = 5                    mov    $0x5,%r8
6:  (85) call pc+1                 callq  0x0000000000001ee8

Sau npe r0 kuj tseem siv los rov qab cov txiaj ntsig ntawm kev ua tiav, thiab hauv kev sau npe r1 qhov kev zov me nyuam tau dhau tus taw tes rau cov ntsiab lus - nyob ntawm seb hom kev pab cuam, qhov no tuaj yeem yog, piv txwv li, tus qauv struct xdp_md (rau XDP) lossis qauv struct __sk_buff (rau cov kev pabcuam network sib txawv) lossis cov qauv struct pt_regs (rau ntau hom kev pab cuam tracing), thiab lwm yam.

Yog li, peb muaj cov txheej txheem sau npe, cov pab pawg neeg, pawg, cov ntsiab lus taw qhia thiab sib koom nco hauv daim duab qhia chaw. Tsis yog txhua qhov no yog qhov tsim nyog ntawm kev mus ncig, tab sis ...

Cia peb txuas ntxiv cov lus piav qhia thiab tham txog cov kab ke hais kom ua haujlwm nrog cov khoom no. Txhua tus (Yuav luag tag nrho) BPF cov lus qhia muaj qhov ruaj khov 64-ntsis loj. Yog tias koj saib ntawm ib qho kev qhia ntawm 64-ntsis Big Endian tshuab koj yuav pom

BPF rau cov me me, ib feem: txuas ntxiv BPF

nws yog Code - qhov no yog encoding ntawm cov lus qhia, Dst/Src yog cov encodings ntawm receiver thiab qhov chaw, feem, Off - 16-ntsis kos npe indentation, thiab Imm yog tus lej 32-ntsis kos npe siv hauv qee cov lus qhia (zoo ib yam li cBPF tas li K). Encoding Code muaj ib tug ntawm ob hom:

BPF rau cov me me, ib feem: txuas ntxiv BPF

Cov chav qhia 0, 1, 2, 3 txhais cov lus txib rau kev ua haujlwm nrog kev nco. Lawv yog hu ua, BPF_LD, BPF_LDX, BPF_ST, BPF_STX, raws. Chav Kawm 4, 7 (BPF_ALU, BPF_ALU64) ua ib pawg ntawm ALU cov lus qhia. Chav Kawm 5, 6 (BPF_JMP, BPF_JMP32) muaj cov lus qhia dhia.

Txoj kev npaj ntxiv rau kev kawm BPF cov lus qhia yog raws li hauv qab no: es tsis txhob sau tag nrho cov lus qhia thiab lawv qhov tsis zoo, peb yuav saib ob peb yam piv txwv hauv ntu no thiab los ntawm lawv nws yuav paub meej tias cov lus qhia ua haujlwm li cas thiab yuav ua li cas. manually disassemble tej ntaub ntawv binary rau BPF. Txhawm rau sib sau cov ntaub ntawv tom qab hauv tsab xov xwm, peb tseem yuav ntsib nrog cov lus qhia ntawm tus kheej hauv ntu hais txog Verifier, JIT compiler, kev txhais lus ntawm classic BPF, nrog rau thaum kawm daim duab qhia chaw, hu ua haujlwm, thiab lwm yam.

Thaum peb tham txog tus kheej cov lus qhia, peb yuav xa mus rau cov ntaub ntawv tseem ceeb bpf.h ΠΈ bpf_common.h, uas txhais cov lej lej ntawm BPF cov lus qhia. Thaum kawm architecture ntawm koj tus kheej thiab / lossis kev sib cais binaries, koj tuaj yeem pom cov lus hais hauv cov hauv qab no, txheeb nyob rau hauv kev txiav txim ntawm complexity: Unofficial eBPF spec, BPF thiab XDP Reference Guide, Instruction Set, Documentation/networking/filter.txt thiab, ntawm chav kawm, nyob rau hauv Linux qhov chaws code - verifier, JIT, BPF neeg txhais lus.

Piv txwv: disassembling BPF hauv koj lub taub hau

Cia peb saib ib qho piv txwv uas peb sau ib qho kev pab cuam readelf-example.c thiab saib qhov tshwm sim binary. Peb yuav nthuav tawm cov ntsiab lus qub readelf-example.c Hauv qab no, tom qab peb kho nws cov logic los ntawm binary codes:

$ clang -target bpf -c readelf-example.c -o readelf-example.o -O2
$ llvm-readelf -x .text readelf-example.o
Hex dump of section '.text':
0x00000000 b7000000 01000000 15010100 00000000 ................
0x00000010 b7000000 02000000 95000000 00000000 ................

Thawj kab hauv cov zis readelf yog indentation thiab peb qhov kev pab cuam yog li muaj plaub commands:

Code Dst Src Off  Imm
b7   0   0   0000 01000000
15   0   1   0100 00000000
b7   0   0   0000 02000000
95   0   0   0000 00000000

Cov lej hais kom sib npaug b7, 15, b7 ΠΈ 95. Nco qab tias qhov tsawg kawg peb cov khoom yog cov chav qhia kev qhia. Nyob rau hauv peb rooj plaub, qhov thib plaub me ntsis ntawm tag nrho cov lus qhia yog khoob, yog li cov chav kawm qhia yog 7, 5, 7, 5, raws li. BPF_ALU64,a 5 yog BPF_JMP. Rau ob chav kawm, hom kev qhia ntawv yog tib yam (saib saum toj no) thiab peb tuaj yeem rov sau peb qhov kev pab cuam zoo li no ( tib lub sijhawm peb yuav rov sau cov kab ntawv ntxiv rau tib neeg):

Op S  Class   Dst Src Off  Imm
b  0  ALU64   0   0   0    1
1  0  JMP     0   1   1    0
b  0  ALU64   0   0   0    2
9  0  JMP     0   0   0    0

Ua haujlwm b chav kawm ALU64 Yog BPF_MOV. Nws muab tus nqi rau qhov chaw sau npe. Yog hais tias lub ntsis yog teem s (qhov chaw), ces tus nqi yog muab los ntawm qhov chaw sau npe, thiab yog hais tias, raws li nyob rau hauv peb cov ntaub ntawv, nws tsis yog teem, ces tus nqi yog muab los ntawm lub teb. Imm. Yog li hauv thawj thiab thib peb cov lus qhia peb ua haujlwm r0 = Imm. Ntxiv mus, JMP chav kawm 1 kev ua haujlwm yog PEB_JEQ (dhia yog sib npaug). Hauv peb qhov xwm txheej, txij li me ntsis S yog xoom, nws piv tus nqi ntawm qhov chaw sau npe nrog daim teb Imm. Yog hais tias tus nqi coincide, ces qhov kev hloov tshwm sim rau PC + Offqhov twg PC, raws li ib txwm muaj, muaj qhov chaw nyob ntawm cov lus qhia tom ntej. Thaum kawg, JMP Chav Kawm 9 Kev Ua Haujlwm yog BPF_EXIT. Cov lus qhia no xaus qhov kev pab cuam, rov qab mus rau lub kernel r0. Cia peb ntxiv ib kab tshiab rau peb lub rooj:

Op    S  Class   Dst Src Off  Imm    Disassm
MOV   0  ALU64   0   0   0    1      r0 = 1
JEQ   0  JMP     0   1   1    0      if (r1 == 0) goto pc+1
MOV   0  ALU64   0   0   0    2      r0 = 2
EXIT  0  JMP     0   0   0    0      exit

Peb tuaj yeem sau qhov no hauv ib daim ntawv yooj yim dua:

     r0 = 1
     if (r1 == 0) goto END
     r0 = 2
END:
     exit

Yog tias peb nco ntsoov dab tsi hauv daim ntawv teev npe r1 qhov kev pab cuam tau dhau ib tus taw tes rau cov ntsiab lus los ntawm cov ntsiav, thiab hauv cov npe r0 tus nqi xa rov qab mus rau lub ntsiav, ces peb tuaj yeem pom tias yog tus taw tes rau cov ntsiab lus yog xoom, ces peb rov qab 1, thiab lwm yam - 2. Cia peb xyuas seb peb puas yog los ntawm kev saib lub hauv paus:

$ cat readelf-example.c
int foo(void *ctx)
{
        return ctx ? 2 : 1;
}

Yog lawm, nws yog qhov kev pab cuam tsis muaj nuj nqis, tab sis nws txhais ua plaub yam lus qhia yooj yim xwb.

Exception piv txwv: 16-byte qhia

Peb tau hais ua ntej tias qee cov lus qhia siv ntau dua 64 khoom. Qhov no siv, piv txwv li, rau cov lus qhia lddw (Code = 0x18 = BPF_LD | BPF_DW | BPF_IMM) - thauj ob lo lus los ntawm cov teb rau hauv cov npe Imm... Qhov tseeb yog qhov ntawd Imm muaj qhov loj me ntawm 32, thiab ob lo lus yog 64-ntsis, yog li kev thauj khoom 64-ntsis tam sim tus nqi rau hauv ib daim ntawv teev npe hauv ib qho kev qhia 64-ntsis yuav tsis ua haujlwm. Txhawm rau ua qhov no, ob cov lus qhia uas nyob ib sab yog siv los khaws qhov thib ob ntawm tus nqi 64-ntsis hauv daim teb Imm... Piv txwv:

$ cat x64.c
long foo(void *ctx)
{
        return 0x11223344aabbccdd;
}
$ clang -target bpf -c x64.c -o x64.o -O2
$ llvm-readelf -x .text x64.o
Hex dump of section '.text':
0x00000000 18000000 ddccbbaa 00000000 44332211 ............D3".
0x00000010 95000000 00000000                   ........

Tsuas muaj ob qho lus qhia hauv qhov kev pab cuam binary:

Binary                                 Disassm
18000000 ddccbbaa 00000000 44332211    r0 = Imm[0]|Imm[1]
95000000 00000000                      exit

Peb yuav ntsib dua nrog cov lus qhia lddw, thaum peb tham txog kev hloov chaw thiab ua haujlwm nrog daim duab qhia chaw.

Piv txwv: disassembling BPF siv cov cuab yeej txheem

Yog li, peb tau kawm nyeem BPF binary codes thiab npaj txhij los txheeb xyuas cov lus qhia yog tias tsim nyog. Txawm li cas los xij, nws tsim nyog hais tias hauv kev xyaum nws yooj yim dua thiab sai dua rau kev tshem tawm cov kev pab cuam siv cov cuab yeej txheem, piv txwv li:

$ llvm-objdump -d x64.o

Disassembly of section .text:

0000000000000000 <foo>:
 0: 18 00 00 00 dd cc bb aa 00 00 00 00 44 33 22 11 r0 = 1234605617868164317 ll
 2: 95 00 00 00 00 00 00 00 exit

Lifecycle ntawm BPF cov khoom, bpffs file system

(Kuv thawj zaug kawm qee cov ntsiab lus piav qhia hauv ntu ntu no los ntawm ncej Alexei Starovoitov nyob rau hauv BPF Blog.)

BPF cov khoom - cov kev pab cuam thiab daim ntawv qhia - yog tsim los ntawm cov neeg siv chaw siv cov lus txib BPF_PROG_LOAD ΠΈ BPF_MAP_CREATE system hu bpf(2), peb mam li tham txog yuav ua li cas qhov no tshwm sim nyob rau hauv nqe lus tom ntej. Qhov no tsim cov ntaub ntawv kernel thiab rau txhua tus ntawm lawv refcount (siv suav) yog teem rau ib qho, thiab cov ntaub ntawv piav qhia taw qhia rau cov khoom raug xa rov qab rau tus neeg siv. Tom qab tus kov raug kaw refcount cov khoom raug txo los ntawm ib qho, thiab thaum nws ncav cuag xoom, cov khoom raug rhuav tshem.

Yog hais tias qhov kev zov me nyuam siv daim ntawv qhia, ces refcount cov duab qhia chaw no tau nce los ntawm ib qho tom qab thauj cov kev pab cuam, i.e. lawv cov ntaub ntawv piav qhia tuaj yeem raug kaw los ntawm tus neeg siv cov txheej txheem thiab tseem refcount yuav tsis dhau xoom:

BPF rau cov me me, ib feem: txuas ntxiv BPF

Tom qab ua tiav kev thauj khoom ib qho kev pab cuam, peb feem ntau xa nws mus rau qee yam kev tshwm sim generator. Piv txwv li, peb tuaj yeem muab tso rau hauv lub network interface los ua cov ntawv xa tuaj lossis txuas rau qee qhov tracepoint hauv lub hauv paus. Nyob rau ntawm lub sijhawm no, cov ntaub ntawv pov thawj tseem yuav nce ntxiv los ntawm ib qho thiab peb yuav tuaj yeem kaw cov ntaub ntawv piav qhia hauv qhov program loader.

Yuav ua li cas yog tias peb tam sim no kaw lub bootloader? Nws nyob ntawm hom kev tshwm sim generator (nqus). Txhua lub network hooks yuav muaj nyob tom qab lub loader ua tiav, cov no yog lub npe hu ua thoob ntiaj teb hooks. Thiab, piv txwv li, cov phiaj xwm kab yuav raug tso tawm tom qab cov txheej txheem uas tsim lawv raug txiav tawm (thiab yog li hu ua hauv zos, los ntawm "hauv zos mus rau txheej txheem"). Technically, cov hooks hauv zos ib txwm muaj cov ntaub ntawv sib tham hauv cov neeg siv qhov chaw thiab yog li kaw thaum cov txheej txheem raug kaw, tab sis thoob ntiaj teb hooks tsis ua. Hauv daim duab hauv qab no, siv cov ntoo khaub lig liab, kuv sim ua kom pom tias qhov kev txiav tawm ntawm qhov kev pab cuam loader cuam tshuam li cas rau lub neej ntawm cov khoom nyob rau hauv rooj plaub hauv zos thiab thoob ntiaj teb hooks.

BPF rau cov me me, ib feem: txuas ntxiv BPF

Vim li cas thiaj muaj qhov sib txawv ntawm cov hooks hauv zos thiab thoob ntiaj teb? Kev khiav qee hom kev pabcuam hauv network ua rau kev nkag siab yam tsis muaj chaw siv, piv txwv li, xav txog kev tiv thaiv DDoS - bootloader sau cov cai thiab txuas BPF program rau lub network interface, tom qab ntawd tus bootloader tuaj yeem mus thiab tua nws tus kheej. Ntawm qhov tod tes, xav txog qhov kev pab cuam debugging kab uas koj tau sau rau ntawm koj lub hauv caug hauv kaum feeb - thaum nws tiav lawm, koj xav kom tsis muaj cov khib nyiab pov tseg hauv lub cev, thiab cov hooks hauv zos yuav xyuas kom meej tias.

Ntawm qhov tod tes, xav txog tias koj xav txuas mus rau ib qho tracepoint hauv kernel thiab sau cov txheeb cais ntau xyoo. Hauv qhov no, koj yuav xav ua kom tiav cov neeg siv feem thiab rov qab mus rau cov txheeb cais los ntawm ib ntus. Cov ntaub ntawv bpf muab lub sijhawm no. Nws yog ib qho hauv-nco-tsuas yog pseudo-file system uas tso cai rau kev tsim cov ntaub ntawv uas siv BPF cov khoom thiab yog li nce. refcount khoom. Tom qab no, lub loader tuaj yeem tawm, thiab cov khoom nws tsim yuav nyob twj ywm.

BPF rau cov me me, ib feem: txuas ntxiv BPF

Tsim cov ntaub ntawv hauv bpffs uas siv cov khoom siv BPF hu ua "pinning" (raws li hauv kab lus hauv qab no: "txheej txheem tuaj yeem pin ib qho kev pabcuam BPF lossis daim ntawv qhia"). Tsim cov ntaub ntawv rau cov khoom BPF ua rau kev nkag siab tsis yog tsuas yog txuas rau lub neej ntawm cov khoom hauv zos, tab sis kuj tseem siv tau rau cov khoom siv thoob ntiaj teb - rov qab mus rau qhov piv txwv nrog lub ntiaj teb DDoS kev tiv thaiv kev pab cuam, peb xav tuaj yeem tuaj saib thiab txheeb xyuas. ntawm lub sijhawm dhau mus.

BPF cov ntaub ntawv feem ntau yog mounted rau hauv /sys/fs/bpf, tab sis nws kuj tuaj yeem mounted hauv zos, piv txwv li, zoo li no:

$ mkdir bpf-mountpoint
$ sudo mount -t bpf none bpf-mountpoint

Cov npe ntawm cov ntaub ntawv raug tsim los ntawm kev hais kom ua BPF_OBJ_PIN BPF system hu. Txhawm rau ua piv txwv, cia peb ua ib qho program, sau nws, upload nws, thiab pin nws rau bpffs. Peb qhov kev pab cuam tsis ua ib yam dab tsi muaj txiaj ntsig, peb tsuas yog nthuav tawm cov cai kom koj tuaj yeem tsim cov piv txwv:

$ cat test.c
__attribute__((section("xdp"), used))
int test(void *ctx)
{
        return 0;
}

char _license[] __attribute__((section("license"), used)) = "GPL";

Cia peb muab cov kev pab cuam no thiab tsim ib daim ntawv theej hauv zos ntawm cov ntaub ntawv kaw lus bpffs:

$ clang -target bpf -c test.c -o test.o
$ mkdir bpf-mountpoint
$ sudo mount -t bpf none bpf-mountpoint

Tam sim no cia peb rub tawm peb qhov kev pab cuam uas siv lub tshuab hluav taws xob bpftool thiab saib cov kev hu xovtooj nrog bpf(2) (qee cov kab tsis cuam tshuam raug tshem tawm los ntawm kev tso zis tawm:

$ sudo strace -e bpf bpftool prog load ./test.o bpf-mountpoint/test
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, prog_name="test", ...}, 120) = 3
bpf(BPF_OBJ_PIN, {pathname="bpf-mountpoint/test", bpf_fd=3}, 120) = 0

Ntawm no peb tau loaded qhov kev pab cuam siv BPF_PROG_LOAD, tau txais cov ntaub ntawv piav qhia los ntawm lub kernel 3 thiab siv cov lus txib BPF_OBJ_PIN pinned no cov ntaub ntawv descriptor ua ib cov ntaub ntawv "bpf-mountpoint/test". Tom qab ntawd qhib qhov program bootloader bpftool ua haujlwm tiav, tab sis peb qhov kev pab cuam tseem nyob hauv cov ntsiav, txawm hais tias peb tsis tau txuas nws mus rau ib qho kev sib txuas hauv network:

$ sudo bpftool prog | tail -3
783: xdp  name test  tag 5c8ba0cf164cb46c  gpl
        loaded_at 2020-05-05T13:27:08+0000  uid 0
        xlated 24B  jited 41B  memlock 4096B

Peb tuaj yeem rho tawm cov ntaub ntawv ib txwm unlink(2) thiab tom qab ntawd cov kev pabcuam cuam tshuam yuav raug muab tshem tawm:

$ sudo rm ./bpf-mountpoint/test
$ sudo bpftool prog show id 783
Error: get by id (783): No such file or directory

Rho tawm cov khoom

Hais txog kev rho tawm cov khoom, nws yog ib qho tsim nyog yuav tsum tau qhia meej tias tom qab peb tau txiav qhov kev pab cuam los ntawm kev sib txuas (kev tshwm sim lub tshuab hluav taws xob), tsis yog ib qho kev tshwm sim tshiab yuav ua rau nws pib, txawm li cas los xij, tag nrho cov xwm txheej tam sim no ntawm qhov kev zov me nyuam yuav ua tiav hauv qhov kev txiav txim ib txwm. .

Qee hom kev pab cuam BPF tso cai rau koj los hloov qhov kev pab cuam ntawm ya, i.e. muab cov kab ke atomicity replace = detach old program, attach new program. Nyob rau hauv cov ntaub ntawv no, tag nrho cov active piv txwv ntawm lub qub version ntawm qhov kev pab cuam yuav ua tiav lawv cov hauj lwm, thiab tshiab tshwm sim handlers yuav raug tsim los ntawm qhov kev pab cuam tshiab, thiab "atomicity" ntawm no txhais tau hais tias tsis yog ib qho kev tshwm sim yuav ploj.

Txuas cov kev pab cuam rau qhov chaw tshwm sim

Nyob rau hauv tsab xov xwm no, peb yuav tsis cais piav qhia txog kev sib txuas cov kev pab cuam rau qhov chaw tshwm sim, vim nws ua rau kev nkag siab los kawm qhov no hauv cov ntsiab lus ntawm ib hom kev pab cuam. Cm. Piv txwv hauv qab no, uas peb qhia seb cov kev pab cuam zoo li XDP txuas nrog li cas.

Manipulating Objects siv bpf System Hu

BPF cov kev pab cuam

Tag nrho cov khoom BPF yog tsim thiab tswj los ntawm cov neeg siv qhov chaw siv lub kaw lus hu bpf, muaj cov qauv hauv qab no:

#include <linux/bpf.h>

int bpf(int cmd, union bpf_attr *attr, unsigned int size);

Ntawm no yog pab neeg cmd yog ib qho ntawm qhov tseem ceeb ntawm hom enum bpf_cmd, attr - tus taw tes rau qhov ntsuas rau ib qho kev pab cuam tshwj xeeb thiab size - yam khoom loj raws li tus pointer, i.e. feem ntau qhov no sizeof(*attr). Hauv kernel 5.8 lub kaw lus hu bpf txhawb 34 cov lus txib sib txawv, thiab txhais tau union bpf_attr occupies 200 kab. Tab sis peb yuav tsum tsis txhob ntshai los ntawm qhov no, txij li thaum peb yuav paub peb tus kheej nrog cov lus txib thiab cov kev txwv tsis pub dhau ob peb nqe lus.

Cia peb pib nrog pab neeg BPF_PROG_LOAD, uas tsim cov kev pab cuam BPF - siv ib txheej ntawm BPF cov lus qhia thiab thauj nws mus rau hauv cov ntsiav. Thaum lub sijhawm thauj khoom, tus neeg pov thawj tau pib, thiab tom qab ntawd JIT compiler thiab, tom qab ua tiav tiav, cov ntaub ntawv piav qhia tau muab xa rov qab rau tus neeg siv. Peb pom dab tsi tshwm sim rau nws tom ntej no hauv ntu dhau los hais txog lub neej voj voog ntawm BPF cov khoom.

Tam sim no peb yuav sau cov kev cai tshwj xeeb uas yuav thauj khoom yooj yim BPF program, tab sis ua ntej peb yuav tsum txiav txim siab seb hom kev pab cuam twg peb xav thauj khoom - peb yuav tsum xaiv hom thiab nyob rau hauv lub moj khaum ntawm hom no, sau ib qho kev pab cuam uas yuav dhau qhov kev kuaj xyuas. Txawm li cas los xij, txhawm rau kom tsis txhob cuam tshuam cov txheej txheem, ntawm no yog ib qho kev npaj ua tiav: peb yuav siv qhov kev pab cuam zoo li BPF_PROG_TYPE_XDP, uas yuav rov qab tus nqi XDP_PASS (hla tag nrho cov pob). Hauv BPF assembler nws zoo li yooj yim heev:

r0 = 2
exit

Tom qab peb tau txiav txim siab uas Peb yuav upload, peb tuaj yeem qhia koj seb peb yuav ua li cas:

#define _GNU_SOURCE
#include <string.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <linux/bpf.h>

static inline __u64 ptr_to_u64(const void *ptr)
{
        return (__u64) (unsigned long) ptr;
}

int main(void)
{
    struct bpf_insn insns[] = {
        {
            .code = BPF_ALU64 | BPF_MOV | BPF_K,
            .dst_reg = BPF_REG_0,
            .imm = XDP_PASS
        },
        {
            .code = BPF_JMP | BPF_EXIT
        },
    };

    union bpf_attr attr = {
        .prog_type = BPF_PROG_TYPE_XDP,
        .insns     = ptr_to_u64(insns),
        .insn_cnt  = sizeof(insns)/sizeof(insns[0]),
        .license   = ptr_to_u64("GPL"),
    };

    strncpy(attr.prog_name, "woo", sizeof(attr.prog_name));
    syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr));

    for ( ;; )
        pause();
}

Cov xwm txheej txaus siab hauv ib qho kev pab cuam pib nrog lub ntsiab lus ntawm ib qho array insns - peb qhov kev pab cuam BPF hauv tshuab code. Hauv qhov no, txhua qhov kev qhia ntawm BPF qhov kev pab cuam tau ntim rau hauv cov qauv bpf_insn. Thawj lub ntsiab insns ua raws li cov lus qhia r0 = 2, thib ob - exit.

Rov qab. Lub kernel txhais tau yooj yim dua macros rau kev sau cov lej tshuab, thiab siv cov ntaub ntawv kernel header tools/include/linux/filter.h peb sau tau

struct bpf_insn insns[] = {
    BPF_MOV64_IMM(BPF_REG_0, XDP_PASS),
    BPF_EXIT_INSN()
};

Tab sis txij li thaum sau BPF cov kev pab cuam hauv cov cai ib txwm muaj tsuas yog tsim nyog rau kev sau cov ntawv xeem hauv cov ntsiav thiab cov ntawv hais txog BPF, qhov tsis muaj cov macros no tsis cuam tshuam rau tus tsim tawm lub neej.

Tom qab txhais qhov kev pab cuam BPF, peb txav mus rau kev thauj khoom mus rau hauv cov ntsiav. Peb cov txheej txheem minimalist attr suav nrog hom kev pab cuam, teeb tsa thiab tus naj npawb ntawm cov lus qhia, daim ntawv tso cai xav tau, thiab lub npe "woo", uas peb siv los nrhiav peb cov kev pab cuam ntawm lub kaw lus tom qab rub tawm. Qhov kev pab cuam, raws li tau cog lus tseg, yog loaded rau hauv lub system siv ib tug system hu bpf.

Thaum kawg ntawm qhov kev pab cuam peb xaus rau hauv lub voj infinite uas simulates lub payload. Yog tsis muaj nws, qhov kev zov me nyuam yuav raug tua los ntawm cov ntsiav thaum cov ntaub ntawv piav qhia uas lub kaw lus hu rov qab rau peb raug kaw bpf, thiab peb yuav tsis pom nws hauv qhov system.

Zoo, peb npaj txhij rau kev sim. Cia peb sib sau ua ke thiab khiav qhov program hauv qab stracetxhawm rau xyuas tias txhua yam ua haujlwm raws li nws yuav tsum:

$ clang -g -O2 simple-prog.c -o simple-prog

$ sudo strace ./simple-prog
execve("./simple-prog", ["./simple-prog"], 0x7ffc7b553480 /* 13 vars */) = 0
...
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, insn_cnt=2, insns=0x7ffe03c4ed50, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_V
ERSION(0, 0, 0), prog_flags=0, prog_name="woo", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS}, 72) = 3
pause(

Txhua yam zoo, bpf(2) rov qab tuav 3 rau peb thiab peb mus rau hauv lub voj infinite nrog pause(). Cia peb sim nrhiav peb qhov program hauv qhov system. Ua li no peb yuav mus rau lwm lub davhlau ya nyob twg thiab siv cov khoom siv hluav taws xob bpftool:

# bpftool prog | grep -A3 woo
390: xdp  name woo  tag 3b185187f1855c4c  gpl
        loaded_at 2020-08-31T24:66:44+0000  uid 0
        xlated 16B  jited 40B  memlock 4096B
        pids simple-prog(10381)

Peb pom tias muaj ib qho kev pab cuam loaded ntawm lub system woo nws tus ID thoob ntiaj teb yog 390 thiab tam sim no tab tom ua tiav simple-prog muaj qhov qhib cov ntaub ntawv piav qhia taw qhia rau qhov program (thiab yog tias simple-prog yuav ua kom tiav txoj hauj lwm, ces woo yuav ploj). Raws li xav tau, qhov kev zov me nyuam woo siv 16 bytes - ob cov lus qhia - ntawm binary codes nyob rau hauv BPF architecture, tab sis nyob rau hauv nws haiv neeg daim ntawv (x86_64) nws twb 40 bytes. Cia peb saib peb qhov program hauv nws daim ntawv qub:

# bpftool prog dump xlated id 390
   0: (b7) r0 = 2
   1: (95) exit

tsis xav tsis thoob. Tam sim no cia saib cov cai tsim los ntawm JIT compiler:

# bpftool prog dump jited id 390
bpf_prog_3b185187f1855c4c_woo:
   0:   nopl   0x0(%rax,%rax,1)
   5:   push   %rbp
   6:   mov    %rsp,%rbp
   9:   sub    $0x0,%rsp
  10:   push   %rbx
  11:   push   %r13
  13:   push   %r14
  15:   push   %r15
  17:   pushq  $0x0
  19:   mov    $0x2,%eax
  1e:   pop    %rbx
  1f:   pop    %r15
  21:   pop    %r14
  23:   pop    %r13
  25:   pop    %rbx
  26:   leaveq
  27:   retq

tsis zoo heev rau exit(2), tab sis nyob rau hauv kev ncaj ncees, peb qhov kev pab cuam yog yooj yim heev, thiab rau cov kev pab cuam uas tsis yog tsis tseem ceeb cov prologue thiab epilogue ntxiv los ntawm JIT compiler yog, ntawm chav kawm, xav tau.

maps

BPF cov kev pab cuam tuaj yeem siv cov chaw cim xeeb uas siv tau rau ob qho tib si rau lwm cov kev pabcuam BPF thiab rau cov kev pabcuam hauv qhov chaw siv. Cov khoom no yog hu ua maps thiab nyob rau hauv seem no peb yuav qhia yuav ua li cas rau manipulate lawv siv ib tug system hu bpf.

Cia peb hais tam sim ntawd tias lub peev xwm ntawm daim ntawv qhia tsis txwv tsuas yog nkag mus rau kev sib koom nco. Muaj cov phiaj xwm tshwj xeeb uas muaj, piv txwv li, taw qhia rau BPF cov kev pab cuam lossis cov taw qhia rau kev sib txuas hauv network, daim duab qhia kev ua haujlwm nrog cov xwm txheej perf, thiab lwm yam. Peb yuav tsis tham txog lawv ntawm no, thiaj li tsis ua rau cov neeg nyeem tsis meej pem. Ntxiv rau qhov no, peb tsis quav ntsej cov teeb meem synchronization, vim qhov no tsis tseem ceeb rau peb cov piv txwv. Ib daim ntawv teev tag nrho cov hom phiaj muaj nyob rau hauv <linux/bpf.h>, thiab hauv seem no peb yuav coj ua piv txwv txog keeb kwm thawj hom, lub rooj hash BPF_MAP_TYPE_HASH.

Yog tias koj tsim lub rooj hash hauv, hais, C ++, koj yuav hais unordered_map<int,long> woo, uas nyob rau hauv Lavxias teb sab txhais tau tias "Kuv xav tau ib lub rooj woo unlimited loj, nws cov yawm sij yog hom int, thiab cov nqi yog hom long" Txhawm rau tsim BPF hash lub rooj, peb yuav tsum ua ntau yam zoo ib yam, tshwj tsis yog tias peb yuav tsum qhia qhov loj tshaj plaws ntawm lub rooj, thiab es tsis txhob qhia cov yuam sij thiab qhov tseem ceeb, peb yuav tsum qhia lawv qhov ntau thiab tsawg hauv bytes. . Txhawm rau tsim maps siv cov lus txib BPF_MAP_CREATE system hu bpf. Cia peb saib ib qho kev pab cuam tsawg los yog tsawg uas tsim ib daim ntawv qhia. Tom qab qhov kev pab cuam dhau los uas thauj cov kev pabcuam BPF, qhov no yuav tsum zoo li yooj yim rau koj:

$ cat simple-map.c
#define _GNU_SOURCE
#include <string.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <linux/bpf.h>

int main(void)
{
    union bpf_attr attr = {
        .map_type = BPF_MAP_TYPE_HASH,
        .key_size = sizeof(int),
        .value_size = sizeof(int),
        .max_entries = 4,
    };
    strncpy(attr.map_name, "woo", sizeof(attr.map_name));
    syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));

    for ( ;; )
        pause();
}

Ntawm no peb txhais ib txheej ntawm parameters attr, nyob rau hauv uas peb hais tias "Kuv xav tau lub rooj hash nrog cov yawm sij thiab qhov loj me sizeof(int), nyob rau hauv uas kuv tuaj yeem muab qhov siab tshaj plaws ntawm plaub yam. " Thaum tsim BPF daim duab qhia chaw, koj tuaj yeem qhia lwm yam tsis, piv txwv li, tib yam li hauv qhov piv txwv nrog qhov kev pab cuam, peb tau teev lub npe ntawm cov khoom raws li "woo".

Cia peb sau thiab khiav qhov program:

$ clang -g -O2 simple-map.c -o simple-map
$ sudo strace ./simple-map
execve("./simple-map", ["./simple-map"], 0x7ffd40a27070 /* 14 vars */) = 0
...
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4, max_entries=4, map_name="woo", ...}, 72) = 3
pause(

Ntawm no yog qhov system hu bpf(2) xa rov qab rau peb tus lej piav qhia 3 thiab tom qab ntawd qhov kev pab cuam, raws li xav tau, tos rau cov lus qhia ntxiv hauv kev hu xov tooj pause(2).

Tam sim no cia peb xa peb qhov kev pab cuam mus rau keeb kwm yav dhau lossis qhib lwm lub davhlau ya nyob twg thiab saib peb cov khoom siv hluav taws xob bpftool (peb tuaj yeem paub qhov txawv ntawm peb daim ntawv qhia los ntawm lwm tus los ntawm nws lub npe):

$ sudo bpftool map
...
114: hash  name woo  flags 0x0
        key 4B  value 4B  max_entries 4  memlock 4096B
...

Tus lej 114 yog tus ID thoob ntiaj teb ntawm peb cov khoom. Txhua qhov kev pab cuam ntawm lub kaw lus tuaj yeem siv tus ID no los qhib daim ntawv qhia uas twb muaj lawm siv cov lus txib BPF_MAP_GET_FD_BY_ID system hu bpf.

Tam sim no peb tuaj yeem ua si nrog peb lub rooj hash. Cia peb saib nws cov ntsiab lus:

$ sudo bpftool map dump id 114
Found 0 elements

Npua. Cia peb muab tus nqi rau nws hash[1] = 1:

$ sudo bpftool map update id 114 key 1 0 0 0 value 1 0 0 0

Cia wb mus saib lub rooj dua:

$ sudo bpftool map dump id 114
key: 01 00 00 00  value: 01 00 00 00
Found 1 element

Hooray! Peb tswj kom ntxiv ib lub ntsiab. Nco ntsoov tias peb yuav tsum ua haujlwm ntawm qib byte los ua qhov no, txij li bptftool tsis paub dab tsi hom qhov tseem ceeb hauv lub rooj hash yog. (Qhov kev paub no tuaj yeem xa mus rau nws siv BTF, tab sis ntau ntxiv rau tam sim no.)

Yuav ua li cas raws nraim li bpftool nyeem thiab ntxiv cov ntsiab lus? Cia peb saib hauv qab hood:

$ sudo strace -e bpf bpftool map dump id 114
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=114, next_id=0, open_flags=0}, 120) = 3
bpf(BPF_MAP_GET_NEXT_KEY, {map_fd=3, key=NULL, next_key=0x55856ab65280}, 120) = 0
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x55856ab65280, value=0x55856ab652a0}, 120) = 0
key: 01 00 00 00  value: 01 00 00 00
bpf(BPF_MAP_GET_NEXT_KEY, {map_fd=3, key=0x55856ab65280, next_key=0x55856ab65280}, 120) = -1 ENOENT

Ua ntej peb qhib daim ntawv qhia los ntawm nws lub ntiaj teb ID siv cov lus txib BPF_MAP_GET_FD_BY_ID ΠΈ bpf(2) xa rov qab descriptor 3 rau peb. Ntxiv mus siv cov lus txib BPF_MAP_GET_NEXT_KEY peb pom thawj tus yuam sij hauv lub rooj dhau los NULL raws li tus taw tes rau tus yuam sij "yav dhau los". Yog peb muaj tus yuam sij peb ua tau BPF_MAP_LOOKUP_ELEMuas rov qab tus nqi rau tus pointer value. Cov kauj ruam tom ntej yog peb sim nrhiav cov khoom tom ntej los ntawm kev hla tus taw tes rau qhov tseem ceeb tam sim no, tab sis peb lub rooj tsuas muaj ib lub ntsiab lus thiab cov lus txib. BPF_MAP_GET_NEXT_KEY rov qab ENOENT.

Okay, cia peb hloov tus nqi los ntawm qhov tseem ceeb 1, cia peb hais tias peb lub lag luam logic yuav tsum tau sau npe hash[1] = 2:

$ sudo strace -e bpf bpftool map update id 114 key 1 0 0 0 value 2 0 0 0
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=114, next_id=0, open_flags=0}, 120) = 3
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=3, key=0x55dcd72be260, value=0x55dcd72be280, flags=BPF_ANY}, 120) = 0

Raws li xav tau, nws yooj yim heev: cov lus txib BPF_MAP_GET_FD_BY_ID qhib peb daim ntawv qhia los ntawm ID, thiab cov lus txib BPF_MAP_UPDATE_ELEM overwrites lub ntsiab.

Yog li, tom qab tsim lub rooj hash los ntawm ib qho kev pab cuam, peb tuaj yeem nyeem thiab sau nws cov ntsiab lus los ntawm lwm qhov. Nco ntsoov tias yog tias peb tuaj yeem ua qhov no los ntawm kab hais kom ua, ces lwm yam kev pab cuam ntawm lub kaw lus tuaj yeem ua tau. Ntxiv nrog rau cov lus txib tau piav qhia saum toj no, rau kev ua haujlwm nrog daim duab qhia chaw los ntawm cov neeg siv qhov chaw, cov hauv qab no:

  • BPF_MAP_LOOKUP_ELEM: nrhiav tus nqi los ntawm qhov tseem ceeb
  • BPF_MAP_UPDATE_ELEM: hloov tshiab/tsim tus nqi
  • BPF_MAP_DELETE_ELEM: tshem key
  • BPF_MAP_GET_NEXT_KEY: nrhiav tus tom ntej (lossis thawj) tus yuam sij
  • BPF_MAP_GET_NEXT_ID: tso cai rau koj mus dhau txhua daim ntawv qhia uas twb muaj lawm, uas yog nws ua haujlwm li cas bpftool map
  • BPF_MAP_GET_FD_BY_ID: qhib daim ntawv qhia uas twb muaj lawm los ntawm nws tus ID thoob ntiaj teb
  • BPF_MAP_LOOKUP_AND_DELETE_ELEM: atomically hloov kho tus nqi ntawm ib yam khoom thiab xa rov qab qhov qub
  • BPF_MAP_FREEZE: ua kom daim ntawv qhia hloov tsis tau los ntawm userspace (qhov haujlwm no tsis tuaj yeem thim rov qab)
  • BPF_MAP_LOOKUP_BATCH, BPF_MAP_LOOKUP_AND_DELETE_BATCH, BPF_MAP_UPDATE_BATCH, BPF_MAP_DELETE_BATCH: kev ua haujlwm loj. Piv txwv li, BPF_MAP_LOOKUP_AND_DELETE_BATCH - qhov no yog tib txoj kev txhim khu kev qha los nyeem thiab rov pib dua tag nrho cov txiaj ntsig ntawm daim ntawv qhia

Tsis yog tag nrho cov lus txib no ua haujlwm rau txhua hom daim ntawv qhia, tab sis feem ntau ua haujlwm nrog lwm hom duab qhia chaw los ntawm cov neeg siv qhov chaw zoo ib yam li kev ua haujlwm nrog cov rooj hash.

Rau qhov kev txiav txim, cia peb ua tiav peb qhov kev sim hash rooj. Nco ntsoov tias peb tsim ib lub rooj uas tuaj yeem muaj txog plaub tus yuam sij? Cia peb ntxiv ob peb yam ntxiv:

$ sudo bpftool map update id 114 key 2 0 0 0 value 1 0 0 0
$ sudo bpftool map update id 114 key 3 0 0 0 value 1 0 0 0
$ sudo bpftool map update id 114 key 4 0 0 0 value 1 0 0 0

Tam sim no zoo:

$ sudo bpftool map dump id 114
key: 01 00 00 00  value: 01 00 00 00
key: 02 00 00 00  value: 01 00 00 00
key: 04 00 00 00  value: 01 00 00 00
key: 03 00 00 00  value: 01 00 00 00
Found 4 elements

Cia peb sim ntxiv ib qho ntxiv:

$ sudo bpftool map update id 114 key 5 0 0 0 value 1 0 0 0
Error: update failed: Argument list too long

Raws li kev cia siab, peb ua tsis tau tiav. Cia peb saib qhov yuam kev hauv kev nthuav dav ntxiv:

$ sudo strace -e bpf bpftool map update id 114 key 5 0 0 0 value 1 0 0 0
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=114, next_id=0, open_flags=0}, 120) = 3
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=3, info_len=80, info=0x7ffe6c626da0}}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=3, key=0x56049ded5260, value=0x56049ded5280, flags=BPF_ANY}, 120) = -1 E2BIG (Argument list too long)
Error: update failed: Argument list too long
+++ exited with 255 +++

Txhua yam zoo: raws li qhov xav tau, pab pawg BPF_MAP_UPDATE_ELEM sim tsim ib qho tshiab, thib tsib, qhov tseem ceeb, tab sis poob E2BIG.

Yog li, peb tuaj yeem tsim thiab thauj cov kev pabcuam BPF, nrog rau tsim thiab tswj cov duab qhia chaw los ntawm cov neeg siv qhov chaw. Tam sim no nws yog qhov tsim nyog los saib seb peb tuaj yeem siv daim ntawv qhia li cas los ntawm BPF cov kev pab cuam lawv tus kheej. Peb tuaj yeem tham txog qhov no hauv cov lus ntawm cov kev pabcuam nyuaj-rau-nyeem ntawv hauv tshuab macro codes, tab sis qhov tseeb lub sijhawm tau los qhia tias BPF cov kev pabcuam tau sau thiab khaws cia li cas - siv libbpf.

(Rau cov neeg nyeem uas tsis txaus siab rau qhov tsis muaj qhov piv txwv qis: peb yuav txheeb xyuas hauv cov kev qhia ntxaws ntxaws uas siv cov duab qhia chaw thiab cov haujlwm pabcuam uas tsim los siv. libbpf thiab qhia koj tias muaj dab tsi tshwm sim ntawm qib kev qhia. Rau cov neeg nyeem uas tsis txaus siab ntau heev, peb ntxiv Piv txwv nyob rau hauv qhov chaw tsim nyog hauv kab lus.)

Sau BPF cov kev pab cuam siv libbpf

Sau BPF cov kev pab cuam siv tshuab cov lej tuaj yeem nthuav tsuas yog thawj zaug, thiab tom qab ntawd satiety teeb tsa. Lub sijhawm no koj yuav tsum tig koj lub siab rau llvm, uas muaj qhov backend rau tsim cov cai rau BPF architecture, nrog rau lub tsev qiv ntawv libbpf, uas tso cai rau koj sau cov neeg siv sab ntawm BPF daim ntawv thov thiab thauj cov cai ntawm BPF cov kev pab cuam tsim siv llvm/clang.

Qhov tseeb, raws li peb yuav pom hauv cov ntawv no thiab tom ntej, libbpf ua haujlwm ntau yam tsis muaj nws (lossis cov cuab yeej zoo sib xws - iproute2, libbcc, libbpf-go, thiab lwm yam) nws tsis tuaj yeem ua neej nyob. Ib tug ntawm cov killer nta ntawm qhov project libbpf yog BPF CO-RE (Compile Ib zaug, Khiav Txhua Qhov) - ib qhov project uas tso cai rau koj sau BPF cov kev pabcuam uas tuaj yeem nqa tau los ntawm ib lub ntsiav mus rau lwm qhov, nrog lub peev xwm los khiav ntawm APIs sib txawv (piv txwv li, thaum lub kernel qauv hloov ntawm version. rau version). Txhawm rau kom muaj peev xwm ua haujlwm nrog CO-RE, koj lub kernel yuav tsum tau muab tso ua ke nrog BTF kev txhawb nqa (peb piav qhia yuav ua li cas hauv ntu Cov cuab yeej tsim kho. Koj tuaj yeem tshawb xyuas seb koj lub kernel yog tsim nrog BTF lossis tsis yooj yim heev - los ntawm lub xub ntiag ntawm cov ntaub ntawv hauv qab no:

$ ls -lh /sys/kernel/btf/vmlinux
-r--r--r-- 1 root root 2.6M Jul 29 15:30 /sys/kernel/btf/vmlinux

Cov ntaub ntawv no khaws cov ntaub ntawv hais txog txhua hom ntaub ntawv siv hauv kernel thiab siv nyob rau hauv tag nrho peb cov qauv siv libbpf. Peb yuav tham kom ntxaws txog CO-RE hauv tsab xov xwm tom ntej, tab sis hauv qhov no - tsuas yog tsim koj tus kheej lub ntsiav nrog CONFIG_DEBUG_INFO_BTF.

tsev qiv ntawv libbpf nyob txoj cai nyob rau hauv cov directory tools/lib/bpf kernel thiab nws txoj kev loj hlob yog ua los ntawm kev xa ntawv [email protected]. Txawm li cas los xij, ib qho chaw khaws cia cais yog khaws cia rau cov kev xav tau ntawm cov ntawv thov nyob sab nraum lub kernel https://github.com/libbpf/libbpf nyob rau hauv uas lub tsev qiv ntawv kernel yog mirrored rau kev nyeem ntawv ntau los yog tsawg dua li yog.

Hauv seem no peb yuav saib seb koj tuaj yeem tsim ib qhov project uas siv libbpf, cia peb sau ob peb (ntau dua lossis tsawg dua qhov tsis muaj ntsiab lus) kev sim thiab tshuaj xyuas kom meej tias nws ua haujlwm li cas. Qhov no yuav ua rau peb piav qhia yooj yim dua hauv cov tshooj hauv qab no raws nraim li cas BPF cov kev pab cuam cuam tshuam nrog cov duab qhia chaw, cov pab pawg, BTF, thiab lwm yam.

Feem ntau siv tej yaam num libbpf ntxiv GitHub repository raws li git submodule, peb yuav ua tib yam:

$ mkdir /tmp/libbpf-example
$ cd /tmp/libbpf-example/
$ git init-db
Initialized empty Git repository in /tmp/libbpf-example/.git/
$ git submodule add https://github.com/libbpf/libbpf.git
Cloning into '/tmp/libbpf-example/libbpf'...
remote: Enumerating objects: 200, done.
remote: Counting objects: 100% (200/200), done.
remote: Compressing objects: 100% (103/103), done.
remote: Total 3354 (delta 101), reused 118 (delta 79), pack-reused 3154
Receiving objects: 100% (3354/3354), 2.05 MiB | 10.22 MiB/s, done.
Resolving deltas: 100% (2176/2176), done.

Mus rau libbpf yooj yim heev:

$ cd libbpf/src
$ mkdir build
$ OBJDIR=build DESTDIR=root make -s install
$ find root
root
root/usr
root/usr/include
root/usr/include/bpf
root/usr/include/bpf/bpf_tracing.h
root/usr/include/bpf/xsk.h
root/usr/include/bpf/libbpf_common.h
root/usr/include/bpf/bpf_endian.h
root/usr/include/bpf/bpf_helpers.h
root/usr/include/bpf/btf.h
root/usr/include/bpf/bpf_helper_defs.h
root/usr/include/bpf/bpf.h
root/usr/include/bpf/libbpf_util.h
root/usr/include/bpf/libbpf.h
root/usr/include/bpf/bpf_core_read.h
root/usr/lib64
root/usr/lib64/libbpf.so.0.1.0
root/usr/lib64/libbpf.so.0
root/usr/lib64/libbpf.a
root/usr/lib64/libbpf.so
root/usr/lib64/pkgconfig
root/usr/lib64/pkgconfig/libbpf.pc

Peb cov phiaj xwm tom ntej hauv ntu no yog raws li hauv qab no: peb yuav sau BPF program zoo li BPF_PROG_TYPE_XDP, zoo ib yam li hauv qhov piv txwv dhau los, tab sis hauv C, peb suav nws siv clang, thiab sau ib qho kev pab cuam uas yuav thauj mus rau hauv cov ntsiav. Hauv ntu nram qab no peb yuav nthuav dav cov peev txheej ntawm BPF program thiab pab pawg.

Piv txwv li: tsim ib daim ntawv thov tag nrho siv libbpf

Yuav pib nrog, peb siv cov ntaub ntawv /sys/kernel/btf/vmlinux, uas tau hais los saum no, thiab tsim nws qhov sib npaug hauv daim ntawv ntawm cov ntaub ntawv header:

$ bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h

Cov ntaub ntawv no yuav khaws tag nrho cov ntaub ntawv muaj nyob hauv peb cov ntsiav, piv txwv li, qhov no yog li cas IPv4 header tau txhais hauv cov ntsiav:

$ grep -A 12 'struct iphdr {' vmlinux.h
struct iphdr {
    __u8 ihl: 4;
    __u8 version: 4;
    __u8 tos;
    __be16 tot_len;
    __be16 id;
    __be16 frag_off;
    __u8 ttl;
    __u8 protocol;
    __sum16 check;
    __be32 saddr;
    __be32 daddr;
};

Tam sim no peb yuav sau peb qhov kev pab cuam BPF hauv C:

$ cat xdp-simple.bpf.c
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>

SEC("xdp/simple")
int simple(void *ctx)
{
        return XDP_PASS;
}

char LICENSE[] SEC("license") = "GPL";

Txawm hais tias peb qhov kev pab cuam tau dhau los ua qhov yooj yim heev, peb tseem yuav tsum tau them sai sai rau ntau yam ntsiab lus. Ua ntej, thawj cov ntaub ntawv header peb suav nrog yog vmlinux.h, uas peb nyuam qhuav generated siv bpftool btf dump - tam sim no peb tsis tas yuav teeb tsa lub pob kernel-headers kom paub seb cov qauv kernel zoo li cas. Cov ntaub ntawv hauv qab no tuaj rau peb ntawm lub tsev qiv ntawv libbpf. Tam sim no peb tsuas xav tau nws los txhais cov macro SEC, uas xa cov cim mus rau seem tsim nyog ntawm ELF cov ntaub ntawv khoom. Peb qhov kev pab cuam muaj nyob rau hauv seem xdp/simple, qhov twg ua ntej tus lej peb txhais cov kev pab cuam hom BPF - qhov no yog cov lus pom zoo siv hauv libbpf, raws li lub npe ntu nws yuav hloov hom raug thaum pib bpf(2). Qhov kev pab cuam BPF nws tus kheej yog C - yooj yim heev thiab muaj ib kab return XDP_PASS. Thaum kawg, ib ntu cais "license" muaj lub npe ntawm daim ntawv tso cai.

Peb tuaj yeem suav peb qhov program siv llvm/clang, version>= 10.0.0, lossis zoo dua, ntau dua (saib ntu Cov cuab yeej tsim kho):

$ clang --version
clang version 11.0.0 (https://github.com/llvm/llvm-project.git afc287e0abec710398465ee1f86237513f2b5091)
...

$ clang -O2 -g -c -target bpf -I libbpf/src/root/usr/include xdp-simple.bpf.c -o xdp-simple.bpf.o

Ntawm cov nta nthuav: peb qhia lub hom phiaj architecture -target bpf thiab txoj kev mus rau headers libbpf, uas peb nyuam qhuav nruab. Tsis tas li ntawd, tsis txhob hnov ​​​​qab txog -O2, tsis muaj qhov kev xaiv no koj tuaj yeem ua rau surprises yav tom ntej. Cia peb saib peb cov cai, peb puas tau tswj kom sau qhov program peb xav tau?

$ llvm-objdump --section=xdp/simple --no-show-raw-insn -D xdp-simple.bpf.o

xdp-simple.bpf.o:       file format elf64-bpf

Disassembly of section xdp/simple:

0000000000000000 <simple>:
       0:       r0 = 2
       1:       exit

Yog, nws ua haujlwm! Tam sim no, peb muaj cov ntaub ntawv binary nrog cov kev pab cuam, thiab peb xav tsim ib daim ntawv thov uas yuav thauj mus rau hauv cov ntsiav. Rau lub hom phiaj no lub tsev qiv ntawv libbpf muab peb ob txoj kev xaiv - siv API qib qis lossis API qib siab dua. Peb yuav mus rau txoj kev thib ob, vim peb xav kawm yuav ua li cas sau, thauj khoom thiab txuas BPF cov kev pab cuam nrog kev siv zog tsawg rau lawv txoj kev kawm tom ntej.

Ua ntej, peb yuav tsum tsim kom muaj "skeleton" ntawm peb qhov kev pab cuam los ntawm nws cov binary siv tib lub txiaj ntsig bpftool - Swiss riam ntawm lub ntiaj teb BPF (uas tuaj yeem raug coj los ua, txij li Daniel Borkman, ib tus tsim thiab saib xyuas ntawm BPF, yog Swiss):

$ bpftool gen skeleton xdp-simple.bpf.o > xdp-simple.skel.h

Hauv cov ntaub ntawv xdp-simple.skel.h muaj binary code ntawm peb qhov kev pab cuam thiab kev ua haujlwm rau kev tswj hwm - thauj khoom, txuas, tshem tawm peb cov khoom. Nyob rau hauv peb cov ntaub ntawv yooj yim no zoo li overkill, tab sis nws kuj ua hauj lwm nyob rau hauv cov ntaub ntawv uas cov khoom cov ntaub ntawv muaj ntau BPF cov kev pab cuam thiab maps thiab mus thauj cov ELF loj loj no peb tsuas yog xav tau los tsim cov pob txha thiab hu ib los yog ob lub zog los ntawm daim ntawv thov kev cai peb. tab tom sau Cia peb mus tam sim no.

Hais lus nruj me ntsis, peb qhov kev pab cuam loader yog qhov tsis tseem ceeb:

#include <err.h>
#include <unistd.h>
#include "xdp-simple.skel.h"

int main(int argc, char **argv)
{
    struct xdp_simple_bpf *obj;

    obj = xdp_simple_bpf__open_and_load();
    if (!obj)
        err(1, "failed to open and/or load BPF objectn");

    pause();

    xdp_simple_bpf__destroy(obj);
}

nws yog struct xdp_simple_bpf txhais hauv cov ntaub ntawv xdp-simple.skel.h thiab piav txog peb cov ntaub ntawv khoom:

struct xdp_simple_bpf {
    struct bpf_object_skeleton *skeleton;
    struct bpf_object *obj;
    struct {
        struct bpf_program *simple;
    } progs;
    struct {
        struct bpf_link *simple;
    } links;
};

Peb tuaj yeem pom cov cim ntawm API qib qis ntawm no: tus qauv struct bpf_program *simple ΠΈ struct bpf_link *simple. Thawj tus qauv tshwj xeeb piav qhia txog peb txoj haujlwm, sau rau hauv ntu xdp/simple, thiab qhov thib ob piav qhia txog qhov program txuas mus rau qhov xwm txheej li cas.

muaj nuj nqi xdp_simple_bpf__open_and_load, qhib ib qho khoom ELF, parses nws, tsim tag nrho cov qauv thiab substructures (dua li qhov kev pab cuam, ELF kuj muaj lwm ntu - cov ntaub ntawv, cov ntaub ntawv nyeem nkaus xwb, cov ntaub ntawv debugging, daim ntawv tso cai, thiab lwm yam), thiab tom qab ntawd thauj nws mus rau hauv cov ntsiav uas siv lub kaw lus. hu bpf, uas peb tuaj yeem tshawb xyuas los ntawm kev sau thiab khiav qhov program:

$ clang -O2 -I ./libbpf/src/root/usr/include/ xdp-simple.c -o xdp-simple ./libbpf/src/root/usr/lib64/libbpf.a -lelf -lz

$ sudo strace -e bpf ./xdp-simple
...
bpf(BPF_BTF_LOAD, 0x7ffdb8fd9670, 120)  = 3
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, insn_cnt=2, insns=0xdfd580, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 8, 0), prog_flags=0, prog_name="simple", prog_ifindex=0, expected_attach_type=0x25 /* BPF_??? */, ...}, 120) = 4

Cia peb tam sim no saib peb qhov program siv bpftool. Cia peb nrhiav nws tus ID:

# bpftool p | grep -A4 simple
463: xdp  name simple  tag 3b185187f1855c4c  gpl
        loaded_at 2020-08-01T01:59:49+0000  uid 0
        xlated 16B  jited 40B  memlock 4096B
        btf_id 185
        pids xdp-simple(16498)

thiab pov tseg (peb siv daim ntawv luv luv ntawm cov lus txib bpftool prog dump xlated):

# bpftool p d x id 463
int simple(void *ctx):
; return XDP_PASS;
   0: (b7) r0 = 2
   1: (95) exit

Ib yam tshiab! Qhov kev pab cuam luam tawm chunks ntawm peb cov ntaub ntawv C, qhov no yog ua los ntawm lub tsev qiv ntawv libbpf, uas pom qhov debug seem hauv binary, muab tso ua ke rau hauv ib qho khoom BTF, thauj nws mus rau hauv cov ntsiav siv BPF_BTF_LOAD, thiab tom qab ntawd teev cov txiaj ntsig cov ntaub ntawv piav qhia thaum thauj cov program nrog cov lus txib BPG_PROG_LOAD.

Kernel Helpers

BPF cov kev pabcuam tuaj yeem ua haujlwm "sab nraud" - cov pab pawg neeg. Cov kev pab cuam no tso cai rau BPF cov kev pab cuam nkag mus rau cov qauv kernel, tswj cov duab qhia chaw, thiab tseem sib txuas lus nrog "lub ntiaj teb tiag" - tsim perf txheej xwm, tswj kho vajtse (piv txwv li, redirect packets), thiab lwm yam.

Piv txwv li: bpf_get_smp_processor_id

Nyob rau hauv lub moj khaum ntawm "kev kawm los ntawm kev piv txwv" paradigm, cia peb xav txog ib qho ntawm cov pab cuam, bpf_get_smp_processor_id(), meej hauv cov ntaub ntawv kernel/bpf/helpers.c. Nws rov qab tus naj npawb ntawm cov processor uas BPF qhov kev pab cuam hu ua nws khiav. Tab sis peb tsis txaus siab rau nws cov semantics raws li qhov tseeb tias nws qhov kev siv yuav siv ib kab:

BPF_CALL_0(bpf_get_smp_processor_id)
{
    return smp_processor_id();
}

BPF pab txhais cov lus txhais tau zoo ib yam li Linux system hu cov ntsiab lus. Ntawm no, piv txwv li, muaj nuj nqi txhais tias tsis muaj kev sib cav. (Ib txoj haujlwm uas siv, hais, peb qhov kev sib cav yog txhais los ntawm kev siv macro BPF_CALL_3. Qhov siab tshaj plaws ntawm cov lus sib cav yog tsib.) Txawm li cas los xij, qhov no tsuas yog thawj ntu ntawm cov lus txhais. Qhov thib ob yog los txhais cov qauv qauv struct bpf_func_proto, uas muaj cov lus piav qhia ntawm tus pab cuam ua haujlwm uas tus neeg txheeb xyuas nkag siab:

const struct bpf_func_proto bpf_get_smp_processor_id_proto = {
    .func     = bpf_get_smp_processor_id,
    .gpl_only = false,
    .ret_type = RET_INTEGER,
};

Kev Sau Npe Pabcuam Kev Ua Haujlwm

Yuav kom BPF cov kev pab cuam ntawm ib hom kev siv no muaj nuj nqi, lawv yuav tsum sau npe rau nws, piv txwv li rau hom BPF_PROG_TYPE_XDP muaj nuj nqi yog txhais nyob rau hauv lub kernel xdp_func_proto, uas txiav txim siab los ntawm tus pab ua haujlwm ID seb XDP txhawb txoj haujlwm no lossis tsis. Peb lub luag haujlwm yog txhawb nqa:

static const struct bpf_func_proto *
xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
{
    switch (func_id) {
    ...
    case BPF_FUNC_get_smp_processor_id:
        return &bpf_get_smp_processor_id_proto;
    ...
    }
}

Cov hom phiaj BPF tshiab yog "txhais" hauv cov ntaub ntawv include/linux/bpf_types.h siv macro BPF_PROG_TYPE. Txhais nyob rau hauv quotes vim hais tias nws yog ib lub ntsiab lus txhais, thiab nyob rau hauv cov lus C cov ntsiab lus lub ntsiab lus ntawm tag nrho cov txheej txheem pob zeb tshwm sim nyob rau hauv lwm qhov chaw. Tshwj xeeb, hauv cov ntaub ntawv kernel/bpf/verifier.c tag nrho cov ntsiab lus los ntawm cov ntaub ntawv bpf_types.h yog siv los tsim ib qho array ntawm cov qauv bpf_verifier_ops[]:

static const struct bpf_verifier_ops *const bpf_verifier_ops[] = {
#define BPF_PROG_TYPE(_id, _name, prog_ctx_type, kern_ctx_type) 
    [_id] = & _name ## _verifier_ops,
#include <linux/bpf_types.h>
#undef BPF_PROG_TYPE
};

Ntawd yog, rau txhua hom kev pabcuam BPF, tus taw tes rau cov ntaub ntawv qauv ntawm hom raug txhais struct bpf_verifier_ops, uas yog pib nrog tus nqi _name ## _verifier_ops, i.e., xdp_verifier_ops rau xdpCov. Cov Qauv xdp_verifier_ops txiav txim siab hauv cov ntaub ntawv net/core/filter.c raws li nram no:

const struct bpf_verifier_ops xdp_verifier_ops = {
    .get_func_proto     = xdp_func_proto,
    .is_valid_access    = xdp_is_valid_access,
    .convert_ctx_access = xdp_convert_ctx_access,
    .gen_prologue       = bpf_noop_prologue,
};

Ntawm no peb pom peb txoj haujlwm paub xdp_func_proto, uas yuav khiav tus txheeb xyuas txhua zaus nws ntsib kev sib tw tej yam ua haujlwm hauv BPF program, saib verifier.c.

Cia peb saib seb qhov kev ua haujlwm siab BPF siv lub luag haujlwm li cas bpf_get_smp_processor_id. Txhawm rau ua qhov no, peb rov sau qhov program los ntawm peb ntu dhau los raws li hauv qab no:

#include "vmlinux.h"
#include <bpf/bpf_helpers.h>

SEC("xdp/simple")
int simple(void *ctx)
{
    if (bpf_get_smp_processor_id() != 0)
        return XDP_DROP;
    return XDP_PASS;
}

char LICENSE[] SEC("license") = "GPL";

Cim bpf_get_smp_processor_id txiav txim siab Π² <bpf/bpf_helper_defs.h> cov tsev qiv ntawv libbpf yuav ua li cas

static u32 (*bpf_get_smp_processor_id)(void) = (void *) 8;

uas yog, bpf_get_smp_processor_id yog tus taw tes ua haujlwm uas nws tus nqi yog 8, qhov twg 8 yog tus nqi BPF_FUNC_get_smp_processor_id hom enum bpf_fun_id, uas yog txhais rau peb hauv cov ntaub ntawv vmlinux.h (cov ntaub ntawv bpf_helper_defs.h nyob rau hauv lub kernel yog tsim los ntawm ib tsab ntawv, yog li cov "magic" tooj yog ok). Qhov kev ua haujlwm no siv tsis muaj kev sib cav thiab xa rov qab tus nqi ntawm hom __u32. Thaum peb khiav nws hauv peb qhov program, clang tsim ib qho kev qhia BPF_CALL "tus zoo" Cia peb suav cov program thiab saib cov ntu xdp/simple:

$ clang -O2 -g -c -target bpf -I libbpf/src/root/usr/include xdp-simple.bpf.c -o xdp-simple.bpf.o
$ llvm-objdump -D --section=xdp/simple xdp-simple.bpf.o

xdp-simple.bpf.o:       file format elf64-bpf

Disassembly of section xdp/simple:

0000000000000000 <simple>:
       0:       85 00 00 00 08 00 00 00 call 8
       1:       bf 01 00 00 00 00 00 00 r1 = r0
       2:       67 01 00 00 20 00 00 00 r1 <<= 32
       3:       77 01 00 00 20 00 00 00 r1 >>= 32
       4:       b7 00 00 00 02 00 00 00 r0 = 2
       5:       15 01 01 00 00 00 00 00 if r1 == 0 goto +1 <LBB0_2>
       6:       b7 00 00 00 01 00 00 00 r0 = 1

0000000000000038 <LBB0_2>:
       7:       95 00 00 00 00 00 00 00 exit

Hauv thawj kab peb pom cov lus qhia call, parameter IMM uas yog sib npaug rau 8, thiab SRC_REG - xoom. Raws li ABI daim ntawv cog lus siv los ntawm tus neeg txheeb xyuas, qhov no yog hu rau tus neeg pab ua haujlwm thib yim. Thaum nws yog launched, lub logic yog yooj yim. Rov qab tus nqi los ntawm kev sau npe r0 luam rau r1 thiab ntawm kab 2,3 nws hloov dua siab tshiab rau hom u32 - lub sab sauv 32 khoom raug tshem tawm. Ntawm kab 4,5,6,7 peb rov 2 (XDP_PASS) los yog 1 (XDP_DROP) nyob ntawm seb tus pab cuam ua haujlwm los ntawm kab 0 xa rov qab tus nqi xoom lossis tsis yog xoom.

Cia peb sim peb tus kheej: thauj cov program thiab saib cov zis bpftool prog dump xlated:

$ bpftool gen skeleton xdp-simple.bpf.o > xdp-simple.skel.h
$ clang -O2 -g -I ./libbpf/src/root/usr/include/ -o xdp-simple xdp-simple.c ./libbpf/src/root/usr/lib64/libbpf.a -lelf -lz
$ sudo ./xdp-simple &
[2] 10914

$ sudo bpftool p | grep simple
523: xdp  name simple  tag 44c38a10c657e1b0  gpl
        pids xdp-simple(10915)

$ sudo bpftool p d x id 523
int simple(void *ctx):
; if (bpf_get_smp_processor_id() != 0)
   0: (85) call bpf_get_smp_processor_id#114128
   1: (bf) r1 = r0
   2: (67) r1 <<= 32
   3: (77) r1 >>= 32
   4: (b7) r0 = 2
; }
   5: (15) if r1 == 0x0 goto pc+1
   6: (b7) r0 = 1
   7: (95) exit

Ok, tus neeg txheeb xyuas pom qhov tseeb kernel-helper.

Piv txwv: dhau kev sib cav thiab thaum kawg khiav qhov kev pab cuam!

Txhua qhov kev pab cuam khiav haujlwm muaj tus qauv

u64 fn(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)

Parameters rau tus pab ua haujlwm tau dhau los hauv cov npe r1-r5, thiab tus nqi raug xa rov qab rau hauv daim ntawv teev npe r0. Tsis muaj cov haujlwm uas siv ntau tshaj tsib qhov kev sib cav, thiab kev txhawb nqa rau lawv tsis xav tias yuav ntxiv rau yav tom ntej.

Cia peb saib ntawm tus pab cuam tshiab thiab BPF hla qhov tsis zoo li cas. Wb rov sau dua xdp-simple.bpf.c raws li hauv qab no (tag nrho cov kab tsis tau hloov):

SEC("xdp/simple")
int simple(void *ctx)
{
    bpf_printk("running on CPU%un", bpf_get_smp_processor_id());
    return XDP_PASS;
}

Peb qhov kev pabcuam luam tawm tus lej ntawm CPU uas nws tab tom khiav. Cia peb sau nws thiab saib cov cai:

$ llvm-objdump -D --section=xdp/simple --no-show-raw-insn xdp-simple.bpf.o

0000000000000000 <simple>:
       0:       r1 = 10
       1:       *(u16 *)(r10 - 8) = r1
       2:       r1 = 8441246879787806319 ll
       4:       *(u64 *)(r10 - 16) = r1
       5:       r1 = 2334956330918245746 ll
       7:       *(u64 *)(r10 - 24) = r1
       8:       call 8
       9:       r1 = r10
      10:       r1 += -24
      11:       r2 = 18
      12:       r3 = r0
      13:       call 6
      14:       r0 = 2
      15:       exit

Hauv kab 0-7 peb sau txoj hlua running on CPU%un, thiab tom qab ntawd ntawm kab 8 peb khiav tus paub bpf_get_smp_processor_id. Ntawm kab 9-12 peb npaj cov lus sib cav bpf_printk - sau npe r1, r2, r3. Vim li cas muaj peb ntawm lawv thiab tsis yog ob? Vim bpf_printkqhov no yog macro wrapper nyob ib ncig ntawm tus pab tiag bpf_trace_printk, uas yuav tsum dhau qhov loj ntawm txoj hlua hom.

Cia peb ntxiv ob peb kab rau xdp-simple.ckom peb qhov kev pab cuam txuas rau lub interface lo thiab pib tiag tiag!

$ cat xdp-simple.c
#include <linux/if_link.h>
#include <err.h>
#include <unistd.h>
#include "xdp-simple.skel.h"

int main(int argc, char **argv)
{
    __u32 flags = XDP_FLAGS_SKB_MODE;
    struct xdp_simple_bpf *obj;

    obj = xdp_simple_bpf__open_and_load();
    if (!obj)
        err(1, "failed to open and/or load BPF objectn");

    bpf_set_link_xdp_fd(1, -1, flags);
    bpf_set_link_xdp_fd(1, bpf_program__fd(obj->progs.simple), flags);

cleanup:
    xdp_simple_bpf__destroy(obj);
}

Ntawm no peb siv cov haujlwm bpf_set_link_xdp_fd, uas txuas XDP-hom BPF cov kev pab cuam rau network interfaces. Peb hardcoded tus lej interface lo, uas yog ib txwm 1. Peb khiav qhov kev ua haujlwm ob zaug ua ntej tshem tawm cov kev pab cuam qub yog tias nws txuas nrog. Daim ntawv ceeb toom tias tam sim no peb tsis xav tau kev sib tw pause los yog ib lub voj infinite: peb qhov kev pab cuam loader yuav tawm, tab sis qhov kev pab cuam BPF yuav tsis raug tua vim nws txuas nrog rau qhov tshwm sim. Tom qab ua tiav kev rub tawm thiab kev sib txuas, qhov kev zov me nyuam yuav raug tso tawm rau txhua pob ntawv network tuaj txog ntawm lo.

Wb download tau qhov kev pab cuam thiab saib lub interface lo:

$ sudo ./xdp-simple
$ sudo bpftool p | grep simple
669: xdp  name simple  tag 4fca62e77ccb43d6  gpl
$ ip l show dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 xdpgeneric qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    prog/xdp id 669

Qhov kev pab cuam peb downloaded muaj ID 669 thiab peb pom tib ID ntawm lub interface lo. Peb mam li xa ob peb pob rau 127.0.0.1 (thov + teb):

$ ping -c1 localhost

thiab tam sim no cia saib cov ntsiab lus ntawm cov ntaub ntawv debug virtual /sys/kernel/debug/tracing/trace_pipe, hauv bpf_printk sau nws cov lus:

# cat /sys/kernel/debug/tracing/trace_pipe
ping-13937 [000] d.s1 442015.377014: bpf_trace_printk: running on CPU0
ping-13937 [000] d.s1 442015.377027: bpf_trace_printk: running on CPU0

Ob lub pob tau pom ntawm lo thiab ua tiav ntawm CPU0 - peb thawj qhov kev pabcuam BPF tsis muaj txiaj ntsig tau ua haujlwm!

Nws tsim nyog sau cia tias bpf_printk Nws tsis yog rau tsis muaj dab tsi uas nws sau rau cov ntaub ntawv debug: qhov no tsis yog tus pab cuam zoo tshaj plaws rau kev siv hauv kev tsim khoom, tab sis peb lub hom phiaj yog los qhia qee yam yooj yim.

Nkag mus rau daim ntawv qhia los ntawm BPF cov kev pab cuam

Piv txwv: siv daim ntawv qhia los ntawm BPF program

Hauv seem yav dhau los peb tau kawm yuav ua li cas los tsim thiab siv daim duab qhia chaw los ntawm cov neeg siv qhov chaw, thiab tam sim no cia peb saib ntawm cov kernel. Cia peb pib, raws li niaj zaus, nrog piv txwv. Cia peb rov sau peb qhov program xdp-simple.bpf.c raws li nram no:

#include "vmlinux.h"
#include <bpf/bpf_helpers.h>

struct {
    __uint(type, BPF_MAP_TYPE_ARRAY);
    __uint(max_entries, 8);
    __type(key, u32);
    __type(value, u64);
} woo SEC(".maps");

SEC("xdp/simple")
int simple(void *ctx)
{
    u32 key = bpf_get_smp_processor_id();
    u32 *val;

    val = bpf_map_lookup_elem(&woo, &key);
    if (!val)
        return XDP_ABORTED;

    *val += 1;

    return XDP_PASS;
}

char LICENSE[] SEC("license") = "GPL";

Thaum pib ntawm qhov kev pab cuam peb ntxiv ib daim ntawv qhia txhais woo: Qhov no yog 8-element array uas khaws cov txiaj ntsig zoo li u64 (hauv C peb yuav txhais xws li array li u64 woo[8]). Hauv qhov program "xdp/simple" peb tau txais tus lej processor tam sim no rau hauv qhov sib txawv key thiab tom qab ntawd siv tus pab ua haujlwm bpf_map_lookup_element peb tau txais ib tus pointer rau qhov nkag mus rau hauv cov array, uas peb nce los ntawm ib qho. Txhais ua lus Lavxias: peb suav cov txheeb cais uas CPU ua cov pob khoom tuaj. Cia peb sim khiav qhov program:

$ clang -O2 -g -c -target bpf -I libbpf/src/root/usr/include xdp-simple.bpf.c -o xdp-simple.bpf.o
$ bpftool gen skeleton xdp-simple.bpf.o > xdp-simple.skel.h
$ clang -O2 -g -I ./libbpf/src/root/usr/include/ -o xdp-simple xdp-simple.c ./libbpf/src/root/usr/lib64/libbpf.a -lelf -lz
$ sudo ./xdp-simple

Cia peb tshawb xyuas tias nws tau txuas nrog lo thiab xa ib co pob ntawv:

$ ip l show dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 xdpgeneric qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    prog/xdp id 108

$ for s in `seq 234`; do sudo ping -f -c 100 127.0.0.1 >/dev/null 2>&1; done

Tam sim no cia peb saib cov ntsiab lus ntawm array:

$ sudo bpftool map dump name woo
[
    { "key": 0, "value": 0 },
    { "key": 1, "value": 400 },
    { "key": 2, "value": 0 },
    { "key": 3, "value": 0 },
    { "key": 4, "value": 0 },
    { "key": 5, "value": 0 },
    { "key": 6, "value": 0 },
    { "key": 7, "value": 46400 }
]

Yuav luag tag nrho cov txheej txheem tau ua tiav ntawm CPU7. Qhov no tsis yog qhov tseem ceeb rau peb, qhov tseem ceeb tshaj plaws yog qhov program ua haujlwm thiab peb nkag siab yuav ua li cas nkag mus rau daim duab qhia chaw los ntawm BPF cov kev pab cuam - siv Ρ…Π΅Π»ΠΏΠ΅Ρ€ΠΎΠ² bpf_mp_*.

Mystical index

Yog li, peb tuaj yeem nkag mus rau hauv daim ntawv qhia los ntawm BPF qhov kev pab cuam siv hu zoo li

val = bpf_map_lookup_elem(&woo, &key);

qhov twg tus pab ua haujlwm zoo li

void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)

tab sis peb hla tus taw tes &woo mus rau ib qho qauv tsis muaj npe struct { ... }...

Yog tias peb saib ntawm qhov program assembler, peb pom tias tus nqi &woo tsis tau txhais tau tias (kab 4):

llvm-objdump -D --section xdp/simple xdp-simple.bpf.o

xdp-simple.bpf.o:       file format elf64-bpf

Disassembly of section xdp/simple:

0000000000000000 <simple>:
       0:       85 00 00 00 08 00 00 00 call 8
       1:       63 0a fc ff 00 00 00 00 *(u32 *)(r10 - 4) = r0
       2:       bf a2 00 00 00 00 00 00 r2 = r10
       3:       07 02 00 00 fc ff ff ff r2 += -4
       4:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
       6:       85 00 00 00 01 00 00 00 call 1
...

thiab muaj nyob rau hauv kev hloov chaw:

$ llvm-readelf -r xdp-simple.bpf.o | head -4

Relocation section '.relxdp/simple' at offset 0xe18 contains 1 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name
0000000000000020  0000002700000001 R_BPF_64_64            0000000000000000 woo

Tab sis yog tias peb saib ntawm qhov kev pab cuam uas twb muaj lawm, peb pom tus taw tes rau daim ntawv qhia tseeb (kab 4):

$ sudo bpftool prog dump x name simple
int simple(void *ctx):
   0: (85) call bpf_get_smp_processor_id#114128
   1: (63) *(u32 *)(r10 -4) = r0
   2: (bf) r2 = r10
   3: (07) r2 += -4
   4: (18) r1 = map[id:64]
...

Yog li, peb tuaj yeem xaus tias thaum lub sijhawm pib peb cov kev pabcuam loader, qhov txuas mus rau &woo tau hloov los ntawm ib yam dab tsi nrog lub tsev qiv ntawv libbpf. Ua ntej peb yuav saib cov zis strace:

$ sudo strace -e bpf ./xdp-simple
...
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=8, max_entries=8, map_name="woo", ...}, 120) = 4
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, prog_name="simple", ...}, 120) = 5

Peb pom qhov ntawd libbpf tsim ib daim ntawv qhia woo thiab ces downloaded peb qhov kev pab cuam simple. Cia peb ua tib zoo saib seb peb thauj cov program li cas:

  • hu xdp_simple_bpf__open_and_load los ntawm cov ntaub ntawv xdp-simple.skel.h
  • uas ua rau xdp_simple_bpf__load los ntawm cov ntaub ntawv xdp-simple.skel.h
  • uas ua rau bpf_object__load_skeleton los ntawm cov ntaub ntawv libbpf/src/libbpf.c
  • uas ua rau bpf_object__load_xattr los ntawm libbpf/src/libbpf.c

Lub luag haujlwm kawg, ntawm lwm yam, yuav hu bpf_object__create_maps, uas tsim lossis qhib cov duab qhia chaw uas twb muaj lawm, tig lawv mus rau hauv cov ntaub ntawv piav qhia. (Qhov no yog qhov peb pom BPF_MAP_CREATE hauv cov zis strace.) Tom ntej no muaj nuj nqi hu ua bpf_object__relocate thiab nws yog tus uas txaus siab rau peb, txij li thaum peb nco txog qhov peb tau pom woo nyob rau hauv lub rooj hloov chaw. Tshawb nrhiav nws, thaum kawg peb pom peb tus kheej hauv kev ua haujlwm bpf_program__relocate, qho deals nrog daim ntawv qhia chaw nyob:

case RELO_LD64:
    insn[0].src_reg = BPF_PSEUDO_MAP_FD;
    insn[0].imm = obj->maps[relo->map_idx].fd;
    break;

Yog li peb coj peb cov lus qhia

18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll

thiab hloov qhov chaw sau npe hauv nws nrog BPF_PSEUDO_MAP_FD, thiab thawj IMM rau cov ntaub ntawv piav qhia ntawm peb daim ntawv qhia thiab, yog tias nws sib npaug, piv txwv li, 0xdeadbeef, yog li ntawd peb yuav tau txais kev qhia

18 11 00 00 ef eb ad de 00 00 00 00 00 00 00 00 r1 = 0 ll

Qhov no yog li cas daim ntawv qhia cov ntaub ntawv raug xa mus rau ib qho kev pabcuam BPF tshwj xeeb. Hauv qhov no, daim ntawv qhia tuaj yeem tsim siv BPF_MAP_CREATE, thiab qhib los ntawm ID siv BPF_MAP_GET_FD_BY_ID.

Tag nrho, thaum siv libbpf lub algorithm yog raws li nram no:

  • Thaum muab tso ua ke, cov ntaub ntawv raug tsim nyob rau hauv lub rooj hloov chaw rau kev txuas mus rau daim duab qhia chaw
  • libbpf qhib phau ntawv ELF cov khoom, nrhiav txhua daim ntawv qhia siv thiab tsim cov ntaub ntawv piav qhia rau lawv
  • cov ntaub ntawv piav qhia raug thauj mus rau hauv cov ntsiav ua ib feem ntawm cov lus qhia LD64

Raws li koj tuaj yeem xav, muaj ntau yam tuaj thiab peb yuav tau saib mus rau hauv lub hauv paus. Hmoov zoo, peb muaj cov lus qhia - peb tau sau lub ntsiab lus BPF_PSEUDO_MAP_FD mus rau hauv qhov chaw sau npe thiab peb tuaj yeem faus nws, uas yuav coj peb mus rau qhov dawb huv ntawm txhua tus neeg dawb huv - kernel/bpf/verifier.c, qhov twg muaj nuj nqi nrog lub npe txawv hloov cov ntaub ntawv piav qhia nrog qhov chaw nyob ntawm tus qauv ntawm hom struct bpf_map:

static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env) {
    ...

    f = fdget(insn[0].imm);
    map = __bpf_map_get(f);
    if (insn->src_reg == BPF_PSEUDO_MAP_FD) {
        addr = (unsigned long)map;
    }
    insn[0].imm = (u32)addr;
    insn[1].imm = addr >> 32;

(Ntau tus lej tuaj yeem pom txuas). Yog li peb tuaj yeem nthuav peb cov algorithm:

  • Thaum lub sij hawm thauj khoom qhov kev pab cuam, tus neeg xyuas xyuas xyuas qhov tseeb ntawm kev siv daim ntawv qhia thiab sau qhov chaw nyob ntawm cov qauv sib thooj struct bpf_map

Thaum rub tawm ELF binary siv libbpf Muaj ntau ntxiv mus, tab sis peb yuav tham txog qhov ntawd hauv lwm cov lus.

Loading programs thiab maps tsis muaj libbpf

Raws li tau cog lus tseg, ntawm no yog ib qho piv txwv rau cov neeg nyeem uas xav paub yuav ua li cas los tsim thiab thauj cov kev pab cuam uas siv daim duab qhia chaw, tsis muaj kev pab. libbpf. Qhov no tuaj yeem pab tau thaum koj ua haujlwm hauv ib puag ncig uas koj tsis tuaj yeem tsim kev vam khom, lossis txuag txhua qhov me me, lossis sau ib qho program zoo li ply, uas generates BPF binary code ntawm ya.

Txhawm rau ua kom yooj yim ua raws li cov laj thawj, peb yuav rov sau peb tus qauv rau cov laj thawj no xdp-simple. Cov lej ua tiav thiab nthuav me ntsis ntawm qhov kev pab cuam tau tham hauv qhov piv txwv no tuaj yeem pom hauv qhov no qhov tseem ceeb.

Lub logic ntawm peb daim ntawv thov yog raws li nram no:

  • tsim ib daim duab qhia chaw BPF_MAP_TYPE_ARRAY siv cov lus txib BPF_MAP_CREATE,
  • tsim ib qho kev pab cuam uas siv daim ntawv qhia no,
  • txuas qhov program rau lub interface lo,

uas txhais ua neeg li

int main(void)
{
    int map_fd, prog_fd;

    map_fd = map_create();
    if (map_fd < 0)
        err(1, "bpf: BPF_MAP_CREATE");

    prog_fd = prog_load(map_fd);
    if (prog_fd < 0)
        err(1, "bpf: BPF_PROG_LOAD");

    xdp_attach(1, prog_fd);
}

nws yog map_create tsim ib daim ntawv qhia ib yam li peb tau ua hauv thawj qhov piv txwv txog kev hu xov tooj bpf - β€œKernel, thov ua kuv daim ntawv qhia tshiab hauv daim ntawv ntawm 8 lub ntsiab lus zoo li __u64 thiab muab kuv rov qab cov ntaub ntawv piav qhia":

static int map_create()
{
    union bpf_attr attr;

    memset(&attr, 0, sizeof(attr));
    attr.map_type = BPF_MAP_TYPE_ARRAY,
    attr.key_size = sizeof(__u32),
    attr.value_size = sizeof(__u64),
    attr.max_entries = 8,
    strncpy(attr.map_name, "woo", sizeof(attr.map_name));
    return syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));
}

Qhov kev pab cuam kuj yooj yim mus thauj khoom:

static int prog_load(int map_fd)
{
    union bpf_attr attr;
    struct bpf_insn insns[] = {
        ...
    };

    memset(&attr, 0, sizeof(attr));
    attr.prog_type = BPF_PROG_TYPE_XDP;
    attr.insns     = ptr_to_u64(insns);
    attr.insn_cnt  = sizeof(insns)/sizeof(insns[0]);
    attr.license   = ptr_to_u64("GPL");
    strncpy(attr.prog_name, "woo", sizeof(attr.prog_name));
    return syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr));
}

Qhov nyuaj prog_load yog lub ntsiab txhais ntawm peb qhov kev pab cuam BPF raws li ib qho array ntawm cov qauv struct bpf_insn insns[]. Tab sis txij li thaum peb tab tom siv qhov program uas peb muaj hauv C, peb tuaj yeem dag me ntsis:

$ llvm-objdump -D --section xdp/simple xdp-simple.bpf.o

0000000000000000 <simple>:
       0:       85 00 00 00 08 00 00 00 call 8
       1:       63 0a fc ff 00 00 00 00 *(u32 *)(r10 - 4) = r0
       2:       bf a2 00 00 00 00 00 00 r2 = r10
       3:       07 02 00 00 fc ff ff ff r2 += -4
       4:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
       6:       85 00 00 00 01 00 00 00 call 1
       7:       b7 01 00 00 00 00 00 00 r1 = 0
       8:       15 00 04 00 00 00 00 00 if r0 == 0 goto +4 <LBB0_2>
       9:       61 01 00 00 00 00 00 00 r1 = *(u32 *)(r0 + 0)
      10:       07 01 00 00 01 00 00 00 r1 += 1
      11:       63 10 00 00 00 00 00 00 *(u32 *)(r0 + 0) = r1
      12:       b7 01 00 00 02 00 00 00 r1 = 2

0000000000000068 <LBB0_2>:
      13:       bf 10 00 00 00 00 00 00 r0 = r1
      14:       95 00 00 00 00 00 00 00 exit

Nyob rau hauv tag nrho, peb yuav tsum tau sau 14 cov lus qhia nyob rau hauv daim ntawv ntawm cov qauv zoo li struct bpf_insn (tswv yim: nqa cov pob tseg los ntawm saum toj no, rov nyeem cov lus qhia ntu, qhib linux/bpf.h ΠΈ linux/bpf_common.h thiab sim txiav txim siab struct bpf_insn insns[] ntawm tus kheej):

struct bpf_insn insns[] = {
    /* 85 00 00 00 08 00 00 00 call 8 */
    {
        .code = BPF_JMP | BPF_CALL,
        .imm = 8,
    },

    /* 63 0a fc ff 00 00 00 00 *(u32 *)(r10 - 4) = r0 */
    {
        .code = BPF_MEM | BPF_STX,
        .off = -4,
        .src_reg = BPF_REG_0,
        .dst_reg = BPF_REG_10,
    },

    /* bf a2 00 00 00 00 00 00 r2 = r10 */
    {
        .code = BPF_ALU64 | BPF_MOV | BPF_X,
        .src_reg = BPF_REG_10,
        .dst_reg = BPF_REG_2,
    },

    /* 07 02 00 00 fc ff ff ff r2 += -4 */
    {
        .code = BPF_ALU64 | BPF_ADD | BPF_K,
        .dst_reg = BPF_REG_2,
        .imm = -4,
    },

    /* 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll */
    {
        .code = BPF_LD | BPF_DW | BPF_IMM,
        .src_reg = BPF_PSEUDO_MAP_FD,
        .dst_reg = BPF_REG_1,
        .imm = map_fd,
    },
    { }, /* placeholder */

    /* 85 00 00 00 01 00 00 00 call 1 */
    {
        .code = BPF_JMP | BPF_CALL,
        .imm = 1,
    },

    /* b7 01 00 00 00 00 00 00 r1 = 0 */
    {
        .code = BPF_ALU64 | BPF_MOV | BPF_K,
        .dst_reg = BPF_REG_1,
        .imm = 0,
    },

    /* 15 00 04 00 00 00 00 00 if r0 == 0 goto +4 <LBB0_2> */
    {
        .code = BPF_JMP | BPF_JEQ | BPF_K,
        .off = 4,
        .src_reg = BPF_REG_0,
        .imm = 0,
    },

    /* 61 01 00 00 00 00 00 00 r1 = *(u32 *)(r0 + 0) */
    {
        .code = BPF_MEM | BPF_LDX,
        .off = 0,
        .src_reg = BPF_REG_0,
        .dst_reg = BPF_REG_1,
    },

    /* 07 01 00 00 01 00 00 00 r1 += 1 */
    {
        .code = BPF_ALU64 | BPF_ADD | BPF_K,
        .dst_reg = BPF_REG_1,
        .imm = 1,
    },

    /* 63 10 00 00 00 00 00 00 *(u32 *)(r0 + 0) = r1 */
    {
        .code = BPF_MEM | BPF_STX,
        .src_reg = BPF_REG_1,
        .dst_reg = BPF_REG_0,
    },

    /* b7 01 00 00 02 00 00 00 r1 = 2 */
    {
        .code = BPF_ALU64 | BPF_MOV | BPF_K,
        .dst_reg = BPF_REG_1,
        .imm = 2,
    },

    /* <LBB0_2>: bf 10 00 00 00 00 00 00 r0 = r1 */
    {
        .code = BPF_ALU64 | BPF_MOV | BPF_X,
        .src_reg = BPF_REG_1,
        .dst_reg = BPF_REG_0,
    },

    /* 95 00 00 00 00 00 00 00 exit */
    {
        .code = BPF_JMP | BPF_EXIT
    },
};

Ib qho kev tawm dag zog rau cov uas tsis tau sau qhov no lawv tus kheej - nrhiav map_fd.

Muaj ib qho ntxiv uas tsis tau qhia tawm hauv peb qhov program - xdp_attach. Hmoov tsis zoo, cov kev pab cuam xws li XDP tsis tuaj yeem txuas nrog siv lub kaw lus hu bpf. Cov neeg uas tsim BPF thiab XDP yog los ntawm online Linux zej zog, uas txhais tau hais tias lawv siv ib qho uas paub zoo tshaj plaws rau lawv (tab sis tsis yog. ib txwm neeg) interface rau interacting nrog lub kernel: netlink sockets, saib thiab RFC 3549. Txoj kev yooj yim tshaj plaws los siv xdp_attach yog copying code los ntawm libbpf, uas yog, los ntawm cov ntaub ntawv netlink.c, uas yog qhov peb tau ua, luv luv nws me ntsis:

Txais tos rau lub ntiaj teb ntawm netlink sockets

Qhib hom netlink socket NETLINK_ROUTE:

int netlink_open(__u32 *nl_pid)
{
    struct sockaddr_nl sa;
    socklen_t addrlen;
    int one = 1, ret;
    int sock;

    memset(&sa, 0, sizeof(sa));
    sa.nl_family = AF_NETLINK;

    sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
    if (sock < 0)
        err(1, "socket");

    if (setsockopt(sock, SOL_NETLINK, NETLINK_EXT_ACK, &one, sizeof(one)) < 0)
        warnx("netlink error reporting not supported");

    if (bind(sock, (struct sockaddr *)&sa, sizeof(sa)) < 0)
        err(1, "bind");

    addrlen = sizeof(sa);
    if (getsockname(sock, (struct sockaddr *)&sa, &addrlen) < 0)
        err(1, "getsockname");

    *nl_pid = sa.nl_pid;
    return sock;
}

Peb nyeem los ntawm lub qhov (socket) no:

static int bpf_netlink_recv(int sock, __u32 nl_pid, int seq)
{
    bool multipart = true;
    struct nlmsgerr *errm;
    struct nlmsghdr *nh;
    char buf[4096];
    int len, ret;

    while (multipart) {
        multipart = false;
        len = recv(sock, buf, sizeof(buf), 0);
        if (len < 0)
            err(1, "recv");

        if (len == 0)
            break;

        for (nh = (struct nlmsghdr *)buf; NLMSG_OK(nh, len);
                nh = NLMSG_NEXT(nh, len)) {
            if (nh->nlmsg_pid != nl_pid)
                errx(1, "wrong pid");
            if (nh->nlmsg_seq != seq)
                errx(1, "INVSEQ");
            if (nh->nlmsg_flags & NLM_F_MULTI)
                multipart = true;
            switch (nh->nlmsg_type) {
                case NLMSG_ERROR:
                    errm = (struct nlmsgerr *)NLMSG_DATA(nh);
                    if (!errm->error)
                        continue;
                    ret = errm->error;
                    // libbpf_nla_dump_errormsg(nh); too many code to copy...
                    goto done;
                case NLMSG_DONE:
                    return 0;
                default:
                    break;
            }
        }
    }
    ret = 0;
done:
    return ret;
}

Thaum kawg, ntawm no yog peb txoj haujlwm uas qhib lub qhov (socket) thiab xa cov lus tshwj xeeb rau nws uas muaj cov ntaub ntawv piav qhia:

static int xdp_attach(int ifindex, int prog_fd)
{
    int sock, seq = 0, ret;
    struct nlattr *nla, *nla_xdp;
    struct {
        struct nlmsghdr  nh;
        struct ifinfomsg ifinfo;
        char             attrbuf[64];
    } req;
    __u32 nl_pid = 0;

    sock = netlink_open(&nl_pid);
    if (sock < 0)
        return sock;

    memset(&req, 0, sizeof(req));
    req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
    req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
    req.nh.nlmsg_type = RTM_SETLINK;
    req.nh.nlmsg_pid = 0;
    req.nh.nlmsg_seq = ++seq;
    req.ifinfo.ifi_family = AF_UNSPEC;
    req.ifinfo.ifi_index = ifindex;

    /* started nested attribute for XDP */
    nla = (struct nlattr *)(((char *)&req)
            + NLMSG_ALIGN(req.nh.nlmsg_len));
    nla->nla_type = NLA_F_NESTED | IFLA_XDP;
    nla->nla_len = NLA_HDRLEN;

    /* add XDP fd */
    nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
    nla_xdp->nla_type = IFLA_XDP_FD;
    nla_xdp->nla_len = NLA_HDRLEN + sizeof(int);
    memcpy((char *)nla_xdp + NLA_HDRLEN, &prog_fd, sizeof(prog_fd));
    nla->nla_len += nla_xdp->nla_len;

    /* if user passed in any flags, add those too */
    __u32 flags = XDP_FLAGS_SKB_MODE;
    nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
    nla_xdp->nla_type = IFLA_XDP_FLAGS;
    nla_xdp->nla_len = NLA_HDRLEN + sizeof(flags);
    memcpy((char *)nla_xdp + NLA_HDRLEN, &flags, sizeof(flags));
    nla->nla_len += nla_xdp->nla_len;

    req.nh.nlmsg_len += NLA_ALIGN(nla->nla_len);

    if (send(sock, &req, req.nh.nlmsg_len, 0) < 0)
        err(1, "send");
    ret = bpf_netlink_recv(sock, nl_pid, seq);

cleanup:
    close(sock);
    return ret;
}

Yog li, txhua yam yog npaj rau kev sim:

$ cc nolibbpf.c -o nolibbpf
$ sudo strace -e bpf ./nolibbpf
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, map_name="woo", ...}, 72) = 3
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, insn_cnt=15, prog_name="woo", ...}, 72) = 4
+++ exited with 0 +++

Cia saib seb peb qhov kev pab cuam tau txuas nrog lo:

$ ip l show dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 xdpgeneric qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    prog/xdp id 160

Cia peb xa pings thiab saib daim ntawv qhia:

$ for s in `seq 234`; do sudo ping -f -c 100 127.0.0.1 >/dev/null 2>&1; done
$ sudo bpftool m dump name woo
key: 00 00 00 00  value: 90 01 00 00 00 00 00 00
key: 01 00 00 00  value: 00 00 00 00 00 00 00 00
key: 02 00 00 00  value: 00 00 00 00 00 00 00 00
key: 03 00 00 00  value: 00 00 00 00 00 00 00 00
key: 04 00 00 00  value: 00 00 00 00 00 00 00 00
key: 05 00 00 00  value: 00 00 00 00 00 00 00 00
key: 06 00 00 00  value: 40 b5 00 00 00 00 00 00
key: 07 00 00 00  value: 00 00 00 00 00 00 00 00
Found 8 elements

Hurray, txhua yam ua haujlwm. Nco ntsoov, los ntawm txoj kev, hais tias peb daim ntawv qhia yog dua tso tawm nyob rau hauv daim ntawv ntawm bytes. Qhov no yog vim lub fact tias, tsis zoo li libbpf Peb tsis tau thauj cov ntaub ntawv hom (BTF). Tab sis peb mam li tham ntxiv txog qhov no lwm zaus.

Cov cuab yeej tsim kho

Hauv seem no, peb yuav saib qhov tsawg kawg nkaus BPF cov cuab yeej tsim tawm.

Feem ntau hais lus, koj tsis tas yuav muaj dab tsi tshwj xeeb los tsim BPF cov kev pab cuam - BPF khiav ntawm txhua qhov chaw faib khoom zoo, thiab cov kev pab cuam tsim los siv clang, uas tuaj yeem muab los ntawm pob. Txawm li cas los xij, vim qhov tseeb tias BPF tab tom txhim kho, cov ntsiav thiab cov cuab yeej hloov pauv tas li, yog tias koj tsis xav sau BPF cov kev pab cuam siv cov txheej txheem qub los ntawm 2019, ces koj yuav tsum tau sau.

  • llvm/clang
  • pahole
  • nws core
  • bpftool

(Rau kev siv, ntu no thiab tag nrho cov piv txwv hauv kab lus tau khiav ntawm Debian 10.)

lwm/clang

BPF yog phooj ywg nrog LLVM thiab, txawm hais tias tsis ntev los no cov kev pab cuam rau BPF tuaj yeem muab tso ua ke siv gcc, txhua qhov kev txhim kho tam sim no yog ua rau LLVM. Yog li ntawd, ua ntej ntawm tag nrho cov, peb yuav tsim lub tam sim no version clang los ntawm git:

$ sudo apt install ninja-build
$ git clone --depth 1 https://github.com/llvm/llvm-project.git
$ mkdir -p llvm-project/llvm/build/install
$ cd llvm-project/llvm/build
$ cmake .. -G "Ninja" -DLLVM_TARGETS_TO_BUILD="BPF;X86" 
                      -DLLVM_ENABLE_PROJECTS="clang" 
                      -DBUILD_SHARED_LIBS=OFF 
                      -DCMAKE_BUILD_TYPE=Release 
                      -DLLVM_BUILD_RUNTIME=OFF
$ time ninja
... ΠΌΠ½ΠΎΠ³ΠΎ Π²Ρ€Π΅ΠΌΠ΅Π½ΠΈ спустя
$

Tam sim no peb tuaj yeem tshawb xyuas yog tias txhua yam tuaj ua ke raug:

$ ./bin/llc --version
LLVM (http://llvm.org/):
  LLVM version 11.0.0git
  Optimized build.
  Default target: x86_64-unknown-linux-gnu
  Host CPU: znver1

  Registered Targets:
    bpf    - BPF (host endian)
    bpfeb  - BPF (big endian)
    bpfel  - BPF (little endian)
    x86    - 32-bit X86: Pentium-Pro and above
    x86-64 - 64-bit X86: EM64T and AMD64

(Cov lus qhia ua ke clang coj los ntawm kuv bpf_devel_QA.)

Peb yuav tsis nruab cov kev pab cuam uas peb nyuam qhuav tsim, tab sis tsuas yog ntxiv rau lawv PATH, piv txwv:

export PATH="`pwd`/bin:$PATH"

(Qhov no tuaj yeem ntxiv rau .bashrc los yog mus rau ib daim ntawv cais. Tus kheej, kuv ntxiv tej yam zoo li no rau ~/bin/activate-llvm.sh thiab thaum tsim nyog kuv ua . activate-llvm.sh.)

Pahole thiab BTF

Π’ΠΈΠ»ΠΈΡ‚Π° pahole siv thaum tsim lub kernel los tsim cov ntaub ntawv debugging hauv BTF hom. Peb yuav tsis mus rau hauv kev nthuav dav hauv tsab xov xwm no txog cov ntsiab lus ntawm BTF thev naus laus zis, tsis yog qhov tseeb tias nws yooj yim thiab peb xav siv nws. Yog li yog tias koj yuav tsim koj lub kernel, tsim ua ntej pahole (tsis muaj pahole koj yuav tsis tuaj yeem tsim cov ntsiav nrog kev xaiv CONFIG_DEBUG_INFO_BTF:

$ git clone https://git.kernel.org/pub/scm/devel/pahole/pahole.git
$ cd pahole/
$ sudo apt install cmake
$ mkdir build
$ cd build/
$ cmake -D__LIB=lib ..
$ make
$ sudo make install
$ which pahole
/usr/local/bin/pahole

Kernels rau kev sim nrog BPF

Thaum tshawb nrhiav qhov muaj peev xwm ntawm BPF, kuv xav sib sau ua ke kuv tus kheej lub hauv paus. Qhov no, feem ntau hais lus, tsis yog qhov tsim nyog, vim tias koj yuav tuaj yeem sau thiab thauj cov BPF cov kev pab cuam ntawm cov kernel faib, txawm li cas los xij, muaj koj tus kheej kernel tso cai rau koj siv qhov tseeb BPF nta, uas yuav tshwm sim hauv koj qhov kev faib tawm hauv lub hlis ntawm qhov zoo tshaj plaws. , los yog, raws li nyob rau hauv cov ntaub ntawv ntawm ib co debugging cuab yeej yuav tsis tau ntim nyob rau hauv lub foreseeable yav tom ntej. Tsis tas li ntawd, nws tus kheej tseem ceeb ua rau nws xav tias tseem ceeb rau kev sim nrog cov cai.

Txhawm rau tsim cov ntsiav koj xav tau, thawj zaug, cov ntsiav nws tus kheej, thiab thib ob, cov ntaub ntawv teeb tsa cov ntsiav. Txhawm rau sim nrog BPF peb tuaj yeem siv qhov qub vanilla kernel los yog ib qho kev loj hlob kernels. Keeb kwm, BPF txoj kev loj hlob tshwm sim nyob rau hauv Linux networking zej zog thiab yog li ntawd tag nrho cov kev hloov sai los yog tom qab mus los ntawm David Miller, lub Linux networking tswj. Nyob ntawm lawv qhov xwm txheej - hloov kho lossis cov yam ntxwv tshiab - kev hloov pauv network poob rau hauv ib qho ntawm ob lub cores - net los yog net-next. Cov kev hloov pauv rau BPF tau muab faib ua tib yam ntawm bpf ΠΈ bpf-next, uas yog ces pooled rau hauv net thiab net-tom ntej, feem. Yog xav paub ntxiv, saib bpf_devel_QA ΠΈ netdev-FAQ. Yog li xaiv lub kernel raws li koj saj thiab kev ruaj ntseg xav tau ntawm lub kaw lus koj tab tom sim (*-next kernels yog qhov tsis ruaj khov tshaj plaws ntawm cov npe).

Nws yog dhau ntawm cov kab lus ntawm tsab xov xwm no los tham txog yuav ua li cas tswj cov ntawv teeb tsa cov ntaub ntawv - nws xav tias koj twb paub yuav ua li cas, lossis npaj kawm ntawm tus kheej. Txawm li cas los xij, cov lus qhia hauv qab no yuav tsum muaj ntau lossis tsawg txaus los muab rau koj ua haujlwm BPF-enabled system.

Download tau ib qho ntawm cov kernels saum toj no:

$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
$ cd bpf-next

Tsim ib qho me me ua haujlwm kernel config:

$ cp /boot/config-`uname -r` .config
$ make localmodconfig

Qhib cov kev xaiv BPF hauv cov ntaub ntawv .config ntawm koj tus kheej xaiv (feem ntau yuav CONFIG_BPF yuav tau enabled txij li systemd siv nws). Nov yog cov npe ntawm cov kev xaiv los ntawm cov ntsiav siv rau kab lus no:

CONFIG_CGROUP_BPF=y
CONFIG_BPF=y
CONFIG_BPF_LSM=y
CONFIG_BPF_SYSCALL=y
CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
CONFIG_BPF_JIT_ALWAYS_ON=y
CONFIG_BPF_JIT_DEFAULT_ON=y
CONFIG_IPV6_SEG6_BPF=y
# CONFIG_NETFILTER_XT_MATCH_BPF is not set
# CONFIG_BPFILTER is not set
CONFIG_NET_CLS_BPF=y
CONFIG_NET_ACT_BPF=y
CONFIG_BPF_JIT=y
CONFIG_BPF_STREAM_PARSER=y
CONFIG_LWTUNNEL_BPF=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_BPF_EVENTS=y
CONFIG_BPF_KPROBE_OVERRIDE=y
CONFIG_DEBUG_INFO_BTF=y

Tom qab ntawd peb tuaj yeem yooj yim sib sau ua ke thiab nruab cov modules thiab cov kernel (los ntawm txoj kev, koj tuaj yeem sib sau cov kernel siv cov khoom tshiab. clanglos ntawm kev ntxiv CC=clang):

$ make -s -j $(getconf _NPROCESSORS_ONLN)
$ sudo make modules_install
$ sudo make install

thiab rov pib dua nrog lub kernel tshiab (Kuv siv rau qhov no kexec los ntawm pob kexec-tools):

v=5.8.0-rc6+ # Ссли Π²Ρ‹ пСрСсобираСтС Ρ‚Π΅ΠΊΡƒΡ‰Π΅Π΅ ядро, Ρ‚ΠΎ ΠΌΠΎΠΆΠ½ΠΎ Π΄Π΅Π»Π°Ρ‚ΡŒ v=`uname -r`
sudo kexec -l -t bzImage /boot/vmlinuz-$v --initrd=/boot/initrd.img-$v --reuse-cmdline &&
sudo kexec -e

bpftool ua

Cov khoom siv feem ntau siv hauv kab lus yuav yog cov khoom siv hluav taws xob bpftool, muab los ua ib feem ntawm Linux ntsiav. Nws yog sau thiab khaws cia los ntawm BPF cov neeg tsim khoom rau BPF cov neeg tsim khoom thiab tuaj yeem siv los tswj txhua yam ntawm BPF cov khoom - thauj cov kev pab cuam, tsim thiab kho cov duab qhia chaw, tshawb txog lub neej ntawm BPF ecosystem, thiab lwm yam. Cov ntaub ntawv nyob rau hauv daim ntawv ntawm qhov chaws code rau txiv neej nplooj ntawv tuaj yeem pom hauv lub hauv paus los yog, twb compiled, nyob rau hauv net.

Thaum lub sijhawm sau ntawv no bpftool los npaj-ua tsuas yog rau RHEL, Fedora thiab Ubuntu (saib, piv txwv li, txoj xov no, uas qhia txog qhov tsis tiav ntawm kev ntim khoom bpftool hauv Debian). Tab sis yog tias koj twb tau tsim koj lub kernel, ces tsim bpftool yooj yim li pie:

$ cd ${linux}/tools/bpf/bpftool
# ... ΠΏΡ€ΠΎΠΏΠΈΡˆΠΈΡ‚Π΅ ΠΏΡƒΡ‚ΠΈ ΠΊ послСднСму clang, ΠΊΠ°ΠΊ рассказано Π²Ρ‹ΡˆΠ΅
$ make -s

Auto-detecting system features:
...                        libbfd: [ on  ]
...        disassembler-four-args: [ on  ]
...                          zlib: [ on  ]
...                        libcap: [ on  ]
...               clang-bpf-co-re: [ on  ]

Auto-detecting system features:
...                        libelf: [ on  ]
...                          zlib: [ on  ]
...                           bpf: [ on  ]

$

(Ntawm no ${linux} - qhov no yog koj daim ntawv teev npe kernel.) Tom qab ua tiav cov lus txib no bpftool yuav muab sau rau hauv ib phau ntawv ${linux}/tools/bpf/bpftool thiab nws tuaj yeem muab ntxiv rau txoj hauv kev (ua ntej ntawm txhua tus neeg siv root) los yog luam rau xwb /usr/local/sbin.

Sau bpftool nws yog qhov zoo tshaj los siv tom kawg clang, sib sau ua ke raws li tau piav qhia saum toj no, thiab xyuas seb nws puas yog sib sau ua ke - siv, piv txwv li, cov lus txib

$ sudo bpftool feature probe kernel
Scanning system configuration...
bpf() syscall for unprivileged users is enabled
JIT compiler is enabled
JIT compiler hardening is disabled
JIT compiler kallsyms exports are enabled for root
...

uas yuav qhia tau tias BPF nta twg tau qhib rau hauv koj lub kernel.

Los ntawm txoj kev, cov lus txib yav dhau los tuaj yeem khiav raws li

# bpftool f p k

Qhov no yog ua los ntawm kev sib piv nrog cov khoom siv hluav taws xob los ntawm pob iproute2, qhov twg peb tuaj yeem, piv txwv li, hais ip a s eth0 es tsis txhob ip addr show dev eth0.

xaus

BPF tso cai rau koj los ua khau ib tug dev mub kom zoo ntsuas thiab ntawm-tus-fly hloov lub functionality ntawm cov tub ntxhais. Lub kaw lus tau ua tiav zoo heev, nyob rau hauv cov kev cai zoo tshaj plaws ntawm UNIX: ib qho yooj yim mechanism uas tso cai rau koj mus (rov) cov kev pab cuam lub kernel tso cai rau ib tug lossis loj tus naj npawb ntawm cov neeg thiab cov koom haum sim. Thiab, txawm hais tias qhov kev sim, nrog rau kev txhim kho ntawm BPF infrastructure nws tus kheej, nyob deb ntawm kev ua tiav, lub kaw lus twb muaj ABI ruaj khov uas tso cai rau koj los tsim kev ntseeg siab, thiab qhov tseem ceeb tshaj plaws, kev ua lag luam zoo.

Kuv xav kom nco ntsoov tias, hauv kuv lub tswv yim, thev naus laus zis tau nrov heev vim tias, ntawm ib sab, nws tuaj yeem ua tau mus ua si (lub architecture ntawm lub tshuab tuaj yeem nkag siab ntau dua lossis tsawg dua hauv ib hmo), thiab ntawm qhov tod tes, los daws cov teeb meem uas tsis tuaj yeem daws tau (zoo nkauj) ua ntej nws cov tsos. Ob lub ntsiab lus no ua ke ua rau tib neeg sim thiab ua npau suav, uas ua rau muaj kev tshwm sim ntau dua thiab ntau dua tshiab.

Kab lus no, txawm hais tias tsis yog luv luv, tsuas yog qhia txog lub ntiaj teb ntawm BPF thiab tsis tau piav qhia txog "siab" cov yam ntxwv thiab qhov tseem ceeb ntawm kev tsim vaj tsev. Txoj kev npaj mus tom ntej yog qee yam zoo li no: tsab xov xwm tom ntej yuav yog ib qho kev piav qhia ntawm BPF hom kev pab cuam (muaj 5.8 hom kev pab cuam txhawb nqa hauv 30 kernel), tom qab ntawd peb mam li saib thaum kawg yuav ua li cas sau cov ntawv thov BPF tiag tiag siv kernel tracing programs ua piv txwv, ces nws yog lub sijhawm rau kev kawm tob ntxiv ntawm BPF architecture, ua raws li cov piv txwv ntawm BPF networking thiab kev siv kev ruaj ntseg.

Previous tsab xov xwm nyob rau hauv no series

  1. BPF rau cov me me, ib feem xoom: classic BPF

Txuas

  1. BPF thiab XDP Reference Guide - Cov ntaub ntawv ntawm BPF los ntawm cilium, lossis ntau dua los ntawm Daniel Borkman, yog ib tus tsim thiab saib xyuas ntawm BPF. Qhov no yog ib qho ntawm thawj qhov kev piav qhia loj, uas txawv ntawm lwm tus hauv qhov uas Daniyee paub meej tias nws tau sau dab tsi thiab tsis muaj qhov yuam kev. Tshwj xeeb, daim ntawv no piav qhia yuav ua li cas ua haujlwm nrog BPF cov kev pabcuam ntawm XDP thiab TC hom siv cov khoom siv paub zoo ip los ntawm pob iproute2.

  2. Documentation/networking/filter.txt - thawj cov ntaub ntawv nrog cov ntaub ntawv rau classic thiab txuas ntxiv BPF. Ib qho zoo nyeem yog tias koj xav delve rau hauv cov lus sib dhos thiab cov ntsiab lus ntawm kev tsim vaj tsev.

  3. Blog hais txog BPF los ntawm facebook. Nws yog tshiab tsis tshua muaj, tab sis aptly, raws li Alexei Starovoitov (tus sau ntawm eBPF) thiab Andrii Nakryiko - (tus saib xyuas) sau muaj libbpf).

  4. Secrets ntawm bpftool. Ib qho kev lom zem twitter xov los ntawm Quentin Monnet nrog cov piv txwv thiab cov lus zais ntawm kev siv bpftool.

  5. Nkag mus rau hauv BPF: daim ntawv teev cov ntaub ntawv nyeem. Ib daim ntawv loj loj (thiab tseem khaws cia) cov ntawv txuas mus rau BPF cov ntaub ntawv los ntawm Quentin Monnet.

Tau qhov twg los: www.hab.com

Ntxiv ib saib