BPF rau cov me me, ib feem xoom: classic BPF

Berkeley Packet Filters (BPF) yog Linux kernel thev naus laus zis uas tau nyob rau sab xub ntiag ntawm kev tshaj tawm lus Askiv tau ntau xyoo tam sim no. Cov rooj sib tham tau sau nrog cov lus ceeb toom ntawm kev siv thiab kev txhim kho BPF. David Miller, Linux network subsystem tu, hu nws tham ntawm Linux Plumbers 2018 "Cov lus no tsis yog hais txog XDP" (XDP yog ib rooj plaub siv rau BPF). Brendan Gregg muab cov lus hu ua Linux BPF Superpowers. Toke HΓΈiland-JΓΈrgensen luaghais tias lub kernel tam sim no yog microkernel. Thomas Graf txhawb lub tswv yim ntawd BPF yog javascript rau lub kernel.

Tseem tsis muaj qhov kev piav qhia ntawm BPF ntawm HabrΓ©, thiab yog li ntawd hauv cov ntawv xov xwm kuv yuav sim tham txog keeb kwm ntawm cov thev naus laus zis, piav qhia txog cov cuab yeej tsim thiab kev tsim kho, thiab piav qhia txog thaj chaw ntawm kev thov thiab kev siv BPF. Kab lus no, xoom, hauv koob, qhia txog keeb kwm thiab kev tsim vaj tsev ntawm classic BPF, thiab tseem qhia cov lus zais ntawm nws cov qauv kev ua haujlwm. tcpdump, seccomp, strace, thiab ntau ntxiv.

Kev txhim kho ntawm BPF yog tswj hwm los ntawm Linux networking zej zog, cov ntawv thov tseem ceeb ntawm BPF muaj feem cuam tshuam nrog cov tes hauj lwm thiab yog li ntawd, nrog kev tso cai @eucariot, Kuv hu ua koob "BPF rau cov me nyuam", nyob rau hauv kev hwm ntawm koob zoo "Networks rau cov me nyuam".

Ib chav kawm luv luv hauv keeb kwm ntawm BPF (c)

Niaj hnub nimno BPF thev naus laus zis yog kev txhim kho thiab nthuav dav ntawm cov thev naus laus zis qub nrog tib lub npe, tam sim no hu ua classic BPF kom tsis txhob muaj kev ntxhov siab. Ib qho txiaj ntsig zoo tau tsim los ntawm classic BPF tcpdump, mechanism seccomp, nrog rau tsawg dua cov modules xt_bpf rau iptables thiab classifier cls_bpf. Nyob rau hauv niaj hnub Linux, classic BPF cov kev pab cuam tau txiav mus rau hauv daim ntawv tshiab, txawm li cas los xij, los ntawm cov neeg siv kev xav, API tau tseem nyob hauv qhov chaw thiab siv tshiab rau classic BPF, raws li peb yuav pom hauv tsab xov xwm no, tseem raug pom. Vim li no, thiab tseem vim tias ua raws li keeb kwm ntawm kev txhim kho classical BPF hauv Linux, nws yuav ua kom pom tseeb dua li cas thiab vim li cas nws hloov mus rau hauv nws daim ntawv niaj hnub, kuv txiav txim siab pib nrog ib tsab xov xwm hais txog classical BPF.

Thaum kawg ntawm lub eighties ntawm lub xyoo pua xeem, engineers los ntawm lub npe nrov Lawrence Berkeley Laboratory tau xav txog cov lus nug ntawm yuav ua li cas lim network packets ntawm kho vajtse uas yog niaj hnub nyob rau hauv lub lig eighties ntawm lub xeem caug xyoo. Lub tswv yim tseem ceeb ntawm kev lim dej, thawj zaug siv hauv CSPF (CMU / Stanford Packet Filter) thev naus laus zis, yog txhawm rau lim cov pob ntawv tsis tsim nyog kom ntxov li sai tau, piv txwv li. nyob rau hauv kernel chaw, vim qhov no tsis txhob luam cov ntaub ntawv tsis tsim nyog rau hauv cov neeg siv qhov chaw. Txhawm rau muab kev ruaj ntseg runtime rau kev khiav cov neeg siv cov lej hauv qhov chaw kernel, lub tshuab sandboxed virtual tau siv.

Txawm li cas los xij, cov tshuab virtual rau cov ntxaij lim dej uas twb muaj lawm tau tsim los khiav ntawm cov tshuab ua ke thiab tsis ua haujlwm zoo ntawm cov tshuab RISC tshiab. Raws li qhov tshwm sim, los ntawm kev siv zog ntawm cov engineers los ntawm Berkeley Labs, BPF tshiab (Berkeley Packet Filters) thev naus laus zis tau tsim, lub tshuab virtual uas tau tsim los ntawm Motorola 6502 processor - lub workhorse ntawm cov khoom zoo li no. Kua II los yog NES. Lub tshuab virtual tshiab tau nce lim kev ua tau zoo kaum zaug piv rau cov kev daws teeb meem uas twb muaj lawm.

BPF tshuab architecture

Peb yuav tau paub txog architecture hauv kev ua haujlwm, tshuaj xyuas cov piv txwv. Txawm li cas los xij, pib nrog, cia peb hais tias lub tshuab muaj ob 32-ntsis sau npe nkag tau rau tus neeg siv, ib qho accumulator. A thiab index register X, 64 bytes ntawm lub cim xeeb (16 lo lus), muaj rau kev sau ntawv thiab nyeem ntawv tom ntej, thiab ib qho me me ntawm cov lus txib rau kev ua haujlwm nrog cov khoom no. Dhia cov lus qhia rau kev ua raws li cov lus qhia kuj muaj nyob rau hauv cov kev pab cuam, tab sis kom lav lub sijhawm ua tiav ntawm qhov kev pab cuam, jumps tsuas yog ua rau pem hauv ntej, piv txwv li, tshwj xeeb, nws raug txwv tsis pub tsim cov voj voog.

Lub tswv yim dav dav rau kev pib lub tshuab yog raws li hauv qab no. Tus neeg siv tsim ib qho kev pab cuam rau BPF architecture thiab, siv qee kernel mechanism (xws li lub kaw lus hu), thauj khoom thiab txuas cov kev pab cuam rau rau qee yam mus rau qhov kev tshwm sim generator nyob rau hauv lub kernel (piv txwv li, ib qho kev tshwm sim yog qhov tuaj txog ntawm lub pob ntawv tom ntej ntawm daim npav network). Thaum muaj ib qho xwm txheej tshwm sim, cov ntsiav ua haujlwm (piv txwv li, hauv tus neeg txhais lus), thiab lub cim xeeb ntawm lub tshuab sib raug rau rau qee yam kernel nco thaj tsam (piv txwv li, cov ntaub ntawv ntawm ib pob khoom tuaj).

Cov saum toj no yuav txaus rau peb pib saib cov piv txwv: peb yuav tau paub txog lub kaw lus thiab cov lus txib raws li qhov tsim nyog. Yog tias koj xav kawm tam sim ntawd cov lus txib ntawm lub tshuab virtual thiab kawm txog tag nrho nws cov peev xwm, ces koj tuaj yeem nyeem cov ntawv thawj BSD Packet Filter thiab/lossis thawj ib nrab ntawm cov ntaub ntawv Documentation/networking/filter.txt los ntawm cov ntaub ntawv kernel. Ntxiv rau, koj tuaj yeem kawm qhov kev nthuav qhia libpcap: Ib qho Kev Tsim Kho Vaj Tse thiab Kev Txhim Kho Kev Ua Haujlwm Zoo rau Cov Pob Txha, nyob rau hauv uas McCanne, ib tug ntawm cov sau phau ntawv ntawm BPF, tham txog keeb kwm ntawm creation libpcap.

Peb tam sim no txav mus los xav txog tag nrho cov piv txwv tseem ceeb ntawm kev siv classic BPF ntawm Linux: tcpdump (libpcap), sib, xt_bpf, cls_bpf.

tcp pom

Txoj kev loj hlob ntawm BPF tau ua nyob rau tib lub sijhawm nrog kev txhim kho ntawm lub hauv ntej rau pob ntawv lim dej - qhov khoom siv paub zoo tcpdump. Thiab, txij li qhov no yog qhov qub tshaj plaws thiab nto moo piv txwv ntawm kev siv classic BPF, muaj nyob rau hauv ntau lub operating systems, peb yuav pib peb txoj kev kawm txog kev siv tshuab nrog nws.

(Kuv tau khiav tag nrho cov piv txwv hauv kab lus no ntawm Linux 5.6.0-rc6. Cov zis ntawm qee cov lus txib tau raug kho kom nyeem tau zoo dua.)

Piv txwv li: saib IPv6 pob ntawv

Cia peb xav txog tias peb xav saib tag nrho IPv6 pob ntawv ntawm ib qho interface eth0. Ua li no peb tuaj yeem khiav qhov program tcpdump nrog lub lim dej yooj yim ip6:

$ sudo tcpdump -i eth0 ip6

Yog li tcpdump compiles lub lim ip6 mus rau hauv BPF architecture bytecode thiab xa mus rau kernel (saib cov ntsiab lus hauv ntu Tcpdump: loading). Cov lim uas thauj khoom yuav raug khiav rau txhua pob ntawv hla dhau lub interface eth0. Yog tias cov lim rov qab tsis yog xoom tus nqi n, ces mus txog n bytes ntawm pob ntawv yuav raug theej rau tus neeg siv qhov chaw thiab peb yuav pom nws hauv cov zis tcpdump.

BPF rau cov me me, ib feem xoom: classic BPF

Nws hloov tawm hais tias peb tuaj yeem yooj yim nrhiav seb qhov twg bytecode raug xa mus rau lub ntsiav tcpdump nrog kev pab ntawm cov tcpdump, yog tias peb khiav nws nrog kev xaiv -d:

$ sudo tcpdump -i eth0 -d ip6
(000) ldh      [12]
(001) jeq      #0x86dd          jt 2    jf 3
(002) ret      #262144
(003) ret      #0

Hauv kab xoom peb khiav cov lus txib ldh [12], uas stands rau "load rau hauv register A ib nrab ib lo lus (16 me ntsis) nyob ntawm qhov chaw nyob 12 "thiab tib lo lus nug yog dab tsi nco peb hais txog? Cov lus teb yog tias ntawm x pib (x+1)th byte ntawm cov ntaub ntawv txheeb xyuas lub network. Peb nyeem cov pob ntawv los ntawm Ethernet interface eth0thiab no txhais tau tiastias pob ntawv zoo li qhov no (rau qhov yooj yim, peb xav tias tsis muaj VLAN cim npe hauv pob ntawv):

       6              6          2
|Destination MAC|Source MAC|Ether Type|...|

Yog li tom qab ua tiav cov lus txib ldh [12] nyob rau hauv lub register A yuav muaj teb Ether Type - hom pob ntawv xa hauv Ethernet ncej no. Ntawm kab 1 peb piv cov ntsiab lus ntawm cov ntawv sau npe A (package type) c 0x86ddthiab no thiab muaj Hom peb xav tau yog IPv6. Ntawm kab 1, ntxiv rau qhov sib piv cov lus txib, muaj ob kab ntxiv - jt 2 ΠΈ jf 3 - cov cim uas koj yuav tsum tau mus yog qhov kev sib piv ua tiav (A == 0x86dd) thiab ua tsis tiav. Yog li ntawd, nyob rau hauv ib tug muaj kev vam meej rooj plaub (IPv6) peb mus rau kab 2, thiab nyob rau hauv ib tug tsis ua hauj lwm rooj plaub - mus rau kab 3. Ntawm kab 3 qhov kev pab cuam terminates nrog code 0 (tsis txhob luam lub pob ntawv), nyob rau kab 2 qhov kev pab cuam terminates nrog code. 262144 (copy kuv qhov siab tshaj plaws ntawm 256 kilobytes pob).

Ib qho piv txwv nyuaj dua: peb saib TCP pob ntawv los ntawm qhov chaw nres nkoj

Cia saib seb lub lim zoo li cas uas luam tag nrho cov pob ntawv TCP nrog qhov chaw nres nkoj 666. Peb yuav xav txog IPv4 rooj plaub, txij li rooj plaub IPv6 yooj yim dua. Tom qab kawm qhov piv txwv no, koj tuaj yeem tshawb xyuas IPv6 lim koj tus kheej li kev tawm dag zog (ip6 and tcp dst port 666) thiab lub lim rau cov ntaub ntawv dav dav (tcp dst port 666). Yog li, lub lim peb xav tau zoo li no:

$ sudo tcpdump -i eth0 -d ip and tcp dst port 666
(000) ldh      [12]
(001) jeq      #0x800           jt 2    jf 10
(002) ldb      [23]
(003) jeq      #0x6             jt 4    jf 10
(004) ldh      [20]
(005) jset     #0x1fff          jt 10   jf 6
(006) ldxb     4*([14]&0xf)
(007) ldh      [x + 16]
(008) jeq      #0x29a           jt 9    jf 10
(009) ret      #262144
(010) ret      #0

Peb twb paub tias kab 0 thiab 1 ua dab tsi. Ntawm kab 2 peb twb tau kuaj xyuas tias qhov no yog IPv4 pob ntawv (Ether Type = 0x800) thiab thauj nws mus rau hauv lub register A 24th byte ntawm pob ntawv. Peb pob zoo li

       14            8      1     1
|ethernet header|ip fields|ttl|protocol|...|

uas txhais tau tias peb thauj khoom rau hauv lub npe A Cov txheej txheem teb ntawm IP header, uas yog qhov laj thawj, vim peb xav luam cov pob ntawv TCP nkaus xwb. Peb piv raws tu qauv nrog 0x6 (IPPROTO_TCP) hauv kab 3.

Ntawm kab 4 thiab 5 peb thauj cov lus ib nrab nyob ntawm qhov chaw nyob 20 thiab siv cov lus txib jset xyuas seb ib qho ntawm peb tau teeb tsa chij - hnav daim npog qhov ncauj tawm jset peb qhov tseem ceeb tshaj plaws yog tshem tawm. Ob ntawm peb cov khoom qhia peb seb lub pob ntawv puas yog ib feem ntawm cov pob ntawv IP tawg, thiab yog tias muaj, txawm tias nws yog qhov kawg tawg. Qhov thib peb ntsis yog tshwj tseg thiab yuav tsum yog xoom. Peb tsis xav kuaj cov lej lossis cov pob ntawv tawg, yog li peb xyuas tag nrho peb cov khoom.

Kab 6 yog qhov nthuav tshaj plaws hauv cov npe no. Kev nthuav qhia ldxb 4*([14]&0xf) txhais tau tias peb thauj khoom rau hauv lub npe X qhov tsawg kawg yog plaub me ntsis ntawm kaum tsib byte ntawm pob ntawv muab faib ua 4. Qhov tsawg kawg yog plaub qhov ntawm kaum tsib byte yog qhov chaw Internet Header Length IPv4 header, uas khaws qhov ntev ntawm lub header hauv cov lus, yog li koj yuav tsum tau muab faib los ntawm 4. Interestingly, cov lus qhia 4*([14]&0xf) yog ib qho kev xaiv rau ib qho chaw nyob tshwj xeeb uas tsuas yog siv tau rau hauv daim ntawv no thiab tsuas yog rau npe X, i.e. peb tsis tuaj yeem hais ib yam ldb 4*([14]&0xf) tsis ldxb 5*([14]&0xf) (peb tuaj yeem qhia qhov sib txawv offset, piv txwv li, ldxb 4*([16]&0xf)). Nws yog qhov tseeb tias qhov txheej txheem hais lus no tau ntxiv rau BPF meej kom tau txais X (index register) IPv4 header length.

Yog li ntawm kab 7 peb sim thauj ib nrab ntawm lo lus ntawm (X+16). Nco ntsoov tias 14 bytes yog nyob ntawm Ethernet header, thiab X muaj qhov ntev ntawm IPv4 header, peb nkag siab tias hauv A TCP chaw nres nkoj chaw nres nkoj yog loaded:

       14           X           2             2
|ethernet header|ip header|source port|destination port|

Thaum kawg, ntawm kab 8 peb piv cov chaw nres nkoj lo lus uas peb xav tau thiab ntawm kab 9 lossis 10 peb rov qab cov txiaj ntsig - seb puas yuav luam cov pob ntawv lossis tsis.

Tcpdump: loading

Hauv cov piv txwv yav dhau los, peb tshwj xeeb tsis tau nyob hauv kev nthuav dav raws nraim li cas peb thauj BPF bytecode rau hauv cov ntsiav rau pob ntawv lim. Feem ntau hais lus, tcpdump ported rau ntau lub tshuab thiab ua haujlwm nrog cov lim dej tcpdump siv lub tsev qiv ntawv libpcap. Luv luv, tso lub lim ntawm ib qho interface siv libpcap, koj yuav tsum ua cov hauv qab no:

Saib seb qhov ua haujlwm li cas pcap_setfilter siv hauv Linux, peb siv strace (qee kab tau raug tshem tawm):

$ sudo strace -f -e trace=%network tcpdump -p -i eth0 ip
socket(AF_PACKET, SOCK_RAW, 768)        = 3
bind(3, {sa_family=AF_PACKET, sll_protocol=htons(ETH_P_ALL), sll_ifindex=if_nametoindex("eth0"), sll_hatype=ARPHRD_NETROM, sll_pkttype=PACKET_HOST, sll_halen=0}, 20) = 0
setsockopt(3, SOL_SOCKET, SO_ATTACH_FILTER, {len=4, filter=0xb00bb00bb00b}, 16) = 0
...

Ntawm thawj ob kab ntawm cov zis peb tsim raw lub qhov (socket). nyeem tag nrho Ethernet thav ntawv thiab khi nws mus rau lub interface eth0Cov. Ntawm peb thawj tus piv txwv peb paub tias lim ip yuav muaj plaub BPF cov lus qhia, thiab ntawm kab thib peb peb pom yuav ua li cas siv qhov kev xaiv SO_ATTACH_FILTER system hu setsockopt peb thauj khoom thiab txuas lub lim ntawm qhov ntev 4. Qhov no yog peb lub lim.

Nws yog ib nqi sau cia hais tias nyob rau hauv classic BPF, loading thiab txuas lub lim ib txwm tshwm sim raws li ib tug atomic lag luam, thiab nyob rau hauv lub tshiab version ntawm BPF, loading qhov kev pab cuam thiab khi rau qhov kev tshwm sim generator yog sib cais nyob rau hauv lub sij hawm.

Nthuav qhov tseeb

Ib me ntsis ua tiav version ntawm cov zis zoo li no:

$ sudo strace -f -e trace=%network tcpdump -p -i eth0 ip
socket(AF_PACKET, SOCK_RAW, 768)        = 3
bind(3, {sa_family=AF_PACKET, sll_protocol=htons(ETH_P_ALL), sll_ifindex=if_nametoindex("eth0"), sll_hatype=ARPHRD_NETROM, sll_pkttype=PACKET_HOST, sll_halen=0}, 20) = 0
setsockopt(3, SOL_SOCKET, SO_ATTACH_FILTER, {len=1, filter=0xbeefbeefbeef}, 16) = 0
recvfrom(3, 0x7ffcad394257, 1, MSG_TRUNC, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
setsockopt(3, SOL_SOCKET, SO_ATTACH_FILTER, {len=4, filter=0xb00bb00bb00b}, 16) = 0
...

Raws li tau hais los saum no, peb thauj khoom thiab txuas peb lub lim mus rau lub qhov (socket) ntawm kab 5, tab sis yuav ua li cas rau kab 3 thiab 4? Nws hloov tawm tias qhov no libpcap saib xyuas peb - ​​kom cov zis ntawm peb cov lim tsis suav nrog cov pob ntawv uas tsis txaus siab rau nws, lub tsev qiv ntawv txuas dummy lim ret #0 (poob tag nrho cov pob ntawv), hloov lub qhov (socket) mus rau hom tsis thaiv thiab sim rho tawm tag nrho cov pob ntawv uas tuaj yeem nyob ntawm cov lim dej dhau los.

Nyob rau hauv tag nrho, txhawm rau lim pob ntawm Linux siv classic BPF, koj yuav tsum muaj cov lim dej hauv cov qauv zoo li struct sock_fprog thiab lub qhov (socket) qhib, tom qab ntawd cov lim tuaj yeem txuas rau lub qhov (socket) siv lub kaw lus hu setsockopt.

Interestingly, lub lim tuaj yeem txuas rau txhua lub qhov (socket), tsis yog raw. Ntawm no Piv txwv ib qho kev pab cuam uas txiav tawm tag nrho tab sis thawj ob bytes los ntawm tag nrho cov UDP datagrams tuaj. (Kuv ntxiv cov lus hauv cov cai kom tsis txhob cuam tshuam cov kab lus.)

Xav paub ntau ntxiv txog kev siv setsockopt rau kev sib txuas cov ntxaij lim dej, saib qhov (7), tab sis hais txog kev sau koj tus kheej cov ntxaij lim dej zoo li struct sock_fprog tsis muaj kev pab tcpdump peb mam li tham hauv seem Programming BPF siv peb tus kheej tes.

Classic BPF thiab lub xyoo pua XNUMXst

BPF tau suav nrog hauv Linux hauv 1997 thiab tseem ua haujlwm rau lub sijhawm ntev libpcap tsis muaj kev hloov pauv tshwj xeeb (Linux-kev hloov pauv tshwj xeeb, tau kawg, nws yog, tab sis lawv tsis hloov lub ntiaj teb daim duab). Thawj cov cim tseem ceeb uas BPF yuav hloov zuj zus tuaj hauv xyoo 2011, thaum Eric Dumazet tau thov. thaj, uas ntxiv Just In Time Compiler rau lub kernel - tus neeg txhais lus rau hloov BPF bytecode rau haiv neeg x86_64 code.

JIT compiler yog thawj zaug hauv cov saw ntawm kev hloov pauv: hauv xyoo 2012 tshwm sim muaj peev xwm sau cov lim rau seccomp, siv BPF, thaum Lub Ib Hlis 2013 muaj ntxiv tus qauv xt_bpf, uas tso cai rau koj sau cov cai rau iptables nrog kev pab los ntawm BPF, thiab thaum Lub Kaum Hli 2013 yog ntxiv kuj yog ib tug module cls_bpf, uas tso cai rau koj sau cov tsheb khiav tsheb khiav siv BPF.

Peb yuav saib tag nrho cov piv txwv no kom ntxaws ntxiv sai sai no, tab sis ua ntej nws yuav muaj txiaj ntsig zoo rau peb los kawm txog kev sau thiab sau cov kev ua haujlwm tsis txaus ntseeg rau BPF, txij li lub peev xwm muab los ntawm lub tsev qiv ntawv libpcap txwv (piv txwv yooj yim: lim generated libpcap tuaj yeem rov qab tsuas yog ob qhov txiaj ntsig - 0 lossis 0x40000) lossis feem ntau, xws li hauv cov ntaub ntawv seccomp, tsis siv tau.

Programming BPF siv peb tus kheej tes

Cia peb paub txog hom binary ntawm BPF cov lus qhia, nws yooj yim heev:

   16    8    8     32
| code | jt | jf |  k  |

Txhua qhov kev qhia muaj 64 khoom, uas thawj 16 khoom yog cov lus qhia, tom qab ntawd muaj ob yim-ntsis indents, jt ΠΈ jf, thiab 32 ntsis rau qhov kev sib cav K, lub hom phiaj ntawm qhov sib txawv ntawm cov lus txib mus rau kev hais kom ua. Piv txwv li, cov lus txib ret, uas terminates qhov kev pab cuam muaj cov cai 6, thiab tus nqi rov qab yog muab los ntawm qhov tsis tu ncua K. Hauv C, ib qho kev qhia BPF yog sawv cev raws li tus qauv

struct sock_filter {
        __u16   code;
        __u8    jt;
        __u8    jf;
        __u32   k;
}

thiab tag nrho cov kev pab cuam yog nyob rau hauv daim ntawv ntawm ib tug qauv

struct sock_fprog {
        unsigned short len;
        struct sock_filter *filter;
}

Yog li, peb tuaj yeem sau cov kev pabcuam (piv txwv li, peb paub cov lej qhia los ntawm [1]). Qhov no yog qhov lim yuav zoo li ip6 los ntawm peb thawj tus piv txwv:

struct sock_filter code[] = {
        { 0x28, 0, 0, 0x0000000c },
        { 0x15, 0, 1, 0x000086dd },
        { 0x06, 0, 0, 0x00040000 },
        { 0x06, 0, 0, 0x00000000 },
};
struct sock_fprog prog = {
        .len = ARRAY_SIZE(code),
        .filter = code,
};

qhov kev pab cuam prog peb tuaj yeem siv raws cai hauv kev hu

setsockopt(sk, SOL_SOCKET, SO_ATTACH_FILTER, &prog, sizeof(prog))

Sau cov kev pab cuam nyob rau hauv daim ntawv ntawm lub tshuab code tsis yooj yim heev, tab sis qee zaum nws yog qhov tsim nyog (piv txwv li, rau kev debugging, tsim cov ntawv xeem, sau cov ntawv ntawm HabrΓ©, thiab lwm yam). Rau kev yooj yim, hauv cov ntaub ntawv <linux/filter.h> pab macro tau txhais - tib yam piv txwv li saum toj no tuaj yeem rov sau dua li

struct sock_filter code[] = {
        BPF_STMT(BPF_LD|BPF_H|BPF_ABS, 12),
        BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, ETH_P_IPV6, 0, 1),
        BPF_STMT(BPF_RET|BPF_K, 0x00040000),
        BPF_STMT(BPF_RET|BPF_K, 0),
}

Txawm li cas los xij, qhov kev xaiv no tsis yooj yim heev. Qhov no yog qhov Linux kernel programmers vim li cas, thiab yog li hauv phau ntawv qhia tools/bpf kernels koj tuaj yeem nrhiav tau lub assembler thiab debugger ua haujlwm nrog classic BPF.

Cov lus sib dhos zoo ib yam li debug tso zis tcpdump, tab sis tsis tas li ntawd peb tuaj yeem qhia cov cim cim. Piv txwv li, ntawm no yog ib qho kev pab cuam uas poob tag nrho cov pob ntawv tshwj tsis yog TCP / IPv4:

$ cat /tmp/tcp-over-ipv4.bpf
ldh [12]
jne #0x800, drop
ldb [23]
jneq #6, drop
ret #-1
drop: ret #0

Los ntawm lub neej ntawd, lub assembler generates code nyob rau hauv hom <количСство инструкций>,<code1> <jt1> <jf1> <k1>,..., rau peb piv txwv nrog TCP nws yuav yog

$ tools/bpf/bpf_asm /tmp/tcp-over-ipv4.bpf
6,40 0 0 12,21 0 3 2048,48 0 0 23,21 0 1 6,6 0 0 4294967295,6 0 0 0,

Rau qhov yooj yim ntawm C programmers, ib hom ntawv tso zis sib txawv tuaj yeem siv tau:

$ tools/bpf/bpf_asm -c /tmp/tcp-over-ipv4.bpf
{ 0x28,  0,  0, 0x0000000c },
{ 0x15,  0,  3, 0x00000800 },
{ 0x30,  0,  0, 0x00000017 },
{ 0x15,  0,  1, 0x00000006 },
{ 0x06,  0,  0, 0xffffffff },
{ 0x06,  0,  0, 0000000000 },

Cov ntawv no tuaj yeem muab luam rau hauv hom qauv txhais struct sock_filter, zoo li peb tau ua thaum pib ntawm ntu no.

Linux thiab netsniff-ng extensions

Ntxiv rau tus qauv BPF, Linux thiab tools/bpf/bpf_asm txhawb thiab tsis yog txheej txheem. Yeej, cov lus qhia yog siv los nkag mus rau thaj chaw ntawm cov qauv struct sk_buff, uas piav qhia txog pob ntawv network hauv lub kernel. Txawm li cas los xij, kuj tseem muaj lwm hom kev qhia pab, piv txwv ldw cpu yuav load rau hauv lub register A qhov tshwm sim ntawm kev khiav haujlwm kernel raw_smp_processor_id(). (Nyob rau hauv lub tshiab version ntawm BPF, cov tsis-tus qauv txuas ntxiv no tau txuas ntxiv los muab cov kev pab cuam nrog cov txheej txheem pab pawg rau kev nkag mus rau lub cim xeeb, cov qauv, thiab tsim cov xwm txheej.) Ntawm no yog ib qho piv txwv nthuav ntawm cov lim uas peb luam tawm xwb. pob ntawv headers rau hauv cov neeg siv qhov chaw siv qhov txuas ntxiv poff, payload offset:

ld poff
ret a

BPF extensions tsis tuaj yeem siv rau hauv tcpdump, tab sis qhov no yog ib qho laj thawj zoo kom paub txog cov khoom siv hluav taws xob netsniff-ng, uas, ntawm lwm yam, muaj cov kev pab cuam siab heev netsniff-ng, uas, ntxiv rau kev lim dej siv BPF, kuj muaj lub tshuab hluav taws xob ua haujlwm zoo, thiab ntau dua tools/bpf/bpf_asm, BPF assembler hu ua bpfc. Lub pob muaj cov ntaub ntawv ntxaws ntxaws, saib cov kev sib txuas ntawm qhov kawg ntawm kab lus.

seccomp

Yog li ntawd, peb twb paub yuav ua li cas sau BPF cov kev pab cuam ntawm arbitrary complexity thiab npaj txhij los saib cov piv txwv tshiab, thawj ntawm uas yog lub seccomp technology, uas tso cai rau, siv BPF lim, los tswj cov txheej txheem thiab cov txheej txheem hu sib cav muaj rau. ib tug txheej txheem muab thiab nws cov xeeb leej xeeb ntxwv.

Thawj version ntawm seccomp tau ntxiv rau cov ntsiav hauv xyoo 2005 thiab tsis nrov heev, txij li nws tsuas yog muab ib qho kev xaiv - txhawm rau txwv cov txheej txheem hu rau cov txheej txheem rau cov hauv qab no: read, write, exit ΠΈ sigreturn, thiab cov txheej txheem uas ua txhaum txoj cai raug tua siv SIGKILL. Txawm li cas los xij, hauv xyoo 2012, seccomp tau ntxiv lub peev xwm los siv BPF cov ntxaij lim dej, tso cai rau koj los txheeb xyuas cov txheej txheem kev cai hu xov tooj thiab txawm ua cov tshev rau lawv cov lus sib cav. (Kev txaus siab, Chrome yog ib qho ntawm thawj cov neeg siv ntawm qhov kev ua haujlwm no, thiab cov neeg Chrome tam sim no tab tom tsim KRSI mechanism raws li tus tshiab version ntawm BPF thiab tso cai rau kev hloov kho ntawm Linux Security Modules.) Txuas mus rau cov ntaub ntawv ntxiv tuaj yeem pom thaum kawg. ntawm tsab xov xwm.

Nco ntsoov tias twb tau muaj cov lus nyob rau hauv lub hub hais txog kev siv seccomp, tej zaum ib tug neeg yuav xav nyeem lawv ua ntej (los yog es tsis txhob) nyeem cov nram qab no subsections. Hauv tsab xov xwm Ntim thiab kev ruaj ntseg: seccomp muab piv txwv ntawm kev siv seccomp, ob qho tib si 2007 version thiab version siv BPF (filters yog generated siv libseccomp), tham txog kev sib txuas ntawm seccomp nrog Docker, thiab kuj muab ntau yam kev sib txuas. Hauv tsab xov xwm cais daemons nrog systemd lossis "koj tsis xav tau Docker rau qhov no!" Nws npog, tshwj xeeb, yuav ua li cas ntxiv blacklists lossis whitelists ntawm lub kaw lus hu rau daemons khiav systemd.

Tom ntej no peb yuav pom yuav ua li cas sau thiab thauj cov lim rau seccomp nyob rau hauv liab qab C thiab siv lub tsev qiv ntawv libseccomp thiab dab tsi yog qhov zoo thiab qhov tsis zoo ntawm txhua qhov kev xaiv, thiab thaum kawg, cia saib seb seccomp siv los ntawm qhov program strace.

Sau thiab thauj cov lim rau seccomp

Peb twb paub yuav ua li cas sau BPF cov kev pab cuam, yog li cia peb xub saib ntawm seccomp programming interface. Koj tuaj yeem teeb lub lim dej ntawm cov txheej txheem, thiab tag nrho cov txheej txheem me nyuam yuav tau txais cov kev txwv. Qhov no yog ua tiav siv lub kaw lus hu seccomp(2):

seccomp(SECCOMP_SET_MODE_FILTER, flags, &filter)

qhov twg &filter - qhov no yog ib tug taw tes rau ib tug qauv twb paub rau peb struct sock_fprog, i.e. BPF program.

Cov kev pab cuam rau seccomp txawv ntawm cov kev pab cuam rau cov qhov (socket) li cas? Cov ntsiab lus kis tau. Nyob rau hauv cov ntaub ntawv ntawm lub qhov (sockets), peb tau txais ib qho chaw nco uas muaj cov pob ntawv, thiab nyob rau hauv cov ntaub ntawv ntawm seccomp peb tau muab ib tug qauv zoo li

struct seccomp_data {
    int   nr;
    __u32 arch;
    __u64 instruction_pointer;
    __u64 args[6];
};

nws yog nr yog tus xov tooj ntawm lub kaw lus hu yuav tsum launched, arch - tam sim no architecture (ntxiv rau qhov no hauv qab no), args - mus txog rau rau qhov kev sib cav hu, thiab instruction_pointer yog tus taw tes rau tus neeg siv qhov chaw qhia uas ua rau lub kaw lus hu. Yog li, piv txwv li, txhawm rau thauj khoom hu xov tooj rau hauv lub npe A peb yuav tsum hais

ldw [0]

Muaj lwm yam nta rau cov kev pab cuam seccomp, piv txwv li, cov ntsiab lus tsuas yog nkag tau los ntawm 32-ntsis kev sib raug zoo thiab koj tsis tuaj yeem thauj ib nrab ib lo lus lossis ib byte - thaum sim thauj cov lim. ldh [0] system hu seccomp yuav rov qab los EINVAL. Lub luag haujlwm kuaj xyuas cov lim dej ntim seccomp_check_filter() cov noob. (Funny tshaj plaws yog, nyob rau hauv thawj qhov kev cog lus uas ntxiv cov kev ua haujlwm seccomp, lawv tsis nco qab ntxiv kev tso cai siv cov lus qhia rau txoj haujlwm no mod (kev faib seem seem) thiab tam sim no tsis muaj rau cov kev pabcuam secomp BPF, txij li nws ntxiv yuav tawg ABI.)

Yeej, peb twb paub txhua yam los sau thiab nyeem cov kev pabcuam seccomp. Feem ntau cov kev pab cuam logic yog teem raws li ib tug dawb los yog dub daim ntawv teev npe hu, piv txwv li qhov kev pab cuam

ld [0]
jeq #304, bad
jeq #176, bad
jeq #239, bad
jeq #279, bad
good: ret #0x7fff0000 /* SECCOMP_RET_ALLOW */
bad: ret #0

txheeb xyuas cov npe dub ntawm plaub lub xov tooj hu xov tooj 304, 176, 239, 279. Cov kab ke hu li cas? Peb tsis tuaj yeem hais meej, txij li peb tsis paub txog qhov twg architecture qhov program tau sau. Yog li ntawd, tus sau ntawm seccomp muab pib tag nrho cov kev pab cuam nrog kev kuaj xyuas architecture (qhov tam sim no architecture yog qhia nyob rau hauv cov ntsiab lus teb raws li ib daim teb arch cov qauv struct seccomp_data). Nrog kev kuaj xyuas architecture, qhov pib ntawm qhov piv txwv yuav zoo li:

ld [4]
jne #0xc000003e, bad_arch ; SCMP_ARCH_X86_64

thiab tom qab ntawd peb cov lej hu xov tooj yuav tau txais qee yam txiaj ntsig.

Peb sau thiab thauj cov lim rau seccomp siv libseccomp

Kev sau cov lim dej hauv haiv neeg lossis hauv BPF sib dhos tso cai rau koj kom muaj kev tswj hwm tag nrho ntawm qhov tshwm sim, tab sis tib lub sijhawm, nws yog qee zaum nyiam kom muaj cov lej nqa tau thiab / lossis nyeem tau. Lub tsev qiv ntawv yuav pab peb nrog qhov no libseccomp, uas muab tus qauv interface rau kev sau cov lim dej dub lossis dawb.

Cia peb, piv txwv li, sau ib qho kev pab cuam uas khiav cov ntaub ntawv binary ntawm tus neeg siv kev xaiv, yav tas los teeb tsa ib daim ntawv teev npe dub ntawm kev hu xov tooj los ntawm cov lus saum toj no (qhov kev pab cuam tau yooj yim rau kev nyeem tau ntau dua, cov ntawv puv tuaj yeem pom S, SΡ“S,):

#include <seccomp.h>
#include <unistd.h>
#include <err.h>

static int sys_numbers[] = {
        __NR_mount,
        __NR_umount2,
       // ... Π΅Ρ‰Π΅ 40 систСмных Π²Ρ‹Π·ΠΎΠ²ΠΎΠ² ...
        __NR_vmsplice,
        __NR_perf_event_open,
};

int main(int argc, char **argv)
{
        scmp_filter_ctx ctx = seccomp_init(SCMP_ACT_ALLOW);

        for (size_t i = 0; i < sizeof(sys_numbers)/sizeof(sys_numbers[0]); i++)
                seccomp_rule_add(ctx, SCMP_ACT_TRAP, sys_numbers[i], 0);

        seccomp_load(ctx);

        execvp(argv[1], &argv[1]);
        err(1, "execlp: %s", argv[1]);
}

Ua ntej peb txhais ib qho array sys_numbers ntawm 40+ system hu xov tooj los thaiv. Tom qab ntawd, pib lub ntsiab lus teb ctx thiab qhia lub tsev qiv ntawv seb peb xav tso cai (SCMP_ACT_ALLOW) txhua lub kaw lus hu los ntawm lub neej ntawd (nws yooj yim dua los tsim blacklists). Tom qab ntawd, ib qho los ntawm ib qho, peb ntxiv tag nrho cov kab ke hu los ntawm blacklist. Hauv kev teb rau kev hu xov tooj los ntawm cov npe, peb thov SCMP_ACT_TRAP, nyob rau hauv cov ntaub ntawv no seccomp yuav xa ib lub teeb liab rau cov txheej txheem SIGSYS nrog rau cov lus piav qhia ntawm qhov kev hu xov tooj ua txhaum txoj cai. Thaum kawg, peb thauj cov kev pab cuam rau hauv cov ntsiav siv seccomp_load, uas yuav muab tso ua ke qhov kev pab cuam thiab muab nws mus rau tus txheej txheem siv ib tug system hu seccomp(2).

Yuav kom ua tiav kev sau ua tiav, qhov kev zov me nyuam yuav tsum txuas nrog lub tsev qiv ntawv libseccomp, piv txwv:

cc -std=c17 -Wall -Wextra -c -o seccomp_lib.o seccomp_lib.c
cc -o seccomp_lib seccomp_lib.o -lseccomp

Piv txwv ntawm kev tso tawm kom tiav:

$ ./seccomp_lib echo ok
ok

Piv txwv ntawm kev thaiv qhov system hu:

$ sudo ./seccomp_lib mount -t bpf bpf /tmp
Bad system call

Peb siv stracekom paub meej:

$ sudo strace -e seccomp ./seccomp_lib mount -t bpf bpf /tmp
seccomp(SECCOMP_SET_MODE_FILTER, 0, {len=50, filter=0x55d8e78428e0}) = 0
--- SIGSYS {si_signo=SIGSYS, si_code=SYS_SECCOMP, si_call_addr=0xboobdeadbeef, si_syscall=__NR_mount, si_arch=AUDIT_ARCH_X86_64} ---
+++ killed by SIGSYS (core dumped) +++
Bad system call

Yuav ua li cas peb thiaj paub tias qhov kev zov me nyuam raug txiav tawm vim yog siv qhov kev hu tsis raug cai mount(2).

Yog li, peb tau sau cov lim dej siv lub tsev qiv ntawv libseccomp, haum non-trivial code rau hauv plaub kab. Hauv qhov piv txwv saum toj no, yog tias muaj coob tus neeg hu xov tooj, lub sijhawm ua tiav tuaj yeem txo qis, vim tias daim tshev tsuas yog cov npe sib piv. Rau optimization, libseccomp nyuam qhuav muaj thaj suav nrog, uas ntxiv kev txhawb nqa rau lub lim dej SCMP_FLTATR_CTL_OPTIMIZE. Kev teeb tsa tus cwj pwm no rau 2 yuav hloov lub lim rau hauv qhov kev tshawb nrhiav binary.

Yog tias koj xav pom yuav ua li cas binary tshawb nrhiav lim dej ua haujlwm, ua tib zoo saib yooj yim tsab ntawv, uas tsim cov kev pab cuam hauv BPF assembler los ntawm kev hu xov tooj hu xov tooj, piv txwv li:

$ echo 1 3 6 8 13 | ./generate_bin_search_bpf.py
ld [0]
jeq #6, bad
jgt #6, check8
jeq #1, bad
jeq #3, bad
ret #0x7fff0000
check8:
jeq #8, bad
jeq #13, bad
ret #0x7fff0000
bad: ret #0

Koj yuav tsis tuaj yeem sau ib yam dab tsi sai dua, txij li BPF cov kev pab cuam tsis tuaj yeem ua tsis tau indentation jumps (peb tsis tuaj yeem ua, piv txwv li, jmp A los yog jmp [label+X]) thiab yog li ntawd tag nrho cov kev hloov pauv yog static.

seccomp thiab strace

Txhua tus paub txog kev siv hluav taws xob strace yog ib qho cuab yeej tseem ceeb rau kev kawm txog tus cwj pwm ntawm cov txheej txheem ntawm Linux. Txawm li cas los xij, ntau tus kuj tau hnov ​​​​txog teeb meem kev ua haujlwm thaum siv cov khoom siv no. Qhov tseeb yog qhov ntawd strace ua tiav siv ptrace(2), thiab hauv cov txheej txheem no peb tsis tuaj yeem qhia meej ntawm cov txheej txheem hu ua dab tsi peb yuav tsum tau nres cov txheej txheem, piv txwv li, cov lus txib

$ time strace du /usr/share/ >/dev/null 2>&1

real    0m3.081s
user    0m0.531s
sys     0m2.073s

ΠΈ

$ time strace -e open du /usr/share/ >/dev/null 2>&1

real    0m2.404s
user    0m0.193s
sys     0m1.800s

tau ua tiav nyob rau hauv kwv yees li tib lub sijhawm, txawm hais tias nyob rau hauv qhov thib ob peb xav taug qab tsuas yog ib qho kev hu xov tooj.

Kev xaiv tshiab --seccomp-bpf, ntxiv rau strace version 5.3, tso cai rau koj kom ceev cov txheej txheem ntau zaus thiab lub sijhawm pib ua haujlwm nyob rau hauv kab ntawm ib qho kev hu xov tooj yog twb piv rau lub sijhawm pib ua haujlwm tsis tu ncua:

$ time strace --seccomp-bpf -e open du /usr/share/ >/dev/null 2>&1

real    0m0.148s
user    0m0.017s
sys     0m0.131s

$ time du /usr/share/ >/dev/null 2>&1

real    0m0.140s
user    0m0.024s
sys     0m0.116s

(Ntawm no, tau kawg, muaj kev dag me ntsis hauv qhov uas peb tsis tau taug qab qhov kev hu xov tooj tseem ceeb ntawm cov lus txib no. Yog tias peb tau taug qab, piv txwv li, newfsstat, ces strace yuav nres ib yam nkaus li tsis muaj --seccomp-bpf.)

Qhov kev xaiv no ua haujlwm li cas? Yog tsis muaj nws strace txuas mus rau tus txheej txheem thiab pib nws siv PTRACE_SYSCALL. Thaum tus txheej txheem tswj teeb meem (ib qho) hu xov tooj, kev tswj hwm raug xa mus rau strace, uas saib cov lus sib cav ntawm qhov system hu thiab khiav nws siv PTRACE_SYSCALL. Tom qab qee lub sijhawm, cov txheej txheem ua tiav qhov kev hu xov tooj thiab thaum tawm ntawm nws, kev tswj hwm tau hloov dua tshiab strace, uas saib cov nqi rov qab thiab pib cov txheej txheem siv PTRACE_SYSCALL, thiab lwm yam.

BPF rau cov me me, ib feem xoom: classic BPF

Nrog seccomp, txawm li cas los xij, cov txheej txheem no tuaj yeem ua kom zoo raws nraim li peb xav tau. Namely, yog hais tias peb xav mus saib xwb nyob rau hauv lub system hu X, ces peb tuaj yeem sau BPF lim uas rau X rov qab muaj nuj nqis SECCOMP_RET_TRACE, thiab rau kev hu uas tsis txaus siab rau peb - SECCOMP_RET_ALLOW:

ld [0]
jneq #X, ignore
trace: ret #0x7ff00000
ignore: ret #0x7fff0000

Hauv qhov no strace pib pib txheej txheem raws li PTRACE_CONT, peb cov lim tau ua tiav rau txhua qhov kev hu xov tooj, yog tias qhov system hu tsis yog X, ces cov txheej txheem txuas ntxiv mus, tab sis yog tias qhov no X, ces seccomp yuav hloov tswj straceuas yuav saib cov lus sib cav thiab pib txheej txheem zoo li PTRACE_SYSCALL (vim seccomp tsis muaj peev xwm khiav ib qho kev pab cuam ntawm kev tawm ntawm lub kaw lus hu). Thaum lub kaw lus hu rov qab, strace yuav rov pib txheej txheem siv PTRACE_CONT thiab yuav tos cov lus tshiab los ntawm seccomp.

BPF rau cov me me, ib feem xoom: classic BPF

Thaum siv qhov kev xaiv --seccomp-bpf muaj ob txoj kev txwv. Ua ntej, nws yuav tsis tuaj yeem koom nrog cov txheej txheem uas twb muaj lawm (kev xaiv -p cov kev pab cuam strace), vim qhov no tsis txhawb los ntawm seccomp. Qhov thib ob, tsis muaj qhov ua tau tsis saib cov txheej txheem menyuam yaus, txij li cov ntxaij lim dej seccomp tau txais los ntawm txhua tus txheej txheem menyuam yaus yam tsis muaj peev xwm lov tes taw qhov no.

Ib me ntsis ntxiv txog yuav ua li cas raws nraim strace ua haujlwm nrog seccomp tuaj yeem pom los ntawm tsab ntawv ceeb toom tsis ntev los no. Rau peb, qhov tseeb nthuav tshaj plaws yog qhov classic BPF sawv cev los ntawm seccomp tseem siv niaj hnub no.

xt_bpf

Cia peb rov qab mus rau lub ntiaj teb ntawm kev sib koom tes.

Keeb kwm: ntev dhau los, xyoo 2007, qhov tseem ceeb yog ntxiv tus qauv xt_u32 rau netfilter. Nws tau sau los ntawm kev sib piv nrog ib qho kev lag luam ntau dua qub cls_u32 thiab tso cai rau koj los sau cov cai binary arbitrary rau iptables siv cov haujlwm yooj yim hauv qab no: thauj khoom 32 khoom los ntawm ib pob thiab ua cov txheej txheem lej ntawm lawv. Piv txwv li,

sudo iptables -A INPUT -m u32 --u32 "6&0xFF=1" -j LOG --log-prefix "seen-by-xt_u32"

Loads 32 khoom ntawm IP header, pib ntawm padding 6, thiab siv lub npog ntsej muag rau lawv 0xFF (nqe qis byte). Daim teb no protocol IP header thiab peb piv nws nrog 1 (ICMP). Koj tuaj yeem ua ke ntau cov tshev hauv ib txoj cai, thiab koj tuaj yeem ua rau tus neeg teb xov tooj @ - txav X bytes mus rau sab xis. Piv txwv li, txoj cai

iptables -m u32 --u32 "6&0xFF=0x6 && 0>>22&0x3C@4=0x29"

xyuas yog TCP Sequence Number tsis sib npaug 0x29. Kuv yuav tsis mus rau hauv cov ntsiab lus ntxiv, vim nws twb paub meej tias kev sau cov cai no los ntawm tes tsis yooj yim heev. Hauv tsab xov xwm BPF - tsis nco qab bytecode, muaj ntau qhov txuas nrog piv txwv ntawm kev siv thiab kev cai tsim rau xt_u32. Saib cov kev sib txuas ntawm qhov kawg ntawm kab lus no.

Txij li thaum 2013 module hloov module xt_u32 Koj tuaj yeem siv BPF raws li module xt_bpf. Tus neeg twg uas tau nyeem qhov deb no yuav tsum paub meej txog lub hauv paus ntsiab lus ntawm nws txoj haujlwm: khiav BPF bytecode raws li iptables cov cai. Koj tuaj yeem tsim txoj cai tshiab, piv txwv li, zoo li no:

iptables -A INPUT -m bpf --bytecode <Π±Π°ΠΉΡ‚ΠΊΠΎΠ΄> -j LOG

no <Π±Π°ΠΉΡ‚ΠΊΠΎΠ΄> - qhov no yog cov cai nyob rau hauv assembler tso zis hom bpf_asm los ntawm default, piv txwv li,

$ cat /tmp/test.bpf
ldb [9]
jneq #17, ignore
ret #1
ignore: ret #0

$ bpf_asm /tmp/test.bpf
4,48 0 0 9,21 0 1 17,6 0 0 1,6 0 0 0,

# iptables -A INPUT -m bpf --bytecode "$(bpf_asm /tmp/test.bpf)" -j LOG

Hauv qhov piv txwv no peb tab tom lim tag nrho UDP pob ntawv. Cov ntsiab lus rau BPF qhov kev pab cuam hauv ib qho module xt_bpf, tau kawg, taw qhia rau pob ntawv cov ntaub ntawv, nyob rau hauv rooj plaub ntawm iptables, mus rau qhov pib ntawm IPv4 header. Rov qab tus nqi los ntawm BPF program booleanqhov twg false txhais tau hais tias lub pob ntawv tsis sib xws.

Nws yog tseeb hais tias lub module xt_bpf txhawb nqa cov ntxaij lim dej ntau dua li qhov piv txwv saum toj no. Cia peb saib cov piv txwv tiag tiag los ntawm Cloudfare. Txog tam sim no lawv tau siv lub module xt_bpf los tiv thaiv DDoS tawm tsam. Hauv tsab xov xwm Taw qhia cov cuab yeej BPF lawv piav qhia li cas (thiab yog vim li cas) lawv tsim BPF cov ntxaij lim dej thiab tshaj tawm cov kev txuas mus rau cov khoom siv hluav taws xob tsim cov lim dej zoo li no. Piv txwv li, siv cov khoom siv hluav taws xob bpfgen koj tuaj yeem tsim BPF qhov kev pab cuam uas phim cov lus nug DNS rau lub npe habr.com:

$ ./bpfgen --assembly dns -- habr.com
ldx 4*([0]&0xf)
ld #20
add x
tax

lb_0:
    ld [x + 0]
    jneq #0x04686162, lb_1
    ld [x + 4]
    jneq #0x7203636f, lb_1
    ldh [x + 8]
    jneq #0x6d00, lb_1
    ret #65535

lb_1:
    ret #0

Nyob rau hauv qhov kev pab cuam peb xub thauj mus rau hauv lub register X pib ntawm kab chaw nyob x04habrx03comx00 hauv UDP datagram thiab tom qab ntawd xyuas qhov kev thov: 0x04686162 <-> "x04hab" thiab ua li ntawd.

Ib me ntsis tom qab, Cloudfare luam tawm p0f -> BPF compiler code. Hauv tsab xov xwm Qhia p0f BPF compiler lawv tham txog dab tsi p0f yog thiab yuav ua li cas hloov p0f kos npe rau BPF:

$ ./bpfgen p0f -- 4:64:0:0:*,0::ack+:0
39,0 0 0 0,48 0 0 8,37 35 0 64,37 0 34 29,48 0 0 0,
84 0 0 15,21 0 31 5,48 0 0 9,21 0 29 6,40 0 0 6,
...

Tam sim no tsis siv Cloudfare lawm xt_bpf, txij li thaum lawv tsiv mus rau XDP - ib qho ntawm cov kev xaiv rau kev siv BPF version tshiab, saib. L4Drop: XDP DDoS Mitigations.

cls_bpf

Qhov piv txwv kawg ntawm kev siv classic BPF hauv cov ntsiav yog cov cais tawm cls_bpf rau kev tswj hwm kev khiav tsheb khiav hauv Linux, ntxiv rau Linux thaum kawg ntawm 2013 thiab hloov pauv lub tswv yim qub. cls_u32.

Txawm li cas los xij, tam sim no peb yuav tsis piav txog txoj haujlwm cls_bpf, txij li thaum los ntawm qhov kev xav ntawm kev paub txog classical BPF qhov no yuav tsis muab rau peb ib yam dab tsi - peb twb paub tag nrho cov functionality. Tsis tas li ntawd, nyob rau hauv cov ntawv tom ntej uas tham txog Extended BPF, peb yuav ntsib tus txheej txheem no ntau dua ib zaug.

Lwm qhov laj thawj tsis txhob tham txog kev siv classic BPF c cls_bpf Qhov teeb meem yog tias, piv rau Extended BPF, qhov kev siv tau rau hauv qhov no yog radically nqaim: classical cov kev pab cuam tsis tuaj yeem hloov cov ntsiab lus ntawm cov pob khoom thiab tsis tuaj yeem txuag lub xeev ntawm kev hu.

Yog li nws yog lub sijhawm los hais lus zoo rau classic BPF thiab saib mus rau yav tom ntej.

Farewell rau classic BPF

Peb tau saib yuav ua li cas BPF thev naus laus zis, tsim nyob rau thaum ntxov nineties, ua tiav nyob rau ib lub hlis twg ntawm ib puas xyoo thiab txog thaum kawg pom cov ntawv thov tshiab. Txawm li cas los xij, zoo ib yam li kev hloov pauv ntawm cov tshuab pawg mus rau RISC, uas yog lub zog rau kev txhim kho classic BPF, nyob rau xyoo 32s muaj kev hloov pauv ntawm 64-ntsis rau XNUMX-ntsis tshuab thiab classic BPF pib dhau los. Tsis tas li ntawd, lub peev xwm ntawm classic BPF muaj tsawg heev, thiab ntxiv rau cov qauv qub qub - peb tsis muaj peev xwm txuag lub xeev ntawm kev hu mus rau BPF cov kev pab cuam, tsis muaj kev cuam tshuam ncaj qha rau cov neeg siv, tsis muaj kev cuam tshuam. nrog cov kernel, tshwj tsis yog nyeem ntawv tsawg tus qauv teb sk_buff thiab tso tawm cov kev pabcuam yooj yim tshaj plaws, koj tsis tuaj yeem hloov cov ntsiab lus ntawm cov pob ntawv thiab hloov mus rau lawv.

Qhov tseeb, tam sim no txhua yam uas tseem tshuav ntawm classic BPF hauv Linux yog API interface, thiab hauv cov ntsiav tag nrho cov kev pabcuam classic, yog nws lub qhov (socket filters) lossis cov ntxaij lim dej seccomp, tau muab txhais ua hom ntawv tshiab, Extended BPF. (Peb mam li tham txog yuav ua li cas qhov no tshwm sim hauv tsab xov xwm tom ntej.)

Kev hloov pauv mus rau ib qho kev tsim kho tshiab tau pib hauv 2013, thaum Alexey Starovoitov tau npaj BPF cov phiaj xwm hloov tshiab. Nyob rau hauv 2014 cov ntaub ntawv sib thooj pib tshwm hauv lub hauv paus. Raws li kuv nkag siab, thawj txoj kev npaj tsuas yog txhawm rau txhim kho cov qauv tsim thiab JIT compiler kom khiav tau zoo dua ntawm 64-ntsis tshuab, tab sis hloov qhov kev ua kom zoo dua no tau cim qhov pib ntawm tshooj tshiab hauv Linux kev txhim kho.

Cov kab lus ntxiv hauv cov koob no yuav hais txog kev tsim vaj tsev thiab kev siv thev naus laus zis tshiab, pib hu ua BPF sab hauv, tom qab ntawd txuas ntxiv BPF, thiab tam sim no tsuas yog BPF.

ua tim khawv

  1. Steven McCanne thiab Van Jacobson, "The BSD Packet Filter: A New Architecture for User-level Packet Capture", https://www.tcpdump.org/papers/bpf-usenix93.pdf
  2. Steven McCanne, "libpcap: Ib qho Kev Tsim Kho Vaj Tse thiab Kev Txhim Kho Kom Zoo rau Pob Khoom Capture", https://sharkfestus.wireshark.org/sharkfest.11/presentations/McCanne-Sharkfest'11_Keynote_Address.pdf
  3. tcpdump, libpcap: https://www.tcpdump.org/
  4. IPtable U32 Match Tutorial.
  5. BPF - qhov tsis nco qab bytecode: https://blog.cloudflare.com/bpf-the-forgotten-bytecode/
  6. Qhia txog BPF Tool: https://blog.cloudflare.com/introducing-the-bpf-tools/
  7. bpf_cls: http://man7.org/linux/man-pages/man8/tc-bpf.8.html
  8. Ib tug seccomp txheej txheem cej luam: https://lwn.net/Articles/656307/
  9. https://github.com/torvalds/linux/blob/master/Documentation/userspace-api/seccomp_filter.rst
  10. habr: Ntim thiab kev ruaj ntseg: seccomp
  11. habr: cais daemons nrog systemd lossis "koj tsis xav tau Docker rau qhov no!"
  12. Paul Chaignon, "strace --seccomp-bpf: saib hauv qab hood", https://fosdem.org/2020/schedule/event/debugging_strace_bpf/
  13. netsniff-ng: http://netsniff-ng.org/

Tau qhov twg los: www.hab.com

Ntxiv ib saib