BPF no na kamalii, hapa zero: BPF ma'amau

ʻO Berkeley Packet Filters (BPF) kahi ʻenehana kernel Linux aia ma nā ʻaoʻao mua o nā paʻi ʻenehana ʻōlelo Pelekania no kekahi mau makahiki i kēia manawa. Hoʻopiha ʻia nā ʻaha kūkā me nā hōʻike e pili ana i ka hoʻohana ʻana a me ka hoʻomohala ʻana o BPF. Ua kāhea ʻo David Miller, ka mea mālama pūnaewele subsystem Linux, i kāna kamaʻilio ma Linux Plumbers 2018 "ʻAʻole kēia kamaʻilio e pili ana iā XDP" (ʻO XDP kahi hihia hoʻohana no BPF). Hāʻawi ʻo Brendan Gregg i nā kamaʻilio me ke kuleana Linux BPF Mana Nui. Toke Høiland-Jørgensen ʻakaʻakahe microkernel kēia manawa. Hoʻolaha ʻo Thomas Graf i ka manaʻo i kēlā ʻO BPF ka javascript no ka kernel.

ʻAʻohe ʻano wehewehe kikoʻī o ka BPF ma Habré, a no laila i loko o kahi ʻatikala e hoʻāʻo wau e kamaʻilio e pili ana i ka mōʻaukala o ka ʻenehana, wehewehe i ka hoʻolālā a me nā mea hana kūkulu, a wehewehe i nā wahi o ka noi a me ka hana o ka hoʻohana ʻana i ka BPF. ʻO kēia ʻatikala, zero, i ka moʻo, e haʻi i ka mōʻaukala a me ka hoʻolālā o ka BPF maʻamau, a hōʻike pū i nā mea huna o kāna mau loina hana. tcpdump, seccomp, strace, a he nui hou aku.

Hoʻomalu ʻia ka hoʻomohala ʻana o BPF e ka Linux networking community, ʻo nā noi nui o BPF e pili ana i nā pūnaewele a no laila, me ka ʻae ʻia. @eucariot, Ua kapa au i ka moʻolelo "BPF no nā kamaliʻi", no ka mahalo i ka moʻolelo nui "Pūnaewele no nā kamaliʻi".

He papa pōkole i ka mōʻaukala o BPF(c)

ʻO ka ʻenehana BPF o kēia wā he mana hoʻomaikaʻi a hoʻonui ʻia o ka ʻenehana kahiko me ka inoa like, i kapa ʻia i kēia manawa ʻo BPF maʻamau e pale aku i ka huikau. Ua hoʻokumu ʻia kahi mea pono kaulana e pili ana i ka BPF maʻamau tcpdump, mīkini hana seccomp, a me nā modula i ʻike ʻole ʻia xt_bpf no ka mea, iptables a me ka papa helu cls_bpf. Ma Linux hou, ua unuhi 'akomi 'ia nā polokalamu BPF ma'amau i ke 'ano hou, akā na'e, mai ka mana'o o ka mea ho'ohana, ua ho'omau 'ia ka API a me nā ho'ohana hou no ka BPF ma'amau, e like me kā mākou e 'ike nei ma kēia 'atikala. No kēia kumu, a no ka hahai ʻana i ka mōʻaukala o ka hoʻomohala ʻana o ka BPF maʻamau ma Linux, e ʻike maopopo ʻia pehea a me ke kumu i ulu ai i kona ʻano hou, ua hoʻoholo wau e hoʻomaka me kahi ʻatikala e pili ana i ka BPF maʻamau.

I ka pau ʻana o nā makahiki kanawalu o ke kenekulia i hala, ua hoihoi nā mea ʻenekinia mai ka Lawrence Berkeley Laboratory kaulana i ka nīnau e pili ana i ke kānana pono ʻana i nā ʻeke pūnaewele ma nā lako hana hou i ka hopena o ke kanawalu o ke kenekulia i hala. ʻO ka manaʻo kumu o ka kānana ʻana, i hoʻokō mua ʻia i ka ʻenehana CSPF (CMU/Stanford Packet Filter), ʻo ia ke kānana i nā ʻeke pono ʻole i ka wā hiki, ʻo ia. i loko o ka kernel space, no ka mea e pale ana kēia i ke kope ʻana i ka ʻikepili pono ʻole i loko o kahi mea hoʻohana. No ka hoʻolako ʻana i ka palekana runtime no ka holo ʻana i ka code mea hoʻohana ma ke kikowaena kernel, ua hoʻohana ʻia kahi mīkini sandboxed virtual.

Eia naʻe, ua hoʻolālā ʻia nā mīkini virtual no nā kānana i kēia manawa e holo ma luna o nā mīkini i hoʻokumu ʻia a ʻaʻole i holo maikaʻi ma nā mīkini RISC hou. ʻO ka hopena, ma o ka hoʻoikaika ʻana o nā ʻenekinia mai Berkeley Labs, ua hoʻomohala ʻia kahi ʻenehana BPF (Berkeley Packet Filters) hou, ʻo ka hoʻolālā mīkini virtual i hoʻolālā ʻia ma muli o ke kaʻina hana Motorola 6502 - ka mea hana o nā huahana kaulana e like me. Apple II ai ole ia, NES. Ua hoʻonui ka mīkini virtual hou i ka hana kānana i ʻumi mau manawa i hoʻohālikelike ʻia i nā hoʻonā i loaʻa.

ʻO ka hoʻolālā mīkini BPF

E kamaʻāina mākou i ka hoʻolālā ʻana ma ke ʻano hana, ka nānā ʻana i nā hiʻohiʻona. Eia naʻe, i ka hoʻomaka ʻana, e ʻōlelo kākou he ʻelua mau papa inoa 32-bit i hiki i ka mea hoʻohana, kahi accumulator. A a me ka papa kuhikuhi kuhikuhi X, 64 bytes o ka hoʻomanaʻo (16 huaʻōlelo), loaʻa no ke kākau ʻana a me ka heluhelu ʻana ma hope, a me kahi ʻōnaehana liʻiliʻi o nā kauoha no ka hana ʻana me kēia mau mea. Loaʻa nā ʻōlelo kuhikuhi no ka hoʻokō ʻana i nā ʻōlelo kūlana i nā papahana, akā no ka hōʻoiaʻiʻo ʻana i ka hoʻopau ʻana i ka manawa kūpono o ka papahana, hiki ke lele i mua, ʻo ia hoʻi, ua pāpā ʻia ka hana ʻana i nā puka lou.

ʻO ka papahana maʻamau no ka hoʻomakaʻana i ka mīkini penei. Hoʻokumu ka mea hoʻohana i kahi papahana no ka hoʻolālā BPF a me ka hoʻohana ʻana kekahi kernel mechanical (e like me ke kelepona ʻōnaehana), hoʻouka a hoʻohui i ka papahana i i kekahi i ka hanana hanana i loko o ka kernel (no ka laʻana, he hanana ka hiki ʻana mai o ka ʻeke aʻe ma ke kāleka pūnaewele). Ke loaʻa kahi hanana, holo ka kernel i ka papahana (no ka laʻana, ma ka unuhi ʻōlelo), a pili ka hoʻomanaʻo ʻana i ka mīkini. i kekahi ʻāpana hoʻomanaʻo kernel (no ka laʻana, ka ʻikepili o kahi ʻeke komo mai).

E lawa ka mea i luna no mākou e hoʻomaka e nānā i nā hiʻohiʻona: e ʻike mākou i ka ʻōnaehana a me ke ʻano kauoha e like me ka mea e pono ai. Inā makemake ʻoe e aʻo koke i ka ʻōnaehana kauoha o kahi mīkini virtual a aʻo e pili ana i kona hiki a pau, a laila hiki iā ʻoe ke heluhelu i ka ʻatikala kumu. Ka BSD Packet Filter a i ʻole ka hapa mua o ka faila Palapala/networking/filter.txt mai ka palapala kernel. Eia hou, hiki iā ʻoe ke aʻo i ka hōʻike libpcap: He Hoʻolālā Hoʻolālā a Hoʻoponopono ʻia no ka hopu ʻana i ka ʻeke, kahi a McCanne, kekahi o nā mea kākau o BPF, e kamaʻilio e pili ana i ka mōʻaukala o ka hana ʻana libpcap.

Ke neʻe nei mākou e noʻonoʻo i nā hiʻohiʻona nui o ka hoʻohana ʻana i ka BPF maʻamau ma Linux: tcpdump (libpcap), kekona, xt_bpf, cls_bpf.

tcpdump

Ua hoʻokō ʻia ka hoʻomohala ʻana o ka BPF me ka hoʻomohala ʻana o ka frontend no ka kānana packet - kahi pono kaulana. tcpdump. A, ʻoiai ʻo kēia ka laʻana kahiko a kaulana loa o ka hoʻohana ʻana i ka BPF maʻamau, i loaʻa ma nā ʻōnaehana hana he nui, e hoʻomaka mākou i kā mākou aʻo ʻana i ka ʻenehana me ia.

(Holo wau i nā hiʻohiʻona āpau i kēia ʻatikala ma Linux 5.6.0-rc6. Ua hoʻoponopono ʻia ka hoʻopuka o kekahi mau kauoha no ka heluhelu maikaʻi ʻana.)

Laʻana: nānā i nā ʻeke IPv6

E noʻonoʻo mākou makemake mākou e nānā i nā ʻeke IPv6 āpau ma kahi interface eth0. No ka hana ʻana i kēia hiki iā mākou ke holo i ka polokalamu tcpdump me kahi kānana maʻalahi ip6:

$ sudo tcpdump -i eth0 ip6

pela tcpdump hōʻuluʻulu i ka kānana ip6 i loko o ka BPF architecture bytecode a hoʻouna iā ia i ka kernel (e ʻike i nā kikoʻī ma ka ʻāpana Tcpdump: hoʻouka). E holo ʻia ka kānana i hoʻouka ʻia no kēlā me kēia ʻeke e hele ana ma ka interface eth0. Inā hoʻihoʻi ka kānana i kahi waiwai ʻole-zero n, a laila a hiki i n e kope ʻia nā bytes o ka ʻeke i kahi mea hoʻohana a ʻike mākou iā ia ma ka puka tcpdump.

BPF no na kamalii, hapa zero: BPF ma'amau

ʻIke ʻia hiki iā mākou ke ʻike maʻalahi i ka bytecode i hoʻouna ʻia i ka kernel tcpdump me ke kokua ana o ka tcpdump, inā mākou e holo me ke koho -d:

$ sudo tcpdump -i eth0 -d ip6
(000) ldh      [12]
(001) jeq      #0x86dd          jt 2    jf 3
(002) ret      #262144
(003) ret      #0

Ma ka laina zero e holo mākou i ke kauoha ldh [12], ʻo ia hoʻi ka "hoʻouka i loko o ka papa inoa A ʻO ka hapalua huaʻōlelo (16 bits) aia ma ka helu 12" a ʻo ka nīnau wale nō ke ʻano o ka hoʻomanaʻo a mākou e kamaʻilio nei? ʻO ka pane aia ma x Hoʻomaka (x+1)th byte o ka waihona pūnaewele i kālailai ʻia. Heluhelu mākou i nā ʻeke mai ka interface Ethernet eth0a me keia ʻo ia hoʻie like me kēia ka ʻeke (no ka maʻalahi, ke manaʻo nei mākou ʻaʻohe mau huaʻōlelo VLAN i loko o ka ʻeke):

       6              6          2
|Destination MAC|Source MAC|Ether Type|...|

No laila ma hope o ka hoʻokō ʻana i ke kauoha ldh [12] ma ka papa inoa A he mahinaai Ether Type - ke ʻano o ka ʻeke i hoʻouna ʻia i kēia kiʻi Ethernet. Ma ka laina 1 hoʻohālikelike mākou i nā mea o ka papa inoa A (ʻano pūʻolo) c 0x86dda me keia a loaʻa ʻO ke ʻano o kā mākou makemake ʻo IPv6. Ma ka laina 1, ma kahi o ke kauoha hoʻohālikelike, aia ʻelua mau kolamu - jt 2 и jf 3 - nā māka āu e hele ai inā kūleʻa ka hoʻohālikelike (A == 0x86dd) a ua lanakila ʻole. No laila, i kahi hihia kūleʻa (IPv6) hele mākou i ka laina 2, a ma kahi hihia i kūleʻa ʻole - i ka laina 3. Ma ka laina 3 e hoʻopau ka papahana me ke code 0 (mai kope i ka packet), ma ka laina 2 e hoʻopau ka papahana me ke code 262144 (kope iaʻu i ka nui o 256 kilobytes pūʻolo).

ʻO kahi hiʻohiʻona paʻakikī: nānā mākou i nā ʻeke TCP ma ke awa e hele ai

E ʻike kākou i ke ʻano o kahi kānana e kope ai i nā ʻeke TCP a pau me ke awa huakaʻi 666. E noʻonoʻo mākou i ka hihia IPv4, ʻoiai ʻoi aku ka maʻalahi o ka hihia IPv6. Ma hope o ke aʻo ʻana i kēia hiʻohiʻona, hiki iā ʻoe ke ʻimi i ka kānana IPv6 iā ʻoe iho ma ke ʻano he hoʻoikaika kino (ip6 and tcp dst port 666) a me kahi kānana no ka hihia maʻamau (tcp dst port 666). No laila, ʻo ka kānana a mākou e makemake ai e like me kēia:

$ sudo tcpdump -i eth0 -d ip and tcp dst port 666
(000) ldh      [12]
(001) jeq      #0x800           jt 2    jf 10
(002) ldb      [23]
(003) jeq      #0x6             jt 4    jf 10
(004) ldh      [20]
(005) jset     #0x1fff          jt 10   jf 6
(006) ldxb     4*([14]&0xf)
(007) ldh      [x + 16]
(008) jeq      #0x29a           jt 9    jf 10
(009) ret      #262144
(010) ret      #0

Ua ʻike mua mākou i ka hana o nā laina 0 a me 1. Ma ka laina 2 ua nānā mua mākou he ʻeke IPv4 kēia (Ether Type = 0x800) a hoʻouka i loko o ka papa inoa A 24th byte o ka ʻeke. Ua like ko mākou pūʻolo

       14            8      1     1
|ethernet header|ip fields|ttl|protocol|...|

ʻo ia hoʻi ke hoʻouka nei mākou i ka papa inoa A ke kahua Protocol o ke poʻomanaʻo IP, he kūpono ia, no ka mea makemake mākou e kope wale i nā ʻeke TCP. Hoʻohālikelike mākou i ka Protocol me 0x6 (IPPROTO_TCP) ma ka laina 3.

Ma nā laina 4 a me 5 mākou e hoʻouka i ka hapalua huaʻōlelo aia ma ka helu 20 a hoʻohana i ke kauoha jset e nānā inā hoʻonoho ʻia kekahi o nā ʻekolu nā hae - ke komo ʻana i ka mask i hoʻopuka ʻia jset holoi ʻia nā ʻāpana koʻikoʻi ʻekolu. ʻElua o nā ʻāpana ʻekolu e haʻi mai iā mākou inā he ʻāpana o ka ʻeke IP ʻāpana, a inā pēlā, ʻo ia ka ʻāpana hope. Ua mālama ʻia ka bit ʻekolu a ʻaʻohe pono. ʻAʻole mākou makemake e nānā i nā ʻeke piha ʻole a haki paha, no laila e nānā mākou i nā ʻāpana ʻekolu.

ʻO ka laina 6 ka mea hoihoi loa i kēia papa inoa. Hōʻike ldxb 4*([14]&0xf) ʻo ia hoʻi mākou e hoʻouka i loko o ka papa inoa X ʻo nā ʻehā koʻikoʻi liʻiliʻi loa o ka ʻumikūmālima paita o ka ʻeke i hoʻonui ʻia me 4. ʻO nā ʻehā koʻikoʻi liʻiliʻi loa o ka paita ʻumikumamālima ʻo ia ke kahua. Ka lōʻihi o ke poʻo pūnaewele ʻO ke poʻo IPv4, kahi e mālama ai i ka lōʻihi o ke poʻo ma nā huaʻōlelo, no laila pono ʻoe e hoʻonui i ka 4. ʻO ka mea hoihoi, ʻo ka ʻōlelo. 4*([14]&0xf) He inoa ia no kahi papa kuhikuhi kūikawā hiki ke hoʻohana wale ʻia ma kēia ʻano a no kahi kākau inoa wale nō X, i.e. ʻaʻole hiki iā mākou ke ʻōlelo aku ldb 4*([14]&0xf) ʻaʻole hoʻi ldxb 5*([14]&0xf) (hiki iā mākou ke kuhikuhi i kahi offset ʻē aʻe, no ka laʻana, ldxb 4*([16]&0xf)). Ua maopopo ua hoʻohui pono ʻia kēia ʻōlelo hoʻoponopono i ka BPF i mea e loaʻa ai X (ka helu helu helu) IPv4 ka lōʻihi o ke poʻo.

No laila ma ka laina 7 ke ho'āʻo nei mākou e hoʻouka i ka hapalua huaʻōlelo ma (X+16). Ke hoʻomanaʻo nei ʻo 14 bytes i noho ʻia e ka poʻomanaʻo Ethernet, a X aia ka lōʻihi o ke poʻomanaʻo IPv4, maopopo iā mākou i loko A Hoʻouka ʻia ke awa hoʻouna ʻia ʻo TCP:

       14           X           2             2
|ethernet header|ip header|source port|destination port|

ʻO ka hope, ma ka laina 8 e hoʻohālikelike mākou i ke awa e hele ai me ka waiwai i makemake ʻia a ma nā laina 9 a i ʻole 10 mākou e hoʻihoʻi i ka hopena - inā e kope i ka ʻeke a ʻaʻole paha.

Tcpdump: hoʻouka

Ma nā hiʻohiʻona mua, ʻaʻole mākou i noʻonoʻo kikoʻī i ke ʻano o kā mākou hoʻouka ʻana i ka BPF bytecode i loko o ka kernel no ka kānana packet. ʻO ka ʻōlelo maʻamau, tcpdump lawe ʻia i nā ʻōnaehana he nui a no ka hana ʻana me nā kānana tcpdump hoʻohana i ka waihona libpcap. ʻO ka pōkole, e kau i kahi kānana ma kahi interface me ka hoʻohana ʻana libpcap, pono e hana i kēia:

E ʻike i ke ʻano o ka hana pcap_setfilter hoʻohana ʻia ma Linux, hoʻohana mākou strace (Ua wehe ʻia kekahi mau laina):

$ sudo strace -f -e trace=%network tcpdump -p -i eth0 ip
socket(AF_PACKET, SOCK_RAW, 768)        = 3
bind(3, {sa_family=AF_PACKET, sll_protocol=htons(ETH_P_ALL), sll_ifindex=if_nametoindex("eth0"), sll_hatype=ARPHRD_NETROM, sll_pkttype=PACKET_HOST, sll_halen=0}, 20) = 0
setsockopt(3, SOL_SOCKET, SO_ATTACH_FILTER, {len=4, filter=0xb00bb00bb00b}, 16) = 0
...

Ma nā laina mua ʻelua o ka huahana a mākou e hana ai puna maka e heluhelu i nā kiʻi Ethernet āpau a hoʻopaʻa iā ia i ka interface eth0. Mai kā mākou laʻana mua ʻike mākou i ka kānana ip e loaʻa i nā ʻōlelo aʻo BPF ʻehā, a ma ke kolu o ka laina ʻike mākou i ka hoʻohana ʻana i ke koho SO_ATTACH_FILTER kelepona ʻōnaehana setsockopt hoʻouka mākou a hoʻohui i kahi kānana o ka lōʻihi 4. ʻO kā mākou kānana kēia.

He mea pono e hoʻomaopopo i ka BPF maʻamau, ʻo ka hoʻouka ʻana a me ka hoʻopili ʻana i kahi kānana e hana mau ʻia ma ke ʻano he hana atomic, a ma ka mana hou o BPF, hoʻoili ʻia ka papahana a hoʻopaʻa ʻia i ka hanana hanana i ka manawa.

Huna Oiaio

ʻO kahi mana piha piha o ka puka e like me kēia:

$ sudo strace -f -e trace=%network tcpdump -p -i eth0 ip
socket(AF_PACKET, SOCK_RAW, 768)        = 3
bind(3, {sa_family=AF_PACKET, sll_protocol=htons(ETH_P_ALL), sll_ifindex=if_nametoindex("eth0"), sll_hatype=ARPHRD_NETROM, sll_pkttype=PACKET_HOST, sll_halen=0}, 20) = 0
setsockopt(3, SOL_SOCKET, SO_ATTACH_FILTER, {len=1, filter=0xbeefbeefbeef}, 16) = 0
recvfrom(3, 0x7ffcad394257, 1, MSG_TRUNC, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
setsockopt(3, SOL_SOCKET, SO_ATTACH_FILTER, {len=4, filter=0xb00bb00bb00b}, 16) = 0
...

E like me ka mea i ʻōlelo ʻia ma luna, hoʻouka mākou a hoʻohui i kā mākou kānana i ke kumu ma ka laina 5, akā pehea ka hana ma nā laina 3 a me 4? ʻIke ʻia kēia libpcap mālama iā mākou - no laila ʻaʻole i hoʻokomo ʻia ka hoʻopuka o kā mākou kānana i nā ʻeke ʻaʻole e māʻona iā ia, ka waihona. pili kānana dummy ret #0 (hoʻokuʻu i nā ʻeke a pau), hoʻololi i ke kumu i ke ʻano hoʻopaʻa ʻole a hoʻāʻo e unuhi i nā ʻeke āpau i hiki ke waiho ʻia mai nā kānana mua.

I ka huina, e kānana i nā pūʻolo ma Linux me ka hoʻohana ʻana i ka BPF maʻamau, pono ʻoe e loaʻa kahi kānana ma ke ʻano o kahi ʻano like struct sock_fprog a me kahi puka hāmama, a laila hiki ke hoʻopili ʻia ka kānana i ke kumu me ka hoʻohana ʻana i kahi kelepona ʻōnaehana setsockopt.

ʻO ka mea e mahalo ai, hiki ke hoʻopili ʻia ka kānana i kekahi kumu, ʻaʻole maka wale. Eia hiʻohiʻona he polokalamu e ʻoki ana i nā paita mua ʻelua wale nō mai nā kiʻi ʻikepili UDP e hiki mai ana. (Ua hoʻohui au i nā manaʻo i loko o ke code i ʻole e hoʻopili i ka ʻatikala.)

Nā kikoʻī hou aku e pili ana i ka hoʻohana setsockopt no ka hoʻohui ʻana i nā kānana, ʻike kumu(7), akā e pili ana i ke kākau ʻana i kāu mau kānana like struct sock_fprog me ke kokua ole tcpdump e kamaʻilio mākou ma ka ʻāpana Hoʻolālā BPF me ko mākou mau lima ponoʻī.

BPF maʻamau a me ke kenekulia XNUMX

Ua hoʻokomo ʻia ʻo BPF i Linux i ka makahiki 1997 a ua hoʻomau ʻo ia i ka hana hana no ka manawa lōʻihi libpcap me ka ʻole o nā hoʻololi kūikawā (Linux-specific change, ʻoiaʻiʻo, It ua, akā ʻaʻole lākou i hoʻololi i ke kiʻi honua). ʻO nā hōʻailona koʻikoʻi mua e ulu ai ka BPF i hele mai i 2011, i ka manawa i noi ai ʻo Eric Dumazet popōna, e hoʻohui ana i ka Just In Time Compiler i ka kernel - he unuhi no ka hoʻololi ʻana i ka BPF bytecode i ka ʻōiwi. x86_64 pāʻālua

ʻO JIT compiler ka mea mua ma ke kaulahao o nā loli: ma 2012 puka mai hiki ke kākau i nā kānana no hūnā, me ka hoʻohana ʻana i ka BPF, ma Ianuali 2013 aia hoʻohui ʻia ka ʻōlelo xt_bpf, hiki iā ʻoe ke kākau i nā lula no iptables me ke kōkua o BPF, a ma ʻOkakopa 2013 ua hoʻohui ʻia he module kekahi cls_bpf, ka mea e hiki ai iā ʻoe ke kākau i nā mea hoʻohālikelike kaʻa me ka hoʻohana ʻana i ka BPF.

E nānā mākou i kēia mau hiʻohiʻona i nā kikoʻī hou aʻe, akā ʻo ka mea mua e pono iā mākou ke aʻo i ke kākau ʻana a me ka hōʻuluʻulu ʻana i nā papahana arbitrary no BPF, ʻoiai nā mea hiki ke hāʻawi ʻia e ka waihona. libpcap kaupalena (maʻalahi laʻana: kānana i hana ʻia libpcap hiki ke hoʻihoʻi i ʻelua mau waiwai - 0 a i ʻole 0x40000) a i ʻole ma ke ʻano maʻamau, e like me ka seccomp, ʻaʻole pili.

Hoʻolālā BPF me ko mākou mau lima ponoʻī

E ʻike kākou i ka format binary o nā ʻōlelo aʻo BPF, maʻalahi loa ia:

   16    8    8     32
| code | jt | jf |  k  |

Loaʻa i kēlā me kēia ʻōlelo aʻoaʻo he 64 mau bits, a ʻo nā 16 bits mua ke code aʻoaʻo, a laila aia ʻelua ʻewalu-bit indents, jt и jf, a me 32 bits no ka paio K, ʻokoʻa ke kumu o ia kauoha mai ke kauoha. Eia kekahi laʻana, ke kauoha ret, ka mea e hoʻopau i ka polokalamu aia ke code 6, a ua laweia ka waiwai hoihoi mai ka mau K. Ma C, hōʻike ʻia kahi aʻo BPF hoʻokahi ma ke ʻano he hale

struct sock_filter {
        __u16   code;
        __u8    jt;
        __u8    jf;
        __u32   k;
}

a aia ka papahana holoʻokoʻa ma ke ʻano o kahi hoʻolālā

struct sock_fprog {
        unsigned short len;
        struct sock_filter *filter;
}

No laila, hiki iā mākou ke kākau i nā papahana (no ka laʻana, ʻike mākou i nā code aʻoaʻo mai [1]). ʻO kēia ke ʻano o ka kānana ip6 mai kā mākou laʻana mua:

struct sock_filter code[] = {
        { 0x28, 0, 0, 0x0000000c },
        { 0x15, 0, 1, 0x000086dd },
        { 0x06, 0, 0, 0x00040000 },
        { 0x06, 0, 0, 0x00000000 },
};
struct sock_fprog prog = {
        .len = ARRAY_SIZE(code),
        .filter = code,
};

papahana prog hiki iā mākou ke hoʻohana ma ke kānāwai i kahi kelepona

setsockopt(sk, SOL_SOCKET, SO_ATTACH_FILTER, &prog, sizeof(prog))

ʻAʻole maʻalahi ka kākau ʻana i nā papahana ma ke ʻano o nā code mīkini, akā i kekahi manawa he mea pono ia (no ka laʻana, no ka debugging, ka hana ʻana i nā hoʻokolohua unit, kākau ʻatikala ma Habré, etc.). No ka maʻalahi i ka faila <linux/filter.h> Ua wehewehe ʻia nā macros helper - hiki ke kākau hou ʻia ka laʻana like e like me luna

struct sock_filter code[] = {
        BPF_STMT(BPF_LD|BPF_H|BPF_ABS, 12),
        BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, ETH_P_IPV6, 0, 1),
        BPF_STMT(BPF_RET|BPF_K, 0x00040000),
        BPF_STMT(BPF_RET|BPF_K, 0),
}

Eia naʻe, ʻaʻole maʻalahi kēia koho. ʻO kēia ka mea i manaʻo ai ka poʻe papahana kernel Linux, a no laila i ka papa kuhikuhi tools/bpf kernels hiki iā ʻoe ke loaʻa i kahi hui a me ka debugger no ka hana ʻana me ka BPF maʻamau.

Ua like loa ka ʻōlelo hui me ka hoʻopuka debug tcpdump, akā hiki iā mākou ke kuhikuhi i nā lepili hōʻailona. Eia kekahi laʻana, eia kahi papahana e hoʻolei i nā ʻeke a pau koe wale nō TCP/IPv4:

$ cat /tmp/tcp-over-ipv4.bpf
ldh [12]
jne #0x800, drop
ldb [23]
jneq #6, drop
ret #-1
drop: ret #0

Ma ka paʻamau, hoʻopuka ka mea hui i ke code ma ke ʻano <количество инструкций>,<code1> <jt1> <jf1> <k1>,..., no kā mākou laʻana me TCP

$ tools/bpf/bpf_asm /tmp/tcp-over-ipv4.bpf
6,40 0 0 12,21 0 3 2048,48 0 0 23,21 0 1 6,6 0 0 4294967295,6 0 0 0,

No ka maʻalahi o nā polokalamu C, hiki ke hoʻohana ʻia kahi ʻano hoʻopuka ʻokoʻa:

$ tools/bpf/bpf_asm -c /tmp/tcp-over-ipv4.bpf
{ 0x28,  0,  0, 0x0000000c },
{ 0x15,  0,  3, 0x00000800 },
{ 0x30,  0,  0, 0x00000017 },
{ 0x15,  0,  1, 0x00000006 },
{ 0x06,  0,  0, 0xffffffff },
{ 0x06,  0,  0, 0000000000 },

Hiki ke kope ʻia kēia kikokikona i loko o ka wehewehe ʻano ʻano struct sock_filter, e like me kā mākou i hana ai ma ka hoʻomaka ʻana o kēia pauku.

Linux a me netsniff-ng hoʻonui

Ma waho aʻe o ka BPF maʻamau, Linux a me tools/bpf/bpf_asm kākoʻo a hoʻonohonoho maʻamau ʻole. ʻO ke kumu, hoʻohana ʻia nā ʻōlelo aʻoaʻo e komo i nā kahua o kahi hale struct sk_buff, e wehewehe ana i kahi ʻeke pūnaewele ma ka kernel. Eia nō naʻe, aia kekahi mau ʻano ʻōlelo aʻoaʻo kōkua, no ka laʻana ldw cpu e hoʻouka i loko o ka papa inoa A ka hopena o ka holo ʻana i kahi hana kernel raw_smp_processor_id(). (Ma ka mana hou o BPF, ua hoʻonui ʻia kēia mau hoʻonui ʻole maʻamau e hāʻawi i nā papahana me kahi pūʻulu o nā mea kōkua kernel no ke komo ʻana i ka hoʻomanaʻo, nā hale, a me nā hanana hanana.) Eia kekahi hiʻohiʻona hoihoi o kahi kānana a mākou e kope wale ai i ka nā poʻomanaʻo packet i kahi mea hoʻohana me ka hoʻonui poff, hoʻopau hoʻopaʻa ʻana:

ld poff
ret a

ʻAʻole hiki ke hoʻohana ʻia nā hoʻonui BPF tcpdump, akā he kumu maikaʻi kēia e ʻike ai i ka pūʻolo pono netsniff-ng, kahi, i waena o nā mea ʻē aʻe, aia kahi papahana holomua netsniff-ng, ka mea, ma kahi o ka kānana ʻana me ka hoʻohana ʻana i ka BPF, loaʻa pū kekahi i kahi mea hoʻokele kaʻa maikaʻi, a ʻoi aku ka holomua ma mua o tools/bpf/bpf_asm, he mea hui BPF i kapaia bpfc. Aia ka pūʻolo i nā palapala kikoʻī, e ʻike pū i nā loulou ma ka hope o ka ʻatikala.

hūnā

No laila, ua ʻike mua mākou pehea e kākau ai i nā polokalamu BPF o ka paʻakikī paʻakikī a mākaukau e nānā i nā hiʻohiʻona hou, ʻo ka mea mua ʻo ia ka ʻenehana seccomp, e ʻae ai, me ka hoʻohana ʻana i nā kānana BPF, e hoʻokele i ka hoʻonohonoho a me ka hoʻonohonoho ʻana o nā ʻōlelo hoʻopaʻapaʻa kelepona i loaʻa iā. he kaʻina hana a me kāna mau mamo.

Ua hoʻohui ʻia ka mana mua o ka seccomp i ka kernel i ka makahiki 2005 a ʻaʻole kaulana loa, no ka mea ua hāʻawi ʻia kahi koho hoʻokahi wale nō - e kaupalena i ka hoʻonohonoho o nā kelepona ʻōnaehana i loaʻa i kahi kaʻina hana i kēia: read, write, exit и sigreturn, a ua pepehi ʻia ke kaʻina hana i uhaki i nā lula me ka hoʻohana ʻana SIGKILL. Eia nō naʻe, i ka makahiki 2012, ua hoʻohui ʻo seccomp i ka hiki ke hoʻohana i nā kānana BPF, e ʻae iā ʻoe e wehewehe i kahi hoʻonohonoho o nā kelepona ʻōnaehana ʻae ʻia a hana pū i nā loiloi i kā lākou hoʻopaʻapaʻa. (ʻO ka mahalo, ʻo Chrome kekahi o nā mea hoʻohana mua i kēia hana, a ke hoʻomohala nei ka poʻe Chrome i kahi hana KRSI e pili ana i kahi mana hou o BPF a ʻae i ka hoʻopilikino ʻana i nā Linux Security Modules.) Hiki ke loaʻa nā loulou i nā palapala hou ma ka hopena. o ka 'atikala.

E hoʻomanaʻo ua loaʻa nā ʻatikala ma ka hub e pili ana i ka hoʻohana ʻana i ka seccom, makemake paha kekahi e heluhelu iā lākou ma mua (a i ʻole ma mua o) heluhelu ʻana i kēia mau ʻāpana. Ma ka ʻatikala Nā pahu a me ka palekana: seccom hāʻawi i nā hiʻohiʻona o ka hoʻohana ʻana i ka seccomp, ʻo ka mana 2007 a me ka mana e hoʻohana ana i ka BPF (hana ʻia nā kānana me ka libseccomp), kamaʻilio e pili ana i ka pilina o seccomp me Docker, a hāʻawi pū i nā loulou pono he nui. Ma ka ʻatikala Hoʻokaʻawale i nā daemons me systemd a i ʻole "ʻaʻole pono ʻoe iā Docker no kēia!" Hoʻopili ia, ma ke ʻano, pehea e hoʻohui ai i nā papa inoa ʻeleʻele a i ʻole nā ​​papa inoa keʻokeʻo o nā kelepona ʻōnaehana no nā daemons e holo ana i ka systemd.

A laila e ʻike mākou pehea e kākau ai a hoʻouka i nā kānana no seccomp ma C a me ka hoʻohana ʻana i ka waihona libseccomp a he aha nā pōmaikaʻi a me nā pōʻino o kēlā me kēia koho, a ʻo ka hope, e ʻike kākou pehea e hoʻohana ʻia ai ka seccomp e ka papahana strace.

Kākau a hoʻouka ʻana i nā kānana no seccomp

Ua ʻike mua mākou i ke kākau ʻana i nā polokalamu BPF, no laila e nānā mua kākou i ka seccom programming interface. Hiki iā ʻoe ke hoʻonohonoho i kahi kānana ma ka pae kaʻina hana, a e hoʻoili ʻia nā kaʻina hana keiki a pau i nā kapu. Hana ʻia kēia me ke kelepona ʻōnaehana seccomp(2):

seccomp(SECCOMP_SET_MODE_FILTER, flags, &filter)

kahi &filter - he kuhikuhi kēia i kahi hale i kamaʻāina mua iā mākou struct sock_fprog, i.e. polokalamu BPF.

Pehea ka ʻokoʻa o nā polokalamu no ka seccom mai nā polokalamu no nā kumu? Hoʻouna ʻia ka pōʻaiapili. I ka hihia o nā kumu, ua hāʻawi ʻia iā mākou kahi wahi hoʻomanaʻo i loaʻa i ka ʻeke, a i ka hihia o seccomp ua hāʻawi ʻia mākou i kahi ʻano like.

struct seccomp_data {
    int   nr;
    __u32 arch;
    __u64 instruction_pointer;
    __u64 args[6];
};

he mea nr ʻo ia ka helu o ke kelepona pūnaewele e hoʻomaka ai, arch - hale hoʻolālā o kēia manawa (ʻoi aku ma luna o kēia ma lalo), args - a hiki i ʻeono hoʻopaʻapaʻa kelepona ʻōnaehana, a instruction_pointer He kuhikuhi i ke aʻo ʻana o ka mea hoʻohana i ka ʻōnaehana kelepona. No laila, no ka laʻana, e hoʻouka i ka helu kelepona pūnaewele i loko o ka papa inoa A pono mākou e ʻōlelo

ldw [0]

Aia kekahi mau hiʻohiʻona ʻē aʻe no nā polokalamu seccomp, no ka laʻana, hiki ke kiʻi wale ʻia ka pōʻaiapili e ka alignment 32-bit a ʻaʻole hiki iā ʻoe ke hoʻouka i ka hapalua huaʻōlelo a i ʻole ka byte - ke hoʻāʻo nei e hoʻouka i kahi kānana. ldh [0] kelepona ʻōnaehana seccomp e hoʻi mai EINVAL. Nānā ka hana i nā kānana i hoʻouka ʻia seccomp_check_filter() nā ʻōpala. (ʻO ka mea ʻakaʻaka, i ka hana kumu i hoʻohui i ka hana seccomp, poina lākou e hoʻohui i ka ʻae e hoʻohana i ke aʻo ʻana i kēia hana. mod (koena mahele) a ʻaʻole i loaʻa i kēia manawa no nā polokalamu BPF seccom, mai kona hoʻohui ʻana e haki ABI.)

Basically, ua ʻike mua mākou i nā mea āpau e kākau a heluhelu i nā papahana seccom. Hoʻonohonoho maʻamau ka loina papahana ma ke ʻano he papa inoa keʻokeʻo a ʻeleʻele paha o nā kelepona ʻōnaehana, no ka laʻana ka papahana

ld [0]
jeq #304, bad
jeq #176, bad
jeq #239, bad
jeq #279, bad
good: ret #0x7fff0000 /* SECCOMP_RET_ALLOW */
bad: ret #0

nānā i ka papa inoa ʻeleʻele o nā kelepona ʻōnaehana ʻehā i helu ʻia 304, 176, 239, 279. He aha kēia mau kelepona pūnaewele? ʻAʻole hiki iā mākou ke ʻōlelo maopopo, no ka mea ʻaʻole mākou i ʻike no ke aha i kākau ʻia ai ka papahana. No laila, ʻo nā mea kākau o seccom hāʻawi e hoʻomaka i nā polokalamu āpau me kahi nānā hoʻolālā (hōʻike ʻia ka hoʻolālā o kēia manawa ma ka pōʻaiapili ma ke ʻano he kahua arch ka hoolālā struct seccomp_data). Me ka nānā ʻia ʻana o ka hale hoʻolālā, e like ke ʻano o ka hoʻomaka o ka laʻana:

ld [4]
jne #0xc000003e, bad_arch ; SCMP_ARCH_X86_64

a laila e loaʻa i kā mākou helu kelepona kekahi mau waiwai.

Kākau mākou a hoʻouka i nā kānana no ka hoʻohana seccom libseccomp

ʻO ke kākau ʻana i nā kānana ma ke code maoli a i ʻole ka hui BPF e hiki iā ʻoe ke loaʻa ka mana piha ma luna o ka hopena, akā i ka manawa like, ʻoi aku ka maikaʻi o ka loaʻa ʻana o ka code portable a/a i ʻole heluhelu. E kōkua ka waihona iā mākou i kēia libseccomp, e hāʻawi ana i kahi kikowaena maʻamau no ke kākau ʻana i nā kānana ʻeleʻele a keʻokeʻo paha.

No ka laʻana, e kākau i kahi papahana e holo ana i kahi faila binary a ka mea hoʻohana i koho ai, ua hoʻokomo mua i kahi papa inoa ʻeleʻele o nā kelepona ʻōnaehana mai. ka ʻatikala ma luna (Ua maʻalahi ka papahana no ka heluhelu ʻana, hiki ke loaʻa ka mana piha maanei):

#include <seccomp.h>
#include <unistd.h>
#include <err.h>

static int sys_numbers[] = {
        __NR_mount,
        __NR_umount2,
       // ... еще 40 системных вызовов ...
        __NR_vmsplice,
        __NR_perf_event_open,
};

int main(int argc, char **argv)
{
        scmp_filter_ctx ctx = seccomp_init(SCMP_ACT_ALLOW);

        for (size_t i = 0; i < sizeof(sys_numbers)/sizeof(sys_numbers[0]); i++)
                seccomp_rule_add(ctx, SCMP_ACT_TRAP, sys_numbers[i], 0);

        seccomp_load(ctx);

        execvp(argv[1], &argv[1]);
        err(1, "execlp: %s", argv[1]);
}

ʻO ka mua mākou e wehewehe i kahi array sys_numbers o 40+ mau helu kelepona pūnaewele e ālai. A laila, e hoʻomaka i ka pōʻaiapili ctx a haʻi i ka hale waihona puke i ka mea a mākou e makemake ai e ʻae (SCMP_ACT_ALLOW) nā kelepona ʻōnaehana a pau ma ka paʻamau (ʻoi aku ka maʻalahi o ke kūkulu ʻana i nā papa inoa ʻeleʻele). A laila, hoʻokahi, hoʻohui mākou i nā kelepona ʻōnaehana āpau mai ka papa inoa ʻeleʻele. I ka pane ʻana i kahi kelepona ʻōnaehana mai ka papa inoa, noi mākou SCMP_ACT_TRAP, i kēia hihia e hoʻouna ʻo seccom i kahi hōʻailona i ke kaʻina hana SIGSYS me ka wehewehe ʻana i ke kelepona ʻōnaehana i uhaki i nā lula. ʻO ka hope, hoʻouka mākou i ka polokalamu i ka kernel me ka hoʻohana ʻana seccomp_load, ka mea e hōʻuluʻulu i ka papahana a hoʻopili iā ia i ke kaʻina hana me ka hoʻohana ʻana i kahi kelepona ʻōnaehana seccomp(2).

No ka hōʻuluʻulu kūleʻa, pono e hoʻopili ʻia ka papahana me ka waihona libseccompno ka laʻana:

cc -std=c17 -Wall -Wextra -c -o seccomp_lib.o seccomp_lib.c
cc -o seccomp_lib seccomp_lib.o -lseccomp

Ka laʻana o ka hoʻomaka ʻana o ka holomua:

$ ./seccomp_lib echo ok
ok

Laʻana o kahi kelepona ʻōnaehana i ālai ʻia:

$ sudo ./seccomp_lib mount -t bpf bpf /tmp
Bad system call

Hoʻohana mākou straceno nā kikoʻī:

$ sudo strace -e seccomp ./seccomp_lib mount -t bpf bpf /tmp
seccomp(SECCOMP_SET_MODE_FILTER, 0, {len=50, filter=0x55d8e78428e0}) = 0
--- SIGSYS {si_signo=SIGSYS, si_code=SYS_SECCOMP, si_call_addr=0xboobdeadbeef, si_syscall=__NR_mount, si_arch=AUDIT_ARCH_X86_64} ---
+++ killed by SIGSYS (core dumped) +++
Bad system call

pehea e hiki ai iā mākou ke ʻike ua hoʻopau ʻia ka papahana ma muli o ka hoʻohana ʻana i kahi kelepona ʻōnaehana kīnā ʻole mount(2).

No laila, kākau mākou i kahi kānana me ka hoʻohana ʻana i ka waihona libseccomp, hoʻokomo i ka code non-trivial i nā laina ʻehā. I ka laʻana ma luna, inā he nui nā kelepona ʻōnaehana, hiki ke hoʻemi ʻia ka manawa hoʻokō, no ka mea, he papa inoa wale nō o ka hoʻohālikelike. No ka optimization, libseccomp i kēia manawa hoʻopili pū ʻia, e hoʻohui i ke kākoʻo no ke ʻano kānana SCMP_FLTATR_CTL_OPTIMIZE. ʻO ka hoʻonohonoho ʻana i kēia ʻano i ka 2 e hoʻohuli i ka kānana i kahi polokalamu hulina binary.

Inā makemake ʻoe e ʻike pehea e hana ai nā kānana hulina binary, e nānā palapala maʻalahi, ka mea e hana ana i ia mau papahana ma ka BPF assembler ma ke kelepona ʻana i nā helu kelepona ʻōnaehana, no ka laʻana:

$ echo 1 3 6 8 13 | ./generate_bin_search_bpf.py
ld [0]
jeq #6, bad
jgt #6, check8
jeq #1, bad
jeq #3, bad
ret #0x7fff0000
check8:
jeq #8, bad
jeq #13, bad
ret #0x7fff0000
bad: ret #0

ʻAʻole hiki ke kākau i kekahi mea ʻoi aku ka wikiwiki, no ka mea ʻaʻole hiki i nā polokalamu BPF ke hana i nā lele indentation (ʻaʻole hiki iā mākou ke hana, no ka laʻana, jmp A ai ole ia, jmp [label+X]) a no laila, paʻa nā hoʻololi a pau.

seccom and strace

Uaʻike nā kānaka a pau i ka pono strace he mea hana pono no ke aʻo ʻana i ke ʻano o nā kaʻina hana ma Linux. Eia naʻe, nui ka poʻe i lohe e pili ana pilikia hana i ka wā e hoʻohana ai i kēia pono. ʻO ka mea ʻoiaʻiʻo strace hoʻokō ʻia me ka hoʻohana ʻana ptrace(2), a ma kēia hana ʻaʻole hiki iā mākou ke kuhikuhi i ke ʻano o nā kelepona ʻōnaehana e pono ai mākou e hoʻōki i ke kaʻina hana, ʻo ia hoʻi, nā kauoha.

$ time strace du /usr/share/ >/dev/null 2>&1

real    0m3.081s
user    0m0.531s
sys     0m2.073s

и

$ time strace -e open du /usr/share/ >/dev/null 2>&1

real    0m2.404s
user    0m0.193s
sys     0m1.800s

Hoʻohana ʻia i ka manawa like, ʻoiai ma ka hihia ʻelua makemake mākou e huli i hoʻokahi kelepona ʻōnaehana.

Koho hou --seccomp-bpf, hoʻohui ʻia i strace version 5.3, hiki iā ʻoe ke wikiwiki i ke kaʻina hana i nā manawa he nui a ua hoʻohālikelike ʻia ka manawa hoʻomaka ma lalo o ke ʻano o kahi kelepona ʻōnaehana me ka manawa o kahi hoʻomaka maʻamau:

$ time strace --seccomp-bpf -e open du /usr/share/ >/dev/null 2>&1

real    0m0.148s
user    0m0.017s
sys     0m0.131s

$ time du /usr/share/ >/dev/null 2>&1

real    0m0.140s
user    0m0.024s
sys     0m0.116s

(Maʻaneʻi, ʻoiaʻiʻo, aia kahi hoʻopunipuni liʻiliʻi ʻaʻole mākou e ʻimi nei i ke kelepona ʻōnaehana nui o kēia kauoha. Inā mākou e ʻimi nei, no ka laʻana, newfsstatalaila strace e uhaki e like me ka paakiki ole --seccomp-bpf.)

Pehea e hana ai kēia koho? Me ka ole ia strace pili i ke kaʻina hana a hoʻomaka ia e hoʻohana PTRACE_SYSCALL. Ke hoʻopuka ke kaʻina hana i kahi (kekahi) kelepona pūnaewele, hoʻololi ʻia ka mana i strace, nāna e nānā i nā hoʻopaʻapaʻa o ke kelepona ʻōnaehana a holo me ka hoʻohana ʻana PTRACE_SYSCALL. Ma hope o kekahi manawa, hoʻopau ke kaʻina hana i ke kelepona ʻōnaehana a i ka wā e haʻalele ai, hoʻololi hou ʻia ka mana strace, e nānā ana i nā waiwai hoʻihoʻi a hoʻomaka i ke kaʻina hana PTRACE_SYSCALL, a laila.

BPF no na kamalii, hapa zero: BPF ma'amau

Me ka seccomp, akā naʻe, hiki ke hoʻopaʻa ʻia kēia kaʻina hana e like me kā mākou makemake. ʻO ia, inā makemake mākou e nānā wale i ke kelepona ʻōnaehana X, a laila hiki iā mākou ke kākau i kahi kānana BPF no X hoʻihoʻi waiwai SECCOMP_RET_TRACE, a no nā kelepona hoihoi ʻole iā mākou - SECCOMP_RET_ALLOW:

ld [0]
jneq #X, ignore
trace: ret #0x7ff00000
ignore: ret #0x7fff0000

Ma kēia hihia strace hoʻomaka mua i ke kaʻina hana e like me PTRACE_CONT, Hoʻohana ʻia kā mākou kānana no kēlā me kēia kelepona ʻōnaehana, inā ʻaʻole ka kelepona ʻōnaehana X, a laila hoʻomau ke kaʻina hana, akā inā kēia X, a laila e hoʻoili ka seccom i ka mana stracee nānā i nā hoʻopaʻapaʻa a hoʻomaka i ka hana like PTRACE_SYSCALL (no ka mea, ʻaʻole hiki i ka seccomp ke holo i kahi papahana ma ka puka ʻana mai kahi kelepona ʻōnaehana). Ke hoʻi mai ke kelepona pūnaewele, strace e hoʻomaka hou i ke kaʻina hana me ka hoʻohana PTRACE_CONT a e kali no na memo hou mai seccom.

BPF no na kamalii, hapa zero: BPF ma'amau

I ka hoʻohana ʻana i ke koho --seccomp-bpf ʻelua kapu. ʻO ka mea mua, ʻaʻole hiki ke hoʻohui i kahi kaʻina hana mua (koho -p papahana strace), no ka mea ʻaʻole kākoʻo ʻia kēia e seccomp. ʻO ka lua, ʻaʻohe mea hiki ole e nānā i nā kaʻina hana keiki, no ka mea, ua hoʻoili ʻia nā kānana seccom e nā kaʻina keiki āpau me ka hiki ʻole ke hoʻopau i kēia.

ʻO kahi kikoʻī hou aʻe e pili ana i ka pololei strace hana pū me seccomp hiki ke loaa mai hōʻike hou. No mākou, ʻo ka mea hoihoi loa ʻo ka BPF maʻamau i hōʻike ʻia e seccomp e hoʻohana mau ʻia i kēia lā.

xt_bpf

E hoʻi kāua i ka honua o nā pūnaewele.

Ka hope: i ka manawa lōʻihi i hala, i ka makahiki 2007, ʻo ke kumu hoʻohui ʻia ka ʻōlelo xt_u32 no net kānana. Ua kākau ʻia ma ke ʻano hoʻohālikelike me kahi papa hana kaʻa kahiko cls_u32 a ʻae iā ʻoe e kākau i nā lula binary arbitrary no nā iptables me ka hoʻohana ʻana i nā hana maʻalahi: e hoʻouka i nā bits 32 mai kahi pūʻolo a hana i kahi hoʻonohonoho o nā hana helu ma luna o lākou. ʻo kahi laʻana,

sudo iptables -A INPUT -m u32 --u32 "6&0xFF=1" -j LOG --log-prefix "seen-by-xt_u32"

Hoʻouka i nā ʻāpana 32 o ke poʻomanaʻo IP, e hoʻomaka ana ma ka padding 6, a kau i kahi mask iā lākou 0xFF (e lawe i ka byte haʻahaʻa). ʻO kēia kahua protocol IP poʻomanaʻo a hoʻohālikelike mākou me 1 (ICMP). Hiki iā ʻoe ke hoʻohui i nā loiloi he nui i hoʻokahi lula, a hiki iā ʻoe ke hoʻokō i ka mea hoʻohana @ — hoʻoneʻe X bytes i ka ʻākau. No ka laʻana, ka lula

iptables -m u32 --u32 "6&0xFF=0x6 && 0>>22&0x3C@4=0x29"

nānā inā ʻaʻole like ka helu o ka TCP Sequence Number 0x29. ʻAʻole wau e hele i nā kikoʻī hou aʻe, no ka mea, ua maopopo ka kākau ʻana i ia mau lula ma ka lima ʻaʻole kūpono loa. Ma ka ʻatikala BPF - ka bytecode poina, aia kekahi mau loulou me nā laʻana o ka hoʻohana ʻana a me ka hoʻokumu kānāwai no xt_u32. E ʻike pū i nā loulou ma ka hope o kēia ʻatikala.

Mai ka 2013 module ma kahi o ka module xt_u32 hiki iā ʻoe ke hoʻohana i kahi module BPF xt_bpf. ʻO ka mea i heluhelu i kēia mamao, pono e maopopo i ke kumu o kāna hana: holo BPF bytecode e like me nā lula iptables. Hiki iā ʻoe ke hana i kahi lula hou, no ka laʻana, e like me kēia:

iptables -A INPUT -m bpf --bytecode <байткод> -j LOG

maanei <байткод> - ʻo ia ke code ma ka hōʻano hoʻopuka assembler bpf_asm ma ka paʻamau, no ka laʻana,

$ cat /tmp/test.bpf
ldb [9]
jneq #17, ignore
ret #1
ignore: ret #0

$ bpf_asm /tmp/test.bpf
4,48 0 0 9,21 0 1 17,6 0 0 1,6 0 0 0,

# iptables -A INPUT -m bpf --bytecode "$(bpf_asm /tmp/test.bpf)" -j LOG

Ma kēia hiʻohiʻona ke kānana nei mākou i nā ʻeke UDP āpau. Hōʻike no kahi papahana BPF ma kahi module xt_bpf, ʻoiaʻiʻo, kuhikuhi i ka ʻikepili packet, i ka hihia o nā iptables, i ka hoʻomaka o ke poʻo IPv4. Hoʻihoʻi waiwai mai ka papahana BPF booleankahi false 'o ia ho'i, 'a'ole i like ka 'eke.

Ua maopopo ka module xt_bpf kākoʻo i nā kānana paʻakikī ma mua o ka laʻana ma luna. E nānā kākou i nā hiʻohiʻona maoli mai Cloudfare. A hiki i kēia manawa ua hoʻohana lākou i ka module xt_bpf e pale aku i nā hoʻouka kaua DDoS. Ma ka ʻatikala Ke hoʻolauna nei i nā mea hana BPF wehewehe lākou pehea (a me ke kumu) hana lākou i nā kānana BPF a hoʻolaha i nā loulou i kahi hoʻonohonoho o nā pono hana no ka hana ʻana i nā kānana. No ka laʻana, hoʻohana i ka pono bpfgen hiki iā ʻoe ke hana i kahi polokalamu BPF e pili ana i kahi nīnau DNS no kahi inoa habr.com:

$ ./bpfgen --assembly dns -- habr.com
ldx 4*([0]&0xf)
ld #20
add x
tax

lb_0:
    ld [x + 0]
    jneq #0x04686162, lb_1
    ld [x + 4]
    jneq #0x7203636f, lb_1
    ldh [x + 8]
    jneq #0x6d00, lb_1
    ret #65535

lb_1:
    ret #0

Ma ka papahana mākou e hoʻouka mua i ka papa inoa X hoʻomaka o ka helu laina x04habrx03comx00 i loko o kahi UDP datagram a laila e nānā i ka noi: 0x04686162 <-> "x04hab" a pēlā aku nō.

Ma hope iki, ua hoʻopuka ʻo Cloudfare i ka p0f -> BPF compiler code. Ma ka ʻatikala Ke hoʻolauna nei i ka p0f BPF compiler kamaʻilio lākou e pili ana i ka p0f a pehea e hoʻololi ai i nā inoa p0f i BPF:

$ ./bpfgen p0f -- 4:64:0:0:*,0::ack+:0
39,0 0 0 0,48 0 0 8,37 35 0 64,37 0 34 29,48 0 0 0,
84 0 0 15,21 0 31 5,48 0 0 9,21 0 29 6,40 0 0 6,
...

ʻAʻole hoʻohana hou ʻo Cloudfare i kēia manawa xt_bpf, no ka mea ua neʻe lākou i XDP - kekahi o nā koho no ka hoʻohana ʻana i ka mana hou o BPF, ʻike. L4Drop: XDP DDoS Mitigations.

cls_bpf

ʻO ka hiʻohiʻona hope loa o ka hoʻohana ʻana i ka BPF maʻamau i ka kernel ʻo ia ka classifier cls_bpf no ka subsystem hoʻokele kaʻa ma Linux, i hoʻohui ʻia i Linux ma ka hopena o 2013 a me ka manaʻo e hoʻololi i ka mea kahiko. cls_u32.

Akā naʻe, ʻaʻole mākou e wehewehe i kēia hana cls_bpf, mai ka manaʻo o ka ʻike e pili ana i ka BPF maʻamau ʻaʻole kēia e hāʻawi iā mākou i kekahi mea - ua kamaʻāina mākou i nā hana āpau. Eia kekahi, ma nā ʻatikala aʻe e kamaʻilio ana e pili ana i Extended BPF, e hālāwai mākou i kēia classifier ma mua o hoʻokahi.

ʻO kekahi kumu ʻaʻole e kamaʻilio e pili ana i ka hoʻohana ʻana i ka BPF c cls_bpf ʻO ka pilikia, ke hoʻohālikelike ʻia me Extended BPF, ua hoʻokaʻawale ʻia ke ʻano o ka hoʻohana ʻana i kēia hihia: ʻaʻole hiki i nā papahana maʻamau ke hoʻololi i nā ʻike o nā pūʻolo a ʻaʻole hiki ke mālama i ka mokuʻāina ma waena o nā kelepona.

No laila ʻo ka manawa e haʻi aku ai i ka BPF maʻamau a nānā i ka wā e hiki mai ana.

Aloha ʻoe i ka BPF maʻamau

Ua nānā mākou i ke ʻano o ka ʻenehana BPF, i hoʻomohala ʻia i ka makahiki mua o nineties, i ola maikaʻi no ka hapahā o ke kenekulia a hiki i ka hopena loaʻa nā noi hou. Eia naʻe, e like me ka hoʻololi ʻana mai nā mīkini paʻa i ka RISC, ka mea i lilo i mea hoʻoikaika no ka hoʻomohala ʻana i ka BPF maʻamau, i ka makahiki 32 aia kahi hoʻololi mai 64-bit a XNUMX-bit machines a ua hoʻomaka ka BPF maʻamau e lilo i obsolete. Eia hou, ua kaupalena 'ia ka mana o ka maʻamau BPF, a me ka hoʻohui i ka outdated architecture - 'aʻole mākou i ka hiki ke hoola moku'āina ma waena o nā kelepona i nā polokalamu BPF,ʻaʻohe mea hiki o ka mea hoʻohana pili pono,ʻaʻohe mea hiki ke launa pū. me ka kernel, koe wale no ka heluhelu ʻana i ka helu palena o nā kahua kūkulu sk_buff a me ka hoʻomaka ʻana i nā hana kōkua maʻalahi, ʻaʻole hiki iā ʻoe ke hoʻololi i nā ʻike o nā ʻeke a hoʻohuli iā lākou.

ʻO ka ʻoiaʻiʻo, i kēia manawa ʻo nā mea a pau i koe o ka BPF maʻamau i Linux ʻo ia ka API interface, a i loko o ka kernel nā papahana maʻamau āpau, ʻo nā kānana socket a i ʻole nā ​​kānana seccomp, ua unuhi ʻokoʻa ʻia i kahi ʻano hou, Extended BPF. (E kamaʻilio mākou e pili ana i ke ʻano o kēia ma ka ʻatikala aʻe.)

Ua hoʻomaka ka hoʻololi ʻana i kahi hale hoʻolālā hou i ka makahiki 2013, i ka manawa i hāʻawi ai ʻo Alexey Starovoitov i kahi hoʻolālā BPF. I ka makahiki 2014 nā pākuʻi pili hoomaka e puka mai ma ke kumu. E like me kaʻu i hoʻomaopopo ai, ʻo ka hoʻolālā mua wale nō ka hoʻolālā ʻana i ka hoʻolālā a me ka compiler JIT e holo ʻoi aku ka maikaʻi ma nā mīkini 64-bit, akā ua hōʻailona kēia mau optimizations i ka hoʻomaka ʻana o kahi mokuna hou i ka hoʻomohala Linux.

ʻO nā ʻatikala hou aʻe o kēia pūʻulu e uhi i ka hoʻolālā a me nā noi o ka ʻenehana hou, i ʻike mua ʻia ʻo BPF kūloko, a laila hoʻonui ʻia ka BPF, a i kēia manawa ʻo BPF wale nō.

kūmole

  1. ʻO Steven McCanne lāua ʻo Van Jacobson, "The BSD Packet Filter: A New Architecture for User-level Packet Capture", https://www.tcpdump.org/papers/bpf-usenix93.pdf
  2. ʻO Steven McCanne, "libpcap: He ʻano hoʻolālā a me ka hoʻoponopono ʻana no ka hopu ʻana i ka ʻeke", https://sharkfestus.wireshark.org/sharkfest.11/presentations/McCanne-Sharkfest'11_Keynote_Address.pdf
  3. tcpdump, libpcap: https://www.tcpdump.org/
  4. IPtable U32 Aʻo Hoʻolike.
  5. BPF - ka bytecode poina: https://blog.cloudflare.com/bpf-the-forgotten-bytecode/
  6. Ke hoʻolauna nei i ka mea hana BPF: https://blog.cloudflare.com/introducing-the-bpf-tools/
  7. bpf_cls: http://man7.org/linux/man-pages/man8/tc-bpf.8.html
  8. ʻO kahi hiʻohiʻona seccomp: https://lwn.net/Articles/656307/
  9. https://github.com/torvalds/linux/blob/master/Documentation/userspace-api/seccomp_filter.rst
  10. habr: Nā pahu a me ka palekana: seccomp
  11. habr: Hoʻokaʻawale i nā daemons me systemd a i ʻole "ʻaʻole pono ʻoe iā Docker no kēia!"
  12. Paul Chaignon, "strace --seccomp-bpf: kahi nānā ma lalo o ka puʻu", https://fosdem.org/2020/schedule/event/debugging_strace_bpf/
  13. netsniff-ng: http://netsniff-ng.org/

Source: www.habr.com

Pākuʻi i ka manaʻo hoʻopuka