BPF no na kamalii, hapa hookahi: BPF hoonui

I ka hoʻomaka ʻana he ʻenehana a ua kapa ʻia ʻo BPF. Nānā mākou iā ia mamua, 'atikala o ke Kauoha Kahiko o keia moo. I ka makahiki 2013, ma o ka hana a Alexei Starovoitov a me Daniel Borkman, ua hoʻomohala ʻia kahi mana i hoʻomaikaʻi ʻia, i hoʻolālā ʻia no nā mīkini 64-bit hou, a ua hoʻokomo ʻia i loko o ka kernel Linux. Ua kapa pōkole ʻia kēia ʻenehana hou ʻo Internal BPF, a laila kapa hou ʻia ʻo Extended BPF, a i kēia manawa, ma hope o kekahi mau makahiki, kapa ʻia kēlā me kēia kanaka he BPF.

ʻO ka ʻōlelo koʻikoʻi, ʻae ʻo BPF iā ʻoe e holo i nā code i hāʻawi ʻia e ka mea hoʻohana i ka Linux kernel space, a ua kūleʻa ka hoʻolālā hou e pono ai mākou i nā ʻatikala hou aʻe e wehewehe i kāna mau noi āpau. (ʻO ka mea wale nō i hana maikaʻi ʻole ai nā mea hoʻomohala, e like me kāu e ʻike ai ma ke code hana ma lalo nei, ʻo ia ka hana ʻana i kahi logo kūpono.)

Hōʻike kēia ʻatikala i ke ʻano o ka mīkini virtual BPF. nā mea a pau e pono ai mākou i ka wā e hiki mai ana no ka noiʻi hohonu ʻana i nā noi kūpono o BPF.
BPF no na kamalii, hapa hookahi: BPF hoonui

Hōʻuluʻulu manaʻo o ka ʻatikala

Hoʻomaka i ka hoʻolālā BPF. ʻO ka mea mua, e nānā mākou i ka maka manu o ka hoʻolālā BPF a wehewehe i nā mea nui.

Nā papa inoa a me nā ʻōnaehana kauoha o ka mīkini virtual BPF. Loaʻa i ka manaʻo o ka hoʻolālā holoʻokoʻa, e wehewehe mākou i ke ʻano o ka mīkini virtual BPF.

ʻO ke ola o nā mea BPF, ʻōnaehana faila bpffs. Ma kēia ʻāpana, e nānā pono mākou i ke ola o nā mea BPF - nā papahana a me nā palapala ʻāina.

Ka mālama ʻana i nā mea me ka hoʻohana ʻana i ka bpf system call. Me ka hoʻomaopopo ʻana i ka ʻōnaehana i hoʻonoho ʻia, e nānā hope mākou i ka hana ʻana a me ka hoʻoponopono ʻana i nā mea mai kahi mea hoʻohana me ka hoʻohana ʻana i kahi kelepona ʻōnaehana kūikawā − bpf(2).

Пишем программы BPF с помощью libbpf. ʻOiaʻiʻo, hiki iā ʻoe ke kākau i nā papahana me ka hoʻohana ʻana i kahi kelepona ʻōnaehana. Akā paʻakikī. No kahi hiʻohiʻona ʻoi aku ka ʻoiaʻiʻo, ua kūkulu nā mea hoʻolālā nuklea i kahi waihona libbpf. E hana mākou i kahi ʻiʻo noi BPF kumu a mākou e hoʻohana ai i nā hiʻohiʻona aʻe.

Kokua Kernel. Maanei e aʻo ai mākou pehea e hiki ai i nā polokalamu BPF ke komo i nā hana kōkua kernel - kahi mea hana, me nā palapala 'āina, e hoʻonui nui i nā hiki o ka BPF hou i hoʻohālikelike ʻia me ka mea maʻamau.

Loaʻa i nā palapala 'āina mai nā polokalamu BPF. Ma kēia wahi, lawa mākou i ka ʻike e hoʻomaopopo pono i ke ʻano o ka hana ʻana i nā papahana e hoʻohana ana i nā palapala ʻāina. A e nānā koke kākou i ka mea hōʻoia nui a ikaika.

Nā mea hana hoʻomohala. Mahele kōkua e pili ana i ka hoʻohui ʻana i nā pono pono a me ka kernel no nā hoʻokolohua.

ʻO ka hopena. Ma ka hopena o ka ʻatikala, e ʻike ka poʻe heluhelu i kēia mamao i nā huaʻōlelo hoʻoikaika a me kahi wehewehe pōkole o nā mea e hiki mai ana ma nā ʻatikala aʻe. E papa inoa pū mākou i kekahi mau loulou no ka hoʻopaʻa ʻana iā lākou iho no ka poʻe ʻaʻole makemake a hiki ʻole ke kali no ka hoʻomau.

Introduction to BPF Architecture

Ma mua o ka hoʻomaka ʻana e noʻonoʻo i ka hoʻolālā BPF, e kuhikuhi mākou i ka manawa hope loa (oh). BPF maʻamau, i hoʻomohala ʻia ma ke ʻano he pane i ka hiki ʻana mai o nā mīkini RISC a hoʻonā i ka pilikia o ka kānana packet pono. Ua lanakila loa ka hale hana, no ka mea, ua hānau ʻia i nā makahiki he kanaiwa ma Berkeley UNIX, ua lawe ʻia ʻo ia i ka hapa nui o nā ʻōnaehana hana, i ola i ka makahiki iwakālua a ke ʻimi nei i nā noi hou.

Ua hoʻomohala ʻia ka BPF hou ma ke ʻano he pane i ka ubiquity o nā mīkini 64-bit, nā lawelawe kapuaʻi a me ka nui o ka pono o nā mea hana no ka hana ʻana i ka SDN (Skadhoʻomākaukau nka hana ʻana). Hoʻomohala ʻia e nā ʻenehana ʻenehana kernel ma ke ʻano he mea hoʻololi hou no ka BPF maʻamau, ʻo ka BPF hou i ʻeono mahina ma hope mai ua loaʻa nā noi i ka hana paʻakikī o ka ʻimi ʻana i nā ʻōnaehana Linux, a i kēia manawa, ʻeono mau makahiki ma hope o kona ʻike ʻana, pono mākou i kahi ʻatikala holoʻokoʻa e pono ai. papa inoa i nā ʻano papahana like ʻole.

Nā kiʻi ʻakaʻaka

Ma kāna kumu, ʻo ka BPF kahi mīkini sandbox virtual e hiki ai iā ʻoe ke holo i ka code "arbitrary" i loko o ka lumi kernel me ka ʻole o ka hoʻopalekana i ka palekana. Hoʻokumu ʻia nā polokalamu BPF ma kahi o ka mea hoʻohana, hoʻouka ʻia i loko o ka kernel, a pili i kekahi kumu hanana. Hiki i kahi hanana, no ka laʻana, ka hāʻawi ʻana i kahi ʻeke i kahi kikowaena pūnaewele, ka hoʻomaka ʻana o kekahi hana kernel, etc. I ka hihia o kahi pūʻolo, hiki i ka polokalamu BPF ke komo i ka ʻikepili a me ka metadata o ka pōʻai (no ka heluhelu ʻana a me ke kākau ʻana paha, e pili ana i ke ʻano o ka papahana); i ke ʻano o ka holo ʻana i kahi hana kernel, nā hoʻopaʻapaʻa o ka hana, me nā kuhikuhi i ka hoʻomanaʻo kernel, etc.

E nānā pono kākou i kēia kaʻina hana. No ka hoʻomaka ʻana, e kamaʻilio e pili ana i ka ʻokoʻa mua mai ka BPF maʻamau, nā papahana i kākau ʻia ma assembler. I loko o ka mana hou, ua hoʻonuiʻia ka hale hana i hiki ke kākauʻia nā papahana ma nā'ōlelo kūlana kiʻekiʻe, ma mua,ʻoiaʻiʻo, ma C. No kēia, ua kūkuluʻia kahi backend no llvm, e hiki ai ke hana i ka bytecode no ka hoʻolālā BPF.

BPF no na kamalii, hapa hookahi: BPF hoonui

Ua hoʻolālā ʻia ka hoʻolālā BPF, ma kahi hapa, e holo pono i nā mīkini hou. No ka hana ʻana i kēia hana ma ka hoʻomaʻamaʻa ʻana, ua unuhi ʻia ka BPF bytecode, i hoʻouka ʻia i loko o ka kernel, i loko o ke code maoli me ka hoʻohana ʻana i kahi mea i kapa ʻia he JIT compiler (Just In Time). A laila, inā ʻoe e hoʻomanaʻo, i ka BPF maʻamau ua hoʻouka ʻia ka papahana i loko o ka kernel a hoʻopili ʻia i ke kumu hanana hanana atomically - i loko o ka pōʻaiapili o kahi kelepona ʻōnaehana hoʻokahi. I ka hale hoʻolālā hou, hana kēia i ʻelua mau ʻanuʻu - ʻo ka mua, ua hoʻokomo ʻia ke code i loko o ka kernel me ka hoʻohana ʻana i kahi kelepona ʻōnaehana bpf(2)a laila, ma hope aku, ma o nā ʻano hana ʻē aʻe e like me ke ʻano o ka papahana, pili ka papahana i ke kumu hanana.

He nīnau paha ka mea heluhelu: hiki paha? Pehea ka palekana o ka hoʻokō ʻana o ia code? Hoʻopaʻa ʻia ka palekana o ka hoʻokō ʻana iā mākou e ka pae o ka hoʻouka ʻana i nā polokalamu BPF i kapa ʻia verifier (ma ka ʻōlelo Pelekania ua kapa ʻia kēia ʻano verifier a e hoʻomau wau i ka hoʻohana ʻana i ka ʻōlelo Pelekane):

BPF no na kamalii, hapa hookahi: BPF hoonui

ʻO Verifier kahi mea hōʻike static e hōʻoiaʻiʻo ʻaʻole e hoʻopau ka polokalamu i ka hana maʻamau o ka kernel. ʻO kēia, ma ke ala, ʻaʻole ia he manaʻo ʻaʻole hiki i ka papahana ke hoʻopilikia i ka hana o ka ʻōnaehana - nā polokalamu BPF, ma muli o ke ʻano, hiki ke heluhelu a kākau hou i nā ʻāpana o ka hoʻomanaʻo kernel, hoʻihoʻi i nā waiwai o nā hana, ʻoki, hoʻohui, kākau hou. a me nā ʻeke pūnaewele i mua. Hōʻoia ka Verifier i ka holo ʻana i kahi papahana BPF ʻaʻole ia e hoʻopololei i ka kernel a ʻo kahi papahana e like me nā lula, i loaʻa i ke komo kākau, no ka laʻana, ʻo ka ʻikepili o kahi ʻeke puka, ʻaʻole hiki ke kākau i ka hoʻomanaʻo kernel ma waho o ka ʻeke. E nānā mākou i ka verifier ma kahi kikoʻī liʻiliʻi ma ka ʻāpana pili, ma hope o ka ʻike ʻana i nā ʻāpana ʻē aʻe o BPF.

No laila he aha kā mākou i aʻo ai i kēia manawa? Kākau ka mea hoʻohana i kahi papahana ma C, hoʻouka iā ia i loko o ka kernel me ka hoʻohana ʻana i kahi kelepona ʻōnaehana bpf(2), kahi i nānā ʻia e ka mea hōʻoia a unuhi ʻia i ka bytecode maoli. A laila hoʻopili ka mea hoʻohana like a i ʻole kekahi mea hoʻohana i ka papahana i ke kumu hanana a hoʻomaka ia e hoʻokō. Pono ka hoʻokaʻawale ʻana i ka boot a me ka pilina no nā kumu he nui. ʻO ka mea mua, ʻoi aku ke kumukūʻai o ka holo ʻana i kahi hōʻoia a ma ka hoʻoiho ʻana i ka papahana like i nā manawa he nui mākou e hoʻopau i ka manawa kamepiula. ʻO ka lua, ʻo ke ʻano o ka pili ʻana o kahi papahana e pili ana i kona ʻano, a ʻo kahi "universal" interface i kūkulu ʻia i hoʻokahi makahiki i hala aku nei ʻaʻole kūpono i nā ʻano papahana hou. (ʻOiai i kēia manawa ke ulu aʻe nei ka hale hana, aia kahi manaʻo e hoʻohui i kēia interface ma ka pae libbpf.)

E ʻike paha ka mea heluhelu makaʻala ʻaʻole mākou i pau i nā kiʻi. ʻOiaʻiʻo, ʻaʻole wehewehe nā mea āpau i ke kumu e hoʻololi maoli ai ka BPF i ke kiʻi i hoʻohālikelike ʻia me ka BPF maʻamau. ʻElua mau mea hou e hoʻonui nui i ke ʻano o ka hoʻohana ʻana i ka hiki ke hoʻohana i ka hoʻomanaʻo like a me nā hana kōkua kernel. Ma BPF, hoʻokō ʻia ka hoʻomanaʻo ʻana me ka hoʻohana ʻana i nā palapala i kapa ʻia - kaʻana like ʻikepili me kahi API kikoʻī. Ua loaʻa paha iā lākou kēia inoa no ka mea ʻo ke ʻano mua o ka palapala ʻāina i ʻike ʻia he papa hash. A laila, ʻike ʻia nā papa kuhikuhi, nā papa hash kūloko (per-CPU) a me nā ʻāpana kūloko, nā kumu lāʻau ʻimi, nā palapala ʻāina i loaʻa nā kuhikuhi i nā polokalamu BPF a ʻoi aku. ʻO ka mea hoihoi iā mākou i kēia manawa ʻo ia ka hiki i nā polokalamu BPF ke hoʻomau i ke kūlana ma waena o nā kelepona a kaʻana like me nā papahana ʻē aʻe a me ka wahi hoʻohana.

Loaʻa ʻia nā palapala palapala mai nā kaʻina hana hoʻohana me ka hoʻohana ʻana i kahi kelepona ʻōnaehana bpf(2), a mai nā polokalamu BPF e holo ana i ka kernel me ka hoʻohana ʻana i nā hana kōkua. Eia kekahi, ʻaʻole wale nā ​​​​mea kōkua e hana me nā palapala ʻāina, akā no ke komo ʻana i nā mana kernel ʻē aʻe. No ka laʻana, hiki i nā polokalamu BPF ke hoʻohana i nā hana kōkua no ka hoʻouna ʻana i nā ʻeke i nā ʻaoʻao ʻē aʻe, e hana i nā hanana perf, e komo i nā hale kernel, a pēlā aku.

BPF no na kamalii, hapa hookahi: BPF hoonui

I ka hōʻuluʻulu manaʻo, hāʻawi ʻo BPF i ka hiki ke hoʻouka i nā ʻano like ʻole, ʻo ia hoʻi, verifier-tested, code mea hoʻohana i loko o ka lumi kernel. Hiki i kēia code ke mālama i ka mokuʻāina ma waena o nā kelepona a me ka hoʻololi ʻana i ka ʻikepili me kahi mea hoʻohana, a loaʻa pū kekahi i nā subsystem kernel i ʻae ʻia e kēia ʻano papahana.

Ua like kēia me nā mana i hāʻawi ʻia e nā modula kernel, i hoʻohālikelike ʻia i loaʻa i ka BPF kekahi mau mea maikaʻi (ʻoiaʻiʻo, hiki iā ʻoe ke hoʻohālikelike i nā noi like, no ka laʻana, ʻo ka ʻōnaehana ʻōnaehana - ʻaʻole hiki iā ʻoe ke kākau i kahi mea hoʻokele arbitrary me BPF). Hiki iā ʻoe ke ʻike i kahi paepae haʻahaʻa haʻahaʻa (ʻo kekahi mau mea pono e hoʻohana ana i ka BPF ʻaʻole koi i ka mea hoʻohana e loaʻa i nā mākau polokalamu kernel, a i ʻole nā ​​mākau hoʻolālā ma ke ʻano laulā), ka palekana o ka wā holo (e hāpai i kou lima i nā manaʻo no ka poʻe i haki ʻole i ka ʻōnaehana i ke kākau ʻana. a i ʻole nā ​​modula hoʻāʻo), atomicity - aia ka manawa haʻahaʻa i ka hoʻouka hou ʻana i nā modules, a ʻo ka subsystem BPF e hōʻoiaʻiʻo ʻaʻole i hala nā hanana (no ka pololei, ʻaʻole ʻoiaʻiʻo kēia no nā ʻano papahana BPF a pau).

ʻO ka hiki ʻana o ia mau mea hiki ke hana i ka BPF i mea hana honua no ka hoʻonui ʻana i ka kernel, i hōʻoia ʻia i ka hoʻomaʻamaʻa: ʻoi aku ka nui o nā ʻano papahana hou i hoʻohui ʻia i ka BPF, ʻoi aku ka nui o nā hui nui e hoʻohana i ka BPF ma nā kikowaena kaua 24 × 7, ʻoi aku ka nui aʻe. Hoʻokumu nā mea hoʻomaka i kā lākou ʻoihana ma nā hopena e pili ana i ka BPF. Hoʻohana ʻia ʻo BPF ma nā wahi āpau: i ka pale ʻana i ka hoʻouka ʻana o DDoS, ka hana ʻana i ka SDN (no ka laʻana, ka hoʻokō ʻana i nā pūnaewele no nā kubernetes), ma ke ʻano he ʻōnaehana huli ʻana a me ka ʻohi helu helu, i nā ʻōnaehana ʻike intrusion a me nā ʻōnaehana sandbox, etc.

E hoʻopau kākou i ka ʻāpana ʻike nui o ka ʻatikala ma aneʻi a nānā i ka mīkini virtual a me ka kaiaola BPF i nā kikoʻī hou aku.

Digression: pono

I mea e hiki ai ke holo i nā hiʻohiʻona ma nā ʻāpana aʻe, pono paha ʻoe i kekahi mau mea pono, ma ka liʻiliʻi llvm/clang me ke kākoʻo bpf a bpftool. ^ E Ha yM. Ma ka ʻāpana Nā mea hana hoʻomohala Hiki iā ʻoe ke heluhelu i nā ʻōlelo aʻoaʻo no ka hui ʻana i nā pono hana, a me kāu kernel. Ua waiho ʻia kēia ʻāpana ma lalo nei i ʻole e hoʻopilikia i ka lokahi o kā mākou hōʻike.

BPF Mīkini Mīkini Hoʻopaʻa inoa a me ka Pūnaehana Aʻo

Ua hoʻomohala ʻia ka ʻōnaehana hoʻolālā a me ka ʻōnaehana kauoha o BPF me ka noʻonoʻo ʻana e kākau ʻia nā papahana ma ka ʻōlelo C a, ma hope o ka hoʻouka ʻana i loko o ka kernel, unuhi ʻia i ke code maoli. No laila, ua koho ʻia ka helu o nā papa inoa a me ka hoʻonohonoho o nā kauoha me ka maka i ka intersection, i ka manaʻo makemakika, o nā hiki o nā mīkini hou. Eia kekahi, ua kau ʻia nā ʻano ʻokoʻa i nā papahana, no ka laʻana, a hiki i kēia manawa ʻaʻole hiki ke kākau i nā puka lou a me nā subroutines, a ua kaupalena ʻia ka helu o nā ʻōlelo aʻo i ka 4096 (hiki i nā polokalamu pono ke hoʻouka i kahi miliona mau kuhikuhi).

He ʻumikūmākahi nā papa inoa 64-bit hiki ke loaʻa i nā mea hoʻohana r0-r10 a me ka papa helu helu. Kakau inoa r10 loaʻa i kahi kuhikuhi kiʻi a he heluhelu-wale nō. Loaʻa nā polokalamu i kahi waihona 512-byte i ka wā holo a me ka helu palena ʻole o ka hoʻomanaʻo like ʻana ma ke ʻano o nā palapala ʻāina.

Ua ʻae ʻia nā polokalamu BPF e holo i kahi pūʻulu kikoʻī o nā mea kōkua kernel ʻano papahana a me nā hana maʻamau. Hiki i kēlā me kēia hana i kapa ʻia ke lawe i ʻelima mau manaʻo, i hoʻoholo ʻia ma nā papa inoa r1-r5, a hāʻawi ʻia ka waiwai hoʻihoʻi i r0. Ua hōʻoia ʻia ma hope o ka hoʻi ʻana mai ka hana, nā mea i loko o nā papa inoa r6-r9 ʻAʻole e loli.

No ka unuhi ʻana o ka polokalamu maikaʻi, hoʻopaʻa inoa r0-r11 no ka mea, ʻo nā hale kākela i kākoʻo ʻia, ua palapala ʻia i nā papa inoa maoli, me ka noʻonoʻo ʻana i nā hiʻohiʻona ABI o ka hoʻolālā o kēia manawa. No ka laʻana, no ka x86_64 kākau inoa r1-r5, i hoʻohana ʻia no ka hāʻawi ʻana i nā ʻāpana hana, hōʻike ʻia ma rdi, rsi, rdx, rcx, r8, i hoʻohana ʻia e hāʻawi i nā ʻāpana i nā hana x86_64. No ka laʻana, unuhi ka code ma ka hema i ke code ma ka ʻākau e like me kēia:

1:  (b7) r1 = 1                    mov    $0x1,%rdi
2:  (b7) r2 = 2                    mov    $0x2,%rsi
3:  (b7) r3 = 3                    mov    $0x3,%rdx
4:  (b7) r4 = 4                    mov    $0x4,%rcx
5:  (b7) r5 = 5                    mov    $0x5,%r8
6:  (85) call pc+1                 callq  0x0000000000001ee8

Kakau inoa r0 hoʻohana pū ʻia e hoʻihoʻi i ka hopena o ka hoʻokō papahana, a ma ka papa inoa r1 ua hāʻawi ʻia ka papahana i kahi kuhikuhi i ka pōʻaiapili - ma muli o ke ʻano o ka papahana, hiki paha kēia, no ka laʻana, kahi hoʻolālā struct xdp_md (no XDP) a i ʻole ka hale struct __sk_buff (no nā polokalamu pūnaewele like ʻole) a i ʻole ka hoʻolālā struct pt_regs (no nā ʻano papahana ʻimi like ʻole), etc.

No laila, ua loaʻa iā mākou kahi papa inoa, nā mea kōkua kernel, kahi hoʻopaʻa, kahi kuhikuhi pōʻaiapili a me ka hoʻomanaʻo like ʻana ma ke ʻano o nā palapala ʻāina. ʻAʻole pono kēia mau mea āpau ma ka huakaʻi, akā ...

E hoʻomau i ka wehewehe a kamaʻilio e pili ana i ka ʻōnaehana kauoha no ka hana ʻana me kēia mau mea. ʻO nā mea a pau (Aneane pau loa) Loaʻa i nā ʻōlelo kuhikuhi BPF ka nui 64-bit paʻa. Inā ʻoe e nānā i hoʻokahi aʻo ma kahi mīkini Big Endian 64-bit e ʻike ʻoe

BPF no na kamalii, hapa hookahi: BPF hoonui

he mea Code - ʻo kēia ka hoʻopili ʻana o ke aʻo ʻana, Dst/Src ʻo ia nā hoʻopaʻa ʻana o ka mea hoʻokipa a me ke kumu, kēlā me kēia, Off - 16-bit kakauinoa indentation, a Imm he 32-bit pūlima i hoʻohana ʻia i kekahi mau kuhikuhi (e like me ka cBPF mau K). Hoʻopili Code aia kekahi o nā ʻano ʻelua:

BPF no na kamalii, hapa hookahi: BPF hoonui

ʻO nā papa kuhikuhi 0, 1, 2, 3 e wehewehe i nā kauoha no ka hana ʻana me ka hoʻomanaʻo. ʻO lākou i hea?, BPF_LD, BPF_LDX, BPF_ST, BPF_STX, pakahi. Papa 4, 7 (BPF_ALU, BPF_ALU64) he pūʻulu o nā ʻōlelo kuhikuhi ALU. Papa 5, 6 (BPF_JMP, BPF_JMP32) aia nā ʻōlelo kuhikuhi lele.

ʻO ka hoʻolālā hou aʻe no ke aʻo ʻana i ka ʻōnaehana aʻo BPF penei: ma kahi o ka hoʻopaʻa inoa ʻana i nā ʻōlelo aʻoaʻo a me ko lākou mau palena, e nānā mākou i kekahi mau hiʻohiʻona ma kēia ʻāpana a mai ia mau mea e maopopo ai ke ʻano o ka hana ʻana o nā ʻōlelo a me pehea e hana ai. wehe lima i kekahi waihona binary no BPF. No ka hoʻohui ʻana i ka mea ma hope o ka ʻatikala, e hui pū mākou me nā ʻōlelo aʻoaʻo ma nā ʻāpana e pili ana i ka Verifier, JIT compiler, unuhi o BPF maʻamau, a me ke aʻo ʻana i nā palapala ʻāina, nā hana kelepona, etc.

Ke kamaʻilio mākou e pili ana i nā ʻōlelo aʻo pilikino, e nānā mākou i nā faila kumu bpf.h и bpf_common.h, ka mea e wehewehe i nā code helu o nā kuhikuhi BPF. Ke aʻo ʻana i ka hoʻolālā ʻana ma kāu iho a / a i ʻole parsing binaries, hiki iā ʻoe ke ʻike i nā semantics ma nā kumu aʻe, i hoʻokaʻawale ʻia ma ke ʻano o ka paʻakikī: kikoʻī eBPF kūhelu ʻole, BPF a me XDP Reference Guide, Instruction Set, Palapala/networking/filter.txt a, ʻoiaʻiʻo, ma ke code kumu Linux - verifier, JIT, BPF unuhi.

Laʻana: wehe i ka BPF ma kou poʻo

E nānā kākou i kahi laʻana e hoʻohui ai i kahi papahana readelf-example.c a nana i ka hua binary. E hōʻike mākou i ka ʻike kumu readelf-example.c ma lalo, ma hope o ka hoʻihoʻi ʻana i kāna loiloi mai nā code binary:

$ clang -target bpf -c readelf-example.c -o readelf-example.o -O2
$ llvm-readelf -x .text readelf-example.o
Hex dump of section '.text':
0x00000000 b7000000 01000000 15010100 00000000 ................
0x00000010 b7000000 02000000 95000000 00000000 ................

kolamu mua ma ka puka readelf he indentation a ʻo kā mākou papahana he ʻehā mau kauoha:

Code Dst Src Off  Imm
b7   0   0   0000 01000000
15   0   1   0100 00000000
b7   0   0   0000 02000000
95   0   0   0000 00000000

Ua like nā code kauoha b7, 15, b7 и 95. E hoʻomanaʻo ʻo ka hapa liʻiliʻi loa ʻekolu mau bits ka papa aʻo. I kā mākou hihia, ʻaʻohe hapa ʻehā o nā ʻōlelo aʻo a pau, no laila ʻo 7, 5, 7, 5 nā papa kuhikuhi. BPF_ALU64, a he 5 BPF_JMP. No nā papa ʻelua, ua like ke ʻano o ke aʻo ʻana (e ʻike i luna) a hiki iā mākou ke kākau hou i kā mākou papahana e like me kēia (ma ka manawa like mākou e kākau hou i nā kolamu i koe ma ke ʻano kanaka):

Op S  Class   Dst Src Off  Imm
b  0  ALU64   0   0   0    1
1  0  JMP     0   1   1    0
b  0  ALU64   0   0   0    2
9  0  JMP     0   0   0    0

Ka lawelawe b keka ALU64 Ua BPF_MOV. Hāʻawi ia i kahi waiwai i ka papa inoa e hele ai. Inā hoʻonoho ʻia ka bit s (kumu), a laila lawe ʻia ka waiwai mai ka papa inoa kumu, a inā, e like me kā mākou hihia, ʻaʻole i hoʻonohonoho ʻia, a laila lawe ʻia ka waiwai mai ke kahua. Imm. No laila ma ka ʻōlelo mua a me ke kolu mākou e hana ai i ka hana r0 = Imm. Eia kekahi, ʻo ka hana JMP papa 1 BPF_JEQ (lele inā like). I kā mākou hihia, mai ka bit S ʻaʻohe, hoʻohālikelike ia i ka waiwai o ka waihona kumu me ke kahua Imm. Inā kūlike nā waiwai, a laila hiki ke hoʻololi i PC + Offkahi PC, e like me ka mea mau, aia ka helu wahi o ke aʻo aʻe. ʻO ka mea hope loa, ʻo ka JMP Class 9 Operation BPF_EXIT. Hoʻopau kēia ʻōlelo aʻo i ka papahana, e hoʻi ana i ka kernel r0. E hoʻohui i kahi kolamu hou i kā mākou papaʻaina:

Op    S  Class   Dst Src Off  Imm    Disassm
MOV   0  ALU64   0   0   0    1      r0 = 1
JEQ   0  JMP     0   1   1    0      if (r1 == 0) goto pc+1
MOV   0  ALU64   0   0   0    2      r0 = 2
EXIT  0  JMP     0   0   0    0      exit

Hiki iā mākou ke kākau hou i kēia ma kahi ʻano maʻalahi:

     r0 = 1
     if (r1 == 0) goto END
     r0 = 2
END:
     exit

Inā mākou e hoʻomanaʻo i ka mea i loko o ka papa inoa r1 ua hāʻawi ʻia ka papahana i kahi kuhikuhi i ka pōʻaiapili mai ka kernel, a ma ka papa inoa r0 ua hoʻihoʻi ʻia kahi waiwai i ka kernel, a laila hiki iā mākou ke ʻike inā ʻaʻole ʻo ka kuhikuhi i ka pōʻaiapili, a laila e hoʻihoʻi mākou i ka 1, a i ʻole - 2. E nānā pono mākou ma ka nānā ʻana i ke kumu.

$ cat readelf-example.c
int foo(void *ctx)
{
        return ctx ? 2 : 1;
}

ʻAe, he polokalamu manaʻo ʻole ia, akā ua unuhi ʻia i ʻehā mau kuhikuhi maʻalahi.

Laʻana hoʻokoe: 16-byte aʻo

Ua ʻōlelo mua mākou e lawe ana kekahi mau ʻōlelo aʻo ma mua o 64 mau bits. Pili kēia, no ka laʻana, i nā kuhikuhi lddw (Kōhea = 0x18 = BPF_LD | BPF_DW | BPF_IMM) - hoʻouka i kahi huaʻōlelo pālua mai nā kahua i ka papa inoa Imm. ʻO ka mea ʻoiaʻiʻo Imm he 32 ka nui, a ʻo ka huaʻōlelo pālua he 64 bits, no laila, ʻaʻole e holo ka hoʻouka ʻana i kahi waiwai koke 64-bit i loko o kahi papa inoa i hoʻokahi aʻo 64-bit. No ka hana ʻana i kēia, hoʻohana ʻia nā ʻōlelo aʻoaʻo ʻelua e mālama i ka ʻāpana ʻelua o ka waiwai 64-bit ma ke kahua. Imm... Laʻana:

$ cat x64.c
long foo(void *ctx)
{
        return 0x11223344aabbccdd;
}
$ clang -target bpf -c x64.c -o x64.o -O2
$ llvm-readelf -x .text x64.o
Hex dump of section '.text':
0x00000000 18000000 ddccbbaa 00000000 44332211 ............D3".
0x00000010 95000000 00000000                   ........

ʻElua wale nō kuhikuhi i loko o kahi papahana binary:

Binary                                 Disassm
18000000 ddccbbaa 00000000 44332211    r0 = Imm[0]|Imm[1]
95000000 00000000                      exit

E hui hou kakou me na kuhikuhi lddw, ke kamaʻilio mākou e pili ana i ka neʻe ʻana a me ka hana ʻana me nā palapala ʻāina.

Laʻana: wehe i ka BPF me ka hoʻohana ʻana i nā mea hana maʻamau

No laila, ua aʻo mākou i ka heluhelu ʻana i nā code binary BPF a mākaukau e hoʻopaʻa i kekahi ʻōlelo aʻo inā pono. Eia nō naʻe, pono e ʻōlelo i ka hoʻomaʻamaʻa ʻoi aku ka maʻalahi a me ka wikiwiki e wehe i nā polokalamu me ka hoʻohana ʻana i nā mea hana maʻamau, no ka laʻana:

$ llvm-objdump -d x64.o

Disassembly of section .text:

0000000000000000 <foo>:
 0: 18 00 00 00 dd cc bb aa 00 00 00 00 44 33 22 11 r0 = 1234605617868164317 ll
 2: 95 00 00 00 00 00 00 00 exit

Lifecycle o nā mea BPF, bpffs file system

(Ua aʻo mua au i kekahi o nā kikoʻī i wehewehe ʻia ma kēia ʻāpana mai pou Alexei Starovoitov ma BPF Blog.)

Hana ʻia nā mea BPF - nā papahana a me nā palapala 'āina - mai kahi mea hoʻohana e hoʻohana ana i nā kauoha BPF_PROG_LOAD и BPF_MAP_CREATE kelepona ʻōnaehana bpf(2), e kamaʻilio mākou e pili ana i ke ʻano o kēia ma ka ʻāpana aʻe. Hoʻokumu kēia i nā hale ʻikepili kernel a no kēlā me kēia o lākou refcount (helu helu helu) ua hoʻonoho ʻia i hoʻokahi, a ua hoʻihoʻi ʻia kahi wehewehe faila e kuhikuhi ana i ka mea i ka mea hoʻohana. Ma hope o ka paʻa ʻana o ka lima refcount ua hoemiia ka mea i hookahi, a i ka hiki ana i ka zero, ua luku ia ka mea.

Inā hoʻohana ka papahana i nā palapala ʻāina, a laila refcount ua hoʻonui ʻia kēia mau palapala ʻāina i hoʻokahi ma hope o ka hoʻouka ʻana i ka papahana, ʻo ia hoʻi. hiki ke pani ʻia kā lākou faila wehewehe mai ke kaʻina hana a ka mea hoʻohana refcount ʻaʻole e lilo i ʻole:

BPF no na kamalii, hapa hookahi: BPF hoonui

Ma hope o ka hoʻouka ʻana i kahi papahana, hoʻopili maʻamau mākou iā ia i kekahi ʻano hanana hanana. No ka laʻana, hiki iā mākou ke kau iā ia ma kahi kikowaena pūnaewele e hoʻoponopono i nā ʻeke komo a hoʻopili paha i kekahi tracepoint i ke kumu. I kēia manawa, e hoʻonui ʻia ka helu kuhikuhi i hoʻokahi a hiki iā mākou ke pani i ka wehewehe faila ma ka papahana loader.

He aha ka hopena inā pani mākou i ka bootloader? Pili ia i ke ʻano o ka hanana hanana (hook). E ola nā makau pūnaewele a pau ma hope o ka pau ʻana o ka mea hoʻoili, ʻo ia nā mea i kapa ʻia nā makau honua. A, no ka laʻana, e hoʻokuʻu ʻia nā papahana trace ma hope o ka pau ʻana o ke kaʻina hana i hana ai iā lākou (a no laila ua kapa ʻia ʻo ia he kūloko, mai "local to the process"). Ma keʻano loea, loaʻa mau i nā lou kūloko kahi wehewehe kikoʻī kikoʻī ma kahi o ka mea hoʻohana a no laila e pani i ka wā i pani ʻia ai ke kaʻina hana, akā ʻaʻole nā ​​makau honua. Ma ke kiʻi aʻe, me ka hoʻohana ʻana i nā keʻa ʻulaʻula, hoʻāʻo wau e hōʻike i ke ʻano o ka hoʻopau ʻana o ka papahana loader i ke ola o nā mea i ka hihia o nā makau kūloko a me ka honua.

BPF no na kamalii, hapa hookahi: BPF hoonui

No ke aha he ʻokoʻa ma waena o nā makau kūloko a me ka honua? ʻO ka holo ʻana i kekahi mau ʻano o nā papahana pūnaewele he mea maʻalahi me ka ʻole o kahi mea hoʻohana, no ka laʻana, e noʻonoʻo i ka pale DDoS - kākau ka bootloader i nā lula a hoʻopili i ka polokalamu BPF i ke kikowaena pūnaewele, a laila hiki i ka bootloader ke hele a pepehi iā ia iho. Ma ka ʻaoʻao ʻē aʻe, e noʻonoʻo ʻoe i kahi papahana debugging trace āu i kākau ai ma kou mau kuli i nā minuke he ʻumi - i ka pau ʻana, makemake ʻoe ʻaʻohe ʻōpala i waiho ʻia i loko o ka ʻōnaehana, a na nā makau kūloko e hōʻoia i kēlā.

Ma ka ʻaoʻao ʻē aʻe, e noʻonoʻo ʻoe makemake e hoʻopili i kahi tracepoint i ka kernel a hōʻiliʻili i nā helu no nā makahiki he nui. I kēia hihia, makemake ʻoe e hoʻopau i ka ʻāpana mea hoʻohana a hoʻi i ka ʻikepili i kēlā me kēia manawa. Hāʻawi ka ʻōnaehana faila bpf i kēia manawa. He ʻōnaehana pseudo-file i loko o ka hoʻomanaʻo wale nō e hiki ai ke hana i nā faila e kuhikuhi i nā mea BPF a laila e hoʻonui ai. refcount mea. Ma hope o kēia, hiki i ka mea hoʻouka ke puka, a ola nā mea i hana ʻia.

BPF no na kamalii, hapa hookahi: BPF hoonui

Ke hana ʻana i nā faila ma nā bpffs e kuhikuhi ana i nā mea BPF i kapa ʻia ʻo "pinning" (e like me ka ʻōlelo aʻe: "hiki ke hana i kahi papahana BPF a i ʻole palapala"). ʻO ka hana ʻana i nā faila no nā mea BPF he mea kūpono ʻole no ka hoʻonui ʻana i ke ola o nā mea kūloko, akā no ka hoʻohana ʻana i nā mea honua - e hoʻi i ka laʻana me ka papahana pale DDoS honua, makemake mākou e hele mai a nānā i nā helu. i kēlā me kēia manawa.

Hoʻokomo pinepine ʻia ka ʻōnaehana faila BPF /sys/fs/bpf, akā hiki ke kau ʻia ma ka ʻāina, no ka laʻana, e like me kēia:

$ mkdir bpf-mountpoint
$ sudo mount -t bpf none bpf-mountpoint

Hana ʻia nā inoa ʻōnaehana waihona me ke kauoha BPF_OBJ_PIN Kāhea pūnaewele BPF. No ka hōʻike ʻana, e lawe kāua i kahi papahana, e hōʻuluʻulu, e hoʻouka a hoʻopaʻa ʻia i bpffs. ʻAʻole pono kā mākou polokalamu, ke hōʻike wale nei mākou i ke code i hiki iā ʻoe ke hana hou i ka laʻana:

$ cat test.c
__attribute__((section("xdp"), used))
int test(void *ctx)
{
        return 0;
}

char _license[] __attribute__((section("license"), used)) = "GPL";

E hōʻuluʻulu kākou i kēia polokalamu a hana i kope kūloko o ka ʻōnaehana waihona bpffs:

$ clang -target bpf -c test.c -o test.o
$ mkdir bpf-mountpoint
$ sudo mount -t bpf none bpf-mountpoint

I kēia manawa e hoʻoiho i kā mākou polokalamu me ka hoʻohana ʻana i ka pono bpftool a e nānā i nā kelepona pūnaewele e pili pū ana bpf(2) (Wehe ʻia kekahi mau laina pili ʻole mai ka strace output):

$ sudo strace -e bpf bpftool prog load ./test.o bpf-mountpoint/test
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, prog_name="test", ...}, 120) = 3
bpf(BPF_OBJ_PIN, {pathname="bpf-mountpoint/test", bpf_fd=3}, 120) = 0

Maanei ua hoʻouka mākou i ka polokalamu me ka hoʻohana ʻana BPF_PROG_LOAD, ua loaʻa kahi wehewehe faila mai ka kernel 3 a me ka hoʻohana ʻana i ke kauoha BPF_OBJ_PIN ua hoʻopaʻa ʻia kēia faila wehewehe ma ke ʻano he faila "bpf-mountpoint/test". Ma hope o kēia ka polokalamu bootloader bpftool pau ka hana ʻana, akā noho kā mākou papahana i ka kernel, ʻoiai ʻaʻole mākou i hoʻopili iā ia i kekahi kikowaena pūnaewele:

$ sudo bpftool prog | tail -3
783: xdp  name test  tag 5c8ba0cf164cb46c  gpl
        loaded_at 2020-05-05T13:27:08+0000  uid 0
        xlated 24B  jited 41B  memlock 4096B

Hiki iā mākou ke holoi i ka waihona mea maʻamau unlink(2) a ma hope o kēlā, e hoʻopau ʻia ka papahana pili:

$ sudo rm ./bpf-mountpoint/test
$ sudo bpftool prog show id 783
Error: get by id (783): No such file or directory

Holoi i nā mea

Ma ke kamaʻilio ʻana e pili ana i ka holoi ʻana i nā mea, pono e wehewehe ma hope o ka wehe ʻana i ka papahana mai ka makau (event generator), ʻaʻole kahi hanana hou e hoʻomaka i kāna hoʻomaka ʻana, akā naʻe, e hoʻopau ʻia nā manawa āpau o ka papahana ma ke ʻano maʻamau. .

ʻO kekahi mau ʻano papahana BPF e ʻae iā ʻoe e pani i ka papahana ma ka lele, ʻo ia. hāʻawi i ka atomicity sequence replace = detach old program, attach new program. Ma kēia hihia, e hoʻopau nā mea hana a pau o ka mana kahiko o ka papahana i kā lākou hana, a e hana ʻia nā mea lawelawe hanana hou mai ka papahana hou, a ʻo ka "atomicity" ma aneʻi ʻo ia hoʻi ʻaʻole e hala kekahi hanana.

Hoʻopili i nā polokalamu i nā kumu hanana

Ma kēia ʻatikala, ʻaʻole mākou e wehewehe kaʻawale i ka hoʻopili ʻana i nā papahana i nā kumu hanana, no ka mea he mea kūpono ke aʻo ʻana i kēia ma ke ʻano o kahi ʻano papahana. Cm. hiʻohiʻona ma lalo, kahi e hōʻike ai mākou i ka pili ʻana o nā polokalamu e like me XDP.

Ka hoʻohana ʻana i ka bpf System Call

Nā polokalamu BPF

Hoʻokumu ʻia a mālama ʻia nā mea BPF āpau mai kahi mea hoʻohana me ka hoʻohana ʻana i kahi kelepona ʻōnaehana bpf, loaʻa kēia prototype:

#include <linux/bpf.h>

int bpf(int cmd, union bpf_attr *attr, unsigned int size);

Eia ka hui cmd ʻo ia kekahi o nā waiwai o ke ʻano enum bpf_cmd, attr - kahi kuhikuhi i nā ʻāpana no kahi papahana kikoʻī a size - ka nui o ka mea e like me ke kuhikuhi, i.e. maʻa mau kēia sizeof(*attr). Ma ka kernel 5.8 ka ʻōnaehana kelepona bpf kākoʻo 34 okoa kauoha, a ʻike union bpf_attr noho 200 laina. Akā ʻaʻole pono mākou e hoʻoweliweli i kēia, no ka mea, e kamaʻāina mākou iā mākou iho me nā kauoha a me nā ʻāpana i ka wā o kekahi mau ʻatikala.

E hoʻomaka kākou me ka hui BPF_PROG_LOAD, ka mea i hana i nā polokalamu BPF - lawe i kahi hoʻonohonoho o nā ʻōlelo aʻoaʻo BPF a hoʻouka i loko o ka kernel. I ka manawa o ka hoʻouka ʻana, hoʻomaka ka mea hōʻoia, a laila ka JIT compiler a, ma hope o ka hoʻokō kūleʻa, hoʻihoʻi ʻia ka descriptor file program i ka mea hoʻohana. Ua ʻike mākou i ka mea e hiki mai ana iā ia ma ka pauku mua e pili ana i ke kaʻina ola o nā mea BPF.

E kākau mākou i kahi papahana maʻamau e hoʻouka i kahi polokalamu BPF maʻalahi, akā pono mākou e hoʻoholo i ke ʻano o ka papahana a mākou e makemake ai e hoʻouka - pono mākou e koho ʻAno a i loko o ke ʻano o kēia ʻano, e kākau i kahi papahana e hele i ka hōʻike hōʻoia. Eia naʻe, i ʻole e paʻakikī i ke kaʻina hana, eia kahi hoʻonā mākaukau: e lawe mākou i kahi papahana like BPF_PROG_TYPE_XDP, e hoʻihoʻi i ka waiwai XDP_PASS (hoʻokuʻu i nā pūʻolo āpau). I ka BPF assembler he mea maʻalahi loa ia:

r0 = 2
exit

Ma hope o ko mākou hoʻoholo ʻana i e hoʻouka mākou, hiki iā mākou ke haʻi iā ʻoe pehea mākou e hana ai:

#define _GNU_SOURCE
#include <string.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <linux/bpf.h>

static inline __u64 ptr_to_u64(const void *ptr)
{
        return (__u64) (unsigned long) ptr;
}

int main(void)
{
    struct bpf_insn insns[] = {
        {
            .code = BPF_ALU64 | BPF_MOV | BPF_K,
            .dst_reg = BPF_REG_0,
            .imm = XDP_PASS
        },
        {
            .code = BPF_JMP | BPF_EXIT
        },
    };

    union bpf_attr attr = {
        .prog_type = BPF_PROG_TYPE_XDP,
        .insns     = ptr_to_u64(insns),
        .insn_cnt  = sizeof(insns)/sizeof(insns[0]),
        .license   = ptr_to_u64("GPL"),
    };

    strncpy(attr.prog_name, "woo", sizeof(attr.prog_name));
    syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr));

    for ( ;; )
        pause();
}

Hoʻomaka nā hanana hoihoi i loko o kahi papahana me ka wehewehe ʻana o kahi laha insns - kā mākou papahana BPF i ka helu mīkini. I kēia hihia, hoʻopili ʻia kēlā me kēia ʻōlelo aʻoaʻo o ka papahana BPF i loko o ka hale bpf_insn. ʻElemu mua insns hoʻokō me nā kuhikuhi r0 = 2, ka lua - exit.

Hoʻi hope. Hoʻomaopopo ka kernel i nā macros kūpono no ke kākau ʻana i nā code mīkini, a me ka hoʻohana ʻana i ka faila poʻomanaʻo kernel tools/include/linux/filter.h hiki iā mākou ke kākau

struct bpf_insn insns[] = {
    BPF_MOV64_IMM(BPF_REG_0, XDP_PASS),
    BPF_EXIT_INSN()
};

Akā ʻoiai ʻo ka kākau ʻana i nā polokalamu BPF ma ke code maoli he mea pono wale nō no ke kākau ʻana i nā hoʻāʻo ma ka kernel a me nā ʻatikala e pili ana i ka BPF, ʻaʻole paʻakikī maoli ka nele o kēia mau macros i ke ola o ka mea hoʻomohala.

Ma hope o ka wehewehe ʻana i ka papahana BPF, neʻe mākou i ka hoʻouka ʻana i loko o ka kernel. ʻO kā mākou hoʻonohonoho minimalist o nā palena attr e pili ana i ke ʻano o ka papahana, hoʻonohonoho a me ka helu o nā ʻōlelo aʻo, ka laikini koi, a me ka inoa "woo", a mākou e hoʻohana ai e ʻimi i kā mākou papahana ma ka ʻōnaehana ma hope o ka hoʻoiho ʻana. Hoʻokomo ʻia ka polokalamu, e like me ka mea i hoʻohiki ʻia, i loko o ka ʻōnaehana me ka hoʻohana ʻana i kahi kelepona ʻōnaehana bpf.

I ka pau ʻana o ka papahana e hoʻopau mākou i kahi loop palena ʻole e hoʻohālikelike i ka uku. Inā ʻaʻole ia, e pepehi ʻia ka papahana e ka kernel i ka wā i pani ʻia ai ka wehewehe faila i hoʻihoʻi ʻia mai ka ʻōnaehana iā mākou bpf, a ʻaʻole mākou e ʻike iā ia ma ka ʻōnaehana.

ʻAe, ua mākaukau mākou no ka hoʻāʻo ʻana. E hui a holo i ka papahana ma lalo stracee nānā i ka hana ʻana o nā mea a pau e like me ka mea e pono ai:

$ clang -g -O2 simple-prog.c -o simple-prog

$ sudo strace ./simple-prog
execve("./simple-prog", ["./simple-prog"], 0x7ffc7b553480 /* 13 vars */) = 0
...
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, insn_cnt=2, insns=0x7ffe03c4ed50, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_V
ERSION(0, 0, 0), prog_flags=0, prog_name="woo", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS}, 72) = 3
pause(

Maikaʻi nā mea a pau, bpf(2) Ua hoʻihoʻi mai i ka lima 3 iā mākou a hele mākou i kahi loop palena ʻole me pause(). E hoʻāʻo kākou e ʻimi i kā mākou papahana ma ka ʻōnaehana. No ka hana ʻana i kēia, e hele mākou i kahi pahu ʻē aʻe a hoʻohana i ka pono bpftool:

# bpftool prog | grep -A3 woo
390: xdp  name woo  tag 3b185187f1855c4c  gpl
        loaded_at 2020-08-31T24:66:44+0000  uid 0
        xlated 16B  jited 40B  memlock 4096B
        pids simple-prog(10381)

ʻIke mākou aia kahi papahana i hoʻouka ʻia ma ka ʻōnaehana woo nona ka ID honua he 390 a ke holomua nei simple-prog aia kahi wehewehe faila wehe e kuhikuhi ana i ka papahana (a inā simple-prog e hoopau i ka hana, alaila woo e nalowale ana). E like me ka mea i manaʻo ʻia, ʻo ka papahana woo lawe i 16 bytes - ʻelua kuhikuhi - o nā code binary i ka hale hoʻolālā BPF, akā ma kona ʻano maoli (x86_64) he 40 bytes. E nānā kākou i kā mākou papahana ma kona ʻano kumu:

# bpftool prog dump xlated id 390
   0: (b7) r0 = 2
   1: (95) exit

ʻaʻohe mea kupanaha. I kēia manawa, e nānā kākou i ke code i hana ʻia e ka JIT compiler:

# bpftool prog dump jited id 390
bpf_prog_3b185187f1855c4c_woo:
   0:   nopl   0x0(%rax,%rax,1)
   5:   push   %rbp
   6:   mov    %rsp,%rbp
   9:   sub    $0x0,%rsp
  10:   push   %rbx
  11:   push   %r13
  13:   push   %r14
  15:   push   %r15
  17:   pushq  $0x0
  19:   mov    $0x2,%eax
  1e:   pop    %rbx
  1f:   pop    %r15
  21:   pop    %r14
  23:   pop    %r13
  25:   pop    %rbx
  26:   leaveq
  27:   retq

ʻaʻole maikaʻi loa no exit(2), akā, ma ka pololei, he maʻalahi loa kā mākou papahana, a no nā papahana non-trivial ka prologue a me ka epilogue i hoʻohui ʻia e ka JIT compiler, ʻoiaʻiʻo, pono.

Maps

Hiki i nā polokalamu BPF ke hoʻohana i nā wahi hoʻomanaʻo i hoʻonohonoho ʻia i hiki ke loaʻa i nā papahana BPF ʻē aʻe a me nā papahana ma kahi hoʻohana. Kapa ʻia kēia mau mea he palapala ʻāina a ma kēia ʻāpana e hōʻike mākou i ke ʻano o ka hoʻohana ʻana iā lākou me ke kelepona ʻōnaehana bpf.

E ʻōlelo koke kākou ʻaʻole i kaupalena ʻia nā hiki o nā palapala ʻāina i ke komo ʻana i ka hoʻomanaʻo like. Aia nā palapala 'āina kūikawā i loaʻa, no ka laʻana, nā kuhikuhi i nā polokalamu BPF a i ʻole nā ​​kuhikuhi i nā pilina pūnaewele, nā palapala ʻāina no ka hana ʻana me nā hanana perf, etc. ʻAʻole mākou e kamaʻilio e pili ana iā lākou ma ʻaneʻi, i ʻole e huikau ka mea heluhelu. Ma waho aʻe o kēia, hōʻole mākou i nā pilikia synchronization, no ka mea ʻaʻole koʻikoʻi kēia no kā mākou mau hiʻohiʻona. Hiki ke loaʻa kahi papa inoa piha o nā ʻano palapala ʻāina i loaʻa ma <linux/bpf.h>, a ma kēia ʻāpana e lawe mākou i laʻana i ke ʻano mua o ka mōʻaukala, ka papa hash BPF_MAP_TYPE_HASH.

Inā ʻoe e hana i kahi papa hash i loko, e ʻōlelo, C++, e ʻōlelo ʻoe unordered_map<int,long> woo, ʻo ia hoʻi ma ka ʻōlelo Lūkini "Pono wau i kahi papaʻaina woo ka nui palena ʻole, nona nā kī o ke ʻano int, a ʻo nā waiwai ke ʻano long" No ka hana ʻana i kahi papa hash BPF, pono mākou e hana i ka mea like, koe wale nō ka mea pono mākou e kuhikuhi i ka nui o ka nui o ka papaʻaina, a ma kahi o ka wehewehe ʻana i nā ʻano kī a me nā waiwai, pono mākou e kuhikuhi i ko lākou nui i nā bytes. . No ka hana ʻana i nā palapala ʻāina e hoʻohana i ke kauoha BPF_MAP_CREATE kelepona ʻōnaehana bpf. E nānā kākou i kahi polokalamu liʻiliʻi a ʻoi aʻe paha e hana i ka palapala ʻāina. Ma hope o ka papahana mua e hoʻouka ana i nā polokalamu BPF, he mea maʻalahi kēia iā ʻoe:

$ cat simple-map.c
#define _GNU_SOURCE
#include <string.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <linux/bpf.h>

int main(void)
{
    union bpf_attr attr = {
        .map_type = BPF_MAP_TYPE_HASH,
        .key_size = sizeof(int),
        .value_size = sizeof(int),
        .max_entries = 4,
    };
    strncpy(attr.map_name, "woo", sizeof(attr.map_name));
    syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));

    for ( ;; )
        pause();
}

Maanei mākou e wehewehe ai i kahi hoʻonohonoho o nā ʻāpana attr, kahi a mākou e ʻōlelo ai "Pono wau i kahi papa hash me nā kī a me nā waiwai nui sizeof(int), kahi e hiki ai iaʻu ke hoʻokomo i ʻehā mau mea nui." I ka hana ʻana i nā palapala BPF, hiki iā ʻoe ke kuhikuhi i nā ʻāpana ʻē aʻe, no ka laʻana, e like me ka laʻana me ka papahana, ua kuhikuhi mākou i ka inoa o ka mea. "woo".

E hōʻuluʻulu a holo i ka papahana:

$ clang -g -O2 simple-map.c -o simple-map
$ sudo strace ./simple-map
execve("./simple-map", ["./simple-map"], 0x7ffd40a27070 /* 14 vars */) = 0
...
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4, max_entries=4, map_name="woo", ...}, 72) = 3
pause(

Eia ka ʻōnaehana kelepona bpf(2) hoʻihoʻi mai iā mākou i ka helu palapala wehewehe 3 a laila e kali ana ka polokalamu, e like me ka mea i manaʻo ʻia, no nā ʻōlelo aʻo hou aʻe i ke kelepona ʻōnaehana pause(2).

I kēia manawa e hoʻouna i kā mākou papahana i ke kua a i ʻole e wehe i kahi pahu ʻē aʻe a nānā i kā mākou mea me ka hoʻohana ʻana i ka pono bpftool (hiki iā mākou ke hoʻokaʻawale i kā mākou palapala ʻāina mai nā mea ʻē aʻe ma kona inoa):

$ sudo bpftool map
...
114: hash  name woo  flags 0x0
        key 4B  value 4B  max_entries 4  memlock 4096B
...

ʻO ka helu 114 ka ID honua o kā mākou mea. Hiki i kekahi polokalamu ma ka ʻōnaehana ke hoʻohana i kēia ID e wehe i kahi palapala ʻāina e hoʻohana ana i ke kauoha BPF_MAP_GET_FD_BY_ID kelepona ʻōnaehana bpf.

I kēia manawa hiki iā mākou ke pāʻani me kā mākou papaʻaina hash. E nana kakou i kona mau mea:

$ sudo bpftool map dump id 114
Found 0 elements

Haʻahaʻa. E hookomo kakou i ka waiwai hash[1] = 1:

$ sudo bpftool map update id 114 key 1 0 0 0 value 1 0 0 0

E nānā hou kākou i ka papaʻaina:

$ sudo bpftool map dump id 114
key: 01 00 00 00  value: 01 00 00 00
Found 1 element

Hooray! Ua hiki iā mākou ke hoʻohui i hoʻokahi mea. E hoʻomaopopo he pono mākou e hana ma ka pae byte e hana i kēia, ʻoiai bptftool ʻaʻole ʻike i ke ʻano o nā waiwai i ka papa hash. (Hiki ke hoʻoili ʻia kēia ʻike iā ia me ka hoʻohana ʻana i ka BTF, akā ʻoi aku ka nui o kēlā i kēia manawa.)

Pehea ka heluhelu ʻana a me ka hoʻohui ʻana o bpftool i nā mea? E nānā kākou ma lalo o ka puʻupuʻu:

$ sudo strace -e bpf bpftool map dump id 114
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=114, next_id=0, open_flags=0}, 120) = 3
bpf(BPF_MAP_GET_NEXT_KEY, {map_fd=3, key=NULL, next_key=0x55856ab65280}, 120) = 0
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x55856ab65280, value=0x55856ab652a0}, 120) = 0
key: 01 00 00 00  value: 01 00 00 00
bpf(BPF_MAP_GET_NEXT_KEY, {map_fd=3, key=0x55856ab65280, next_key=0x55856ab65280}, 120) = -1 ENOENT

Ua wehe mua mākou i ka palapala 'āina e kāna ID honua me ke kauoha BPF_MAP_GET_FD_BY_ID и bpf(2) Hoʻihoʻi mai ka wehewehe 3 iā mākou. E hoʻohana hou i ke kauoha BPF_MAP_GET_NEXT_KEY ua loaʻa iā mākou ke kī mua ma ka pākaukau ma ka hele ʻana NULL i mea kuhikuhi i ke kī "mua". Inā loaʻa iā mākou ke kī hiki iā mākou ke hana BPF_MAP_LOOKUP_ELEMka mea e hoʻihoʻi i kahi waiwai i kahi kuhikuhi value. ʻO ka hana aʻe, ke hoʻāʻo nei mākou e ʻimi i ka mea aʻe ma ke kau ʻana i kahi kuhikuhi i ke kī o kēia manawa, akā hoʻokahi wale nō mea i loko o kā mākou papa a me ke kauoha. BPF_MAP_GET_NEXT_KEY hoʻi ENOENT.

ʻAe, e hoʻololi kākou i ka waiwai ma ke kī 1, e ʻōlelo kākou i kā mākou loiloi ʻoihana pono e hoʻopaʻa inoa hash[1] = 2:

$ sudo strace -e bpf bpftool map update id 114 key 1 0 0 0 value 2 0 0 0
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=114, next_id=0, open_flags=0}, 120) = 3
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=3, key=0x55dcd72be260, value=0x55dcd72be280, flags=BPF_ANY}, 120) = 0

E like me ka mea i manaʻo ʻia, maʻalahi loa: ke kauoha BPF_MAP_GET_FD_BY_ID wehe i kā mākou palapala 'āina ma ka ID, a me ke kauoha BPF_MAP_UPDATE_ELEM kākau hou i ka mea.

No laila, ma hope o ka hana ʻana i kahi papa hash mai kahi papahana, hiki iā mākou ke heluhelu a kākau i kāna mau ʻike mai kahi papahana ʻē aʻe. E hoʻomaopopo inā hiki iā mākou ke hana i kēia mai ka laina kauoha, a laila hiki i kekahi papahana ʻē aʻe ma ka ʻōnaehana ke hana. Ma waho aʻe o nā kauoha i hōʻike ʻia ma luna, no ka hana ʻana me nā palapala ʻāina mai kahi mea hoʻohana, kēia mau mea:

  • BPF_MAP_LOOKUP_ELEM: huli waiwai ma ke kī
  • BPF_MAP_UPDATE_ELEM: hōʻano hou / hana waiwai
  • BPF_MAP_DELETE_ELEM: wehe i ke kī
  • BPF_MAP_GET_NEXT_KEY: huli i ke kī aʻe (a i ʻole mua).
  • BPF_MAP_GET_NEXT_ID: hiki iā ʻoe ke hele i nā palapala ʻāina a pau, ʻo ia ka hana bpftool map
  • BPF_MAP_GET_FD_BY_ID: wehe i ka palapala 'āina e noho nei me kona ID honua
  • BPF_MAP_LOOKUP_AND_DELETE_ELEM: hōʻano hou i ka waiwai o kekahi mea a hoʻihoʻi i ka mea kahiko
  • BPF_MAP_FREEZE: e hoʻololi i ka palapala ʻāina mai ka mea hoʻohana (ʻaʻole hiki ke wehe ʻia kēia hana)
  • BPF_MAP_LOOKUP_BATCH, BPF_MAP_LOOKUP_AND_DELETE_BATCH, BPF_MAP_UPDATE_BATCH, BPF_MAP_DELETE_BATCH: hana lehulehu. ʻo kahi laʻana, BPF_MAP_LOOKUP_AND_DELETE_BATCH - ʻo kēia wale nō ke ala hilinaʻi e heluhelu ai a hoʻonohonoho hou i nā waiwai āpau mai ka palapala ʻāina

ʻAʻole hana kēia mau kauoha no nā ʻano palapala ʻāina āpau, akā ma ka hana maʻamau me nā ʻano palapala ʻāina ʻē aʻe mai kahi mea hoʻohana e like like me ka hana ʻana me nā papa hash.

No ke kauoha, e hoʻopau i kā mākou hoʻokolohua papaʻaina hash. E hoʻomanaʻo ua hana mākou i kahi papaʻaina hiki ke loaʻa i nā kī ʻehā? E hoʻohui i kekahi mau mea hou aʻe:

$ sudo bpftool map update id 114 key 2 0 0 0 value 1 0 0 0
$ sudo bpftool map update id 114 key 3 0 0 0 value 1 0 0 0
$ sudo bpftool map update id 114 key 4 0 0 0 value 1 0 0 0

I kēia manawa maikaʻi:

$ sudo bpftool map dump id 114
key: 01 00 00 00  value: 01 00 00 00
key: 02 00 00 00  value: 01 00 00 00
key: 04 00 00 00  value: 01 00 00 00
key: 03 00 00 00  value: 01 00 00 00
Found 4 elements

E ho'āʻo kākou e hoʻohui i hoʻokahi:

$ sudo bpftool map update id 114 key 5 0 0 0 value 1 0 0 0
Error: update failed: Argument list too long

E like me ka mea i manaʻo ʻia, ʻaʻole mākou i kūleʻa. E nānā hou aku kākou i ka hewa:

$ sudo strace -e bpf bpftool map update id 114 key 5 0 0 0 value 1 0 0 0
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=114, next_id=0, open_flags=0}, 120) = 3
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=3, info_len=80, info=0x7ffe6c626da0}}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=3, key=0x56049ded5260, value=0x56049ded5280, flags=BPF_ANY}, 120) = -1 E2BIG (Argument list too long)
Error: update failed: Argument list too long
+++ exited with 255 +++

Ua maikaʻi nā mea a pau: e like me ka mea i manaʻo ʻia, ka hui BPF_MAP_UPDATE_ELEM hoʻāʻo e hana i kahi kī hou, ʻelima, akā hāʻule E2BIG.

No laila, hiki iā mākou ke hana a hoʻouka i nā polokalamu BPF, a me ka hana ʻana a me ka hoʻokele ʻana i nā palapala ʻāina mai kahi mea hoʻohana. I kēia manawa, kūpono ke nānā ʻana pehea e hiki ai iā mākou ke hoʻohana i nā palapala ʻāina mai nā papahana BPF iā lākou iho. Hiki iā mākou ke kamaʻilio e pili ana i kēia ma ka ʻōlelo o nā papahana paʻakikī i ka heluhelu ʻana i nā code macro mīkini, akā ʻoiaʻiʻo ua hiki mai ka manawa e hōʻike ai pehea i kākau maoli ʻia ai nā papahana BPF - me ka hoʻohana ʻana. libbpf.

(No ka poʻe heluhelu i ʻoluʻolu ʻole i ka nele o kahi hiʻohiʻona haʻahaʻa haʻahaʻa: e loiloi mākou i nā papahana kikoʻī e hoʻohana ana i nā palapala ʻāina a me nā hana kōkua i hana ʻia me libbpf a haʻi iā ʻoe i ka mea e hana nei ma ka pae aʻo. No ka poʻe heluhelu i ʻoluʻolu ʻole nui loa, ua hoʻohui mākou hiʻohiʻona ma kahi kūpono ma ka ʻatikala.)

Ke kākau ʻana i nā polokalamu BPF me ka hoʻohana ʻana i ka libbpf

ʻO ke kākau ʻana i nā polokalamu BPF me ka hoʻohana ʻana i nā code mīkini hiki ke hoihoi i ka manawa mua wale nō, a laila hoʻokomo ʻia ka māʻona. I kēia manawa pono ʻoe e hoʻohuli i kou manaʻo llvm, nona ka hope no ka hana ʻana i nā code no ka hoʻolālā BPF, a me kahi waihona libbpf, hiki iā ʻoe ke kākau i ka ʻaoʻao mea hoʻohana o nā noi BPF a hoʻouka i ke code o nā polokalamu BPF i hana ʻia me ka hoʻohana ʻana. llvm/clang.

ʻOiaʻiʻo, e like me kā mākou e ʻike ai ma kēia a me nā ʻatikala e hiki mai ana. libbpf he nui nā hana me ka ʻole (a i ʻole nā ​​mea hana like - iproute2, libbcc, libbpf-go, etc.) hiki ole ke ola. ʻO kekahi o nā hiʻohiʻona pepehi kanaka o ka papahana libbpf ʻO BPF CO-RE (Compile Once, Run Everywhere) - he papahana e hiki ai iā ʻoe ke kākau i nā polokalamu BPF i lawe ʻia mai kekahi kernel a i kekahi, me ka hiki ke holo ma nā API like ʻole (no ka laʻana, ke hoʻololi ke ʻano o ka kernel mai ka mana. i ka mana). I mea e hiki ai ke hana pū me CO-RE, pono e hui pū ʻia kāu kernel me ke kākoʻo BTF (e wehewehe mākou pehea e hana ai i kēia ma ka ʻāpana. Nā mea hana hoʻomohala. Hiki iā ʻoe ke nānā inā kūkulu ʻia kāu kernel me BTF a ʻaʻole maʻalahi loa - ma ke alo o kēia faila:

$ ls -lh /sys/kernel/btf/vmlinux
-r--r--r-- 1 root root 2.6M Jul 29 15:30 /sys/kernel/btf/vmlinux

Mālama kēia faila i ka ʻike e pili ana i nā ʻano ʻikepili āpau i hoʻohana ʻia i ka kernel a hoʻohana ʻia i kā mākou mau hiʻohiʻona āpau libbpf. E kamaʻilio kikoʻī mākou e pili ana i ka CO-RE ma ka ʻatikala aʻe, akā i kēia - kūkulu wale ʻoe iā ʻoe iho i kernel me CONFIG_DEBUG_INFO_BTF.

waihona libbpf noho pono i ka papa kuhikuhi tools/lib/bpf kernel a me kāna hoʻomohala ʻana ma o ka papa inoa leka uila [email protected]. Eia naʻe, mālama ʻia kahi waihona ʻokoʻa no nā pono o nā noi e noho ana ma waho o ka kernel https://github.com/libbpf/libbpf kahi i hoʻohālikelike ʻia ai ka waihona kernel no ka loaʻa ʻana o ka heluhelu ʻana a ʻoi aʻe paha.

Ma kēia ʻāpana e nānā mākou pehea ʻoe e hana ai i kahi papahana e hoʻohana ai libbpf, e kākau kāua i nā polokalamu hoʻāʻo (ʻoi aʻe a i ʻole ka manaʻo ʻole) a e noʻonoʻo kikoʻī pehea e hana ai. E ʻae kēia iā mākou e wehewehe maʻalahi i nā ʻāpana aʻe i ka pili ʻana o nā polokalamu BPF me nā palapala ʻāina, nā mea kōkua kernel, BTF, etc.

Hoʻohana maʻamau nā papahana libbpf hoʻohui i kahi waihona GitHub ma ke ʻano he submodule git, e hana like mākou:

$ mkdir /tmp/libbpf-example
$ cd /tmp/libbpf-example/
$ git init-db
Initialized empty Git repository in /tmp/libbpf-example/.git/
$ git submodule add https://github.com/libbpf/libbpf.git
Cloning into '/tmp/libbpf-example/libbpf'...
remote: Enumerating objects: 200, done.
remote: Counting objects: 100% (200/200), done.
remote: Compressing objects: 100% (103/103), done.
remote: Total 3354 (delta 101), reused 118 (delta 79), pack-reused 3154
Receiving objects: 100% (3354/3354), 2.05 MiB | 10.22 MiB/s, done.
Resolving deltas: 100% (2176/2176), done.

Ke hele nei libbpf maʻalahi loa:

$ cd libbpf/src
$ mkdir build
$ OBJDIR=build DESTDIR=root make -s install
$ find root
root
root/usr
root/usr/include
root/usr/include/bpf
root/usr/include/bpf/bpf_tracing.h
root/usr/include/bpf/xsk.h
root/usr/include/bpf/libbpf_common.h
root/usr/include/bpf/bpf_endian.h
root/usr/include/bpf/bpf_helpers.h
root/usr/include/bpf/btf.h
root/usr/include/bpf/bpf_helper_defs.h
root/usr/include/bpf/bpf.h
root/usr/include/bpf/libbpf_util.h
root/usr/include/bpf/libbpf.h
root/usr/include/bpf/bpf_core_read.h
root/usr/lib64
root/usr/lib64/libbpf.so.0.1.0
root/usr/lib64/libbpf.so.0
root/usr/lib64/libbpf.a
root/usr/lib64/libbpf.so
root/usr/lib64/pkgconfig
root/usr/lib64/pkgconfig/libbpf.pc

ʻO kā mākou papahana aʻe ma kēia ʻāpana penei: e kākau mākou i kahi papahana BPF e like me BPF_PROG_TYPE_XDP, e like me ka laʻana ma mua, akā ma C, hoʻohui mākou me ka hoʻohana ʻana clang, a kākau i kahi polokalamu kōkua e hoʻouka iā ia i loko o ka kernel. Ma nā ʻāpana aʻe e hoʻonui mākou i nā hiki o ka papahana BPF a me ka papahana kōkua.

Laʻana: hana i kahi noi piha me ka libbpf

No ka hoʻomaka ʻana, hoʻohana mākou i ka faila /sys/kernel/btf/vmlinux, i ʻōlelo ʻia ma luna, a hana i kāna mea like ma ke ʻano o kahi faila poʻomanaʻo:

$ bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h

E mālama kēia faila i nā hale ʻikepili āpau i loaʻa i kā mākou kernel, no ka laʻana, penei ka wehewehe ʻana o ke poʻo IPv4 i ka kernel:

$ grep -A 12 'struct iphdr {' vmlinux.h
struct iphdr {
    __u8 ihl: 4;
    __u8 version: 4;
    __u8 tos;
    __be16 tot_len;
    __be16 id;
    __be16 frag_off;
    __u8 ttl;
    __u8 protocol;
    __sum16 check;
    __be32 saddr;
    __be32 daddr;
};

I kēia manawa e kākau mākou i kā mākou papahana BPF ma C:

$ cat xdp-simple.bpf.c
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>

SEC("xdp/simple")
int simple(void *ctx)
{
        return XDP_PASS;
}

char LICENSE[] SEC("license") = "GPL";

ʻOiai ua maʻalahi kā mākou papahana, pono mākou e hoʻolohe i nā kikoʻī he nui. ʻO ka mea mua, ʻo ka faila poʻomanaʻo mua mākou e hoʻokomo ai vmlinux.h, a mākou i hana ai me ka hoʻohana ʻana bpftool btf dump - i kēia manawa ʻaʻole pono mākou e hoʻokomo i ka pūʻolo kernel-headers e ʻike i ke ʻano o nā hale kernel. Hiki mai kēia waihona poʻomanaʻo iā mākou mai ka waihona libbpf. I kēia manawa pono mākou e wehewehe i ka macro SEC, e hoʻouna i ke ʻano i ka ʻāpana kūpono o ka faila mea ELF. Aia kā mākou papahana ma ka ʻāpana xdp/simple, kahi ma mua o ka slash mākou e wehewehe i ke ʻano o ka papahana BPF - ʻo ia ka hui i hoʻohana ʻia ma libbpf, ma muli o ka inoa ʻāpana e hoʻololi i ke ʻano kūpono i ka hoʻomaka ʻana bpf(2). ʻO ka polokalamu BPF ponoʻī C - maʻalahi loa a loaʻa i hoʻokahi laina return XDP_PASS. ʻO ka hope, he ʻāpana kaʻawale "license" aia ka inoa o ka laikini.

Hiki iā mākou ke hōʻuluʻulu i kā mākou papahana me ka hoʻohana ʻana i llvm/clang, version >= 10.0.0, a i ʻole maikaʻi aʻe, ʻoi aku ka nui (e ʻike i ka pauku Nā mea hana hoʻomohala):

$ clang --version
clang version 11.0.0 (https://github.com/llvm/llvm-project.git afc287e0abec710398465ee1f86237513f2b5091)
...

$ clang -O2 -g -c -target bpf -I libbpf/src/root/usr/include xdp-simple.bpf.c -o xdp-simple.bpf.o

Ma waena o nā hiʻohiʻona hoihoi: hōʻike mākou i ka hoʻolālā pahuhopu -target bpf a me ke ala i nā poʻo libbpf, a mākou i hoʻokomo hou nei. Eia kekahi, mai poina e pili ana -O2, me ka ʻole o kēia koho, hiki paha iā ʻoe ke kahaha i ka wā e hiki mai ana. E nānā kāua i kā mākou code, ua hiki iā mākou ke kākau i ka papahana a mākou i makemake ai?

$ llvm-objdump --section=xdp/simple --no-show-raw-insn -D xdp-simple.bpf.o

xdp-simple.bpf.o:       file format elf64-bpf

Disassembly of section xdp/simple:

0000000000000000 <simple>:
       0:       r0 = 2
       1:       exit

ʻAe, ua hana! I kēia manawa, loaʻa iā mākou kahi faila binary me ka papahana, a makemake mākou e hana i kahi noi e hoʻouka iā ia i loko o ka kernel. No kēia kumu, ka hale waihona puke libbpf hāʻawi iā mākou i ʻelua mau koho - e hoʻohana i kahi API haʻahaʻa a i ʻole API kiʻekiʻe. E hele mākou ma ke ala ʻelua, no ka mea makemake mākou e aʻo pehea e kākau, hoʻouka a hoʻopili i nā polokalamu BPF me ka liʻiliʻi o kā lākou aʻo ʻana.

ʻO ka mea mua, pono mākou e hana i ka "skeleton" o kā mākou papahana mai kāna binary me ka hoʻohana ʻana i ka pono like bpftool - ka pahi Swiss o ka honua BPF (hiki ke lawe maoli ʻia, ʻoiai ʻo Daniel Borkman, kekahi o nā mea hana a mālama i ka BPF, ʻo ia ʻo Swiss):

$ bpftool gen skeleton xdp-simple.bpf.o > xdp-simple.skel.h

Ma ka waihona xdp-simple.skel.h Loaʻa ka code binary o kā mākou papahana a me nā hana no ka hoʻokele - hoʻouka ʻana, hoʻopili, holoi ʻana i kā mākou mea. I kā mākou hihia maʻalahi, ua like kēia me ka overkill, akā hana pū ia i ka hihia kahi i loaʻa i ka faila mea nā polokalamu BPF a me nā palapala 'āina a no ka hoʻouka ʻana i kēia ELF nunui pono mākou e hana i ka iwi a kāhea aku i hoʻokahi a ʻelua paha mau hana mai ka noi maʻamau a mākou. ke kākau nei E neʻe kākou i kēia manawa.

ʻO ka ʻōlelo koʻikoʻi, he mea liʻiliʻi kā mākou papahana loader:

#include <err.h>
#include <unistd.h>
#include "xdp-simple.skel.h"

int main(int argc, char **argv)
{
    struct xdp_simple_bpf *obj;

    obj = xdp_simple_bpf__open_and_load();
    if (!obj)
        err(1, "failed to open and/or load BPF objectn");

    pause();

    xdp_simple_bpf__destroy(obj);
}

he mea struct xdp_simple_bpf i wehewehe ʻia ma ka faila xdp-simple.skel.h a wehewehe i kā mākou mea waihona:

struct xdp_simple_bpf {
    struct bpf_object_skeleton *skeleton;
    struct bpf_object *obj;
    struct {
        struct bpf_program *simple;
    } progs;
    struct {
        struct bpf_link *simple;
    } links;
};

Hiki iā mākou ke ʻike i nā ʻāpana o kahi API haʻahaʻa haʻahaʻa ma aneʻi: ka hale struct bpf_program *simple и struct bpf_link *simple. ʻO ka papa hana mua e wehewehe pono i kā mākou papahana, i kākau ʻia ma ka ʻāpana xdp/simple, a wehewehe ka lua i ka pili ʻana o ka papahana i ke kumu hanana.

kuleana pili i xdp_simple_bpf__open_and_load, wehe i kahi mea ELF, hoʻokaʻawale iā ia, hana i nā hale a me nā substructure a pau (ma waho aʻe o ka papahana, loaʻa pū kekahi ELF i nā ʻāpana ʻē aʻe - ʻikepili, ʻikepili heluhelu wale nō, ʻike debugging, laikini, a me nā mea ʻē aʻe), a laila hoʻouka i loko o ka kernel me ka hoʻohana ʻana i kahi ʻōnaehana. kahea bpf, hiki iā mākou ke nānā ma ka hōʻuluʻulu ʻana a me ka holo ʻana i ka papahana:

$ clang -O2 -I ./libbpf/src/root/usr/include/ xdp-simple.c -o xdp-simple ./libbpf/src/root/usr/lib64/libbpf.a -lelf -lz

$ sudo strace -e bpf ./xdp-simple
...
bpf(BPF_BTF_LOAD, 0x7ffdb8fd9670, 120)  = 3
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, insn_cnt=2, insns=0xdfd580, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 8, 0), prog_flags=0, prog_name="simple", prog_ifindex=0, expected_attach_type=0x25 /* BPF_??? */, ...}, 120) = 4

E nānā kākou i kā mākou polokalamu e hoʻohana nei bpftool. E ʻimi kākou i kāna ID:

# bpftool p | grep -A4 simple
463: xdp  name simple  tag 3b185187f1855c4c  gpl
        loaded_at 2020-08-01T01:59:49+0000  uid 0
        xlated 16B  jited 40B  memlock 4096B
        btf_id 185
        pids xdp-simple(16498)

a dump (hoʻohana mākou i kahi ʻano pōkole o ke kauoha bpftool prog dump xlated):

# bpftool p d x id 463
int simple(void *ctx):
; return XDP_PASS;
   0: (b7) r0 = 2
   1: (95) exit

He mea hou! Ua paʻi ka papahana i nā ʻāpana o kā mākou waihona kumu C. Ua hana ʻia kēia e ka waihona libbpf, ka mea i loaʻa i ka ʻāpana debug i ka binary, hoʻohui iā ia i kahi mea BTF, hoʻouka iā ia i loko o ka kernel me ka hoʻohana ʻana. BPF_BTF_LOAD, a laila kuhikuhi i ka wehewehe faila i ka wā e hoʻouka ai i ka papahana me ke kauoha BPG_PROG_LOAD.

Kokua Kernel

Hiki i nā polokalamu BPF ke holo i nā hana "waho" - nā mea kōkua kernel. Hāʻawi kēia mau hana kōkua i nā polokalamu BPF e komo i nā hale kernel, hoʻokele i nā palapala 'āina, a kamaʻilio pū me ka "honua maoli" - hana i nā hanana perf, nā lako mana (e like me ka hoʻihoʻi ʻana i nā packet), etc.

Laʻana: bpf_get_smp_processor_id

I loko o ka papa hana o ka paradigm "aʻo ma ka laʻana", e noʻonoʻo kākou i kekahi o nā hana kōkua, bpf_get_smp_processor_id(), maopopo ma ka waihona kernel/bpf/helpers.c. Hoʻihoʻi ia i ka helu o ka mea hana ma luna o ka polokalamu BPF i kapa ʻia e holo ana. Akā ʻaʻole makemake mākou i kāna mau semantics e like me ka hoʻokō ʻana i hoʻokahi laina:

BPF_CALL_0(bpf_get_smp_processor_id)
{
    return smp_processor_id();
}

ʻO ka wehewehe ʻana i ka hana kōkua BPF e like me ka wehewehe ʻana o ka ʻōnaehana Linux. Eia, no ka laʻana, ua wehewehe ʻia kahi hana ʻaʻohe kumu hoʻopaʻapaʻa. (ʻO kahi hana e lawe, e ʻōlelo, ʻekolu mau manaʻo i wehewehe ʻia me ka hoʻohana ʻana i ka macro BPF_CALL_3. ʻO ka nui o nā hoʻopaʻapaʻa he ʻelima.) Eia naʻe, ʻo ka hapa mua wale nō kēia o ka wehewehe. ʻO ka ʻāpana ʻelua e wehewehe i ke ʻano o ke ʻano struct bpf_func_proto, aia ka wehewehe ʻana o ka hana kōkua i hoʻomaopopo ʻia e ka mea hōʻoia:

const struct bpf_func_proto bpf_get_smp_processor_id_proto = {
    .func     = bpf_get_smp_processor_id,
    .gpl_only = false,
    .ret_type = RET_INTEGER,
};

Hoʻopaʻa inoa i nā hana kōkua

I mea e hoʻohana ai nā polokalamu BPF o kekahi ʻano i kēia hana, pono lākou e hoʻopaʻa inoa, no ka laʻana no ke ʻano BPF_PROG_TYPE_XDP ua wehewehe ʻia kahi hana ma ka kernel xdp_func_proto, e hoʻoholo ana mai ka ID hana kōkua inā kākoʻo ʻo XDP i kēia hana a i ʻole. ʻO kā mākou hana kākoʻo:

static const struct bpf_func_proto *
xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
{
    switch (func_id) {
    ...
    case BPF_FUNC_get_smp_processor_id:
        return &bpf_get_smp_processor_id_proto;
    ...
    }
}

ʻO nā ʻano papahana BPF hou i "wehewehe" i ka faila include/linux/bpf_types.h me ka hoʻohana ʻana i kahi macro BPF_PROG_TYPE. Wehewehe ʻia ma nā ʻōlelo puʻupuʻu no ka mea he wehewehe kūpono ia, a ma nā huaʻōlelo C ʻo ka wehewehe ʻana o kahi pūʻulu holoʻokoʻa o nā hale paʻa i loaʻa ma nā wahi ʻē aʻe. ʻO ka mea nui, i ka faila kernel/bpf/verifier.c nā wehewehe a pau mai ka faila bpf_types.h hoʻohana ʻia e hana i kahi ʻano o nā hale bpf_verifier_ops[]:

static const struct bpf_verifier_ops *const bpf_verifier_ops[] = {
#define BPF_PROG_TYPE(_id, _name, prog_ctx_type, kern_ctx_type) 
    [_id] = & _name ## _verifier_ops,
#include <linux/bpf_types.h>
#undef BPF_PROG_TYPE
};

ʻO ia hoʻi, no kēlā me kēia ʻano o ka papahana BPF, ua wehewehe ʻia kahi kuhikuhi i kahi ʻano ʻikepili o ke ʻano struct bpf_verifier_ops, i hoʻomaka me ka waiwai _name ## _verifier_ops, ʻo ia hoʻi, xdp_verifier_ops no ka mea, xdp. Hoʻolālā xdp_verifier_ops hoʻoholo ma ka waihona net/core/filter.c penei:

const struct bpf_verifier_ops xdp_verifier_ops = {
    .get_func_proto     = xdp_func_proto,
    .is_valid_access    = xdp_is_valid_access,
    .convert_ctx_access = xdp_convert_ctx_access,
    .gen_prologue       = bpf_noop_prologue,
};

Maanei mākou e ʻike ai i kā mākou hana maʻamau xdp_func_proto, ka mea e holo i ka mea hōʻoia i kēlā me kēia manawa e hālāwai ai me kahi paʻakikī kekahi nā hana i loko o kahi papahana BPF, ʻike verifier.c.

E nānā kākou i ka hoʻohana ʻana o ka polokalamu BPF hypothetical i ka hana bpf_get_smp_processor_id. No ka hana ʻana i kēia, kākau hou mākou i ka papahana mai kā mākou ʻāpana mua penei:

#include "vmlinux.h"
#include <bpf/bpf_helpers.h>

SEC("xdp/simple")
int simple(void *ctx)
{
    if (bpf_get_smp_processor_id() != 0)
        return XDP_DROP;
    return XDP_PASS;
}

char LICENSE[] SEC("license") = "GPL";

Hōʻailona bpf_get_smp_processor_id hoʻoholo в <bpf/bpf_helper_defs.h> hale waihona puke libbpf pehea

static u32 (*bpf_get_smp_processor_id)(void) = (void *) 8;

ʻo ia, bpf_get_smp_processor_id he kuhikuhi hana nona ka waiwai he 8, a he 8 ka waiwai BPF_FUNC_get_smp_processor_id ʻano enum bpf_fun_id, i wehewehe ʻia no mākou ma ka faila vmlinux.h (faila bpf_helper_defs.h i loko o ka kernel i hana ʻia e kahi palapala, no laila ua maikaʻi nā helu "magic". ʻAʻole lawe kēia hana i nā hoʻopaʻapaʻa a hoʻihoʻi i kahi waiwai o ke ʻano __u32. Ke holo mākou i loko o kā mākou papahana, clang hoʻopuka i kahi ʻōlelo aʻo BPF_CALL "ʻo ke ʻano kūpono" E hōʻuluʻulu kākou i ka papahana a nānā i ka ʻāpana xdp/simple:

$ clang -O2 -g -c -target bpf -I libbpf/src/root/usr/include xdp-simple.bpf.c -o xdp-simple.bpf.o
$ llvm-objdump -D --section=xdp/simple xdp-simple.bpf.o

xdp-simple.bpf.o:       file format elf64-bpf

Disassembly of section xdp/simple:

0000000000000000 <simple>:
       0:       85 00 00 00 08 00 00 00 call 8
       1:       bf 01 00 00 00 00 00 00 r1 = r0
       2:       67 01 00 00 20 00 00 00 r1 <<= 32
       3:       77 01 00 00 20 00 00 00 r1 >>= 32
       4:       b7 00 00 00 02 00 00 00 r0 = 2
       5:       15 01 01 00 00 00 00 00 if r1 == 0 goto +1 <LBB0_2>
       6:       b7 00 00 00 01 00 00 00 r0 = 1

0000000000000038 <LBB0_2>:
       7:       95 00 00 00 00 00 00 00 exit

Ma ka laina mua, ʻike mākou i nā kuhikuhi call, ka palena IMM ua like ia me 8, a SRC_REG - ʻole. Wahi a ka ʻaelike ABI i hoʻohana ʻia e ka verifier, he kelepona kēia i ka hana kōkua helu ʻewalu. Ke hoʻomaka ʻia, maʻalahi ka loiloi. Hoʻihoʻi i ka waiwai mai ka papa inoa r0 kope i r1 a ma nā laina 2,3 ua hoʻololi ʻia i ke ʻano u32 - ua hoʻomaʻemaʻe ʻia nā 32 bits luna. Ma nā laina 4,5,6,7 hoʻi mākou i 2 (XDP_PASS) a i ʻole 1 (XDP_DROP) ma muli o ka hoʻihoʻi ʻana o ka mea kōkua mai ka laina 0 i kahi waiwai ʻole a i ʻole zero.

E hoʻāʻo iā mākou iho: hoʻouka i ka papahana a nānā i ka hopena bpftool prog dump xlated:

$ bpftool gen skeleton xdp-simple.bpf.o > xdp-simple.skel.h
$ clang -O2 -g -I ./libbpf/src/root/usr/include/ -o xdp-simple xdp-simple.c ./libbpf/src/root/usr/lib64/libbpf.a -lelf -lz
$ sudo ./xdp-simple &
[2] 10914

$ sudo bpftool p | grep simple
523: xdp  name simple  tag 44c38a10c657e1b0  gpl
        pids xdp-simple(10915)

$ sudo bpftool p d x id 523
int simple(void *ctx):
; if (bpf_get_smp_processor_id() != 0)
   0: (85) call bpf_get_smp_processor_id#114128
   1: (bf) r1 = r0
   2: (67) r1 <<= 32
   3: (77) r1 >>= 32
   4: (b7) r0 = 2
; }
   5: (15) if r1 == 0x0 goto pc+1
   6: (b7) r0 = 1
   7: (95) exit

ʻAe, ua loaʻa ka mea hōʻoia i ka mea kōkua kernel pololei.

Ka laʻana: ka hele ʻana i nā hoʻopaʻapaʻa a me ka holo hope ʻana i ka papahana!

Loaʻa i nā hana kōkua holo-level a pau he prototype

u64 fn(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)

Hoʻoholo ʻia nā ʻāpana i nā hana kōkua i nā papa inoa r1-r5, a hoʻihoʻi ʻia ka waiwai ma ka papa inoa r0. ʻAʻohe hana i ʻoi aku ma mua o ʻelima mau hoʻopaʻapaʻa, a ʻaʻole manaʻo ʻia e hoʻohui ʻia ke kākoʻo no lākou i ka wā e hiki mai ana.

E nānā i ka mea kōkua kernel hou a pehea e hala ai ka BPF i nā ʻāpana. E kākau hou kāua xdp-simple.bpf.c penei (ʻaʻole i loli ke koena o nā laina):

SEC("xdp/simple")
int simple(void *ctx)
{
    bpf_printk("running on CPU%un", bpf_get_smp_processor_id());
    return XDP_PASS;
}

Hoʻopuka kā mākou papahana i ka helu o ka CPU e holo nei. E hōʻuluʻulu a nānā i ke code:

$ llvm-objdump -D --section=xdp/simple --no-show-raw-insn xdp-simple.bpf.o

0000000000000000 <simple>:
       0:       r1 = 10
       1:       *(u16 *)(r10 - 8) = r1
       2:       r1 = 8441246879787806319 ll
       4:       *(u64 *)(r10 - 16) = r1
       5:       r1 = 2334956330918245746 ll
       7:       *(u64 *)(r10 - 24) = r1
       8:       call 8
       9:       r1 = r10
      10:       r1 += -24
      11:       r2 = 18
      12:       r3 = r0
      13:       call 6
      14:       r0 = 2
      15:       exit

Ma nā laina 0-7 mākou e kākau i ke kaula running on CPU%un, a laila ma ka laina 8 holo mākou i ka mea maʻa bpf_get_smp_processor_id. Ma nā laina 9-12 mākou e hoʻomākaukau ai i nā manaʻo kōkua bpf_printk - kākau inoa r1, r2, r3. No ke aha he ʻekolu o lākou ʻaʻole ʻelua? No ka mea bpf_printkʻo kēia kahi ʻōwili macro a puni ke kokua maoli bpf_trace_printk, ka mea e pono e hala i ka nui o ke kaula format.

E hoʻohui kākou i ʻelua mau laina i kēia manawa xdp-simple.cno laila e hoʻopili ai kā mākou polokalamu i ka interface lo a hoʻomaka maoli!

$ cat xdp-simple.c
#include <linux/if_link.h>
#include <err.h>
#include <unistd.h>
#include "xdp-simple.skel.h"

int main(int argc, char **argv)
{
    __u32 flags = XDP_FLAGS_SKB_MODE;
    struct xdp_simple_bpf *obj;

    obj = xdp_simple_bpf__open_and_load();
    if (!obj)
        err(1, "failed to open and/or load BPF objectn");

    bpf_set_link_xdp_fd(1, -1, flags);
    bpf_set_link_xdp_fd(1, bpf_program__fd(obj->progs.simple), flags);

cleanup:
    xdp_simple_bpf__destroy(obj);
}

Maanei mākou e hoʻohana ai i ka hana bpf_set_link_xdp_fd, ka mea e hoʻopili ai i nā polokalamu BPF ʻano XDP i nā pilina pūnaewele. Ua paʻakikī mākou i ka helu interface lo, ʻo ia ka mea mau 1. Holo mākou i ka hana ʻelua e wehe mua i ka papahana kahiko inā pili ʻia. E nānā i kēia manawa ʻaʻole pono mākou i kahi paʻakikī pause a i ʻole kahi loop palena ʻole: e haʻalele ana kā mākou papahana loader, akā ʻaʻole e pepehi ʻia ka polokalamu BPF no ka mea ua pili ia i ke kumu hanana. Ma hope o ka hoʻoiho ʻana a me ka hoʻohui ʻana, e hoʻomaka ʻia ka papahana no kēlā me kēia ʻeke pūnaewele e hiki mai ana lo.

E hoʻoiho i ka polokalamu a nānā i ke kikowaena lo:

$ sudo ./xdp-simple
$ sudo bpftool p | grep simple
669: xdp  name simple  tag 4fca62e77ccb43d6  gpl
$ ip l show dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 xdpgeneric qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    prog/xdp id 669

ʻO ka polokalamu a mākou i hoʻoiho ai he ID 669 a ʻike mākou i ka ID like ma ka interface lo. E hoʻouna mākou i ʻelua mau pūʻolo i 127.0.0.1 (noi + pane):

$ ping -c1 localhost

a i kēia manawa, e nānā kākou i nā mea o ka faila virtual debug /sys/kernel/debug/tracing/trace_pipe, ma kahi bpf_printk kakau i kana mau memo:

# cat /sys/kernel/debug/tracing/trace_pipe
ping-13937 [000] d.s1 442015.377014: bpf_trace_printk: running on CPU0
ping-13937 [000] d.s1 442015.377027: bpf_trace_printk: running on CPU0

ʻElua pūʻolo i ʻike ʻia ma lo a hana ʻia ma CPU0 - ua hana kā mākou papahana BPF kumu ʻole piha piha!

He mea pono e hoʻomaopopo i kēlā bpf_printk ʻAʻole ia no ka mea ʻole ke kākau ʻana i ka faila debug: ʻaʻole kēia ka mea kōkua kūleʻa loa no ka hoʻohana ʻana i ka hana, akā ʻo kā mākou pahuhopu e hōʻike i kahi mea maʻalahi.

Ke kiʻi ʻana i nā palapala ʻāina mai nā polokalamu BPF

Laʻana: hoʻohana ʻana i ka palapala ʻāina mai ka papahana BPF

Ma nā ʻāpana i hala ua aʻo mākou i ka hana ʻana a me ka hoʻohana ʻana i nā palapala ʻāina mai kahi mea hoʻohana, a i kēia manawa e nānā kākou i ka ʻāpana kernel. E hoʻomaka kākou, e like me ka mea maʻamau, me kahi laʻana. E kākau hou i kā mākou papahana xdp-simple.bpf.c penei:

#include "vmlinux.h"
#include <bpf/bpf_helpers.h>

struct {
    __uint(type, BPF_MAP_TYPE_ARRAY);
    __uint(max_entries, 8);
    __type(key, u32);
    __type(value, u64);
} woo SEC(".maps");

SEC("xdp/simple")
int simple(void *ctx)
{
    u32 key = bpf_get_smp_processor_id();
    u32 *val;

    val = bpf_map_lookup_elem(&woo, &key);
    if (!val)
        return XDP_ABORTED;

    *val += 1;

    return XDP_PASS;
}

char LICENSE[] SEC("license") = "GPL";

I ka hoʻomaka ʻana o ka papahana ua hoʻohui mākou i kahi wehewehe palapala woo: He 8-element array kēia e mālama ana i nā waiwai like u64 (ma C e wehewehe mākou i kahi ʻano like u64 woo[8]). Ma kahi papahana "xdp/simple" loaʻa iā mākou ka helu kaʻina hana i kēia manawa i loko o kahi loli key a laila hoʻohana i ka hana kōkua bpf_map_lookup_element loaʻa iā mākou kahi kuhikuhi i ke komo ʻana i loko o ka array, a mākou e hoʻonui ai i hoʻokahi. Unuhi ʻia i ka ʻōlelo Lūkini: helu mākou i nā ʻikepili kahi i hana ai ka CPU i nā ʻeke komo mai. E ho'āʻo kākou e holo i ka polokalamu:

$ clang -O2 -g -c -target bpf -I libbpf/src/root/usr/include xdp-simple.bpf.c -o xdp-simple.bpf.o
$ bpftool gen skeleton xdp-simple.bpf.o > xdp-simple.skel.h
$ clang -O2 -g -I ./libbpf/src/root/usr/include/ -o xdp-simple xdp-simple.c ./libbpf/src/root/usr/lib64/libbpf.a -lelf -lz
$ sudo ./xdp-simple

E nānā kāua ua pili ʻo ia iā ia lo a hoʻouna i kekahi mau ʻeke:

$ ip l show dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 xdpgeneric qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    prog/xdp id 108

$ for s in `seq 234`; do sudo ping -f -c 100 127.0.0.1 >/dev/null 2>&1; done

I kēia manawa, e nānā kākou i nā mea o ka array:

$ sudo bpftool map dump name woo
[
    { "key": 0, "value": 0 },
    { "key": 1, "value": 400 },
    { "key": 2, "value": 0 },
    { "key": 3, "value": 0 },
    { "key": 4, "value": 0 },
    { "key": 5, "value": 0 },
    { "key": 6, "value": 0 },
    { "key": 7, "value": 46400 }
]

Aneane pau nā kaʻina hana i ka CPU7. ʻAʻole ia he mea nui iā mākou, ʻo ka mea nui ke hana nei ka papahana a maopopo mākou pehea e komo ai i nā palapala ʻāina mai nā polokalamu BPF - me ka hoʻohana хелперов bpf_mp_*.

Helu kuhikuhi

No laila, hiki iā mākou ke komo i ka palapala ʻāina mai ka papahana BPF me ka hoʻohana ʻana i nā kelepona like

val = bpf_map_lookup_elem(&woo, &key);

kahi e like ai ka hana kōkua

void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)

akā, ke hele nei mākou i kahi kuhikuhi &woo i kahi hale inoa ʻole struct { ... }...

Inā mākou e nānā i ka polokalamu assembler, ʻike mākou i ka waiwai &woo ʻAʻole i wehewehe maoli ʻia (laina 4):

llvm-objdump -D --section xdp/simple xdp-simple.bpf.o

xdp-simple.bpf.o:       file format elf64-bpf

Disassembly of section xdp/simple:

0000000000000000 <simple>:
       0:       85 00 00 00 08 00 00 00 call 8
       1:       63 0a fc ff 00 00 00 00 *(u32 *)(r10 - 4) = r0
       2:       bf a2 00 00 00 00 00 00 r2 = r10
       3:       07 02 00 00 fc ff ff ff r2 += -4
       4:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
       6:       85 00 00 00 01 00 00 00 call 1
...

a aia i loko o ka neʻe ʻana:

$ llvm-readelf -r xdp-simple.bpf.o | head -4

Relocation section '.relxdp/simple' at offset 0xe18 contains 1 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name
0000000000000020  0000002700000001 R_BPF_64_64            0000000000000000 woo

Akā inā mākou e nānā i ka papahana i hoʻouka mua ʻia, ʻike mākou i kahi kuhikuhi i ka palapala ʻāina kūpono (laina 4):

$ sudo bpftool prog dump x name simple
int simple(void *ctx):
   0: (85) call bpf_get_smp_processor_id#114128
   1: (63) *(u32 *)(r10 -4) = r0
   2: (bf) r2 = r10
   3: (07) r2 += -4
   4: (18) r1 = map[id:64]
...

No laila, hiki iā mākou ke hoʻoholo i ka manawa o ka hoʻomaka ʻana i kā mākou papahana loader, ka loulou i &woo ua pani ʻia e kekahi mea me kahi waihona libbpf. E nānā mua mākou i ka hopena strace:

$ sudo strace -e bpf ./xdp-simple
...
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=8, max_entries=8, map_name="woo", ...}, 120) = 4
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, prog_name="simple", ...}, 120) = 5

ʻIke mākou i kēlā libbpf hana i palapala ʻāina woo a laila hoʻoiho i kā mākou papahana simple. E nānā pono kākou i ka hoʻouka ʻana i ka polokalamu:

  • kahea xdp_simple_bpf__open_and_load mai ka waihona xdp-simple.skel.h
  • ke kumu xdp_simple_bpf__load mai ka waihona xdp-simple.skel.h
  • ke kumu bpf_object__load_skeleton mai ka waihona libbpf/src/libbpf.c
  • ke kumu bpf_object__load_xattr mai libbpf/src/libbpf.c

ʻO ka hana hope, ma waena o nā mea ʻē aʻe, e kāhea bpf_object__create_maps, ka mea e hana a wehe paha i nā palapala ʻāina e kū nei, e hoʻololi iā lākou i mea wehewehe faila. (ʻO kēia kahi a mākou e ʻike ai BPF_MAP_CREATE i ka hoopuka ana strace.) A laila kapa ʻia ka hana bpf_object__relocate a oia ka mea hoihoi ia makou, oiai makou e hoomanao nei i ka makou i ike ai woo i ka papa hoʻoneʻe. Ke ʻimi nei, ʻike mākou iā mākou iho i ka hana bpf_program__relocate, ka mea pili i ka neʻe ʻana o ka palapala ʻāina:

case RELO_LD64:
    insn[0].src_reg = BPF_PSEUDO_MAP_FD;
    insn[0].imm = obj->maps[relo->map_idx].fd;
    break;

No laila, lawe mākou i kā mākou mau kuhikuhi

18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll

a hoʻololi i ka papa inoa kumu i loko BPF_PSEUDO_MAP_FD, a me ka IMM mua i ka faila wehewehe o kā mākou palapala 'āina a, inā like ia, no ka laʻana, 0xdeadbeef, a laila e loaʻa iā mākou ke aʻo ʻana

18 11 00 00 ef eb ad de 00 00 00 00 00 00 00 00 r1 = 0 ll

Pēlā e hoʻoili ʻia ai ka ʻike palapala ʻāina i kahi papahana BPF i hoʻouka ʻia. I kēia hihia, hiki ke hana ʻia ka palapala ʻāina me ka hoʻohana ʻana BPF_MAP_CREATE, a wehe ʻia e ka ID me ka hoʻohana ʻana BPF_MAP_GET_FD_BY_ID.

Huina, ke hoohana libbpf penei ka algorithm:

  • i ka wā o ka hōʻuluʻulu ʻana, hana ʻia nā moʻolelo ma ka papa hoʻoneʻe no nā loulou i nā palapala ʻāina
  • libbpf wehe i ka puke mea ELF, ʻike i nā palapala ʻāina i hoʻohana ʻia a hana i nā mea wehewehe faila no lākou
  • Hoʻokomo ʻia nā mea wehewehe faila i loko o ka kernel ma ke ʻano o ke aʻo ʻana LD64

E like me kāu e noʻonoʻo ai, aia nā mea e hiki mai ana a pono mākou e nānā i ke kumu. ʻO ka mea pōmaikaʻi, loaʻa iā mākou kahi ʻike - ua kākau mākou i ke ʻano BPF_PSEUDO_MAP_FD i loko o ka papa inoa kumu a hiki iā mākou ke kanu iā ia, e alakaʻi iā mākou i kahi hoʻāno o nā haipule a pau - kernel/bpf/verifier.c, kahi hana me ka inoa ʻokoʻa e pani ai i kahi wehewehe faila me ka helu wahi o kahi ʻano ʻano struct bpf_map:

static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env) {
    ...

    f = fdget(insn[0].imm);
    map = __bpf_map_get(f);
    if (insn->src_reg == BPF_PSEUDO_MAP_FD) {
        addr = (unsigned long)map;
    }
    insn[0].imm = (u32)addr;
    insn[1].imm = addr >> 32;

(hiki ke loaʻa ke code piha loulou). No laila hiki iā mākou ke hoʻonui i kā mākou algorithm:

  • ʻoiai e hoʻouka ana i ka papahana, nānā ka mea hōʻoia i ka hoʻohana pono ʻana o ka palapala ʻāina a kākau i ka helu o ka hale pili struct bpf_map

Ke hoʻoiho nei i ka ELF binary me ka hoʻohana ʻana libbpf Nui nā mea hou aʻe, akā e kūkākūkā mākou ma nā ʻatikala ʻē aʻe.

Hoʻouka i nā polokalamu a me nā palapala ʻāina me ka libbpf

E like me ka mea i hoʻohiki ʻia, eia kahi laʻana no ka poʻe heluhelu makemake e ʻike pehea e hana a hoʻouka i kahi papahana e hoʻohana ana i nā palapala ʻāina, me ke kōkua ʻole. libbpf. Hiki i kēia ke hoʻohana i ka wā e hana ana ʻoe i kahi kaiapuni kahi hiki ʻole iā ʻoe ke kūkulu i nā hilinaʻi, a i ʻole mālama i kēlā me kēia mea, a i ʻole ke kākau ʻana i kahi papahana e like me ply, ka mea e hoʻopuka ai i ka code binary BPF ma ka lele.

I mea e maʻalahi ai ka hahai ʻana i ka loiloi, e kākau hou mākou i kā mākou hiʻohiʻona no kēia mau kumu xdp-simple. Hiki ke loaʻa ka code piha a hoʻonui iki o ka papahana i kūkākūkā ʻia ma kēia hiʻohiʻona hoʻomālamalama.

ʻO ka loiloi o kā mākou noi penei:

  • hana i kahi palapala ʻano BPF_MAP_TYPE_ARRAY e hoohana ana i ke kauoha BPF_MAP_CREATE,
  • hana i polokalamu e hoʻohana i kēia palapala ʻāina,
  • hoʻohui i ka polokalamu i ka interface lo,

ka mea e unuhi ana i ke kanaka

int main(void)
{
    int map_fd, prog_fd;

    map_fd = map_create();
    if (map_fd < 0)
        err(1, "bpf: BPF_MAP_CREATE");

    prog_fd = prog_load(map_fd);
    if (prog_fd < 0)
        err(1, "bpf: BPF_PROG_LOAD");

    xdp_attach(1, prog_fd);
}

he mea map_create hana i ka palapala 'āina e like me kā mākou i hana ai ma ka laʻana mua e pili ana i ke kelepona ʻōnaehana bpf - "kernel, e ʻoluʻolu e hana mai iaʻu i palapala ʻāina hou ma ke ʻano o kahi ʻano o nā mea 8 like __u64 a hāʻawi mai iaʻu i ka wehewehe faila":

static int map_create()
{
    union bpf_attr attr;

    memset(&attr, 0, sizeof(attr));
    attr.map_type = BPF_MAP_TYPE_ARRAY,
    attr.key_size = sizeof(__u32),
    attr.value_size = sizeof(__u64),
    attr.max_entries = 8,
    strncpy(attr.map_name, "woo", sizeof(attr.map_name));
    return syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));
}

He maʻalahi hoʻi ka polokalamu e hoʻouka:

static int prog_load(int map_fd)
{
    union bpf_attr attr;
    struct bpf_insn insns[] = {
        ...
    };

    memset(&attr, 0, sizeof(attr));
    attr.prog_type = BPF_PROG_TYPE_XDP;
    attr.insns     = ptr_to_u64(insns);
    attr.insn_cnt  = sizeof(insns)/sizeof(insns[0]);
    attr.license   = ptr_to_u64("GPL");
    strncpy(attr.prog_name, "woo", sizeof(attr.prog_name));
    return syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr));
}

ʻO ka ʻāpana paʻakikī prog_load ʻo ia ka wehewehe ʻana o kā mākou papahana BPF ma ke ʻano he ʻano hana struct bpf_insn insns[]. Akā, ʻoiai ke hoʻohana nei mākou i kahi papahana i loaʻa iā mākou ma C, hiki iā mākou ke hoʻopunipuni liʻiliʻi:

$ llvm-objdump -D --section xdp/simple xdp-simple.bpf.o

0000000000000000 <simple>:
       0:       85 00 00 00 08 00 00 00 call 8
       1:       63 0a fc ff 00 00 00 00 *(u32 *)(r10 - 4) = r0
       2:       bf a2 00 00 00 00 00 00 r2 = r10
       3:       07 02 00 00 fc ff ff ff r2 += -4
       4:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
       6:       85 00 00 00 01 00 00 00 call 1
       7:       b7 01 00 00 00 00 00 00 r1 = 0
       8:       15 00 04 00 00 00 00 00 if r0 == 0 goto +4 <LBB0_2>
       9:       61 01 00 00 00 00 00 00 r1 = *(u32 *)(r0 + 0)
      10:       07 01 00 00 01 00 00 00 r1 += 1
      11:       63 10 00 00 00 00 00 00 *(u32 *)(r0 + 0) = r1
      12:       b7 01 00 00 02 00 00 00 r1 = 2

0000000000000068 <LBB0_2>:
      13:       bf 10 00 00 00 00 00 00 r0 = r1
      14:       95 00 00 00 00 00 00 00 exit

I ka huina, pono mākou e kākau i nā ʻōlelo aʻo 14 ma ke ʻano o nā hale like struct bpf_insn (ʻōlelo aʻoaʻo: e lawe i ka dump mai luna mai, heluhelu hou i ka pauku kuhikuhi, wehe linux/bpf.h и linux/bpf_common.h a ho'āʻo e hoʻoholo struct bpf_insn insns[] iā ʻoe iho):

struct bpf_insn insns[] = {
    /* 85 00 00 00 08 00 00 00 call 8 */
    {
        .code = BPF_JMP | BPF_CALL,
        .imm = 8,
    },

    /* 63 0a fc ff 00 00 00 00 *(u32 *)(r10 - 4) = r0 */
    {
        .code = BPF_MEM | BPF_STX,
        .off = -4,
        .src_reg = BPF_REG_0,
        .dst_reg = BPF_REG_10,
    },

    /* bf a2 00 00 00 00 00 00 r2 = r10 */
    {
        .code = BPF_ALU64 | BPF_MOV | BPF_X,
        .src_reg = BPF_REG_10,
        .dst_reg = BPF_REG_2,
    },

    /* 07 02 00 00 fc ff ff ff r2 += -4 */
    {
        .code = BPF_ALU64 | BPF_ADD | BPF_K,
        .dst_reg = BPF_REG_2,
        .imm = -4,
    },

    /* 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll */
    {
        .code = BPF_LD | BPF_DW | BPF_IMM,
        .src_reg = BPF_PSEUDO_MAP_FD,
        .dst_reg = BPF_REG_1,
        .imm = map_fd,
    },
    { }, /* placeholder */

    /* 85 00 00 00 01 00 00 00 call 1 */
    {
        .code = BPF_JMP | BPF_CALL,
        .imm = 1,
    },

    /* b7 01 00 00 00 00 00 00 r1 = 0 */
    {
        .code = BPF_ALU64 | BPF_MOV | BPF_K,
        .dst_reg = BPF_REG_1,
        .imm = 0,
    },

    /* 15 00 04 00 00 00 00 00 if r0 == 0 goto +4 <LBB0_2> */
    {
        .code = BPF_JMP | BPF_JEQ | BPF_K,
        .off = 4,
        .src_reg = BPF_REG_0,
        .imm = 0,
    },

    /* 61 01 00 00 00 00 00 00 r1 = *(u32 *)(r0 + 0) */
    {
        .code = BPF_MEM | BPF_LDX,
        .off = 0,
        .src_reg = BPF_REG_0,
        .dst_reg = BPF_REG_1,
    },

    /* 07 01 00 00 01 00 00 00 r1 += 1 */
    {
        .code = BPF_ALU64 | BPF_ADD | BPF_K,
        .dst_reg = BPF_REG_1,
        .imm = 1,
    },

    /* 63 10 00 00 00 00 00 00 *(u32 *)(r0 + 0) = r1 */
    {
        .code = BPF_MEM | BPF_STX,
        .src_reg = BPF_REG_1,
        .dst_reg = BPF_REG_0,
    },

    /* b7 01 00 00 02 00 00 00 r1 = 2 */
    {
        .code = BPF_ALU64 | BPF_MOV | BPF_K,
        .dst_reg = BPF_REG_1,
        .imm = 2,
    },

    /* <LBB0_2>: bf 10 00 00 00 00 00 00 r0 = r1 */
    {
        .code = BPF_ALU64 | BPF_MOV | BPF_X,
        .src_reg = BPF_REG_1,
        .dst_reg = BPF_REG_0,
    },

    /* 95 00 00 00 00 00 00 00 exit */
    {
        .code = BPF_JMP | BPF_EXIT
    },
};

He hoʻomaʻamaʻa no ka poʻe i kākau ʻole i kēia iā lākou iho - loaʻa map_fd.

Aia kekahi ʻāpana i hōʻike ʻole ʻia i koe i kā mākou papahana - xdp_attach. ʻO ka mea pōʻino, ʻaʻole hiki ke hoʻopili ʻia nā polokalamu e like me XDP me kahi kelepona ʻōnaehana bpf. ʻO ka poʻe nāna i hana i ka BPF a me XDP mai ke kaiāulu Linux pūnaewele, ʻo ia hoʻi ua hoʻohana lākou i ka mea i kamaʻāina loa iā lākou (akā ʻaʻole i maʻamau kanaka) no ka launa pū ʻana me ka kernel: nā kumu netlink, nana hoi RFC3549. ʻO ke ala maʻalahi loa e hoʻokō xdp_attach ke kope nei i ke code mai libbpf, oia hoi, mai ka waihona netlink.c, ʻo ia kā mākou i hana ai, hoʻopōkole iki.

Welina mai i ka honua o nā kumu netlink

E wehe i kahi ʻano kumu kumu netlink NETLINK_ROUTE:

int netlink_open(__u32 *nl_pid)
{
    struct sockaddr_nl sa;
    socklen_t addrlen;
    int one = 1, ret;
    int sock;

    memset(&sa, 0, sizeof(sa));
    sa.nl_family = AF_NETLINK;

    sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
    if (sock < 0)
        err(1, "socket");

    if (setsockopt(sock, SOL_NETLINK, NETLINK_EXT_ACK, &one, sizeof(one)) < 0)
        warnx("netlink error reporting not supported");

    if (bind(sock, (struct sockaddr *)&sa, sizeof(sa)) < 0)
        err(1, "bind");

    addrlen = sizeof(sa);
    if (getsockname(sock, (struct sockaddr *)&sa, &addrlen) < 0)
        err(1, "getsockname");

    *nl_pid = sa.nl_pid;
    return sock;
}

Heluhelu mākou mai kēia kumu:

static int bpf_netlink_recv(int sock, __u32 nl_pid, int seq)
{
    bool multipart = true;
    struct nlmsgerr *errm;
    struct nlmsghdr *nh;
    char buf[4096];
    int len, ret;

    while (multipart) {
        multipart = false;
        len = recv(sock, buf, sizeof(buf), 0);
        if (len < 0)
            err(1, "recv");

        if (len == 0)
            break;

        for (nh = (struct nlmsghdr *)buf; NLMSG_OK(nh, len);
                nh = NLMSG_NEXT(nh, len)) {
            if (nh->nlmsg_pid != nl_pid)
                errx(1, "wrong pid");
            if (nh->nlmsg_seq != seq)
                errx(1, "INVSEQ");
            if (nh->nlmsg_flags & NLM_F_MULTI)
                multipart = true;
            switch (nh->nlmsg_type) {
                case NLMSG_ERROR:
                    errm = (struct nlmsgerr *)NLMSG_DATA(nh);
                    if (!errm->error)
                        continue;
                    ret = errm->error;
                    // libbpf_nla_dump_errormsg(nh); too many code to copy...
                    goto done;
                case NLMSG_DONE:
                    return 0;
                default:
                    break;
            }
        }
    }
    ret = 0;
done:
    return ret;
}

ʻO ka mea hope loa, eia kā mākou hana e wehe ai i kahi kumu a hoʻouna i kahi leka kūikawā iā ia me kahi faila wehewehe:

static int xdp_attach(int ifindex, int prog_fd)
{
    int sock, seq = 0, ret;
    struct nlattr *nla, *nla_xdp;
    struct {
        struct nlmsghdr  nh;
        struct ifinfomsg ifinfo;
        char             attrbuf[64];
    } req;
    __u32 nl_pid = 0;

    sock = netlink_open(&nl_pid);
    if (sock < 0)
        return sock;

    memset(&req, 0, sizeof(req));
    req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
    req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
    req.nh.nlmsg_type = RTM_SETLINK;
    req.nh.nlmsg_pid = 0;
    req.nh.nlmsg_seq = ++seq;
    req.ifinfo.ifi_family = AF_UNSPEC;
    req.ifinfo.ifi_index = ifindex;

    /* started nested attribute for XDP */
    nla = (struct nlattr *)(((char *)&req)
            + NLMSG_ALIGN(req.nh.nlmsg_len));
    nla->nla_type = NLA_F_NESTED | IFLA_XDP;
    nla->nla_len = NLA_HDRLEN;

    /* add XDP fd */
    nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
    nla_xdp->nla_type = IFLA_XDP_FD;
    nla_xdp->nla_len = NLA_HDRLEN + sizeof(int);
    memcpy((char *)nla_xdp + NLA_HDRLEN, &prog_fd, sizeof(prog_fd));
    nla->nla_len += nla_xdp->nla_len;

    /* if user passed in any flags, add those too */
    __u32 flags = XDP_FLAGS_SKB_MODE;
    nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
    nla_xdp->nla_type = IFLA_XDP_FLAGS;
    nla_xdp->nla_len = NLA_HDRLEN + sizeof(flags);
    memcpy((char *)nla_xdp + NLA_HDRLEN, &flags, sizeof(flags));
    nla->nla_len += nla_xdp->nla_len;

    req.nh.nlmsg_len += NLA_ALIGN(nla->nla_len);

    if (send(sock, &req, req.nh.nlmsg_len, 0) < 0)
        err(1, "send");
    ret = bpf_netlink_recv(sock, nl_pid, seq);

cleanup:
    close(sock);
    return ret;
}

No laila, ua mākaukau nā mea a pau no ka hoʻāʻo ʻana:

$ cc nolibbpf.c -o nolibbpf
$ sudo strace -e bpf ./nolibbpf
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, map_name="woo", ...}, 72) = 3
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, insn_cnt=15, prog_name="woo", ...}, 72) = 4
+++ exited with 0 +++

E ʻike inā pili kā mākou polokalamu iā lo:

$ ip l show dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 xdpgeneric qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    prog/xdp id 160

E hoʻouna i nā pings a e nānā i ka palapala ʻāina:

$ for s in `seq 234`; do sudo ping -f -c 100 127.0.0.1 >/dev/null 2>&1; done
$ sudo bpftool m dump name woo
key: 00 00 00 00  value: 90 01 00 00 00 00 00 00
key: 01 00 00 00  value: 00 00 00 00 00 00 00 00
key: 02 00 00 00  value: 00 00 00 00 00 00 00 00
key: 03 00 00 00  value: 00 00 00 00 00 00 00 00
key: 04 00 00 00  value: 00 00 00 00 00 00 00 00
key: 05 00 00 00  value: 00 00 00 00 00 00 00 00
key: 06 00 00 00  value: 40 b5 00 00 00 00 00 00
key: 07 00 00 00  value: 00 00 00 00 00 00 00 00
Found 8 elements

Hurray, hana nā mea a pau. E hoʻomaopopo, ma ke ala, ua hōʻike hou ʻia kā mākou palapala ma ke ʻano o nā bytes. ʻO kēia ma muli o ka ʻoiaʻiʻo, ʻaʻole like libbpf ʻaʻole mākou i hoʻouka i ka ʻike ʻano (BTF). Akā, e kamaʻilio hou kākou no kēia mea aʻe.

Nā mea hana hoʻomohala

Ma kēia ʻāpana, e nānā mākou i ka liʻiliʻi loa o ka pahu hana hoʻomohala BPF.

ʻO ka ʻōlelo maʻamau, ʻaʻole pono ʻoe i kahi mea kūikawā no ka hoʻomohala ʻana i nā polokalamu BPF - holo ʻo BPF ma luna o nā kernel hoʻoili kūpono, a kūkulu ʻia nā papahana me ka hoʻohana ʻana. clang, hiki ke hoʻolako ʻia mai ka pūʻolo. Eia nō naʻe, ma muli o ka hoʻomohala ʻana o ka BPF, ke loli mau nei ka kernel a me nā mea hana, inā ʻaʻole ʻoe makemake e kākau i nā papahana BPF me ka hoʻohana ʻana i nā ʻano hana kahiko mai 2019, a laila pono ʻoe e hōʻuluʻulu.

  • llvm/clang
  • pahole
  • kona kumu
  • bpftool

(No ka ʻike, ua holo kēia ʻāpana a me nā hiʻohiʻona āpau i ka ʻatikala ma Debian 10.)

llvm/clang

He hoaaloha ʻo BPF me LLVM a, ʻoiai hiki ke hoʻohui ʻia nā polokalamu hou no BPF me ka hoʻohana ʻana i ka gcc, ua hoʻokō ʻia nā hoʻomohala o kēia manawa no LLVM. No laila, ʻo ka mea mua, e kūkulu mākou i ka mana o kēia manawa clang mai git:

$ sudo apt install ninja-build
$ git clone --depth 1 https://github.com/llvm/llvm-project.git
$ mkdir -p llvm-project/llvm/build/install
$ cd llvm-project/llvm/build
$ cmake .. -G "Ninja" -DLLVM_TARGETS_TO_BUILD="BPF;X86" 
                      -DLLVM_ENABLE_PROJECTS="clang" 
                      -DBUILD_SHARED_LIBS=OFF 
                      -DCMAKE_BUILD_TYPE=Release 
                      -DLLVM_BUILD_RUNTIME=OFF
$ time ninja
... много времени спустя
$

I kēia manawa hiki iā mākou ke nānā inā ua hui pololei nā mea a pau:

$ ./bin/llc --version
LLVM (http://llvm.org/):
  LLVM version 11.0.0git
  Optimized build.
  Default target: x86_64-unknown-linux-gnu
  Host CPU: znver1

  Registered Targets:
    bpf    - BPF (host endian)
    bpfeb  - BPF (big endian)
    bpfel  - BPF (little endian)
    x86    - 32-bit X86: Pentium-Pro and above
    x86-64 - 64-bit X86: EM64T and AMD64

(Nā kuhikuhi hui clang lawe ia e aʻu mai bpf_devel_QA.)

ʻAʻole mākou e hoʻokomo i nā polokalamu a mākou i kūkulu ai, akā hoʻohui wale iā lākou PATHno ka laʻana:

export PATH="`pwd`/bin:$PATH"

(Hiki ke hoʻohui ʻia i kēia .bashrc aiʻole i kahi faila ʻokoʻa. ʻO wau iho, hoʻohui au i nā mea e like me kēia ~/bin/activate-llvm.sh a inā pono wau e hana . activate-llvm.sh.)

Pahole a me BTF

Mea hoʻohana pahole hoʻohana ʻia i ke kūkulu ʻana i ka kernel e hana i ka ʻike debugging ma ka format BTF. ʻAʻole mākou e hele i nā kikoʻī ma kēia ʻatikala e pili ana i nā kikoʻī o ka ʻenehana BTF, ʻokoʻa ka mea maʻalahi a makemake mākou e hoʻohana. No laila inā e kūkulu ʻoe i kāu kernel, kūkulu mua pahole (ma waho pahole ʻaʻole hiki iā ʻoe ke kūkulu i ka kernel me ke koho CONFIG_DEBUG_INFO_BTF:

$ git clone https://git.kernel.org/pub/scm/devel/pahole/pahole.git
$ cd pahole/
$ sudo apt install cmake
$ mkdir build
$ cd build/
$ cmake -D__LIB=lib ..
$ make
$ sudo make install
$ which pahole
/usr/local/bin/pahole

ʻO nā kernels no ka hoʻokolohua me BPF

I ka ʻimi ʻana i nā hiki o BPF, makemake wau e hōʻuluʻulu i kaʻu kumu ponoʻī. ʻAʻole pono kēia, no ka mea hiki iā ʻoe ke hōʻuluʻulu a hoʻouka i nā polokalamu BPF ma ka puʻupuʻu puʻupuʻu, akā naʻe, ʻo ka loaʻa ʻana o kāu kernel ponoʻī e hiki ai iā ʻoe ke hoʻohana i nā hiʻohiʻona BPF hou loa, e ʻike ʻia i kāu mahele i nā mahina maikaʻi loa. , a i ʻole, e like me ke ʻano o kekahi mau mea hana debugging ʻaʻole e hoʻopili ʻia i ka wā e hiki mai ana. Eia kekahi, ʻo kāna kumu ponoʻī he mea nui ia e hoʻokolohua me ke code.

No ke kūkulu ʻana i kahi kernel pono ʻoe, ʻo ka mea mua, ʻo ka kernel ponoʻī, a ʻo ka lua, kahi faila hoʻonohonoho kernel. No ka hoʻokolohua me BPF hiki iā mākou ke hoʻohana i ka mea maʻamau vanilla kernel a i ʻole kekahi o nā kernels hoʻomohala. Ma ka mōʻaukala, hoʻomohala ʻia ka BPF i loko o ke kaiāulu Linux pūnaewele a no laila e hele koke nā loli āpau ma o David Miller, ka mea mālama pūnaewele Linux. Ma muli o ko lākou ʻano - hoʻoponopono a i ʻole nā ​​hiʻohiʻona hou - hāʻule nā ​​loli pūnaewele i hoʻokahi o nā cores ʻelua - net ai ole ia, net-next. Hoʻololi ʻia nā hoʻololi no ka BPF ma ke ala like ma waena bpf и bpf-next, a laila hoʻohui ʻia i loko o ka ʻupena a me ka net-next, kēlā me kēia. No nā kikoʻī hou aku, e ʻike bpf_devel_QA и netdev-FAQ. No laila e koho i kahi kernel e pili ana i kou ʻono a me nā pono paʻa o ka ʻōnaehana āu e hoʻāʻo nei (*-next ʻO nā kernels ka mea paʻa ʻole o nā mea i helu ʻia).

Aia ma waho o ke kiko o kēia ʻatikala e kamaʻilio e pili ana i ka hoʻokele ʻana i nā faila hoʻonohonoho kernel - ua manaʻo ʻia ua ʻike paha ʻoe pehea e hana ai i kēia, a i ʻole mākaukau e aʻo nona iho. Eia nō naʻe, ʻoi aku ka nui a i ʻole ka liʻiliʻi o kēia mau ʻōlelo aʻoaʻo e hāʻawi iā ʻoe i kahi ʻōnaehana hana BPF.

Hoʻoiho i kekahi o nā kernels i luna:

$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
$ cd bpf-next

E kūkulu i kahi hoʻonohonoho kernel hana liʻiliʻi:

$ cp /boot/config-`uname -r` .config
$ make localmodconfig

E ho'ā i nā koho BPF ma ka waihona .config i kāu koho ponoʻī (ʻoi aku paha CONFIG_BPF hiki ke hoʻohana ʻia mai ka hoʻohana ʻana o systemd). Eia ka papa inoa o nā koho mai ka kernel i hoʻohana ʻia no kēia ʻatikala:

CONFIG_CGROUP_BPF=y
CONFIG_BPF=y
CONFIG_BPF_LSM=y
CONFIG_BPF_SYSCALL=y
CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
CONFIG_BPF_JIT_ALWAYS_ON=y
CONFIG_BPF_JIT_DEFAULT_ON=y
CONFIG_IPV6_SEG6_BPF=y
# CONFIG_NETFILTER_XT_MATCH_BPF is not set
# CONFIG_BPFILTER is not set
CONFIG_NET_CLS_BPF=y
CONFIG_NET_ACT_BPF=y
CONFIG_BPF_JIT=y
CONFIG_BPF_STREAM_PARSER=y
CONFIG_LWTUNNEL_BPF=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_BPF_EVENTS=y
CONFIG_BPF_KPROBE_OVERRIDE=y
CONFIG_DEBUG_INFO_BTF=y

A laila hiki iā mākou ke hōʻuluʻulu maʻalahi a hoʻokomo i nā modules a me ka kernel (ma ke ala, hiki iā ʻoe ke hōʻuluʻulu i ka kernel me ka hoʻohana ʻana i ka mea i hui hou ʻia. clangma ka hoohui ana CC=clang):

$ make -s -j $(getconf _NPROCESSORS_ONLN)
$ sudo make modules_install
$ sudo make install

a hoʻomaka hou me ka kernel hou (hoʻohana wau no kēia kexec mai ka pūʻolo kexec-tools):

v=5.8.0-rc6+ # если вы пересобираете текущее ядро, то можно делать v=`uname -r`
sudo kexec -l -t bzImage /boot/vmlinuz-$v --initrd=/boot/initrd.img-$v --reuse-cmdline &&
sudo kexec -e

bpftool

ʻO ka mea hoʻohana maʻamau i ka ʻatikala ʻo ia ka pono bpftool, hāʻawi ʻia ma ke ʻano he ʻāpana o ka kernel Linux. Ua kākau ʻia a mālama ʻia e nā mea hoʻomohala BPF no nā mea hoʻomohala BPF a hiki ke hoʻohana ʻia no ka mālama ʻana i nā ʻano mea BPF āpau - hoʻouka i nā polokalamu, hana a hoʻoponopono i nā palapala 'āina, e ʻimi i ke ola o ka kaiaola BPF, etc. Hiki ke loaʻa nā palapala ma ke ʻano o nā kumu kumu no nā ʻaoʻao kanaka i ke kumu a i ʻole, i hōʻuluʻulu ʻia, ma ka upena.

I ka manawa o keia kakau ana bpftool hele mai i mākaukau no RHEL, Fedora a me Ubuntu (e nānā, no ka laʻana, keia pae, e haʻi ana i ka moʻolelo paʻa ʻole o ka hoʻopili ʻana bpftool ma Debian). Akā inā ua kūkulu ʻoe i kāu kernel, a laila kūkulu bpftool maʻalahi e like me ka pai:

$ cd ${linux}/tools/bpf/bpftool
# ... пропишите пути к последнему clang, как рассказано выше
$ make -s

Auto-detecting system features:
...                        libbfd: [ on  ]
...        disassembler-four-args: [ on  ]
...                          zlib: [ on  ]
...                        libcap: [ on  ]
...               clang-bpf-co-re: [ on  ]

Auto-detecting system features:
...                        libelf: [ on  ]
...                          zlib: [ on  ]
...                           bpf: [ on  ]

$

(Eia ${linux} - ʻo kāu papa kuhikuhi kernel kēia.) Ma hope o ka hoʻokō ʻana i kēia mau kauoha bpftool e hōʻiliʻili ʻia ma kahi papa kuhikuhi ${linux}/tools/bpf/bpftool a hiki ke hoʻohui ʻia i ke ala (ʻo ka mea mua i ka mea hoʻohana root) aiʻole kope wale i /usr/local/sbin.

E hōʻiliʻili bpftool ʻoi aku ka maikaʻi e hoʻohana i ka hope clang, hui ʻia e like me ka mea i hōʻike ʻia ma luna, a e nānā inā ua ʻākoakoa pololei - me ka hoʻohana ʻana, no ka laʻana, ke kauoha

$ sudo bpftool feature probe kernel
Scanning system configuration...
bpf() syscall for unprivileged users is enabled
JIT compiler is enabled
JIT compiler hardening is disabled
JIT compiler kallsyms exports are enabled for root
...

e hōʻike ana i nā hiʻohiʻona BPF i hiki i kāu kernel.

Ma ke ala, hiki ke holo i ke kauoha mua e like me

# bpftool f p k

Hana ʻia kēia ma ke ʻano hoʻohālikelike me nā pono hana mai ka pūʻolo iproute2, kahi e hiki ai iā mākou, no ka laʻana, ʻōlelo ip a s eth0 ma kahi o ip addr show dev eth0.

hopena

Hāʻawi ʻo BPF iā ʻoe e kāmaʻa i kahi ʻauʻau e ana pono a hoʻololi i ka hana o ke kumu. Ua kūleʻa loa ka ʻōnaehana, i nā kuʻuna maikaʻi loa o UNIX: kahi hana maʻalahi e hiki ai iā ʻoe ke (re) papahana i ka kernel i ʻae i ka nui o nā poʻe a me nā hui e hoʻokolohua. A, ʻoiai ʻo nā hoʻokolohua, a me ka hoʻomohala ʻana i ka ʻoihana BPF ponoʻī, ʻaʻole i pau, ua loaʻa i ka ʻōnaehana kahi ABI paʻa e hiki ai iā ʻoe ke kūkulu i ka hilinaʻi, a ʻo ka mea nui loa, ka loiloi ʻoihana kūpono.

Makemake wau e hoʻomaopopo, i koʻu manaʻo, ua kaulana loa ka ʻenehana no ka mea, ma kekahi ʻaoʻao, hiki iā ia играть (hiki ke hoʻomaopopo ʻia ka hoʻolālā ʻana o ka mīkini i ke ahiahi hoʻokahi), a ma kekahi ʻaoʻao, e hoʻoponopono i nā pilikia hiki ʻole ke hoʻoponopono ʻia (nani) ma mua o kona ʻano. ʻO kēia mau ʻāpana ʻelua e hoʻoikaika i ka poʻe e hoʻokolohua a moeʻuhane, e alakaʻi ana i ka puka ʻana mai o nā hoʻonā hou aʻe.

ʻO kēia ʻatikala, ʻoiai ʻaʻole pōkole loa, he hoʻolauna wale ia i ka honua o BPF a ʻaʻole wehewehe i nā hiʻohiʻona "mua" a me nā ʻāpana koʻikoʻi o ka hoʻolālā. ʻO ka hoʻolālā e hele nei e like me kēia: ʻo ka ʻatikala aʻe ka ʻike o nā ʻano papahana BPF (aia he 5.8 mau ʻano papahana i kākoʻo ʻia i ka kernel 30), a laila e nānā mākou i ke kākau ʻana i nā noi BPF maoli me ka hoʻohana ʻana i nā polokalamu tracing kernel. ma ke ʻano he laʻana, ʻo ia ka manawa no kahi papa hohonu loa ma ka hoʻolālā BPF, a ukali ʻia e nā hiʻohiʻona o ka pūnaewele BPF a me nā noi palekana.

Nā ʻatikala ma mua o kēia moʻo

  1. BPF no na kamalii, hapa zero: BPF ma'amau

Nā loulou

  1. BPF a me XDP kuhikuhi kuhikuhi - nā palapala e pili ana i ka BPF mai cilium, a i ʻole mai Daniel Borkman, kekahi o nā mea hana a mālama i ka BPF. ʻO kēia kekahi o nā wehewehe koʻikoʻi mua, ʻokoʻa i nā mea ʻē aʻe i ʻike maopopo ai ʻo Daniel i kāna mea e kākau nei a ʻaʻohe hewa ma laila. ʻO ka mea kūikawā, wehewehe kēia palapala i ka hana ʻana me nā polokalamu BPF o nā ʻano XDP a me TC me ka hoʻohana ʻana i ka pono kaulana. ip mai ka pūʻolo iproute2.

  2. Palapala/networking/filter.txt - waihona kumu me nā palapala no ka BPF maʻamau a hoʻonui ʻia. He heluhelu maikaʻi inā makemake ʻoe e komo i ka ʻōlelo aʻoaʻo a me nā kikoʻī hoʻolālā ʻenehana.

  3. Blog e pili ana i ka BPF mai facebook. ʻAʻole hiki ke hoʻonui ʻia, akā kūpono, e like me Alexei Starovoitov (mea kākau o eBPF) a me Andrii Nakryiko - (mālama) kākau ma laila libbpf).

  4. Nā mea huna o bpftool. He pae twitter leʻaleʻa mai Quentin Monnet me nā laʻana a me nā mea huna o ka hoʻohana ʻana i ka bpftool.

  5. Luʻu i loko o ka BPF: he papa inoa o nā mea heluhelu. He papa inoa nui (a mālama mau ʻia) o nā loulou i ka palapala BPF mai Quentin Monnet.

Source: www.habr.com

Pākuʻi i ka manaʻo hoʻopuka