åœå㯠BPF ãšåŒã°ãããã¯ãããžãŒããããŸããã ç§ãã¡ã¯åœŒå¥³ãèŠã
倧ãŸãã«èšãã°ãBPF ã䜿çšãããšããŠãŒã¶ãŒãæå®ããä»»æã®ã³ãŒãã Linux ã«ãŒãã«ç©ºéã§å®è¡ã§ããããã«ãªããŸããæ°ããã¢ãŒããã¯ãã£ã¯éåžžã«æåããããããã®ã¢ããªã±ãŒã·ã§ã³ããã¹ãŠèª¬æããã«ã¯ããš XNUMX åã®èšäºãå¿ èŠã«ãªãã§ãããã (以äžã®ããã©ãŒãã³ã¹ ã³ãŒããããããããã«ãéçºè ãããŸãã§ããªãã£ãå¯äžã®ç¹ã¯ããŸãšããªããŽãäœæããããšã§ãã)
ãã®èšäºã§ã¯ãBPF ä»®æ³ãã·ã³ã®æ§é ãBPF ãæäœããããã®ã«ãŒãã« ã€ã³ã¿ãŒãã§ã€ã¹ãéçºããŒã«ãããã³æ¢åã®æ©èœã®ç°¡åãªæŠèŠã«ã€ããŠèª¬æããŸãã å°æ¥ãBPF ã®å®éã®å¿çšãããæ·±ãç 究ããããã«å¿
èŠãªãã®ããã¹ãŠå«ãŸããŠããŸãã
èšäºã®ãŸãšã
bpf(2)
.
ÐОÑеЌ пÑПгÑÐ°ÐŒÐŒÑ BPF Ñ Ð¿ÐŸÐŒÐŸÑÑÑ libbpf
.libbpf
ã åŸç¶ã®äŸã§äœ¿çšããåºæ¬ç㪠BPF ã¢ããªã±ãŒã·ã§ã³ ã¹ã±ã«ãã³ãäœæããŸãã
BPF ã¢ãŒããã¯ãã£ã®æŠèŠ
BPF ã¢ãŒããã¯ãã£ã®æ€èšãå§ããåã«ãæåŸã«ããäžåºŠåç
§ããŸã (ãã)ã
æ°ãã BPF ã¯ã64 ããã ãã·ã³ãã¯ã©ãŠã ãµãŒãã¹ã®æ®åãããã³ SDN ãäœæããããŒã«ã®ããŒãºã®é«ãŸãã«å¯Ÿå¿ããŠéçºãããŸãã (SãœãããŠã§ã¢-då®çŸ©ããã nãããã¯ãŒã¯å)ã å€å žç㪠BPF ã®æ¹è¯ããã代æ¿åãšããŠã«ãŒãã« ãããã¯ãŒã¯ ãšã³ãžãã¢ã«ãã£ãŠéçºãããæ°ãã BPF ã¯ãæåéã XNUMX ãæåŸã« Linux ã·ã¹ãã ããã¬ãŒã¹ãããšããå°é£ãªã¿ã¹ã¯ã«å¿çšã§ããããšãçºèŠããŸããããããŠããã®ç»å Žãã XNUMX 幎ãçµã£ãä»ã次ã®èšäºãå¿ èŠã«ãªãã§ããããããŸããŸãªçš®é¡ã®ããã°ã©ã ããªã¹ãããŸãã
é¢çœãåç
BPF ã®æ žå¿ã¯ãã»ãã¥ãªãã£ãæãªãããšãªãã«ãŒãã«ç©ºéã§ãä»»æã®ãã³ãŒããå®è¡ã§ããããã«ãããµã³ãããã¯ã¹ä»®æ³ãã·ã³ã§ãã BPF ããã°ã©ã ã¯ãŠãŒã¶ãŒç©ºéã§äœæãããã«ãŒãã«ã«ããŒããããŠãäœããã®ã€ãã³ã ãœãŒã¹ã«æ¥ç¶ãããŸãã ã€ãã³ãã«ã¯ãããšãã°ããããã¯ãŒã¯ ã€ã³ã¿ãŒãã§ã€ã¹ãžã®ãã±ããã®é ä¿¡ãã«ãŒãã«é¢æ°ã®èµ·åãªã©ãèããããŸãã ããã±ãŒãžã®å ŽåãBPF ããã°ã©ã ã¯ããã±ãŒãžã®ããŒã¿ãšã¡ã¿ããŒã¿ã«ã¢ã¯ã»ã¹ã§ããŸã (ããã°ã©ã ã®çš®é¡ã«å¿ããŠãèªã¿åãããã³æžã蟌ã¿ã®ãã)ãã«ãŒãã«é¢æ°ãå®è¡ããå Žåã¯ã次ã®åŒæ°ã«ã¢ã¯ã»ã¹ã§ããŸããã«ãŒãã«ã¡ã¢ãªãžã®ãã€ã³ã¿ãªã©ãå«ãé¢æ°ã
ãã®ããã»ã¹ã詳ããèŠãŠã¿ãŸãããã ãŸããããã°ã©ã ãã¢ã»ã³ãã©ãŒã§æžãããå€å žç㪠BPF ãšã®æåã®éãã«ã€ããŠè©±ããŸãããã æ°ããããŒãžã§ã³ã§ã¯ãããã°ã©ã ãé«çŽèšèªã䞻㫠C ã§èšè¿°ã§ããããã«ã¢ãŒããã¯ãã£ãæ¡åŒµãããŸããããã®ããã«ãBPF ã¢ãŒããã¯ãã£ã®ãã€ãã³ãŒããçæã§ãã llvm ã®ããã¯ãšã³ããéçºãããŸããã
BPF ã¢ãŒããã¯ãã£ã¯ãéšåçã«ã¯ææ°ã®ãã·ã³ã§å¹ççã«å®è¡ã§ããããã«èšèšãããŠããŸãã ãããå®éã«æ©èœãããã«ã¯ãã«ãŒãã«ã«ããŒãããã BPF ãã€ãã³ãŒãããJIT ã³ã³ãã€ã©ãŒãšåŒã°ããã³ã³ããŒãã³ãã䜿çšããŠãã€ãã£ã ã³ãŒãã«å€æãããŸã (Just In Tç§ïŒã 次ã«ãèŠããŠãããšæããŸãããåŸæ¥ã® BPF ã§ã¯ãããã°ã©ã ã¯ã«ãŒãã«ã«ããŒããããåäžã®ã·ã¹ãã ã³ãŒã«ã®ã³ã³ããã¹ãã§ã¢ãããã¯ã«ã€ãã³ã ãœãŒã¹ã«ã¢ã¿ãããããŸããã æ°ããã¢ãŒããã¯ãã£ã§ã¯ããã㯠XNUMX 段éã§è¡ãããŸããæåã«ãã·ã¹ãã ã³ãŒã«ã䜿çšããŠã³ãŒããã«ãŒãã«ã«ããŒããããŸãã bpf(2)
ãã®åŸãããã°ã©ã ã®çš®é¡ã«å¿ããŠç°ãªãä»ã®ã¡ã«ããºã ãéããŠãããã°ã©ã ã¯ã€ãã³ã ãœãŒã¹ã«æ¥ç¶ãããŸãã
ããã§èªè ã¯çåãæã€ãããããŸããïŒããã¯å¯èœã§ããããïŒ ãã®ãããªã³ãŒãã®å®è¡ã®å®å šæ§ã¯ã©ã®ããã«ä¿èšŒãããã®ã§ãããã? å®è¡ã®å®å šæ§ã¯ãããªãã¡ã€ã¢ãŒãšåŒã°ãã BPF ããã°ã©ã ãããŒããã段éã«ãã£ãŠä¿èšŒãããŸã (è±èªã§ã¯ããã®æ®µéã¯ããªãã¡ã€ã¢ãŒãšåŒã°ããŸããä»åŸããã®è±èªã®åèªã䜿çšããŸã)ã
Verifier ã¯ãããã°ã©ã ãã«ãŒãã«ã®éåžžã®åäœãäžæããªãããšãä¿èšŒããéçã¢ãã©ã€ã¶ãŒã§ãã ã¡ãªã¿ã«ãããã¯ãããã°ã©ã ãã·ã¹ãã ã®åäœã«å¹²æžã§ããªããšããæå³ã§ã¯ãããŸãããBPF ããã°ã©ã ã¯ãçš®é¡ã«å¿ããŠãã«ãŒãã« ã¡ã¢ãªã®ã»ã¯ã·ã§ã³ã®èªã¿åããšåæžã蟌ã¿ãé¢æ°ã®æ»ãå€ãããªã ãè¿œå ãåæžã蟌ã¿ãå¯èœã§ããããã«ãããã¯ãŒã¯ãã±ããã転éããããšãã§ããŸãã Verifier ã¯ãBPF ããã°ã©ã ãå®è¡ããŠãã«ãŒãã«ãã¯ã©ãã·ã¥ããªãããšãããã³ã«ãŒã«ã«åŸã£ãŠæžã蟌ã¿ã¢ã¯ã»ã¹æš©ãæã€ããã°ã©ã (éä¿¡ãã±ããã®ããŒã¿ãªã©) ããã±ããã®å€éšã®ã«ãŒãã« ã¡ã¢ãªãäžæžãã§ããªãããšãä¿èšŒããŸãã BPF ã®ä»ã®ãã¹ãŠã®ã³ã³ããŒãã³ãã«ã€ããŠç解ããåŸã察å¿ããã»ã¯ã·ã§ã³ã§ Verifier ã«ã€ããŠããå°ã詳ãã説æããŸãã
ã§ã¯ããããŸã§ã«äœãåŠãã ã®ã§ãããã? ãŠãŒã¶ãŒã¯ C ã§ããã°ã©ã ãäœæããã·ã¹ãã ã³ãŒã«ã䜿çšããŠãããã«ãŒãã«ã«ããŒãããŸãã bpf(2)
ãæ€èšŒè
ã«ãã£ãŠãã§ãã¯ããããã€ãã£ã ãã€ãã³ãŒãã«å€æãããŸãã 次ã«ãåããŠãŒã¶ãŒãŸãã¯å¥ã®ãŠãŒã¶ãŒãããã°ã©ã ãã€ãã³ã ãœãŒã¹ã«æ¥ç¶ããå®è¡ãéå§ããŸãã ããŒããšæ¥ç¶ãåé¢ããå¿
èŠãããçç±ã¯ããã€ããããŸãã ãŸããããªãã¡ã€ã¢ã®å®è¡ã¯æ¯èŒçé«äŸ¡ã§ãããåãããã°ã©ã ãäœåºŠãããŠã³ããŒãããããšã§ã³ã³ãã¥ãŒã¿ã®æéãç¡é§ã«ããŸãã 第äºã«ãããã°ã©ã ãã©ã®ããã«æ¥ç¶ããããã¯ãã®çš®é¡ã«ãã£ãŠç°ãªããXNUMX 幎åã«éçºããã XNUMX ã€ã®ããŠãããŒãµã«ãã€ã³ã¿ãŒãã§ã€ã¹ã¯æ°ããçš®é¡ã®ããã°ã©ã ã«ã¯é©ããªãå¯èœæ§ããããŸãã (çŸåšã§ã¯ã¢ãŒããã¯ãã£ãããæçããŠããŠããŸããããã®ã€ã³ã¿ãŒãã§ãŒã¹ãã¬ãã«ã§çµ±äžãããšããèãããããŸã) libbpf
.)
泚ææ·±ãèªè ã¯ããŸã åçãå®æããŠããªãããšã«æ°ã¥ããããããŸããã å®éãäžèšã®ãã¹ãŠã§ã¯ãBPF ãåŸæ¥ã® BPF ãšæ¯èŒããŠç¶æ³ãæ ¹æ¬çã«å€ããçç±ã説æã§ããŸããã é©çšç¯å²ãå€§å¹ ã«æ¡å€§ãã XNUMX ã€ã®é©æ°ã¯ãå ±æã¡ã¢ãªãšã«ãŒãã« ãã«ããŒé¢æ°ã䜿çšããæ©èœã§ãã BPF ã§ã¯ãå ±æã¡ã¢ãªã¯ãããããããããã€ãŸãç¹å®ã® API ãåããå ±æããŒã¿æ§é ã䜿çšããŠå®è£ ãããŸãã ãããããæåã«ç»å Žãããããã®ã¿ã€ããããã·ã¥ ããŒãã«ã ã£ãããããã®ååãä»ãããããšæãããŸãã ãã®åŸãé åãããŒã«ã« (CPU ããš) ããã·ã¥ ããŒãã«ãšããŒã«ã«é åãæ€çŽ¢ããªãŒãBPF ããã°ã©ã ãžã®ãã€ã³ã¿ãŒãå«ãããããªã©ãç»å ŽããŸããã ç§ãã¡ã«ãšã£ãŠèå³æ·±ãã®ã¯ãBPF ããã°ã©ã ãåŒã³åºãéã§ç¶æ ãä¿æãããããä»ã®ããã°ã©ã ããŠãŒã¶ãŒç©ºéãšå ±æããæ©èœãåããŠããããšã§ãã
ãããã¯ã·ã¹ãã ã³ãŒã«ã䜿çšããŠãŠãŒã¶ãŒ ããã»ã¹ããã¢ã¯ã»ã¹ãããŸãã bpf(2)
ãããã³ãã«ããŒé¢æ°ã䜿çšããŠã«ãŒãã«å
ã§å®è¡ãããŠãã BPF ããã°ã©ã ããã ããã«ããã«ããŒã¯ããããæäœããããã ãã§ãªããä»ã®ã«ãŒãã«æ©èœã«ã¢ã¯ã»ã¹ããããã«ãååšããŸãã ããšãã°ãBPF ããã°ã©ã ã¯ãã«ããŒé¢æ°ã䜿çšããŠããã±ãããä»ã®ã€ã³ã¿ãŒãã§ã€ã¹ã«è»¢éããããperf ã€ãã³ããçæããããã«ãŒãã«æ§é ã«ã¢ã¯ã»ã¹ãããããããšãã§ããŸãã
èŠçŽãããšãBPF ã¯ãä»»æã®ãã€ãŸãæ€èšŒè ã«ãã£ãŠãã¹ãããããŠãŒã¶ãŒ ã³ãŒããã«ãŒãã«ç©ºéã«ããŒãããæ©èœãæäŸããŸãã ãã®ã³ãŒãã¯ãåŒã³åºãéã®ç¶æ ãä¿åãããŠãŒã¶ãŒç©ºéãšããŒã¿ã亀æããããšãã§ãããŸãããã®ã¿ã€ãã®ããã°ã©ã ã§èš±å¯ãããŠããã«ãŒãã« ãµãã·ã¹ãã ã«ã¢ã¯ã»ã¹ããããšãã§ããŸãã
ããã¯ãã§ã«ã«ãŒãã« ã¢ãžã¥ãŒã«ã«ãã£ãŠæäŸãããæ©èœãšäŒŒãŠããŸãããBPF ãšæ¯èŒãããšãBPF ã«ã¯ããã€ãã®å©ç¹ããããŸã (ãã¡ãããæ¯èŒã§ããã®ã¯é¡äŒŒããã¢ããªã±ãŒã·ã§ã³ (ã·ã¹ãã ãã¬ãŒã¹ãªã©) ã®ã¿ã§ããBPF ã§ä»»æã®ãã©ã€ããŒãäœæããããšã¯ã§ããŸãã)ã ãšã³ããªã®ãããå€ãäœãããš (BPF ã䜿çšããäžéšã®ãŠãŒãã£ãªãã£ã§ã¯ããŠãŒã¶ãŒãã«ãŒãã« ããã°ã©ãã³ã° ã¹ãã«ãäžè¬çãªããã°ã©ãã³ã° ã¹ãã«ãå¿ èŠãšããªã)ãå®è¡æã®å®å šæ§ (äœææã«ã·ã¹ãã ãå£ããŠããªã人ã¯ã³ã¡ã³ãã§æãæããŠãã ãã) ã«æ³šç®ããŠãã ããããŸãã¯ãã¹ãã¢ãžã¥ãŒã«)ãã¢ãããã¯æ§ - ã¢ãžã¥ãŒã«ããªããŒããããšãã«ããŠã³ã¿ã€ã ãçºçããBPF ãµãã·ã¹ãã ã¯ã€ãã³ããèŠéãããªãããšãä¿èšŒããŸã (å ¬å¹³ãæãããã«ãããã¯ãã¹ãŠã®çš®é¡ã® BPF ããã°ã©ã ã«åœãŠã¯ãŸãããã§ã¯ãããŸãã)ã
ãã®ãããªæ©èœã®ååšã«ãããBPF ã¯ã«ãŒãã«ãæ¡åŒµããããã®æ±çšããŒã«ãšãªããããã¯å®éã«ç¢ºèªãããŠããŸããBPF ã«ã¯ãŸããŸãæ°ããã¿ã€ãã®ããã°ã©ã ãè¿œå ãããæŠéãµãŒããŒã§ 24 æé 7 æ¥ BPF ã䜿çšãã倧äŒæ¥ããŸããŸãå¢ããŠããŸããã¹ã¿ãŒãã¢ããã¯ãBPF ã«åºã¥ãããœãªã¥ãŒã·ã§ã³ã«åºã¥ããŠããžãã¹ãæ§ç¯ããŸãã BPF ã¯ãDDoS æ»æããã®ä¿è·ãSDN ã®äœæ (Kubernetes ã®ãããã¯ãŒã¯ã®å®è£ ãªã©)ãã¡ã€ã³ã®ã·ã¹ãã ãã¬ãŒã¹ ããŒã«ããã³çµ±èšã³ã¬ã¯ã¿ãŒãšããŠãäŸµå ¥æ€ç¥ã·ã¹ãã ããµã³ãããã¯ã¹ ã·ã¹ãã ãªã©ãããããå Žæã§äœ¿çšãããŠããŸãã
ããã§èšäºã®æŠèŠéšåãçµäºããä»®æ³ãã·ã³ãš BPF ãšã³ã·ã¹ãã ãããã«è©³ããèŠãŠã¿ãŸãããã
äœè«: å ¬å ±äºæ¥
次ã®ã»ã¯ã·ã§ã³ã®äŸãå®è¡ã§ããããã«ããã«ã¯ãå°ãªããšãããã€ãã®ãŠãŒãã£ãªãã£ãå¿
èŠã«ãªãå ŽåããããŸãã llvm
/clang
BPF ãµããŒããš bpftool
ã ã»ã¯ã·ã§ã³å
BPF ä»®æ³ãã·ã³ã®ã¬ãžã¹ã¿ãšåœä»€ã·ã¹ãã
BPF ã®ã¢ãŒããã¯ãã£ãšã³ãã³ã ã·ã¹ãã ã¯ãããã°ã©ã ã C èšèªã§èšè¿°ãããã«ãŒãã«ã«ããŒããããåŸã«ãã€ãã£ã ã³ãŒãã«å€æããããšããäºå®ãèæ ®ããŠéçºãããŸããã ãããã£ãŠãã¬ãžã¹ã¿ã®æ°ãšã³ãã³ãã®ã»ããã¯ãçŸä»£ã®ãã·ã³ã®æ©èœã®æ°åŠçãªæå³ã§ã®å ±éç¹ãèæ ®ããŠéžæãããŸããã ããã«ãããã°ã©ã ã«ã¯ããŸããŸãªå¶éã課ãããŠãããããšãã°ãæè¿ãŸã§ã«ãŒãããµãã«ãŒãã³ãäœæã§ããªãã£ãããåœä»€æ°ã 4096 ã«å¶éãããŠããŸãã (çŸåšãç¹æš©ããã°ã©ã ã¯æ倧 XNUMX äžåœä»€ãèªã¿èŸŒãããšãã§ããŸã)ã
BPF ã«ã¯ãŠãŒã¶ãŒãã¢ã¯ã»ã¹ã§ãã 64 ããã ã¬ãžã¹ã¿ã XNUMX åãããŸã r0
- r10
ãããŠããã°ã©ã ã«ãŠã³ã¿ãŒã ç»é²ãã r10
ãã¬ãŒã ãã€ã³ã¿ãŒãå«ãŸããŠãããèªã¿åãå°çšã§ãã ããã°ã©ã ã¯ãå®è¡æã« 512 ãã€ãã®ã¹ã¿ãã¯ãšããããã®åœ¢åŒã§ç¡å¶éã®å
±æã¡ã¢ãªã«ã¢ã¯ã»ã¹ã§ããŸãã
BPF ããã°ã©ã ã§ã¯ãããã°ã©ã ã¿ã€ãã®ã«ãŒãã« ãã«ããŒã®ç¹å®ã®ã»ãããããã³æè¿ã§ã¯éåžžã®é¢æ°ãå®è¡ã§ããŸãã åŒã³åºãããåé¢æ°ã¯ãã¬ãžã¹ã¿ã§æž¡ãããæ倧 XNUMX ã€ã®åŒæ°ãåãããšãã§ããŸã r1
- r5
ãæ»ãå€ã¯ã«æž¡ãããŸã r0
ã é¢æ°ããæ»ã£ãåŸãã¬ãžã¹ã¿ã®å
容ãæŽæ°ãããããšãä¿èšŒãããŸãã r6
- r9
å€æŽãããŸããã
ããã°ã©ã ãå¹ççã«å€æããã«ã¯ãã¬ãžã¹ã¿ã䜿çšããŸãã r0
- r11
ãµããŒããããŠãããã¹ãŠã®ã¢ãŒããã¯ãã£ã§ã¯ãçŸåšã®ã¢ãŒããã¯ãã£ã® ABI æ©èœãèæ
®ããŠãå®ã¬ãžã¹ã¿ã«äžæã«ãããã³ã°ãããŸãã ããšãã°ã x86_64
ã¬ãžã¹ã¿ãŒ r1
- r5
ã¯ãé¢æ°ãã©ã¡ãŒã¿ãæž¡ãããã«äœ¿çšããã rdi
, rsi
, rdx
, rcx
, r8
ããã©ã¡ãŒã¿ãé¢æ°ã«æž¡ãããã«äœ¿çšãããŸãã x86_64
ã ããšãã°ãå·ŠåŽã®ã³ãŒãã¯æ¬¡ã®ããã«å³åŽã®ã³ãŒãã«å€æãããŸãã
1: (b7) r1 = 1 mov $0x1,%rdi
2: (b7) r2 = 2 mov $0x2,%rsi
3: (b7) r3 = 3 mov $0x3,%rdx
4: (b7) r4 = 4 mov $0x4,%rcx
5: (b7) r5 = 5 mov $0x5,%r8
6: (85) call pc+1 callq 0x0000000000001ee8
ã¬ãžã¹ã¿ r0
ããã°ã©ã ã®å®è¡çµæãè¿ãããã«ã䜿çšãããã¬ãžã¹ã¿ãŒå
㧠r1
ããã°ã©ã ã«ã¯ã³ã³ããã¹ããžã®ãã€ã³ã¿ãŒãæž¡ãããŸããããã°ã©ã ã®çš®é¡ã«å¿ããŠãããã¯æ§é äœãªã©ã«ãªããŸãã struct xdp_md
struct __sk_buff
struct pt_regs
ãããã£ãŠãã¬ãžã¹ã¿ãã«ãŒãã« ãã«ããŒãã¹ã¿ãã¯ãã³ã³ããã¹ã ãã€ã³ã¿ãŒãããã³ããã圢åŒã®å ±æã¡ã¢ãªã®ã»ãããååšããŸããã æ è¡ã«ããããã¹ãŠã絶察ã«å¿ èŠãšããããã§ã¯ãããŸãããã...
説æãç¶ããŠããããã®ãªããžã§ã¯ããæäœããããã®ã³ãã³ã ã·ã¹ãã ã«ã€ããŠè©±ããŸãããã å
šãŠ ïŒ
ãã㯠Code
- ããã¯åœä»€ã®ãšã³ã³ãŒãã£ã³ã°ã§ãã Dst
/Src
ã¯ããããåä¿¡åŽãšéä¿¡åŽã®ãšã³ã³ãŒãã£ã³ã°ã§ãã Off
- 16 ãããã®ç¬Šå·ä»ãã€ã³ãã³ããããã³ Imm
äžéšã®åœä»€ã§äœ¿çšããã 32 ãããã®ç¬Šå·ä»ãæŽæ°ã§ã (cBPF å®æ° K ãšåæ§)ã ãšã³ã³ãŒãã£ã³ã° Code
次㮠XNUMX ã€ã®ã¿ã€ãã®ããããããããŸãã
åœä»€ã¯ã©ã¹ 0ã1ã2ã3 ã¯ãã¡ã¢ãªãæäœããããã®ã³ãã³ããå®çŸ©ããŸãã 圌ã㯠BPF_LD
, BPF_LDX
, BPF_ST
, BPF_STX
ã ããããã ã¯ã©ã¹ 4ã7 (BPF_ALU
, BPF_ALU64
) 㯠ALU åœä»€ã®ã»ãããæ§æããŸãã ã¯ã©ã¹ 5ã6 (BPF_JMP
, BPF_JMP32
) ã«ã¯ãžã£ã³ãåœä»€ãå«ãŸããŠããŸãã
BPF åœä»€ã·ã¹ãã ãç 究ããããã®ãããªãèšç»ã¯æ¬¡ã®ãšããã§ãããã¹ãŠã®åœä»€ãšãã®ãã©ã¡ãŒã¿ã泚ææ·±ããªã¹ããã代ããã«ããã®ã»ã¯ã·ã§ã³ã§ã¯ããã€ãã®äŸãèŠãŠãããããããåœä»€ãå®éã«ã©ã®ããã«æ©èœããã®ãããããŠã©ã®ããã«å®è¡ããã®ããæããã«ãªããŸãã BPF çšã®ãã€ã㪠ãã¡ã€ã«ãæåã§éã¢ã»ã³ãã«ããŸãã èšäºã®åŸåã§å 容ãçµ±åããããã«ãVerifierãJIT ã³ã³ãã€ã©ãã¯ã©ã·ã㯠BPF ã®å€æã«é¢ããã»ã¯ã·ã§ã³ãããã³ãããã®åŠç¿ãé¢æ°ã®åŒã³åºããªã©ã«é¢ããåå¥ã®æé ã«ã€ããŠã説æããŸãã
åã
ã®åœä»€ã«ã€ããŠè©±ããšãã¯ãã³ã¢ ãã¡ã€ã«ãåç
§ããŸãã bpf.h
bpf_common.h
äŸ: é ã®äžã§ BPF ãå解ãã
ããã°ã©ã ãã³ã³ãã€ã«ããäŸãèŠãŠã¿ãŸããã readelf-example.c
çµæã®ãã€ããªãèŠãŠãã ããã ãªãªãžãã«ã³ã³ãã³ããå
¬éããŸã readelf-example.c
以äžã¯ããã€ã㪠ã³ãŒãããããžãã¯ã埩å
ããåŸã§ãã
$ clang -target bpf -c readelf-example.c -o readelf-example.o -O2
$ llvm-readelf -x .text readelf-example.o
Hex dump of section '.text':
0x00000000 b7000000 01000000 15010100 00000000 ................
0x00000010 b7000000 02000000 95000000 00000000 ................
åºåã®æåã®å readelf
ã¯ã€ã³ãã³ããªã®ã§ãããã°ã©ã 㯠XNUMX ã€ã®ã³ãã³ãã§æ§æãããŸãã
Code Dst Src Off Imm
b7 0 0 0000 01000000
15 0 1 0100 00000000
b7 0 0 0000 02000000
95 0 0 0000 00000000
ã³ãã³ãã³ãŒããçãã b7
, 15
, b7
О 95
ã æäžäœ 7 ããããåœä»€ã¯ã©ã¹ã§ããããšãæãåºããŠãã ããã ãã®äŸã§ã¯ããã¹ãŠã®åœä»€ã® 5 çªç®ã®ãããã空ã§ãããããåœä»€ã¯ã©ã¹ã¯ãããã 7ã5ã7ãXNUMX ã«ãªããŸããã¯ã©ã¹ XNUMX 㯠BPF_ALU64
ã5㯠BPF_JMP
ã ã©ã¡ãã®ã¯ã©ã¹ã§ããåœä»€åœ¢åŒã¯åãã§ãã (äžèšãåç
§)ã次ã®ããã«ããã°ã©ã ãæžãçŽãããšãã§ããŸã (åæã«ãæ®ãã®åã人éã®åœ¢åŒã§æžãæããŸã)ã
Op S Class Dst Src Off Imm
b 0 ALU64 0 0 0 1
1 0 JMP 0 1 1 0
b 0 ALU64 0 0 0 2
9 0 JMP 0 0 0 0
æäœ b
ã¯ã©ã¹ ALU64
- ã§ã s
(ãœãŒã¹) ã®å Žåãå€ã¯ãœãŒã¹ ã¬ãžã¹ã¿ããååŸããããã®å Žåã®ããã«èšå®ãããŠããªãå Žåã¯ãå€ã¯ãã£ãŒã«ãããååŸãããŸãã Imm
ã ãããã£ãŠãæåãš XNUMX çªç®ã®åœä»€ã§ã¯æ¬¡ã®æäœãå®è¡ããŸãã r0 = Imm
ã ãªããJMP ã¯ã©ã¹ 1 ã®åäœã¯ã S
ããŒãã®å ŽåããœãŒã¹ã¬ãžã¹ã¿ã®å€ãšãã£ãŒã«ããæ¯èŒããŸãã Imm
ã å€ãäžèŽãããšã次ãžã®é·ç§»ãçºçããŸãã PC + Off
ã©ã PC
éåžžã©ããã次ã®åœä»€ã®ã¢ãã¬ã¹ãå«ãŸããŸãã æåŸã«ãJMP ã¯ã©ã¹ 9 ã®åäœã¯ã BPF_EXIT
r0
ã ããŒãã«ã«æ°ããåãè¿œå ããŸãããã
Op S Class Dst Src Off Imm Disassm
MOV 0 ALU64 0 0 0 1 r0 = 1
JEQ 0 JMP 0 1 1 0 if (r1 == 0) goto pc+1
MOV 0 ALU64 0 0 0 2 r0 = 2
EXIT 0 JMP 0 0 0 0 exit
ããããã䟿å©ãªåœ¢åŒã«æžãçŽãããšãã§ããŸãã
r0 = 1
if (r1 == 0) goto END
r0 = 2
END:
exit
ã¬ãžã¹ã¿ãŒã«èšèŒãããŠããå
容ãèŠããŠããå Žå r1
ããã°ã©ã ã«ã¯ã«ãŒãã«ããã³ã³ããã¹ããžã®ãã€ã³ã¿ãæž¡ãããã¬ãžã¹ã¿ã«æ ŒçŽãããŸãã r0
å€ãã«ãŒãã«ã«è¿ããããšãã³ã³ããã¹ããžã®ãã€ã³ã¿ã 1 ã®å Žå㯠2 ãè¿ãããã以å€ã®å Žå㯠XNUMX ãè¿ãããšãããããŸãããœãŒã¹ãèŠãŠãæ£ããããšã確èªããŠã¿ãŸãããã
$ cat readelf-example.c
int foo(void *ctx)
{
return ctx ? 2 : 1;
}
ã¯ããããã¯ç¡æå³ãªããã°ã©ã ã§ããããã£ã XNUMX ã€ã®åçŽãªåœä»€ã«å€æãããŸãã
äŸå€äŸïŒ16ãã€ãåœä»€
äžéšã®åœä»€ã¯ 64 ããã以äžãå æãããšåè¿°ããŸããã ããã¯ãããšãã°æ瀺ã«åœãŠã¯ãŸããŸãã lddw
(ã³ãŒã = 0x18
= BPF_LD
BPF_DW
BPF_IMM
Imm
ã ãã€ã³ãã¯ããšããããšã§ã Imm
ã®ãµã€ãºã¯ 32 ã§ãããã«ã¯ãŒã㯠64 ãããã§ããããã64 ã€ã® 64 ãããåœä»€ã§ 64 ãããã®å³å€ãã¬ãžã¹ã¿ã«ããŒãããããšã¯æ©èœããŸããã ãããè¡ãã«ã¯ãXNUMX ã€ã®é£æ¥ããåœä»€ã䜿çšããŠãXNUMX ãããå€ã® XNUMX çªç®ã®éšåããã£ãŒã«ãã«æ ŒçŽããŸãã Imm
ã äŸïŒ
$ cat x64.c
long foo(void *ctx)
{
return 0x11223344aabbccdd;
}
$ clang -target bpf -c x64.c -o x64.o -O2
$ llvm-readelf -x .text x64.o
Hex dump of section '.text':
0x00000000 18000000 ddccbbaa 00000000 44332211 ............D3".
0x00000010 95000000 00000000 ........
ãã€ã㪠ããã°ã©ã ã«ã¯åœä»€ã XNUMX ã€ã ããããŸãã
Binary Disassm
18000000 ddccbbaa 00000000 44332211 r0 = Imm[0]|Imm[1]
95000000 00000000 exit
æ瀺ãæã£ãŠãŸãäŒããŸããã lddw
ã移転ãšãããã®æäœã«ã€ããŠè©±ããšãã
äŸ: æšæºããŒã«ã䜿çšãã BPF ã®å解
ãããã£ãŠãBPF ãã€ã㪠ã³ãŒããèªã¿åãæ¹æ³ãåŠç¿ããå¿ èŠã«å¿ããŠåœä»€ã解æããæºåãã§ããŸããã ãã ããå®éã«ã¯ã次ã®ãããªæšæºããŒã«ã䜿çšããŠããã°ã©ã ãéã¢ã»ã³ãã«ããæ¹ã䟿å©ã§é«éã§ããããšã¯èšããŸã§ããããŸããã
$ llvm-objdump -d x64.o
Disassembly of section .text:
0000000000000000 <foo>:
0: 18 00 00 00 dd cc bb aa 00 00 00 00 44 33 22 11 r0 = 1234605617868164317 ll
2: 95 00 00 00 00 00 00 00 exit
BPF ãªããžã§ã¯ãã®ã©ã€ããµã€ã¯ã«ãbpffs ãã¡ã€ã« ã·ã¹ãã
(ãã®ãµãã»ã¯ã·ã§ã³ã§èª¬æãããŠãã詳现ã®äžéšãåããŠç¥ããŸããã
BPF ãªããžã§ã¯ã (ããã°ã©ã ãšããã) ã¯ãã³ãã³ãã䜿çšããŠãŠãŒã¶ãŒç©ºéããäœæãããŸã BPF_PROG_LOAD
О BPF_MAP_CREATE
ã·ã¹ãã ã³ãŒã« bpf(2)
ããããã©ã®ããã«èµ·ãããã«ã€ããŠã¯ã次ã®ã»ã¯ã·ã§ã³ã§è©³ãã説æããŸãã ããã«ãããã«ãŒãã« ããŒã¿æ§é ãäœæãããããããã«å¯Ÿã㊠refcount
(åç
§ã«ãŠã³ã) ã XNUMX ã«èšå®ããããªããžã§ã¯ããæããã¡ã€ã«èšè¿°åããŠãŒã¶ãŒã«è¿ãããŸãã ãã³ãã«ãéããåŸ refcount
ãªããžã§ã¯ã㯠XNUMX ã€æžãããŒãã«ãªããšãªããžã§ã¯ãã¯ç Žå£ãããŸãã
ããã°ã©ã ããããã䜿çšããå Žåã refcount
ãããã®ãããã¯ãããã°ã©ã ã®ããŒãåŸã« XNUMX ã€ãã€å¢å ããŸãã ãã¡ã€ã«èšè¿°åã¯ãŠãŒã¶ãŒããã»ã¹ããéããããšãã§ããŸããã refcount
ãŒãã«ã¯ãªããŸãã:
ããã°ã©ã ãæ£åžžã«ããŒããããåŸãéåžžããããäœããã®ã€ãã³ã ãžã§ãã¬ãŒã¿ãŒã«ã¢ã¿ããããŸãã ããšãã°ãåä¿¡ãã±ãããåŠçãããããããã¯ãŒã¯ ã€ã³ã¿ãŒãã§ã€ã¹ã«æ¥ç¶ãããããããšãã§ããŸãã tracepoint
æ žå¿ã«ããã ãã®æç¹ã§ãåç
§ã«ãŠã³ã¿ã XNUMX ã€å¢å ããããŒã㌠ããã°ã©ã ã§ãã¡ã€ã«èšè¿°åãéããããšãã§ããããã«ãªããŸãã
ããã§ããŒãããŒããŒãã·ã£ããããŠã³ãããšã©ããªãã§ãããã? ã€ãã³ããžã§ãã¬ãŒã¿ïŒããã¯ïŒã®çš®é¡ã«ãã£ãŠç°ãªããŸãã ãã¹ãŠã®ãããã¯ãŒã¯ ããã¯ã¯ããŒããŒã®å®äºåŸã«ååšããŸãããããã¯ããããã°ããŒãã« ããã¯ã§ãã ãããŠãããšãã°ããã¬ãŒã¹ ããã°ã©ã ã¯ããããäœæããããã»ã¹ãçµäºããåŸã«è§£æŸãããŸã (ãããã£ãŠããããã»ã¹ã«å¯ŸããŠããŒã«ã«ããšããæå³ã§ããŒã«ã«ãšåŒã°ããŸã)ã æè¡çã«ã¯ãããŒã«ã« ããã¯ã¯åžžã«ãŠãŒã¶ãŒç©ºéã«å¯Ÿå¿ãããã¡ã€ã«èšè¿°åãæã£ãŠãããããããã»ã¹ãéãããããšéããããŸãããã°ããŒãã« ããã¯ã¯ããã§ã¯ãããŸããã 次ã®å³ã§ã¯ãèµ€ãååã䜿çšããŠãããŒã㌠ããã°ã©ã ã®çµäºãããŒã«ã« ããã¯ãšã°ããŒãã« ããã¯ã®å Žåã®ãªããžã§ã¯ãã®åç¶æéã«ã©ã®ãããªåœ±é¿ãäžãããã瀺ããŠããŸãã
ããŒã«ã«ããã¯ãšã°ããŒãã«ããã¯ã«éããããã®ã¯ãªãã§ãã? äžéšã®çš®é¡ã®ãããã¯ãŒã¯ ããã°ã©ã ã®å®è¡ã¯ããŠãŒã¶ãŒã¹ããŒã¹ãªãã§ãæå³ããããŸããããšãã°ãDDoS ä¿è·ãæ³åããŠãã ãããããŒãããŒããŒãã«ãŒã«ãäœæããBPF ããã°ã©ã ããããã¯ãŒã¯ ã€ã³ã¿ãŒãã§ã€ã¹ã«æ¥ç¶ãããã®åŸããŒãããŒããŒãèªããçµäºãããããšãã§ããŸãã äžæ¹ãèã®äžã§ XNUMX åãããŠæžãããããã° ãã¬ãŒã¹ ããã°ã©ã ãæ³åããŠãã ããããããå®äºããããã·ã¹ãã ã«ãŽããæ®ããªãããã«ããããšèããŸãããããŒã«ã« ããã¯ããããä¿èšŒããŸãã
äžæ¹ãã«ãŒãã«å
ã®ãã¬ãŒã¹ãã€ã³ãã«æ¥ç¶ããŠãé·å¹Žã«ãããçµ±èšãåéããããšæ³åããŠãã ããã ãã®å ŽåããŠãŒã¶ãŒéšåãå®äºããæã
çµ±èšã«æ»ãããšãå¿
èŠã«ãªããŸãã bpf ãã¡ã€ã« ã·ã¹ãã ã¯ãã®æ©äŒãæäŸããŸãã ããã¯ãBPF ãªããžã§ã¯ããåç
§ãããã¡ã€ã«ã®äœæãå¯èœã«ãããã¡ã¢ãªå
å°çšã®ç䌌ãã¡ã€ã« ã·ã¹ãã ã§ãã refcount
ãªããžã§ã¯ãã ãã®åŸãããŒããŒã¯çµäºã§ããããŒããŒãäœæãããªããžã§ã¯ãã¯çãããŸãŸã«ãªããŸãã
BPF ãªããžã§ã¯ããåç §ãã bpffs ã§ã®ãã¡ã€ã«ã®äœæã¯ãããã³çãããšåŒã°ããŸã (次ã®ãã¬ãŒãºã®ããã«ããããã»ã¹ã¯ BPF ããã°ã©ã ãŸãã¯ãããããã³çãã§ããŸãã)ã BPF ãªããžã§ã¯ãã®ãã¡ã€ã« ãªããžã§ã¯ããäœæããããšã¯ãããŒã«ã« ãªããžã§ã¯ãã®å¯¿åœã延ã°ãã ãã§ãªããã°ããŒãã« ãªããžã§ã¯ãã®äœ¿ããããã«ãæå³ããããŸããã°ããŒãã« DDoS ä¿è·ããã°ã©ã ã®äŸã«æ»ããçµ±èšã確èªã§ããããã«ããããšèããŠããŸããæã ã
BPF ãã¡ã€ã« ã·ã¹ãã ã¯éåžžã次ã®å Žæã«ããŠã³ããããŸãã /sys/fs/bpf
ã§ãããããšãã°æ¬¡ã®ããã«ããŒã«ã«ã«ããŠã³ãããããšãã§ããŸãã
$ mkdir bpf-mountpoint
$ sudo mount -t bpf none bpf-mountpoint
ãã¡ã€ã«ã·ã¹ãã åã¯ã³ãã³ãã䜿çšããŠäœæãããŸãã BPF_OBJ_PIN
BPF ã·ã¹ãã ã³ãŒã«ã 説æã®ããã«ãããã°ã©ã ãååŸããã³ã³ãã€ã«ããã¢ããããŒãããŠã次ã®å Žæã«åºå®ããŠã¿ãŸãããã bpffs
ã ç§ãã¡ã®ããã°ã©ã ã¯äœã圹ã«ç«ã¡ãŸãããäŸãåçŸã§ããããã«ã³ãŒããæ瀺ããŠããã ãã§ãã
$ cat test.c
__attribute__((section("xdp"), used))
int test(void *ctx)
{
return 0;
}
char _license[] __attribute__((section("license"), used)) = "GPL";
ãã®ããã°ã©ã ãã³ã³ãã€ã«ããŠããã¡ã€ã« ã·ã¹ãã ã®ããŒã«ã« ã³ããŒãäœæããŸããã bpffs
:
$ clang -target bpf -c test.c -o test.o
$ mkdir bpf-mountpoint
$ sudo mount -t bpf none bpf-mountpoint
次ã«ããŠãŒãã£ãªãã£ã䜿çšããŠããã°ã©ã ãããŠã³ããŒãããŸããã bpftool
ä»éããã·ã¹ãã ã³ãŒã«ãèŠãŠãã ããã bpf(2)
(äžéšã®ç¡é¢ä¿ãªè¡ã strace åºåããåé€ãããŸãã):
$ sudo strace -e bpf bpftool prog load ./test.o bpf-mountpoint/test
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, prog_name="test", ...}, 120) = 3
bpf(BPF_OBJ_PIN, {pathname="bpf-mountpoint/test", bpf_fd=3}, 120) = 0
ããã§ã¯ã次ã䜿çšããŠããã°ã©ã ãããŒãããŸãã BPF_PROG_LOAD
ãã«ãŒãã«ãããã¡ã€ã«èšè¿°åãåãåããŸãã 3
ãããŠã³ãã³ãã䜿çšã㊠BPF_OBJ_PIN
ãã®ãã¡ã€ã«èšè¿°åããã¡ã€ã«ãšããŠåºå®ããŸãã "bpf-mountpoint/test"
ã ãã®åŸãããŒãããŒããŒããã°ã©ã bpftool
åäœã¯çµäºããŸããããããã°ã©ã ã¯ãããã¯ãŒã¯ ã€ã³ã¿ãŒãã§ã€ã¹ã«æ¥ç¶ããŠããªãã£ãã«ãããããããã«ãŒãã«å
ã«æ®ããŸããã
$ sudo bpftool prog | tail -3
783: xdp name test tag 5c8ba0cf164cb46c gpl
loaded_at 2020-05-05T13:27:08+0000 uid 0
xlated 24B jited 41B memlock 4096B
ãã¡ã€ã«ãªããžã§ã¯ãã¯æ®éã«åé€ã§ããŸã unlink(2)
ãã®åŸã察å¿ããããã°ã©ã ãåé€ãããŸãã
$ sudo rm ./bpf-mountpoint/test
$ sudo bpftool prog show id 783
Error: get by id (783): No such file or directory
ãªããžã§ã¯ãã®åé€
ãªããžã§ã¯ãã®åé€ã«ã€ããŠèšãã°ããã㯠(ã€ãã³ã ãžã§ãã¬ãŒã¿ãŒ) ããããã°ã©ã ãåæããåŸã¯ãåäžã®æ°ããã€ãã³ãããã®èµ·åãããªã¬ãŒããããšã¯ãããŸããããããã°ã©ã ã®çŸåšã®ã€ã³ã¹ã¿ã³ã¹ã¯ãã¹ãŠéåžžã®é åºã§å®äºããããšãæ確ã«ããå¿ èŠããããŸãã ã
äžéšã®çš®é¡ã® BPF ããã°ã©ã ã§ã¯ããã®å Žã§ããã°ã©ã ã眮ãæããããšãã§ããŸãã ã·ãŒã±ã³ã¹ã®ååæ§ãæäŸãã replace = detach old program, attach new program
ã ãã®å Žåãå€ãããŒãžã§ã³ã®ããã°ã©ã ã®ãã¹ãŠã®ã¢ã¯ãã£ããªã€ã³ã¹ã¿ã³ã¹ãäœæ¥ãçµäºããæ°ããã€ãã³ã ãã³ãã©ãŒãæ°ããããã°ã©ã ããäœæãããŸããããã§ã®ãã¢ãããã¯æ§ããšã¯ãåäžã®ã€ãã³ãã倱ãããªãããšãæå³ããŸãã
ã€ãã³ããœãŒã¹ãžã®ããã°ã©ã ã®æ¥ç¶
ãã®èšäºã§ã¯ãããã°ã©ã ãã€ãã³ã ãœãŒã¹ã«æ¥ç¶ããæ¹æ³ã«ã€ããŠã¯åå¥ã«èª¬æããŸãããããã¯ãç¹å®ã®çš®é¡ã®ããã°ã©ã ã®ã³ã³ããã¹ãã§æ€èšããããšãåççã§ããããã§ãã Cmã
bpf ã·ã¹ãã ã³ãŒã«ã䜿çšãããªããžã§ã¯ãã®æäœ
BPFããã°ã©ã
ãã¹ãŠã® BPF ãªããžã§ã¯ãã¯ãã·ã¹ãã ã³ãŒã«ã䜿çšããŠãŠãŒã¶ãŒç©ºéããäœæããã³ç®¡çãããŸãã bpf
ã次ã®ãããã¿ã€ãããããŸãã
#include <linux/bpf.h>
int bpf(int cmd, union bpf_attr *attr, unsigned int size);
ãã¡ããããŒã ã§ã cmd
type ã®å€ã® XNUMX ã€ã§ã enum bpf_cmd
attr
â ç¹å®ã®ããã°ã©ã ã®ãã©ã¡ãŒã¿ãžã®ãã€ã³ã¿ãããã³ size
â ãã€ã³ã¿ã«å¿ãããªããžã§ã¯ãã®ãµã€ãºãã€ãŸãéåžžã¯ãã sizeof(*attr)
ã ã«ãŒãã« 5.8 ã§ã¯ãã·ã¹ãã ã³ãŒã« bpf
34 ã®ç°ãªãã³ãã³ãããµããŒããã union bpf_attr
200è¡ãå ããŸãã ãã ããæ°åã®èšäºãèªããã¡ã«ã³ãã³ããšãã©ã¡ãŒã¿ã«ã€ããŠã¯æ
£ããŠããã®ã§ãããã«æ¯ããå¿
èŠã¯ãããŸããã
ããŒã ããå§ããŸããã BPF_PROG_LOAD
ãBPF ããã°ã©ã ãäœæããŸã - BPF åœä»€ã®ã»ãããåãåãããããã«ãŒãã«ã«ããŒãããŸãã ããŒãæã«ããªãã¡ã€ã¢ãèµ·åããã次㫠JIT ã³ã³ãã€ã©ãèµ·åãããå®è¡ãæåãããšããã°ã©ã ãã¡ã€ã«èšè¿°åããŠãŒã¶ãŒã«è¿ãããŸãã åã®ã»ã¯ã·ã§ã³ã§æ¬¡ã«åœŒã«äœãèµ·ããããèŠãŸãã
ããã§ãåçŽãª BPF ããã°ã©ã ãããŒãããã«ã¹ã¿ã ããã°ã©ã ãäœæããŸãããæåã«ããŒãããããã°ã©ã ã®çš®é¡ã決å®ããå¿
èŠããããŸãã BPF_PROG_TYPE_XDP
ãå€ãè¿ããŸã XDP_PASS
(ãã¹ãŠã®ããã±ãŒãžãã¹ãããããŸã)ã BPF ã¢ã»ã³ãã©ã§ã¯ãéåžžã«åçŽã«èŠããŸãã
r0 = 2
exit
決å®ããåŸ ãã® ã¢ããããŒãããŸãã®ã§ããã®æ¹æ³ããäŒãããŸãã
#define _GNU_SOURCE
#include <string.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <linux/bpf.h>
static inline __u64 ptr_to_u64(const void *ptr)
{
return (__u64) (unsigned long) ptr;
}
int main(void)
{
struct bpf_insn insns[] = {
{
.code = BPF_ALU64 | BPF_MOV | BPF_K,
.dst_reg = BPF_REG_0,
.imm = XDP_PASS
},
{
.code = BPF_JMP | BPF_EXIT
},
};
union bpf_attr attr = {
.prog_type = BPF_PROG_TYPE_XDP,
.insns = ptr_to_u64(insns),
.insn_cnt = sizeof(insns)/sizeof(insns[0]),
.license = ptr_to_u64("GPL"),
};
strncpy(attr.prog_name, "woo", sizeof(attr.prog_name));
syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr));
for ( ;; )
pause();
}
ããã°ã©ã å
ã®èå³æ·±ãã€ãã³ãã¯é
åã®å®çŸ©ããå§ãŸããŸã insns
- ãã·ã³ã³ãŒãã® BPF ããã°ã©ã ã ãã®å ŽåãBPF ããã°ã©ã ã®ååœä»€ã¯æ§é äœã«ããã¯ãããŸãã bpf_insn
insns
æ瀺ã«åŸã£ãŠãã r0 = 2
ã XNUMXçªç® - exit
.
éåŽã ã«ãŒãã«ã¯ããã·ã³ã³ãŒããèšè¿°ããããã«ãŒãã«ããããŒãã¡ã€ã«ã䜿çšãããããããã®ããã䟿å©ãªãã¯ããå®çŸ©ããŸãã tools/include/linux/filter.h
ç§ãã¡ã¯æžãããšãã§ããŸã
struct bpf_insn insns[] = {
BPF_MOV64_IMM(BPF_REG_0, XDP_PASS),
BPF_EXIT_INSN()
};
ãã ãããã€ãã£ã ã³ãŒã㧠BPF ããã°ã©ã ãäœæããå¿ èŠãããã®ã¯ãã«ãŒãã«ã§ã®ãã¹ãã BPF ã«é¢ããèšäºãäœæããå Žåã®ã¿ã§ããããããããã®ãã¯ãããªããŠãéçºè ã®äœæ¥ãããã»ã©è€éã«ãªãããšã¯ãããŸããã
BPF ããã°ã©ã ãå®çŸ©ããããã«ãŒãã«ãžã®ããŒãã«é²ã¿ãŸãã æå°éã®ãã©ã¡ãŒã¿ã»ãã attr
ããã°ã©ã ã®çš®é¡ãåœä»€ã®ã»ãããšæ°ãå¿
èŠãªã©ã€ã»ã³ã¹ãååãå«ãŸããŸãã "woo"
ãããŠã³ããŒãåŸã«ã·ã¹ãã äžã§ããã°ã©ã ãèŠã€ããããã«äœ¿çšããŸãã ããã°ã©ã ã¯ãçŽæã©ãããã·ã¹ãã ã³ãŒã«ã䜿çšããŠã·ã¹ãã ã«ããŒããããŸãã bpf
.
ããã°ã©ã ã®æåŸã§ã¯ããã€ããŒããã·ãã¥ã¬ãŒãããç¡éã«ãŒãã«é¥ããŸãã ããããªããšãã·ã¹ãã ã³ãŒã«ããè¿ããããã¡ã€ã«èšè¿°åãéãããããšãããã°ã©ã ã¯ã«ãŒãã«ã«ãã£ãŠåŒ·å¶çµäºãããŸãã bpf
ãã·ã¹ãã ã«ã¯è¡šç€ºãããŸããã
ããŠããã¹ãã®æºåãæŽããŸããã 以äžã®ããã°ã©ã ãã¢ã»ã³ãã«ããŠå®è¡ããŠã¿ãŸããã strace
ãã¹ãŠãæ£åžžã«åäœããŠããããšã確èªããã«ã¯ã次ã®ããã«ããŸãã
$ clang -g -O2 simple-prog.c -o simple-prog
$ sudo strace ./simple-prog
execve("./simple-prog", ["./simple-prog"], 0x7ffc7b553480 /* 13 vars */) = 0
...
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, insn_cnt=2, insns=0x7ffe03c4ed50, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_V
ERSION(0, 0, 0), prog_flags=0, prog_name="woo", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS}, 72) = 3
pause(
ãã¹ãŠé 調ã bpf(2)
ãã³ãã« 3 ãè¿ãããç¡éã«ãŒãã«å
¥ããŸããã pause()
ã ã·ã¹ãã å
ã§ããã°ã©ã ãèŠã€ããŠã¿ãŸãããã ãããè¡ãã«ã¯ãå¥ã®ã¿ãŒããã«ã«ç§»åããŠãŠãŒãã£ãªãã£ã䜿çšããŸãã bpftool
:
# bpftool prog | grep -A3 woo
390: xdp name woo tag 3b185187f1855c4c gpl
loaded_at 2020-08-31T24:66:44+0000 uid 0
xlated 16B jited 40B memlock 4096B
pids simple-prog(10381)
ã·ã¹ãã ã«ããŒããããããã°ã©ã ãããããšãããããŸã woo
ã°ããŒãã« ID 㯠390 ã§ãçŸåšé²è¡äžã§ã simple-prog
ããã°ã©ã ãæãéããŠãããã¡ã€ã«èšè¿°åããã (ãããŠã simple-prog
ä»äºã¯çµãããŸãããããã woo
æ¶ããã ããïŒã äºæ³éãããã®ããã°ã©ã ã¯ã woo
BPF ã¢ãŒããã¯ãã£ã§ã¯ 16 ãã€ã (86 åœä»€) ã®ãã€ã㪠ã³ãŒããå¿
èŠãšããŸããããã€ãã£ãåœ¢åŒ (x64_40) ã§ã¯ãã§ã« XNUMX ãã€ãã§ãã ããã°ã©ã ãå
ã®åœ¢åŒã§èŠãŠã¿ãŸãããã
# bpftool prog dump xlated id 390
0: (b7) r0 = 2
1: (95) exit
é©ãæ§ãªäºãããªãã 次ã«ãJIT ã³ã³ãã€ã©ãŒã«ãã£ãŠçæãããã³ãŒããèŠãŠã¿ãŸãããã
# bpftool prog dump jited id 390
bpf_prog_3b185187f1855c4c_woo:
0: nopl 0x0(%rax,%rax,1)
5: push %rbp
6: mov %rsp,%rbp
9: sub $0x0,%rsp
10: push %rbx
11: push %r13
13: push %r14
15: push %r15
17: pushq $0x0
19: mov $0x2,%eax
1e: pop %rbx
1f: pop %r15
21: pop %r14
23: pop %r13
25: pop %rbx
26: leaveq
27: retq
ïœã«ã¯ããŸãå¹æçã§ã¯ãªã exit(2)
ãã ããå
¬å¹³ãæãããã«èšããšãç§ãã¡ã®ããã°ã©ã ã¯åçŽããããããèªæã§ã¯ãªãããã°ã©ã ã®å Žåã¯ãJIT ã³ã³ãã€ã©ãŒã«ãã£ãŠè¿œå ãããããããŒã°ãšãšãããŒã°ãåœç¶å¿
èŠã«ãªããŸãã
ã²ã¬ã³ãããã
BPF ããã°ã©ã ã¯ãä»ã® BPF ããã°ã©ã ãšãŠãŒã¶ãŒç©ºéã®ããã°ã©ã ã®äž¡æ¹ã«ã¢ã¯ã»ã¹ã§ããæ§é åã¡ã¢ãªé åã䜿çšã§ããŸãã ãããã®ãªããžã§ã¯ãã¯ããããšåŒã°ãããã®ã»ã¯ã·ã§ã³ã§ã¯ã·ã¹ãã ã³ãŒã«ã䜿çšããŠããããæäœããæ¹æ³ã瀺ããŸãã bpf
.
ãããã®æ©èœã¯å
±æã¡ã¢ãªãžã®ã¢ã¯ã»ã¹ã ãã«éå®ãããªãããšãããã«èšã£ãŠã¿ãŸãããã ããšãã°ãBPF ããã°ã©ã ãžã®ãã€ã³ã¿ããããã¯ãŒã¯ ã€ã³ã¿ãŒãã§ã€ã¹ãžã®ãã€ã³ã¿ãperf ã€ãã³ããæäœããããã®ããããªã©ãå«ãç¹æ®ãªç®çã®ãããããããŸãã èªè
ãæ··ä¹±ãããªãããã«ãããã§ã¯ãããã«ã€ããŠã¯è§ŠããŸããã ãããšã¯å¥ã«ãåæã®åé¡ã¯ãã®äŸã§ã¯éèŠã§ã¯ãªããããç¡èŠããŸãã å©çšå¯èœãªããã ã¿ã€ãã®å®å
šãªãªã¹ãã¯ã次ã®å Žæã«ãããŸãã <linux/bpf.h>
BPF_MAP_TYPE_HASH
.
ããšãã° C++ ã§ããã·ã¥ ããŒãã«ãäœæããå Žåã¯ã次ã®ããã«ãªããŸãã unordered_map<int,long> woo
ããã·ã¢èªã§ãããŒãã«ãå¿
èŠã§ãããšããæå³ã§ã woo
ãµã€ãºã¯ç¡å¶éãããŒã®ã¿ã€ãã¯æ¬¡ã®ãšããã§ã int
ãå€ã¯åã§ã long
ã BPF ããã·ã¥ ããŒãã«ãäœæããã«ã¯ãããŒãã«ã®æ倧ãµã€ãºãæå®ããå¿
èŠãããããšãšãããŒãšå€ã®ã¿ã€ããæå®ãã代ããã«ããããã®ãµã€ãºããã€ãåäœã§æå®ããå¿
èŠãããç¹ãé€ããŠãã»ãŒåãããšãè¡ãå¿
èŠããããŸãã ã ããããäœæããã«ã¯ã次ã®ã³ãã³ãã䜿çšããŸã BPF_MAP_CREATE
ã·ã¹ãã ã³ãŒã« bpf
ã å°å³ãäœæããã»ãŒæå°éã®ããã°ã©ã ãèŠãŠã¿ãŸãããã BPF ããã°ã©ã ãããŒãããåã®ããã°ã©ã ã®åŸã§ã¯ããã®ããã°ã©ã ã¯ç°¡åã«èŠããã¯ãã§ãã
$ cat simple-map.c
#define _GNU_SOURCE
#include <string.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <linux/bpf.h>
int main(void)
{
union bpf_attr attr = {
.map_type = BPF_MAP_TYPE_HASH,
.key_size = sizeof(int),
.value_size = sizeof(int),
.max_entries = 4,
};
strncpy(attr.map_name, "woo", sizeof(attr.map_name));
syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));
for ( ;; )
pause();
}
ããã§ãã©ã¡ãŒã¿ã®ã»ãããå®çŸ©ããŸã attr
ããã§ã¯ããããŒãšãµã€ãºå€ãå«ãããã·ã¥ ããŒãã«ãå¿
èŠã§ãããšèšããŸãã sizeof(int)
ãæ倧 XNUMX ã€ã®èŠçŽ ãå
¥ããããšãã§ããŸããã BPF ããããäœæãããšãã¯ãä»ã®ãã©ã¡ãŒã¿ãŒãæå®ã§ããŸããããšãã°ãããã°ã©ã ã®äŸãšåãæ¹æ³ã§ããªããžã§ã¯ãã®ååã次ã®ããã«æå®ããŸããã "woo"
.
ããã°ã©ã ãã³ã³ãã€ã«ããŠå®è¡ããŠã¿ãŸãããã
$ clang -g -O2 simple-map.c -o simple-map
$ sudo strace ./simple-map
execve("./simple-map", ["./simple-map"], 0x7ffd40a27070 /* 14 vars */) = 0
...
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4, max_entries=4, map_name="woo", ...}, 72) = 3
pause(
ã·ã¹ãã ã³ãŒã«ã¯ãã¡ã bpf(2)
èšè¿°åãããçªå·ãè¿ããŸãã 3
ãã®åŸãããã°ã©ã ã¯äºæ³ã©ãããã·ã¹ãã ã³ãŒã«ã§ã®ãããªãåœä»€ãåŸ
ã¡ãŸãã pause(2)
.
次ã«ãããã°ã©ã ãããã¯ã°ã©ãŠã³ãã«éä¿¡ããããå¥ã®ã¿ãŒããã«ãéããŠããŠãŒãã£ãªãã£ã䜿çšããŠãªããžã§ã¯ããèŠãŠã¿ãŸããã bpftool
(ããããååã§ä»ã®ããããšåºå¥ã§ããŸã):
$ sudo bpftool map
...
114: hash name woo flags 0x0
key 4B value 4B max_entries 4 memlock 4096B
...
æ°å 114 ã¯ãªããžã§ã¯ãã®ã°ããŒãã« ID ã§ãã ã·ã¹ãã äžã®ãã¹ãŠã®ããã°ã©ã ã¯ããã® ID ã䜿çšããŠã次ã®ã³ãã³ãã䜿çšããŠæ¢åã®ããããéãããšãã§ããŸãã BPF_MAP_GET_FD_BY_ID
ã·ã¹ãã ã³ãŒã« bpf
.
ããã§ãããã·ã¥ ããŒãã«ã䜿ã£ãŠéã¶ããšãã§ããŸãã ãã®å 容ãèŠãŠã¿ãŸãããã
$ sudo bpftool map dump id 114
Found 0 elements
空ã®ã å€ãå
¥ããŠã¿ãŸããã hash[1] = 1
:
$ sudo bpftool map update id 114 key 1 0 0 0 value 1 0 0 0
ããäžåºŠè¡šãèŠãŠã¿ãŸãããã
$ sudo bpftool map dump id 114
key: 01 00 00 00 value: 01 00 00 00
Found 1 element
äžæ³ïŒ èŠçŽ ã XNUMX ã€è¿œå ããããšãã§ããŸããã ãããè¡ãã«ã¯ãã€ãã¬ãã«ã§äœæ¥ããå¿
èŠãããããšã«æ³šæããŠãã ããã bptftool
ããã·ã¥ããŒãã«ã®å€ãã©ã®ãããªåã§ãããã¯ããããŸããã (ãã®ç¥è㯠BTF ã䜿çšããŠåœŒå¥³ã«è»¢éã§ããŸãããããã«ã€ããŠã¯ä»ãã説æããŸãã)
bpftool ã¯èŠçŽ ãã©ã®ããã«æ£ç¢ºã«èªã¿åããè¿œå ããŸãã? å éšãèŠãŠã¿ãŸããã:
$ sudo strace -e bpf bpftool map dump id 114
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=114, next_id=0, open_flags=0}, 120) = 3
bpf(BPF_MAP_GET_NEXT_KEY, {map_fd=3, key=NULL, next_key=0x55856ab65280}, 120) = 0
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x55856ab65280, value=0x55856ab652a0}, 120) = 0
key: 01 00 00 00 value: 01 00 00 00
bpf(BPF_MAP_GET_NEXT_KEY, {map_fd=3, key=0x55856ab65280, next_key=0x55856ab65280}, 120) = -1 ENOENT
ãŸããã³ãã³ãã䜿çšããŠã°ããŒãã« ID ã§ããããéããŸããã BPF_MAP_GET_FD_BY_ID
О bpf(2)
èšè¿°å 3 ãè¿ãããŸãããããã«ã³ãã³ãã䜿çšããŸãã BPF_MAP_GET_NEXT_KEY
ãæž¡ãããšã§ããŒãã«å
ã®æåã®ããŒãèŠã€ããŸããã NULL
ãåãããŒãžã®ãã€ã³ã¿ãšããŠã éµãããã°ã§ããããš BPF_MAP_LOOKUP_ELEM
ãã€ã³ã¿ã«å€ãè¿ã value
ã 次ã®ã¹ãããã§ã¯ãçŸåšã®ããŒãžã®ãã€ã³ã¿ãŒãæž¡ããŠæ¬¡ã®èŠçŽ ãèŠã€ããããšããŸãããããŒãã«ã«ã¯ XNUMX ã€ã®èŠçŽ ãšã³ãã³ãã®ã¿ãå«ãŸããŠããŸãã BPF_MAP_GET_NEXT_KEY
æ»ã ENOENT
.
ããŠãã㌠1 ã§å€ãå€æŽããŸããããããžãã¹ ããžãã¯ã§ç»é²ãå¿
èŠã ãšããŸãã hash[1] = 2
:
$ sudo strace -e bpf bpftool map update id 114 key 1 0 0 0 value 2 0 0 0
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=114, next_id=0, open_flags=0}, 120) = 3
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=3, key=0x55dcd72be260, value=0x55dcd72be280, flags=BPF_ANY}, 120) = 0
äºæ³ã©ãããããã¯éåžžã«ç°¡åã§ã: ã³ãã³ã BPF_MAP_GET_FD_BY_ID
ID ãšã³ãã³ãã§ããããéããŸã BPF_MAP_UPDATE_ELEM
èŠçŽ ãäžæžãããŸãã
ãããã£ãŠãããããã°ã©ã ããããã·ã¥ ããŒãã«ãäœæããåŸãå¥ã®ããã°ã©ã ãããã®å
容ãèªã¿æžãããããšãã§ããŸãã ã³ãã³ãã©ã€ã³ãããããå®è¡ã§ããå Žåã¯ãã·ã¹ãã äžã®ä»ã®ããã°ã©ã ã§ãå®è¡ã§ããããšã«æ³šæããŠãã ããã äžã§èª¬æããã³ãã³ãã«å ããŠããŠãŒã¶ãŒç©ºéããããããæäœããã«ã¯ã
BPF_MAP_LOOKUP_ELEM
: ããŒã«ããå€ã®æ€çŽ¢BPF_MAP_UPDATE_ELEM
: å€ã®æŽæ°/äœæBPF_MAP_DELETE_ELEM
: ããŒãåé€ããŸãBPF_MAP_GET_NEXT_KEY
: 次㮠(ãŸãã¯æåã®) ããŒãæ€çŽ¢ããŸããBPF_MAP_GET_NEXT_ID
: æ¢åã®ãã¹ãŠã®ããããééã§ããŸãããããä»çµã¿ã§ãbpftool map
BPF_MAP_GET_FD_BY_ID
: ã°ããŒãã« ID ã§æ¢åã®ããããéããŸãBPF_MAP_LOOKUP_AND_DELETE_ELEM
: ãªããžã§ã¯ãã®å€ãã¢ãããã¯ã«æŽæ°ããå€ãå€ãè¿ããŸããBPF_MAP_FREEZE
: ãããããŠãŒã¶ãŒç©ºéããäžå€ã«ããŸã (ãã®æäœã¯å ã«æ»ãããšãã§ããŸãã)BPF_MAP_LOOKUP_BATCH
,BPF_MAP_LOOKUP_AND_DELETE_BATCH
,BPF_MAP_UPDATE_BATCH
,BPF_MAP_DELETE_BATCH
: äžæ¬æäœã äŸãã°ãBPF_MAP_LOOKUP_AND_DELETE_BATCH
- ããã¯ãããããããã¹ãŠã®å€ãèªã¿åãããªã»ããããå¯äžã®ä¿¡é Œã§ããæ¹æ³ã§ã
ãããã®ã³ãã³ãã®ãã¹ãŠããã¹ãŠã®ããã ã¿ã€ãã§æ©èœããããã§ã¯ãããŸããããäžè¬ã«ããŠãŒã¶ãŒç©ºéããä»ã®ã¿ã€ãã®ããããæäœããããšã¯ãããã·ã¥ ããŒãã«ãæäœããããšãšãŸã£ããåãããã«èŠããŸãã
é åºã®ããã«ãããã·ã¥ ããŒãã«ã®å®éšãçµäºããŸãããã æ倧 XNUMX ã€ã®ããŒãå«ããããšãã§ããããŒãã«ãäœæããããšãèŠããŠããŸãã? ããã«ããã€ãã®èŠçŽ ãè¿œå ããŠã¿ãŸãããã
$ sudo bpftool map update id 114 key 2 0 0 0 value 1 0 0 0
$ sudo bpftool map update id 114 key 3 0 0 0 value 1 0 0 0
$ sudo bpftool map update id 114 key 4 0 0 0 value 1 0 0 0
ãããŸã§ã¯é 調ã§ããïŒ
$ sudo bpftool map dump id 114
key: 01 00 00 00 value: 01 00 00 00
key: 02 00 00 00 value: 01 00 00 00
key: 04 00 00 00 value: 01 00 00 00
key: 03 00 00 00 value: 01 00 00 00
Found 4 elements
ãã XNUMX ã€è¿œå ããŠã¿ãŸããã:
$ sudo bpftool map update id 114 key 5 0 0 0 value 1 0 0 0
Error: update failed: Argument list too long
äºæ³éããæåããŸããã§ããã ãšã©ãŒã詳ããèŠãŠã¿ãŸãããã
$ sudo strace -e bpf bpftool map update id 114 key 5 0 0 0 value 1 0 0 0
bpf(BPF_MAP_GET_FD_BY_ID, {map_id=114, next_id=0, open_flags=0}, 120) = 3
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=3, info_len=80, info=0x7ffe6c626da0}}, 120) = 0
bpf(BPF_MAP_UPDATE_ELEM, {map_fd=3, key=0x56049ded5260, value=0x56049ded5280, flags=BPF_ANY}, 120) = -1 E2BIG (Argument list too long)
Error: update failed: Argument list too long
+++ exited with 255 +++
ãã¹ãŠé 調ã§ã: äºæ³ã©ãããããŒã BPF_MAP_UPDATE_ELEM
æ°ãã XNUMX çªç®ã®ããŒãäœæããããšããŸãããã¯ã©ãã·ã¥ããŸã E2BIG
.
ãããã£ãŠãBPF ããã°ã©ã ãäœæããŠããŒãã§ããã ãã§ãªãããŠãŒã¶ãŒç©ºéããããããäœæããã³ç®¡çããããšãã§ããŸãã ããã§ãBPF ããã°ã©ã èªäœã®ããããã©ã®ããã«äœ¿çšã§ããããæ€èšããã®ã¯åœç¶ã§ãã ããã«ã€ããŠã¯ããã·ã³ ãã¯ã ã³ãŒãã®èªã¿ã«ããããã°ã©ã ã®èšèªã§è©±ãããšãã§ããŸãããå®éã«ã¯ãBPF ããã°ã©ã ãå®éã«ã©ã®ããã«èšè¿°ãããç¶æããããã瀺ãææãæ¥ãŠããŸãã libbpf
.
(äœã¬ãã«ã®ãµã³ãã«ããªãããšã«äžæºã®ããèªè
ã®ããã«: ã䜿çšããŠäœæãããããããšãã«ããŒé¢æ°ã䜿çšããããã°ã©ã ã詳现ã«åæããŸã) libbpf
ãããŠæå°ã¬ãã«ã§äœãèµ·ããããæããŠãã ããã äžæºã®ããèªè
ãž ãšãŠããè¿œå ããŸãã
libbpf ã䜿çšãã BPF ããã°ã©ã ã®äœæ
ãã·ã³ã³ãŒãã䜿çšã㊠BPF ããã°ã©ã ãäœæããã®ã¯æåã ãã§ããã®åŸã¯é£œããæ¥ãŸãã ãã®ç¬éãããªãã泚æãåããã¹ããªã®ã¯ã llvm
ãBPF ã¢ãŒããã¯ãã£ã®ã³ãŒããçæããããã¯ãšã³ããšã©ã€ãã©ãªããããŸãã libbpf
ããã«ãããBPF ã¢ããªã±ãŒã·ã§ã³ã®ãŠãŒã¶ãŒåŽãäœæãã次ã䜿çšããŠçæããã BPF ããã°ã©ã ã®ã³ãŒããããŒãã§ããŸãã llvm
/clang
.
å®éããã®èšäºãšãã®åŸã®èšäºã§èª¬æããããã«ã libbpf
ãããªã (ãŸãã¯åæ§ã®ããŒã« - iproute2
, libbcc
, libbpf-go
ããªã©ïŒçããŠããããšã¯äžå¯èœã§ãã ãã®ãããžã§ã¯ãã®ãã©ãŒæ©èœã® XNUMX ã€ã¯ libbpf
BPF CO-RE (Compile Once, Run Everywhere) - ããã«ãŒãã«ããå¥ã®ã«ãŒãã«ã«ç§»æ€å¯èœã§ãããŸããŸãª API (ããšãã°ãã«ãŒãã«æ§é ãããŒãžã§ã³ããå€æŽãããå Žåãªã©) ã§å®è¡ã§ãã BPF ããã°ã©ã ãäœæã§ãããããžã§ã¯ãã§ããããŒãžã§ã³ã«åãããŠ)ã CO-RE ã§åäœã§ããããã«ããã«ã¯ãã«ãŒãã«ã BTF ãµããŒãã䜿çšããŠã³ã³ãã€ã«ããå¿
èŠããããŸã (ãããè¡ãæ¹æ³ã«ã€ããŠã¯ãã»ã¯ã·ã§ã³ã§èª¬æããŸã)
$ ls -lh /sys/kernel/btf/vmlinux
-r--r--r-- 1 root root 2.6M Jul 29 15:30 /sys/kernel/btf/vmlinux
ãã®ãã¡ã€ã«ã«ã¯ãã«ãŒãã«ã§äœ¿çšããããã¹ãŠã®ããŒã¿åã«é¢ããæ
å ±ãä¿åããã次ã䜿çšãããã¹ãŠã®äŸã§äœ¿çšãããŸãã libbpf
ã CO-RE ã«ã€ããŠã¯æ¬¡ã®èšäºã§è©³ãã説æããŸããããã®èšäºã§ã¯ã次ã®ãããªã«ãŒãã«ãèªåã§æ§ç¯ããã ãã§ãã CONFIG_DEBUG_INFO_BTF
.
å³æžé€š libbpf
ãã£ã¬ã¯ããªå
ã«ååšããŸã tools/lib/bpf
ã«ãŒãã«ãšãã®éçºã¯ã¡ãŒãªã³ã° ãªã¹ããéããŠè¡ãããŸãã [email protected]
ã ãã ããã«ãŒãã«ã®å€éšã«ååšããã¢ããªã±ãŒã·ã§ã³ã®ããŒãºã«åããŠãå¥ã®ãªããžããªãç¶æãããŸãã
ãã®ã»ã¯ã·ã§ã³ã§ã¯ã次ã䜿çšãããããžã§ã¯ããäœæããæ¹æ³ãèŠãŠãããŸãã libbpf
ãããã€ãã® (å€ããå°ãªããæå³ã®ãªã) ãã¹ã ããã°ã©ã ãäœæãããããã©ã®ããã«æ©èœãããã詳现ã«åæããŠã¿ãŸãããã ããã«ããã次ã®ã»ã¯ã·ã§ã³ã§ãBPF ããã°ã©ã ãããããã«ãŒãã« ãã«ããŒãBTF ãªã©ãšã©ã®ããã«å¯Ÿè©±ããããæ£ç¢ºã«èª¬æããããšãã§ããŸãã
éåžžã次ã䜿çšããŠãããžã§ã¯ããå®è¡ããŸãã libbpf
GitHub ãªããžããªã git ãµãã¢ãžã¥ãŒã«ãšããŠè¿œå ããã«ã¯ãåãããšãè¡ããŸãã
$ mkdir /tmp/libbpf-example
$ cd /tmp/libbpf-example/
$ git init-db
Initialized empty Git repository in /tmp/libbpf-example/.git/
$ git submodule add https://github.com/libbpf/libbpf.git
Cloning into '/tmp/libbpf-example/libbpf'...
remote: Enumerating objects: 200, done.
remote: Counting objects: 100% (200/200), done.
remote: Compressing objects: 100% (103/103), done.
remote: Total 3354 (delta 101), reused 118 (delta 79), pack-reused 3154
Receiving objects: 100% (3354/3354), 2.05 MiB | 10.22 MiB/s, done.
Resolving deltas: 100% (2176/2176), done.
ã«è¡ã libbpf
éåžžã«ã·ã³ãã«ïŒ
$ cd libbpf/src
$ mkdir build
$ OBJDIR=build DESTDIR=root make -s install
$ find root
root
root/usr
root/usr/include
root/usr/include/bpf
root/usr/include/bpf/bpf_tracing.h
root/usr/include/bpf/xsk.h
root/usr/include/bpf/libbpf_common.h
root/usr/include/bpf/bpf_endian.h
root/usr/include/bpf/bpf_helpers.h
root/usr/include/bpf/btf.h
root/usr/include/bpf/bpf_helper_defs.h
root/usr/include/bpf/bpf.h
root/usr/include/bpf/libbpf_util.h
root/usr/include/bpf/libbpf.h
root/usr/include/bpf/bpf_core_read.h
root/usr/lib64
root/usr/lib64/libbpf.so.0.1.0
root/usr/lib64/libbpf.so.0
root/usr/lib64/libbpf.a
root/usr/lib64/libbpf.so
root/usr/lib64/pkgconfig
root/usr/lib64/pkgconfig/libbpf.pc
ãã®ã»ã¯ã·ã§ã³ã®æ¬¡ã®èšç»ã¯æ¬¡ã®ãšããã§ãã次ã®ãã㪠BPF ããã°ã©ã ãäœæããŸãã BPF_PROG_TYPE_XDP
ãåã®äŸãšåãã§ãããC ã§ã¯æ¬¡ã®ããã«ã³ã³ãã€ã«ããŸãã clang
ãããŠããããã«ãŒãã«ã«ããŒããããã«ã㌠ããã°ã©ã ãäœæããŸãã 次ã®ã»ã¯ã·ã§ã³ã§ã¯ãBPF ããã°ã©ã ãšã¢ã·ã¹ã¿ã³ã ããã°ã©ã ã®äž¡æ¹ã®æ©èœãæ¡åŒµããŸãã
äŸ: libbpf ã䜿çšããæ¬æ Œçãªã¢ããªã±ãŒã·ã§ã³ã®äœæ
ãŸãããã¡ã€ã«ã䜿çšããŸã /sys/kernel/btf/vmlinux
ãããã¯äžã§èª¬æãããã®ã§ãåçã®ãã®ãããã㌠ãã¡ã€ã«ã®åœ¢åŒã§äœæããŸãã
$ bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h
ãã®ãã¡ã€ã«ã«ã¯ãã«ãŒãã«ã§äœ¿çšã§ãããã¹ãŠã®ããŒã¿æ§é ãä¿åãããŸããããšãã°ãIPv4 ããããŒãã«ãŒãã«ã§å®çŸ©ãããæ¹æ³ã¯æ¬¡ã®ãšããã§ãã
$ grep -A 12 'struct iphdr {' vmlinux.h
struct iphdr {
__u8 ihl: 4;
__u8 version: 4;
__u8 tos;
__be16 tot_len;
__be16 id;
__be16 frag_off;
__u8 ttl;
__u8 protocol;
__sum16 check;
__be32 saddr;
__be32 daddr;
};
ããã§ãBPF ããã°ã©ã ã C ã§äœæããŸãã
$ cat xdp-simple.bpf.c
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
SEC("xdp/simple")
int simple(void *ctx)
{
return XDP_PASS;
}
char LICENSE[] SEC("license") = "GPL";
ç§ãã¡ã®ããã°ã©ã ã¯éåžžã«åçŽã§ããããšãå€æããŸããããäŸç¶ãšããŠå€ãã®è©³çŽ°ã«æ³šæãæãå¿
èŠããããŸãã ãŸããæåã«ã€ã³ã¯ã«ãŒãããããã㌠ãã¡ã€ã«ã¯æ¬¡ã®ãšããã§ãã vmlinux.h
ã䜿çšããŠçæããã°ããã§ã bpftool btf dump
- ã«ãŒãã«æ§é ãã©ã®ãããªãã®ããç¥ãããã« kernel-headers ããã±ãŒãžãã€ã³ã¹ããŒã«ããå¿
èŠã¯ãªããªããŸããã 次ã®ããã㌠ãã¡ã€ã«ã¯ã©ã€ãã©ãªããæäŸãããŸãã libbpf
ã ããã§å¿
èŠãªã®ã¯ãã¯ããå®çŸ©ããããã ãã§ã SEC
ãELF ãªããžã§ã¯ã ãã¡ã€ã«ã®é©åãªã»ã¯ã·ã§ã³ã«æåãéä¿¡ããŸãã ç§ãã¡ã®ããã°ã©ã ã¯ã»ã¯ã·ã§ã³ã«å«ãŸããŠããŸã xdp/simple
ããã§ãã¹ã©ãã·ã¥ã®åã«ããã°ã©ã ã¿ã€ã BPF ãå®çŸ©ããŸããããã¯ã libbpf
ãã»ã¯ã·ã§ã³åã«åºã¥ããŠãèµ·åæã«æ£ããåã«çœ®ãæããããŸãã bpf(2)
ã BPF ããã°ã©ã èªäœã¯ã C
- éåžžã«ã·ã³ãã«ã§ XNUMX è¡ã§æ§æãããŸã return XDP_PASS
ã æåŸã«å¥ã»ã¯ã·ã§ã³ãšããŠã "license"
ã©ã€ã»ã³ã¹ã®ååãå«ãŸããŸãã
llvm/clangãããŒãžã§ã³ >= 10.0.0ããŸãã¯ãã以äžã䜿çšããŠããã°ã©ã ãã³ã³ãã€ã«ã§ããŸã (ã»ã¯ã·ã§ã³ãåç
§)
$ clang --version
clang version 11.0.0 (https://github.com/llvm/llvm-project.git afc287e0abec710398465ee1f86237513f2b5091)
...
$ clang -O2 -g -c -target bpf -I libbpf/src/root/usr/include xdp-simple.bpf.c -o xdp-simple.bpf.o
èå³æ·±ãæ©èœãšããŠã¯ãã¿ãŒã²ãã ã¢ãŒããã¯ãã£ã瀺ããŸãã -target bpf
ããããŒãžã®ãã¹ libbpf
ãæè¿ã€ã³ã¹ããŒã«ããŸããã ãŸããå¿ããªãã§ãã ãã -O2
, ãã®ãªãã·ã§ã³ã䜿çšããªããšãå°æ¥é©ããããªäºæ
ã«ééããå¯èœæ§ããããŸãã ã³ãŒããèŠãŠã¿ãŸããããåžæéãã®ããã°ã©ã ãæžãããšãã§ããŸããã?
$ llvm-objdump --section=xdp/simple --no-show-raw-insn -D xdp-simple.bpf.o
xdp-simple.bpf.o: file format elf64-bpf
Disassembly of section xdp/simple:
0000000000000000 <simple>:
0: r0 = 2
1: exit
ã¯ããããŸããããŸããïŒ ããã§ããã°ã©ã ãå«ããã€ã㪠ãã¡ã€ã«ãã§ããã®ã§ããããã«ãŒãã«ã«ããŒãããã¢ããªã±ãŒã·ã§ã³ãäœæããããšæããŸãã ãã®ç®çã®ããã«ãå³æžé€šã¯ libbpf
ã«ã¯ãäœã¬ãã« API ã䜿çšããããé«ã¬ãã« API ã䜿çšãããã® XNUMX ã€ã®ãªãã·ã§ã³ããããŸãã ç§ãã¡ã¯ããã®åŸã®åŠç¿ã®ããã«æå°éã®åŽå㧠BPF ããã°ã©ã ãäœæãããŒããæ¥ç¶ããæ¹æ³ãåŠã³ããã®ã§ãXNUMX çªç®ã®æ¹æ³ã«é²ã¿ãŸãã
ãŸããåããŠãŒãã£ãªãã£ã䜿çšããŠããã€ããªããããã°ã©ã ã®ãã¹ã±ã«ãã³ããçæããå¿
èŠããããŸãã bpftool
â BPF çã®ã¹ã€ã¹ãã€ã (BPF ã®äœæè
ããã³ä¿å®è
ã® XNUMX 人ã§ãããããšã«ã»ãã«ã¯ãã³ã¯ã¹ã€ã¹äººãªã®ã§ãæåéãã«åãåãããšãã§ããŸã):
$ bpftool gen skeleton xdp-simple.bpf.o > xdp-simple.skel.h
ãã¡ã€ã«å
xdp-simple.skel.h
ããã°ã©ã ã®ãã€ã㪠ã³ãŒããšããªããžã§ã¯ãã®ããŒããã¢ã¿ãããåé€ãªã©ã管çããããã®é¢æ°ãå«ãŸããŠããŸãã ãã®åçŽãªã±ãŒã¹ã§ã¯ãããã¯ããããã®ããã«èŠããŸããããªããžã§ã¯ã ãã¡ã€ã«ã«å€ãã® BPF ããã°ã©ã ãšããããå«ãŸããŠããå Žåã§ãæ©èœãããã®å·šå€§ãª ELF ãããŒãããã«ã¯ãã¹ã±ã«ãã³ãçæããã«ã¹ã¿ã ã¢ããªã±ãŒã·ã§ã³ãã XNUMX ã€ãŸã㯠XNUMX ã€ã®é¢æ°ãåŒã³åºãã ãã§æžã¿ãŸããæžããŠããŸãããããå
ã«é²ã¿ãŸãããã
å³å¯ã«èšãã°ãç§ãã¡ã®ããŒã㌠ããã°ã©ã ã¯ç°¡åã§ãã
#include <err.h>
#include <unistd.h>
#include "xdp-simple.skel.h"
int main(int argc, char **argv)
{
struct xdp_simple_bpf *obj;
obj = xdp_simple_bpf__open_and_load();
if (!obj)
err(1, "failed to open and/or load BPF objectn");
pause();
xdp_simple_bpf__destroy(obj);
}
ãã㯠struct xdp_simple_bpf
ãã¡ã€ã«ã§å®çŸ©ãããŠãã xdp-simple.skel.h
ãããŠãªããžã§ã¯ã ãã¡ã€ã«ã«ã€ããŠèª¬æããŸãã
struct xdp_simple_bpf {
struct bpf_object_skeleton *skeleton;
struct bpf_object *obj;
struct {
struct bpf_program *simple;
} progs;
struct {
struct bpf_link *simple;
} links;
};
ããã§äœã¬ãã« API ã®çè·¡ãèŠãããšãã§ããŸã: æ§é struct bpf_program *simple
О struct bpf_link *simple
ã æåã®æ§é ã¯ãã»ã¯ã·ã§ã³ã«æžãããŠããããã°ã©ã ãå
·äœçã«èª¬æããŠããŸãã xdp/simple
XNUMX çªç®ã®éšåã§ã¯ãããã°ã©ã ãã€ãã³ã ãœãŒã¹ã«ã©ã®ããã«æ¥ç¶ãããã説æããŸãã
æ©èœ xdp_simple_bpf__open_and_load
ãELF ãªããžã§ã¯ããéããŠè§£æãããã¹ãŠã®æ§é ãšãµãæ§é (ããã°ã©ã ã®ä»ã«ãELF ã«ã¯ããŒã¿ãèªã¿åãå°çšããŒã¿ããããã°æ
å ±ãã©ã€ã»ã³ã¹ãªã©ã®ä»ã®ã»ã¯ã·ã§ã³ãå«ãŸããŸã) ãäœæããã·ã¹ãã ã䜿çšããŠãããã«ãŒãã«ã«ããŒãããŸããé»è©± bpf
ããã¯ãããã°ã©ã ãã³ã³ãã€ã«ããŠå®è¡ããããšã§ç¢ºèªã§ããŸãã
$ clang -O2 -I ./libbpf/src/root/usr/include/ xdp-simple.c -o xdp-simple ./libbpf/src/root/usr/lib64/libbpf.a -lelf -lz
$ sudo strace -e bpf ./xdp-simple
...
bpf(BPF_BTF_LOAD, 0x7ffdb8fd9670, 120) = 3
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, insn_cnt=2, insns=0xdfd580, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 8, 0), prog_flags=0, prog_name="simple", prog_ifindex=0, expected_attach_type=0x25 /* BPF_??? */, ...}, 120) = 4
ã§ã¯ã次ã䜿çšããŠããã°ã©ã ãèŠãŠã¿ãŸãããã bpftool
ã 圌女㮠ID ãèŠã€ããŠã¿ãŸããã:
# bpftool p | grep -A4 simple
463: xdp name simple tag 3b185187f1855c4c gpl
loaded_at 2020-08-01T01:59:49+0000 uid 0
xlated 16B jited 40B memlock 4096B
btf_id 185
pids xdp-simple(16498)
ããã³ dump (ã³ãã³ãã®ç瞮圢ã䜿çšããŸã) bpftool prog dump xlated
):
# bpftool p d x id 463
int simple(void *ctx):
; return XDP_PASS;
0: (b7) r0 = 2
1: (95) exit
æ°ããäœãïŒ ããã°ã©ã 㯠C ãœãŒã¹ ãã¡ã€ã«ã®ãã£ã³ã¯ãåºåããŸãããããã¯ã©ã€ãã©ãªã«ãã£ãŠè¡ãããŸããã libbpf
ããã€ããªå
ã§ãããã° ã»ã¯ã·ã§ã³ãèŠã€ããããã BTF ãªããžã§ã¯ãã«ã³ã³ãã€ã«ãã次ã䜿çšããŠã«ãŒãã«ã«ããŒãããŸããã BPF_BTF_LOAD
ããããŠãã³ãã³ãã§ããã°ã©ã ãããŒããããšãã«ãçµæã®ãã¡ã€ã«èšè¿°åãæå®ããŸãã BPG_PROG_LOAD
.
ã«ãŒãã«ãã«ããŒ
BPF ããã°ã©ã ã¯ããå€éšãé¢æ°ãã€ãŸãã«ãŒãã« ãã«ããŒãå®è¡ã§ããŸãã ãããã®ãã«ããŒé¢æ°ã«ãããBPF ããã°ã©ã ã¯ã«ãŒãã«æ§é ã«ã¢ã¯ã»ã¹ãããããã管çããããã©ãŒãã³ã¹ ã€ãã³ãã®äœæãããŒããŠã§ã¢ã®å¶åŸ¡ (ãã±ããã®ãªãã€ã¬ã¯ããªã©) ãªã©ã®ãçŸå®äžçããšã®éä¿¡ãè¡ãããšãã§ããŸãã
äŸ: bpf_get_smp_processor_id
ãäŸã«ããåŠç¿ããã©ãã€ã ã®æ çµã¿ã®äžã§ããã«ããŒé¢æ°ã® XNUMX ã€ãèããŠã¿ãŸãããã bpf_get_smp_processor_id()
, kernel/bpf/helpers.c
ã ããã¯ãåŒã³åºãå
ã® BPF ããã°ã©ã ãå®è¡ãããŠããããã»ããµã®çªå·ãè¿ããŸãã ããããç§ãã¡ã¯ãã®ã»ãã³ãã£ã¯ã¹ã«ã¯ããŸãèå³ããªããå®è£
ã XNUMX è¡ã§æžããšããäºå®ã«èå³ããããŸãã
BPF_CALL_0(bpf_get_smp_processor_id)
{
return smp_processor_id();
}
BPF ãã«ããŒé¢æ°ã®å®çŸ©ã¯ãLinux ã·ã¹ãã ã³ãŒã«ã®å®çŸ©ã«äŒŒãŠããŸãã ããã§ã¯ãããšãã°ãåŒæ°ã®ãªãé¢æ°ãå®çŸ©ãããŠããŸãã (ããšãã°ãXNUMX ã€ã®åŒæ°ãåãé¢æ°ã¯ããã¯ãã䜿çšããŠå®çŸ©ãããŸãã BPF_CALL_3
ã åŒæ°ã®æ倧æ°ã¯ XNUMX ã§ãã) ãã ããããã¯å®çŸ©ã®æåã®éšåã«ãããŸããã XNUMX çªç®ã®éšåã¯åæ§é ãå®çŸ©ããããšã§ã struct bpf_func_proto
ããã«ã¯ãæ€èšŒè
ãç解ã§ãããã«ããŒé¢æ°ã®èª¬æãå«ãŸããŠããŸãã
const struct bpf_func_proto bpf_get_smp_processor_id_proto = {
.func = bpf_get_smp_processor_id,
.gpl_only = false,
.ret_type = RET_INTEGER,
};
ãã«ããŒé¢æ°ã®ç»é²
ç¹å®ã®ã¿ã€ãã® BPF ããã°ã©ã ããã®é¢æ°ã䜿çšããã«ã¯ããã®é¢æ°ãç»é²ããå¿
èŠããããŸããããšãã°ã次ã®ã¿ã€ãã® BPF ããã°ã©ã ã§ãã BPF_PROG_TYPE_XDP
é¢æ°ã¯ã«ãŒãã«ã§å®çŸ©ãããŠããŸã xdp_func_proto
ããã«ããŒé¢æ° ID ãããXDP ããã®é¢æ°ããµããŒããããã©ããã決å®ããŸãã ç§ãã¡ã®åœ¹å²ã¯
static const struct bpf_func_proto *
xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
{
switch (func_id) {
...
case BPF_FUNC_get_smp_processor_id:
return &bpf_get_smp_processor_id_proto;
...
}
}
æ°ãã BPF ããã°ã©ã ã¿ã€ãã¯ãã¡ã€ã«å
ã§ãå®çŸ©ããããŸã include/linux/bpf_types.h
BPF_PROG_TYPE
ã ããã¯è«ççãªå®çŸ©ã§ããããåŒçšç¬Šã§å²ãŸããŠããŸããC èšèªã®çšèªã§ã¯ãå
·äœçãªæ§é ã®ã»ããå
šäœã®å®çŸ©ã¯ä»ã®å Žæã§è¡ãããŸãã ç¹ã«ãã¡ã€ã«å
ã§ã¯ã kernel/bpf/verifier.c
ãã¡ã€ã«ããã®ãã¹ãŠã®å®çŸ© bpf_types.h
æ§é äœã®é
åãäœæããããã«äœ¿çšãããŸã bpf_verifier_ops[]
:
static const struct bpf_verifier_ops *const bpf_verifier_ops[] = {
#define BPF_PROG_TYPE(_id, _name, prog_ctx_type, kern_ctx_type)
[_id] = & _name ## _verifier_ops,
#include <linux/bpf_types.h>
#undef BPF_PROG_TYPE
};
ã€ãŸããBPF ããã°ã©ã ã®ã¿ã€ãããšã«ããã®ã¿ã€ãã®ããŒã¿æ§é ãžã®ãã€ã³ã¿ãå®çŸ©ãããŸãã struct bpf_verifier_ops
ãå€ã§åæåãããŸã _name ## _verifier_ops
ãã€ãŸãã xdp_verifier_ops
ã®ããã« xdp
ã æ§é xdp_verifier_ops
net/core/filter.c
次ã®ããã«ããŸãã
const struct bpf_verifier_ops xdp_verifier_ops = {
.get_func_proto = xdp_func_proto,
.is_valid_access = xdp_is_valid_access,
.convert_ctx_access = xdp_convert_ctx_access,
.gen_prologue = bpf_noop_prologue,
};
ããã§ã¯ããªãã¿ã®é¢æ°ãèŠãŠã¿ãŸããã xdp_func_proto
ããã£ã¬ã³ãžãçºçãããã³ã«ããªãã¡ã€ã¢ãå®è¡ããŸãã ããã€ã BPF ããã°ã©ã å
ã®é¢æ°ã«ã€ããŠã¯ããåç
§ããŠãã ããã verifier.c
仮説ç㪠BPF ããã°ã©ã ããã®é¢æ°ãã©ã®ããã«äœ¿çšããããèŠãŠã¿ãŸãããã bpf_get_smp_processor_id
ã ãããè¡ãã«ã¯ãåã®ã»ã¯ã·ã§ã³ã®ããã°ã©ã ã次ã®ããã«æžãçŽããŸãã
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
SEC("xdp/simple")
int simple(void *ctx)
{
if (bpf_get_smp_processor_id() != 0)
return XDP_DROP;
return XDP_PASS;
}
char LICENSE[] SEC("license") = "GPL";
ã·ã³ãã« bpf_get_smp_processor_id
<bpf/bpf_helper_defs.h>
ã©ã€ãã©ãªãŒ libbpf
æ¹æ³
static u32 (*bpf_get_smp_processor_id)(void) = (void *) 8;
ããã¯ã bpf_get_smp_processor_id
å€ã 8 ã§ããé¢æ°ãã€ã³ã¿ã§ãã8 ã¯å€ã§ãã BPF_FUNC_get_smp_processor_id
ÑОпа enum bpf_fun_id
ããã¡ã€ã«å
ã§å®çŸ©ãããŠããŸã vmlinux.h
ïŒãã¡ã€ã« bpf_helper_defs.h
ã«ãŒãã«å
ã®æ°å€ã¯ã¹ã¯ãªããã«ãã£ãŠçæãããããããããžãã¯ãæ°å€ã¯åé¡ãããŸãã)ã ãã®é¢æ°ã¯åŒæ°ããšããã次ã®åã®å€ãè¿ããŸãã __u32
ã ããã°ã©ã ã§å®è¡ãããšã clang
åœä»€ãçæããŸã BPF_CALL
ãæ£ããçš®é¡ã ããã°ã©ã ãã³ã³ãã€ã«ããŠã»ã¯ã·ã§ã³ãèŠãŠã¿ãŸããã xdp/simple
:
$ clang -O2 -g -c -target bpf -I libbpf/src/root/usr/include xdp-simple.bpf.c -o xdp-simple.bpf.o
$ llvm-objdump -D --section=xdp/simple xdp-simple.bpf.o
xdp-simple.bpf.o: file format elf64-bpf
Disassembly of section xdp/simple:
0000000000000000 <simple>:
0: 85 00 00 00 08 00 00 00 call 8
1: bf 01 00 00 00 00 00 00 r1 = r0
2: 67 01 00 00 20 00 00 00 r1 <<= 32
3: 77 01 00 00 20 00 00 00 r1 >>= 32
4: b7 00 00 00 02 00 00 00 r0 = 2
5: 15 01 01 00 00 00 00 00 if r1 == 0 goto +1 <LBB0_2>
6: b7 00 00 00 01 00 00 00 r0 = 1
0000000000000038 <LBB0_2>:
7: 95 00 00 00 00 00 00 00 exit
æåã®è¡ã«ã¯æ瀺ã衚瀺ãããŸã call
ããã©ã¡ãŒã¿ IMM
ãã㯠8 ã«çããã SRC_REG
- ãŒãã æ€èšŒè
ã䜿çšãã ABI èŠçŽã«ããã°ãããã¯ãã«ããŒé¢æ° XNUMX çªãžã®åŒã³åºãã§ãã èµ·åããããããžãã¯ã¯ç°¡åã§ãã ã¬ãžã¹ã¿ããã®æ»ãå€ r0
ã«ã³ããŒãããŸãã r1
2,3 è¡ç®ã§ type ã«å€æãããŸãã u32
â äžäœ 32 ããããã¯ãªã¢ãããŸãã 4,5,6,7ã2ãXNUMXãXNUMX è¡ç®ã§ã¯ XNUMX (XDP_PASS
) ãŸã㯠1 (XDP_DROP
) è¡ 0 ã®ãã«ããŒé¢æ°ããŒããŸãã¯ãŒã以å€ã®å€ãè¿ãããã©ããã«å¿ããŠç°ãªããŸãã
èªåèªèº«ããã¹ãããŠã¿ãŸããã: ããã°ã©ã ãããŒãããŠåºåãèŠãŠã¿ãŸããã bpftool prog dump xlated
:
$ bpftool gen skeleton xdp-simple.bpf.o > xdp-simple.skel.h
$ clang -O2 -g -I ./libbpf/src/root/usr/include/ -o xdp-simple xdp-simple.c ./libbpf/src/root/usr/lib64/libbpf.a -lelf -lz
$ sudo ./xdp-simple &
[2] 10914
$ sudo bpftool p | grep simple
523: xdp name simple tag 44c38a10c657e1b0 gpl
pids xdp-simple(10915)
$ sudo bpftool p d x id 523
int simple(void *ctx):
; if (bpf_get_smp_processor_id() != 0)
0: (85) call bpf_get_smp_processor_id#114128
1: (bf) r1 = r0
2: (67) r1 <<= 32
3: (77) r1 >>= 32
4: (b7) r0 = 2
; }
5: (15) if r1 == 0x0 goto pc+1
6: (b7) r0 = 1
7: (95) exit
OKãæ€èšŒè ã¯æ£ããã«ãŒãã«ãã«ããŒãèŠã€ããŸããã
äŸ: åŒæ°ãæž¡ããŠãæåŸã«ããã°ã©ã ãå®è¡ããŸãã
ãã¹ãŠã®å®è¡ã¬ãã«ã®ãã«ããŒé¢æ°ã«ã¯ãããã¿ã€ãããããŸã
u64 fn(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
ãã«ããŒé¢æ°ãžã®ãã©ã¡ãŒã¿ã¯ã¬ãžã¹ã¿ã§æž¡ãããŸã r1
- r5
ãå€ãã¬ãžã¹ã¿ã«è¿ãããŸã r0
ã XNUMX ã€ãè¶
ããåŒæ°ãåãé¢æ°ã¯ãªãããããã®ãµããŒãã¯å°æ¥è¿œå ãããäºå®ã¯ãããŸããã
æ°ããã«ãŒãã« ãã«ããŒãšãBPF ããã©ã¡ãŒã¿ãŒãæž¡ãæ¹æ³ãèŠãŠã¿ãŸãããã æžãçŽããŸããã xdp-simple.bpf.c
以äžã®ããã«ãªããŸã (æ®ãã®è¡ã¯å€æŽãããŠããŸãã)ã
SEC("xdp/simple")
int simple(void *ctx)
{
bpf_printk("running on CPU%un", bpf_get_smp_processor_id());
return XDP_PASS;
}
ç§ãã¡ã®ããã°ã©ã ã¯ãå®è¡ãããŠãã CPU ã®çªå·ãåºåããŸãã ã³ã³ãã€ã«ããŠã³ãŒããèŠãŠã¿ãŸãããã
$ llvm-objdump -D --section=xdp/simple --no-show-raw-insn xdp-simple.bpf.o
0000000000000000 <simple>:
0: r1 = 10
1: *(u16 *)(r10 - 8) = r1
2: r1 = 8441246879787806319 ll
4: *(u64 *)(r10 - 16) = r1
5: r1 = 2334956330918245746 ll
7: *(u64 *)(r10 - 24) = r1
8: call 8
9: r1 = r10
10: r1 += -24
11: r2 = 18
12: r3 = r0
13: call 6
14: r0 = 2
15: exit
0è¡ç®ãã7è¡ç®ãŸã§ã«æååãæžããŸãã running on CPU%un
次ã«ã8 è¡ç®ã§ããªãã¿ã®ãã®ãå®è¡ããŸãã bpf_get_smp_processor_id
ã 9è¡ç®ãã12è¡ç®ã§ãã«ããŒåŒæ°ãæºåããŸãã bpf_printk
- ã¬ãžã¹ã¿ãŒ r1
, r2
, r3
ã ãªã XNUMX ã€ã§ã¯ãªã XNUMX ã€ããã®ã§ãããã? ãªããªã bpf_printk
- bpf_trace_printk
ããã©ãŒãããæååã®ãµã€ãºãæž¡ãå¿
èŠããããŸãã
ããã§ããã€ãã®è¡ãè¿œå ããŸããã xdp-simple.c
ããã°ã©ã ãã€ã³ã¿ãŒãã§ãŒã¹ã«æ¥ç¶ã§ããããã«ãã lo
ãããŠæ¬åœã«å§ãŸããŸããïŒ
$ cat xdp-simple.c
#include <linux/if_link.h>
#include <err.h>
#include <unistd.h>
#include "xdp-simple.skel.h"
int main(int argc, char **argv)
{
__u32 flags = XDP_FLAGS_SKB_MODE;
struct xdp_simple_bpf *obj;
obj = xdp_simple_bpf__open_and_load();
if (!obj)
err(1, "failed to open and/or load BPF objectn");
bpf_set_link_xdp_fd(1, -1, flags);
bpf_set_link_xdp_fd(1, bpf_program__fd(obj->progs.simple), flags);
cleanup:
xdp_simple_bpf__destroy(obj);
}
ããã§é¢æ°ã䜿çšããŸã bpf_set_link_xdp_fd
ãXDP ã¿ã€ãã® BPF ããã°ã©ã ããããã¯ãŒã¯ ã€ã³ã¿ãŒãã§ã€ã¹ã«æ¥ç¶ããŸãã ã€ã³ã¿ãŒãã§ãŒã¹çªå·ãããŒãã³ãŒãã£ã³ã°ããŸãã lo
åžžã« 1 ã§ããé¢æ°ã XNUMX åå®è¡ããŠãå€ãããã°ã©ã ãã¢ã¿ãããããŠããå Žåã¯æåã«ããããã¿ããããŸãã ä»ã¯ææŠããå¿
èŠããªãããšã«æ³šæããŠãã ãã pause
ãŸãã¯ç¡éã«ãŒã: ããŒã㌠ããã°ã©ã ã¯çµäºããŸãããBPF ããã°ã©ã ã¯ã€ãã³ã ãœãŒã¹ã«æ¥ç¶ãããŠãããã匷å¶çµäºãããŸããã ããŠã³ããŒããšæ¥ç¶ãæåãããšããããã¯ãŒã¯ ãã±ãããå°çãããã³ã«ããã°ã©ã ãèµ·åãããŸãã lo
.
ããã°ã©ã ãããŠã³ããŒãããŠã€ã³ã¿ãŒãã§ãŒã¹ãèŠãŠã¿ãŸããã lo
:
$ sudo ./xdp-simple
$ sudo bpftool p | grep simple
669: xdp name simple tag 4fca62e77ccb43d6 gpl
$ ip l show dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 xdpgeneric qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
prog/xdp id 669
ããŠã³ããŒãããããã°ã©ã ã® ID 㯠669 ã§ãã€ã³ã¿ãŒãã§ã€ã¹äžã§ãåã ID ã衚瀺ãããŸãã lo
ã ããã€ãã®è·ç©ãã«éããŸã 127.0.0.1
(ãªã¯ãšã¹ã + è¿ä¿¡):
$ ping -c1 localhost
次ã«ããããã°ä»®æ³ãã¡ã€ã«ã®å
容ãèŠãŠã¿ãŸããã /sys/kernel/debug/tracing/trace_pipe
ã ãã®äžã§ bpf_printk
圌ã®ã¡ãã»ãŒãžã¯æ¬¡ã®ããã«æžãããŠããŸãã
# cat /sys/kernel/debug/tracing/trace_pipe
ping-13937 [000] d.s1 442015.377014: bpf_trace_printk: running on CPU0
ping-13937 [000] d.s1 442015.377027: bpf_trace_printk: running on CPU0
XNUMXã€ã®ããã±ãŒãžãèŠã€ãããŸãã lo
CPU0 ã§åŠçãããŸãããåããŠã®æ¬æ Œçãªæå³ã®ãªã BPF ããã°ã©ã ãæ©èœããŸããã
泚ç®ã«å€ãã bpf_printk
ãããã° ãã¡ã€ã«ã«æžã蟌ãã®ã¯åœç¶ã®ããšã§ããããã¯éçšç°å¢ã§äœ¿çšããã®ã«æãæåãããã«ããŒã§ã¯ãããŸããããç§ãã¡ã®ç®æšã¯åçŽãªãã®ã瀺ãããšã§ããã
BPF ããã°ã©ã ãããããã«ã¢ã¯ã»ã¹ãã
äŸ: BPF ããã°ã©ã ã®ãããã®äœ¿çš
åã®ã»ã¯ã·ã§ã³ã§ã¯ããŠãŒã¶ãŒç©ºéããããããäœæããŠäœ¿çšããæ¹æ³ãåŠã³ãŸããã次ã«ãã«ãŒãã«éšåãèŠãŠã¿ãŸãããã ãã€ãã®ããã«ãäŸããå§ããŸãããã ããã°ã©ã ãæžãçŽããŠã¿ãŸããã xdp-simple.bpf.c
次ã®ããã«ããŸãã
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__uint(max_entries, 8);
__type(key, u32);
__type(value, u64);
} woo SEC(".maps");
SEC("xdp/simple")
int simple(void *ctx)
{
u32 key = bpf_get_smp_processor_id();
u32 *val;
val = bpf_map_lookup_elem(&woo, &key);
if (!val)
return XDP_ABORTED;
*val += 1;
return XDP_PASS;
}
char LICENSE[] SEC("license") = "GPL";
ããã°ã©ã ã®æåã«ãããå®çŸ©ãè¿œå ããŸããã woo
: ããã¯ã次ã®ãããªå€ãæ ŒçŽãã 8 èŠçŽ ã®é
åã§ãã u64
(C ã§ã¯ããã®ãããªé
åã次ã®ããã«å®çŸ©ããŸãã u64 woo[8]
ïŒã çªçµå
㧠"xdp/simple"
çŸåšã®ããã»ããµçªå·ãå€æ°ã«ååŸããŸãã key
ãããŠãã«ããŒé¢æ°ã䜿çšããŸã bpf_map_lookup_element
é
åå
ã®å¯Ÿå¿ãããšã³ããªãžã®ãã€ã³ã¿ãååŸããXNUMX ãã€å¢ãããŸãã ãã·ã¢èªã«ç¿»èš³ãããšãã©ã® CPU ãåä¿¡ãã±ãããåŠçãããã«é¢ããçµ±èšãèšç®ããŸãã ããã°ã©ã ãå®è¡ããŠã¿ãŸãããã
$ clang -O2 -g -c -target bpf -I libbpf/src/root/usr/include xdp-simple.bpf.c -o xdp-simple.bpf.o
$ bpftool gen skeleton xdp-simple.bpf.o > xdp-simple.skel.h
$ clang -O2 -g -I ./libbpf/src/root/usr/include/ -o xdp-simple xdp-simple.c ./libbpf/src/root/usr/lib64/libbpf.a -lelf -lz
$ sudo ./xdp-simple
圌女ãã€ãªãã£ãŠããããšã確èªããŸããã lo
ãããŠããã€ãã®ãã±ãããéä¿¡ããŸãã
$ ip l show dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 xdpgeneric qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
prog/xdp id 108
$ for s in `seq 234`; do sudo ping -f -c 100 127.0.0.1 >/dev/null 2>&1; done
次ã«ãé åã®å 容ãèŠãŠã¿ãŸãããã
$ sudo bpftool map dump name woo
[
{ "key": 0, "value": 0 },
{ "key": 1, "value": 400 },
{ "key": 2, "value": 0 },
{ "key": 3, "value": 0 },
{ "key": 4, "value": 0 },
{ "key": 5, "value": 0 },
{ "key": 6, "value": 0 },
{ "key": 7, "value": 46400 }
]
ã»ãŒãã¹ãŠã®ããã»ã¹ãCPU7ã§åŠçãããŸããã ããã¯ç§ãã¡ã«ãšã£ãŠéèŠã§ã¯ãããŸãããéèŠãªããšã¯ãããã°ã©ã ãåäœããBPF ããã°ã©ã ãããããã«ã¢ã¯ã»ã¹ããæ¹æ³ãç解ããŠããããšã§ãã Ñ
елпеÑПв bpf_mp_*
ç¥ç§ã®ã€ã³ããã¯ã¹
ãããã£ãŠã次ã®ãããªåŒã³åºãã䜿çšããŠãBPF ããã°ã©ã ãããããã«ã¢ã¯ã»ã¹ã§ããŸãã
val = bpf_map_lookup_elem(&woo, &key);
ãã«ããŒé¢æ°ã¯æ¬¡ã®ããã«ãªããŸã
void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)
ããããç§ãã¡ã¯ãã€ã³ã¿ãæž¡ããŠããŸã &woo
ååã®ãªãæ§é ç©ã« struct { ... }
...
ããã°ã©ã ã®ã¢ã»ã³ãã©ãèŠããšãå€ã &woo
å®éã«ã¯å®çŸ©ãããŠããŸãã (è¡ 4):
llvm-objdump -D --section xdp/simple xdp-simple.bpf.o
xdp-simple.bpf.o: file format elf64-bpf
Disassembly of section xdp/simple:
0000000000000000 <simple>:
0: 85 00 00 00 08 00 00 00 call 8
1: 63 0a fc ff 00 00 00 00 *(u32 *)(r10 - 4) = r0
2: bf a2 00 00 00 00 00 00 r2 = r10
3: 07 02 00 00 fc ff ff ff r2 += -4
4: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
6: 85 00 00 00 01 00 00 00 call 1
...
ããã¯åé 眮ã«å«ãŸããŠããŸãã
$ llvm-readelf -r xdp-simple.bpf.o | head -4
Relocation section '.relxdp/simple' at offset 0xe18 contains 1 entries:
Offset Info Type Symbol's Value Symbol's Name
0000000000000020 0000002700000001 R_BPF_64_64 0000000000000000 woo
ãããããã§ã«ããŒããããŠããããã°ã©ã ãèŠããšãæ£ããããããžã®ãã€ã³ã¿ã衚瀺ãããŸã (4 è¡ç®)ã
$ sudo bpftool prog dump x name simple
int simple(void *ctx):
0: (85) call bpf_get_smp_processor_id#114128
1: (63) *(u32 *)(r10 -4) = r0
2: (bf) r2 = r10
3: (07) r2 += -4
4: (18) r1 = map[id:64]
...
ãããã£ãŠãããŒã㌠ããã°ã©ã ã®èµ·åæã«ããžã®ãªã³ã¯ã &woo
ã©ã€ãã©ãªã®ãããã®ã«çœ®ãæããããŸãã libbpf
ã ãŸãåºåãèŠãŠã¿ãŸããã strace
:
$ sudo strace -e bpf ./xdp-simple
...
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=8, max_entries=8, map_name="woo", ...}, 120) = 4
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, prog_name="simple", ...}, 120) = 5
ãããããããŸã libbpf
å°å³ãäœæããŸãã woo
ãããŠããã°ã©ã ãããŠã³ããŒãããŸãã simple
ã ããã°ã©ã ãããŒãããæ¹æ³ã詳ããèŠãŠã¿ãŸãããã
- é»è©±
xdp_simple_bpf__open_and_load
ãã¡ã€ã«ããxdp-simple.skel.h
- ãã®åå
xdp_simple_bpf__load
ãã¡ã€ã«ããxdp-simple.skel.h
- ãã®åå
bpf_object__load_skeleton
ãã¡ã€ã«ããlibbpf/src/libbpf.c
- ãã®åå
bpf_object__load_xattr
ã®libbpf/src/libbpf.c
æåŸã®é¢æ°ã¯ãç¹ã«æ¬¡ã®é¢æ°ãåŒã³åºããŸãã bpf_object__create_maps
ãæ¢åã®ããããäœæãŸãã¯éãããããããã¡ã€ã«èšè¿°åã«å€æããŸãã (ããã§ç§ãã¡ãèŠãã®ã¯ BPF_MAP_CREATE
åºåå
㧠strace
.) 次ã«é¢æ°ãåŒã³åºãããŸãã bpf_object__relocate
ãããŠç§ãã¡ãèå³ãæã£ãŠããã®ã¯åœŒå¥³ã§ãããªããªãç§ãã¡ã¯èŠããã®ãèŠããŠããããã§ã woo
åé
眮ããŒãã«ã«ãããŸãã ãããæ¢çŽ¢ãããšãæçµçã«é¢æ°ã«ãã©ãçããŸãã bpf_program__relocate
ãã©ããš
case RELO_LD64:
insn[0].src_reg = BPF_PSEUDO_MAP_FD;
insn[0].imm = obj->maps[relo->map_idx].fd;
break;
ããã§ç§ãã¡ã¯ç§ãã¡ã®æ瀺ã«åŸããŸã
18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
ãã®äžã®ãœãŒã¹ã¬ãžã¹ã¿ã次ã®ããã«çœ®ãæããŸã BPF_PSEUDO_MAP_FD
ãæåã® IMM ããããã®ãã¡ã€ã«èšè¿°åã«è¿œå ããããã次ã®å Žåãšçãããã©ããã確èªããŸãã 0xdeadbeef
ããã®çµæãæ瀺ãåãåããŸã
18 11 00 00 ef eb ad de 00 00 00 00 00 00 00 00 r1 = 0 ll
ããã¯ããããæ
å ±ãããŒããããç¹å®ã® BPF ããã°ã©ã ã«è»¢éãããæ¹æ³ã§ãã ãã®å Žåããããã¯æ¬¡ã®ããã«äœæã§ããŸãã BPF_MAP_CREATE
ã䜿çšã㊠ID ã«ãã£ãŠéãããŸãã BPF_MAP_GET_FD_BY_ID
.
åèšã䜿çšæ libbpf
ã¢ã«ãŽãªãºã ã¯æ¬¡ã®ãšããã§ãã
- ã³ã³ãã€ã«äžã«ãããããžã®ãªã³ã¯ã®ã¬ã³ãŒããåé 眮ããŒãã«ã«äœæãããŸãã
libbpf
ELF ãªããžã§ã¯ã ããã¯ãéãã䜿çšãããŠãããã¹ãŠã®ããããæ€çŽ¢ãããããã®ãã¡ã€ã«èšè¿°åãäœæããŸãã- ãã¡ã€ã«èšè¿°åã¯åœä»€ã®äžéšãšããŠã«ãŒãã«ã«ããŒããããŸãã
LD64
ãæ³åã®ãšãããä»åŸã¯ããã«å€ãã®ããšãããããã®æ žå¿ã調ã¹ãå¿
èŠããããŸãã 幞ããªããšã«ãç§ãã¡ã¯æããããæã£ãŠããŸã - ç§ãã¡ã¯æå³ãæžãçããŸãã BPF_PSEUDO_MAP_FD
ããããœãŒã¹ã¬ãžã¹ã¿ãŒã«åã蟌ããšããã¹ãŠã®è人ã®èå°ã«ã€ãªããã§ããã - kernel/bpf/verifier.c
ãåºæã®ååãæã€é¢æ°ã¯ããã¡ã€ã«èšè¿°åã次ã®ã¿ã€ãã®æ§é äœã®ã¢ãã¬ã¹ã«çœ®ãæããŸãã struct bpf_map
:
static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env) {
...
f = fdget(insn[0].imm);
map = __bpf_map_get(f);
if (insn->src_reg == BPF_PSEUDO_MAP_FD) {
addr = (unsigned long)map;
}
insn[0].imm = (u32)addr;
insn[1].imm = addr >> 32;
(å®å
šãªã³ãŒããèŠã€ãããŸã
- ããã°ã©ã ã®ããŒãäžã«ãæ€èšŒè
ã¯ãããã®æ£ãã䜿çšæ³ããã§ãã¯ãã察å¿ããæ§é äœã®ã¢ãã¬ã¹ãæžã蟌ã¿ãŸãã
struct bpf_map
ã䜿çšã㊠ELF ãã€ããªãããŠã³ããŒãããå Žå libbpf
ä»ã«ãããããã®ããšãèµ·ãã£ãŠããŸãããããã«ã€ããŠã¯å¥ã®èšäºã§èª¬æããŸãã
libbpf ã䜿çšããªãããã°ã©ã ãšãããã®ããŒã
çŽæã©ããããããã䜿çšããããã°ã©ã ãå©ããªãã§äœæããŠããŒãããæ¹æ³ãç¥ãããèªè
ã®ããã®äŸãããã«ç€ºããŸãã libbpf
ã ããã¯ãäŸåé¢ä¿ãæ§ç¯ã§ããªãç°å¢ã§äœæ¥ããŠããå Žåããã¹ãŠã®ããããä¿åããå ŽåããŸãã¯æ¬¡ã®ãããªããã°ã©ã ãäœæããå Žåã«äŸ¿å©ã§ãã ply
ããžãã¯ãç解ããããããããã«ã次ã®ç®çã®ããã«äŸãæžãçŽããŸãã xdp-simple
ã ãã®äŸã§èª¬æããããã°ã©ã ã®å®å
šãªããããã«æ¡åŒµãããã³ãŒãã¯ã次ã®å Žæã«ãããŸãã
ã¢ããªã±ãŒã·ã§ã³ã®ããžãã¯ã¯æ¬¡ã®ãšããã§ãã
- ã¿ã€ãããããäœæãã
BPF_MAP_TYPE_ARRAY
ã³ãã³ãã䜿çšããŠBPF_MAP_CREATE
, - ãã®ãããã䜿çšããããã°ã©ã ãäœæãã
- ããã°ã©ã ãã€ã³ã¿ãŒãã§ãŒã¹ã«æ¥ç¶ãã
lo
,
ããã人éã«èš³ããšã
int main(void)
{
int map_fd, prog_fd;
map_fd = map_create();
if (map_fd < 0)
err(1, "bpf: BPF_MAP_CREATE");
prog_fd = prog_load(map_fd);
if (prog_fd < 0)
err(1, "bpf: BPF_PROG_LOAD");
xdp_attach(1, prog_fd);
}
ãã㯠map_create
ã·ã¹ãã ã³ãŒã«ã«é¢ããæåã®äŸã§è¡ã£ãã®ãšåãæ¹æ³ã§ããããäœæããŸãã bpf
- ãã«ãŒãã«ã次ã®ãã㪠8 èŠçŽ ã®é
åã®åœ¢åŒã§æ°ããããããäœæããŠãã ããã __u64
ãã¡ã€ã«èšè¿°åãè¿ããŠãã ãã":
static int map_create()
{
union bpf_attr attr;
memset(&attr, 0, sizeof(attr));
attr.map_type = BPF_MAP_TYPE_ARRAY,
attr.key_size = sizeof(__u32),
attr.value_size = sizeof(__u64),
attr.max_entries = 8,
strncpy(attr.map_name, "woo", sizeof(attr.map_name));
return syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));
}
ããã°ã©ã ã®ããŒããç°¡åã§ãã
static int prog_load(int map_fd)
{
union bpf_attr attr;
struct bpf_insn insns[] = {
...
};
memset(&attr, 0, sizeof(attr));
attr.prog_type = BPF_PROG_TYPE_XDP;
attr.insns = ptr_to_u64(insns);
attr.insn_cnt = sizeof(insns)/sizeof(insns[0]);
attr.license = ptr_to_u64("GPL");
strncpy(attr.prog_name, "woo", sizeof(attr.prog_name));
return syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr));
}
é£ããéšå prog_load
æ§é äœã®é
åãšããŠã® BPF ããã°ã©ã ã®å®çŸ©ã§ãã struct bpf_insn insns[]
ã ãã ããC ã§äœæããããã°ã©ã ã䜿çšããŠãããããå°ãããŒãããããšãã§ããŸãã
$ llvm-objdump -D --section xdp/simple xdp-simple.bpf.o
0000000000000000 <simple>:
0: 85 00 00 00 08 00 00 00 call 8
1: 63 0a fc ff 00 00 00 00 *(u32 *)(r10 - 4) = r0
2: bf a2 00 00 00 00 00 00 r2 = r10
3: 07 02 00 00 fc ff ff ff r2 += -4
4: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
6: 85 00 00 00 01 00 00 00 call 1
7: b7 01 00 00 00 00 00 00 r1 = 0
8: 15 00 04 00 00 00 00 00 if r0 == 0 goto +4 <LBB0_2>
9: 61 01 00 00 00 00 00 00 r1 = *(u32 *)(r0 + 0)
10: 07 01 00 00 01 00 00 00 r1 += 1
11: 63 10 00 00 00 00 00 00 *(u32 *)(r0 + 0) = r1
12: b7 01 00 00 02 00 00 00 r1 = 2
0000000000000068 <LBB0_2>:
13: bf 10 00 00 00 00 00 00 r0 = r1
14: 95 00 00 00 00 00 00 00 exit
åèš 14 åã®åœä»€ã次ã®ãããªæ§é ã®åœ¢åŒã§èšè¿°ããå¿
èŠããããŸãã struct bpf_insn
(ã¢ããã€ã¹ïŒ äžèšã®ãã³ããååŸããæé ã»ã¯ã·ã§ã³ãããäžåºŠèªã¿ãéããŸã linux/bpf.h
linux/bpf_common.h
struct bpf_insn insns[]
èªåã§):
struct bpf_insn insns[] = {
/* 85 00 00 00 08 00 00 00 call 8 */
{
.code = BPF_JMP | BPF_CALL,
.imm = 8,
},
/* 63 0a fc ff 00 00 00 00 *(u32 *)(r10 - 4) = r0 */
{
.code = BPF_MEM | BPF_STX,
.off = -4,
.src_reg = BPF_REG_0,
.dst_reg = BPF_REG_10,
},
/* bf a2 00 00 00 00 00 00 r2 = r10 */
{
.code = BPF_ALU64 | BPF_MOV | BPF_X,
.src_reg = BPF_REG_10,
.dst_reg = BPF_REG_2,
},
/* 07 02 00 00 fc ff ff ff r2 += -4 */
{
.code = BPF_ALU64 | BPF_ADD | BPF_K,
.dst_reg = BPF_REG_2,
.imm = -4,
},
/* 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll */
{
.code = BPF_LD | BPF_DW | BPF_IMM,
.src_reg = BPF_PSEUDO_MAP_FD,
.dst_reg = BPF_REG_1,
.imm = map_fd,
},
{ }, /* placeholder */
/* 85 00 00 00 01 00 00 00 call 1 */
{
.code = BPF_JMP | BPF_CALL,
.imm = 1,
},
/* b7 01 00 00 00 00 00 00 r1 = 0 */
{
.code = BPF_ALU64 | BPF_MOV | BPF_K,
.dst_reg = BPF_REG_1,
.imm = 0,
},
/* 15 00 04 00 00 00 00 00 if r0 == 0 goto +4 <LBB0_2> */
{
.code = BPF_JMP | BPF_JEQ | BPF_K,
.off = 4,
.src_reg = BPF_REG_0,
.imm = 0,
},
/* 61 01 00 00 00 00 00 00 r1 = *(u32 *)(r0 + 0) */
{
.code = BPF_MEM | BPF_LDX,
.off = 0,
.src_reg = BPF_REG_0,
.dst_reg = BPF_REG_1,
},
/* 07 01 00 00 01 00 00 00 r1 += 1 */
{
.code = BPF_ALU64 | BPF_ADD | BPF_K,
.dst_reg = BPF_REG_1,
.imm = 1,
},
/* 63 10 00 00 00 00 00 00 *(u32 *)(r0 + 0) = r1 */
{
.code = BPF_MEM | BPF_STX,
.src_reg = BPF_REG_1,
.dst_reg = BPF_REG_0,
},
/* b7 01 00 00 02 00 00 00 r1 = 2 */
{
.code = BPF_ALU64 | BPF_MOV | BPF_K,
.dst_reg = BPF_REG_1,
.imm = 2,
},
/* <LBB0_2>: bf 10 00 00 00 00 00 00 r0 = r1 */
{
.code = BPF_ALU64 | BPF_MOV | BPF_X,
.src_reg = BPF_REG_1,
.dst_reg = BPF_REG_0,
},
/* 95 00 00 00 00 00 00 00 exit */
{
.code = BPF_JMP | BPF_EXIT
},
};
ãããèªåã§äœæããããã§ã¯ãªã人ã®ããã®æŒç¿ - æ€çŽ¢ map_fd
.
ç§ãã¡ã®ããã°ã©ã ã«ã¯ãã XNUMX ã€æªå
¬éã®éšåãæ®ã£ãŠããŸã - xdp_attach
ã æ®å¿µãªãããXDP ã®ãããªããã°ã©ã ã¯ã·ã¹ãã ã³ãŒã«ã䜿çšããŠæ¥ç¶ããããšã¯ã§ããŸããã bpf
ã BPF ãš XDP ãäœæãã人ã
ã¯ãªã³ã©ã€ã³ Linux ã³ãã¥ããã£ã®åºèº«è
ã§ãããã€ãŸãã圌ãã¯æãããç¥ã£ãŠãããã®ã䜿çšããŠããŸãã (ãã ããããã§ã¯ãããŸãã)ã æ®éã® people) ã«ãŒãã«ãšå¯Ÿè©±ããããã®ã€ã³ã¿ãŒãã§ã€ã¹: xdp_attach
ããã³ãŒããã³ããŒããŠããŸã libbpf
ãã€ãŸããã¡ã€ã«ãã netlink.c
ããããªã³ã¯ãœã±ããã®äžçãžãããã
ããããªã³ã¯ãœã±ããã¿ã€ããéããŸã NETLINK_ROUTE
:
int netlink_open(__u32 *nl_pid)
{
struct sockaddr_nl sa;
socklen_t addrlen;
int one = 1, ret;
int sock;
memset(&sa, 0, sizeof(sa));
sa.nl_family = AF_NETLINK;
sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
if (sock < 0)
err(1, "socket");
if (setsockopt(sock, SOL_NETLINK, NETLINK_EXT_ACK, &one, sizeof(one)) < 0)
warnx("netlink error reporting not supported");
if (bind(sock, (struct sockaddr *)&sa, sizeof(sa)) < 0)
err(1, "bind");
addrlen = sizeof(sa);
if (getsockname(sock, (struct sockaddr *)&sa, &addrlen) < 0)
err(1, "getsockname");
*nl_pid = sa.nl_pid;
return sock;
}
ãã®ãœã±ãããã次ã®æ å ±ãèªã¿åããŸãã
static int bpf_netlink_recv(int sock, __u32 nl_pid, int seq)
{
bool multipart = true;
struct nlmsgerr *errm;
struct nlmsghdr *nh;
char buf[4096];
int len, ret;
while (multipart) {
multipart = false;
len = recv(sock, buf, sizeof(buf), 0);
if (len < 0)
err(1, "recv");
if (len == 0)
break;
for (nh = (struct nlmsghdr *)buf; NLMSG_OK(nh, len);
nh = NLMSG_NEXT(nh, len)) {
if (nh->nlmsg_pid != nl_pid)
errx(1, "wrong pid");
if (nh->nlmsg_seq != seq)
errx(1, "INVSEQ");
if (nh->nlmsg_flags & NLM_F_MULTI)
multipart = true;
switch (nh->nlmsg_type) {
case NLMSG_ERROR:
errm = (struct nlmsgerr *)NLMSG_DATA(nh);
if (!errm->error)
continue;
ret = errm->error;
// libbpf_nla_dump_errormsg(nh); too many code to copy...
goto done;
case NLMSG_DONE:
return 0;
default:
break;
}
}
}
ret = 0;
done:
return ret;
}
æåŸã«ããœã±ãããéããŠããã¡ã€ã«èšè¿°åãå«ãç¹å¥ãªã¡ãã»ãŒãžããœã±ããã«éä¿¡ããé¢æ°ã次ã«ç€ºããŸãã
static int xdp_attach(int ifindex, int prog_fd)
{
int sock, seq = 0, ret;
struct nlattr *nla, *nla_xdp;
struct {
struct nlmsghdr nh;
struct ifinfomsg ifinfo;
char attrbuf[64];
} req;
__u32 nl_pid = 0;
sock = netlink_open(&nl_pid);
if (sock < 0)
return sock;
memset(&req, 0, sizeof(req));
req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
req.nh.nlmsg_type = RTM_SETLINK;
req.nh.nlmsg_pid = 0;
req.nh.nlmsg_seq = ++seq;
req.ifinfo.ifi_family = AF_UNSPEC;
req.ifinfo.ifi_index = ifindex;
/* started nested attribute for XDP */
nla = (struct nlattr *)(((char *)&req)
+ NLMSG_ALIGN(req.nh.nlmsg_len));
nla->nla_type = NLA_F_NESTED | IFLA_XDP;
nla->nla_len = NLA_HDRLEN;
/* add XDP fd */
nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
nla_xdp->nla_type = IFLA_XDP_FD;
nla_xdp->nla_len = NLA_HDRLEN + sizeof(int);
memcpy((char *)nla_xdp + NLA_HDRLEN, &prog_fd, sizeof(prog_fd));
nla->nla_len += nla_xdp->nla_len;
/* if user passed in any flags, add those too */
__u32 flags = XDP_FLAGS_SKB_MODE;
nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
nla_xdp->nla_type = IFLA_XDP_FLAGS;
nla_xdp->nla_len = NLA_HDRLEN + sizeof(flags);
memcpy((char *)nla_xdp + NLA_HDRLEN, &flags, sizeof(flags));
nla->nla_len += nla_xdp->nla_len;
req.nh.nlmsg_len += NLA_ALIGN(nla->nla_len);
if (send(sock, &req, req.nh.nlmsg_len, 0) < 0)
err(1, "send");
ret = bpf_netlink_recv(sock, nl_pid, seq);
cleanup:
close(sock);
return ret;
}
ããã§ããã¹ãŠããã¹ãããæºåãæŽããŸããã
$ cc nolibbpf.c -o nolibbpf
$ sudo strace -e bpf ./nolibbpf
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, map_name="woo", ...}, 72) = 3
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, insn_cnt=15, prog_name="woo", ...}, 72) = 4
+++ exited with 0 +++
ç§ãã¡ã®ããã°ã©ã ãæ¥ç¶ãããã©ãããèŠãŠã¿ãŸããã lo
:
$ ip l show dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 xdpgeneric qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
prog/xdp id 160
ping ãéä¿¡ããŠå°å³ãèŠãŠã¿ãŸãããã
$ for s in `seq 234`; do sudo ping -f -c 100 127.0.0.1 >/dev/null 2>&1; done
$ sudo bpftool m dump name woo
key: 00 00 00 00 value: 90 01 00 00 00 00 00 00
key: 01 00 00 00 value: 00 00 00 00 00 00 00 00
key: 02 00 00 00 value: 00 00 00 00 00 00 00 00
key: 03 00 00 00 value: 00 00 00 00 00 00 00 00
key: 04 00 00 00 value: 00 00 00 00 00 00 00 00
key: 05 00 00 00 value: 00 00 00 00 00 00 00 00
key: 06 00 00 00 value: 40 b5 00 00 00 00 00 00
key: 07 00 00 00 value: 00 00 00 00 00 00 00 00
Found 8 elements
äžæ³ããã¹ãŠããŸããããŸãã ã¡ãªã¿ã«ããããã¯åã³ãã€ã圢åŒã§è¡šç€ºãããããšã«æ³šæããŠãã ããã ããã¯ããšã¯ç°ãªãã libbpf
åæ
å ± (BTF) ãããŒãããŸããã§ããã ããããããã«ã€ããŠã¯æ¬¡åã«è©³ãã説æããŸãã
éçºããŒã«
ãã®ã»ã¯ã·ã§ã³ã§ã¯ãæå°éã® BPF éçºè ããŒã«ãããã«ã€ããŠèª¬æããŸãã
äžè¬çã«èšãã°ãBPF ããã°ã©ã ãéçºããããã«ç¹å¥ãªãã®ã¯äœãå¿
èŠãããŸãããBPF ã¯é©åãªãã£ã¹ããªãã¥ãŒã·ã§ã³ ã«ãŒãã«äžã§å®è¡ãããããã°ã©ã ã¯ä»¥äžã䜿çšããŠæ§ç¯ãããŸãã clang
ãããã±ãŒãžããäŸçµŠã§ããŸãã ãã ããBPF ã¯éçºäžã§ãããããã«ãŒãã«ãšããŒã«ã¯åžžã«å€æŽãããŠããã2019 幎以éã®æãªããã®æ¹æ³ã䜿çšã㊠BPF ããã°ã©ã ãäœæããããªãå Žåã¯ãã³ã³ãã€ã«ããå¿
èŠããããŸãã
llvm
/clang
pahole
- ãã®æ žå¿
bpftool
(åèãŸã§ã«ããã®ã»ã¯ã·ã§ã³ãšèšäºå ã®ãã¹ãŠã®äŸã¯ Debian 10 ã§å®è¡ãããŸããã)
llvm/ã¯ã©ã³ã°
BPF 㯠LLVM ãšèŠªåæ§ããããæè¿ã§ã¯ BPF çšã®ããã°ã©ã 㯠gcc ã䜿çšããŠã³ã³ãã€ã«ã§ããããã«ãªããŸããããçŸåšã®éçºã¯ãã¹ãŠ LLVM ã«å¯ŸããŠè¡ãããŠããŸãã ãããã£ãŠããŸã第äžã«ãçŸåšã®ããŒãžã§ã³ããã«ãããŸã clang
git ãã:
$ sudo apt install ninja-build
$ git clone --depth 1 https://github.com/llvm/llvm-project.git
$ mkdir -p llvm-project/llvm/build/install
$ cd llvm-project/llvm/build
$ cmake .. -G "Ninja" -DLLVM_TARGETS_TO_BUILD="BPF;X86"
-DLLVM_ENABLE_PROJECTS="clang"
-DBUILD_SHARED_LIBS=OFF
-DCMAKE_BUILD_TYPE=Release
-DLLVM_BUILD_RUNTIME=OFF
$ time ninja
... ЌМПгП вÑеЌеМО ÑпÑÑÑÑ
$
ããã§ããã¹ãŠãæ£ããçµåãããŠãããã©ããã確èªã§ããŸãã
$ ./bin/llc --version
LLVM (http://llvm.org/):
LLVM version 11.0.0git
Optimized build.
Default target: x86_64-unknown-linux-gnu
Host CPU: znver1
Registered Targets:
bpf - BPF (host endian)
bpfeb - BPF (big endian)
bpfel - BPF (little endian)
x86 - 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64
ïŒçµç«èª¬æ clang
ç§ãããåã£ã
æ§ç¯ããã°ããã®ããã°ã©ã ãã€ã³ã¹ããŒã«ããã®ã§ã¯ãªããåã«è¿œå ããã ãã§ãã PATH
ããšãã°ã次ã®ããã«ãªããŸãã
export PATH="`pwd`/bin:$PATH"
(ããã¯è¿œå ã§ããŸã .bashrc
ãŸãã¯å¥ã®ãã¡ã€ã«ã«ä¿åããŸãã å人çã«ã¯ãããªãã®ãè¿œå ããŠããŸã ~/bin/activate-llvm.sh
ãããŠå¿
èŠãªãšãã¯ããããŸã . activate-llvm.sh
.)
ãããŒã«ãšBTF
ãŠãŒãã£ãªã㣠pahole
ã«ãŒãã«ãæ§ç¯ãããšãã«äœ¿çšãããBTF 圢åŒã§ãããã°æ
å ±ãäœæããŸãã BTF ãã¯ãããžãŒã®è©³çŽ°ã«ã€ããŠã¯ã䟿å©ãªã®ã§äœ¿ããããšããäºå®ä»¥å€ããã®èšäºã§ã¯è©³ãã説æããŸããã ãããã£ãŠãã«ãŒãã«ããã«ãããå Žåã¯ãæåã«ãã«ãããŠãã ãã pahole
ïŒãªã pahole
ãªãã·ã§ã³ã䜿çšããŠã«ãŒãã«ããã«ãããããšã¯ã§ããŸãã CONFIG_DEBUG_INFO_BTF
:
$ git clone https://git.kernel.org/pub/scm/devel/pahole/pahole.git
$ cd pahole/
$ sudo apt install cmake
$ mkdir build
$ cd build/
$ cmake -D__LIB=lib ..
$ make
$ sudo make install
$ which pahole
/usr/local/bin/pahole
BPF ãå®éšããããã®ã«ãŒãã«
BPFã®å¯èœæ§ãæ¢ãäžã§ãèªåãªãã®ã³ã¢ãçµã¿ç«ãŠãŠãããããšæã£ãŠããŸãã ãã£ã¹ããªãã¥ãŒã·ã§ã³ ã«ãŒãã«äžã§ BPF ããã°ã©ã ãã³ã³ãã€ã«ããŠããŒãã§ãããããããã¯äžè¬çã«ã¯å¿ èŠãããŸããããç¬èªã®ã«ãŒãã«ã䜿çšãããšãææ°ã® BPF æ©èœã䜿çšã§ããããã«ãªãããã£ã¹ããªãã¥ãŒã·ã§ã³ã«åæ ããããŸã§ã«æé·ã§æ°ãæããããŸãããŸãã¯ãäžéšã®ãããã° ããŒã«ã®å Žåãšåæ§ã«ãäºèŠå¯èœãªå°æ¥ã«ã¯ãŸã£ããããã±ãŒãžåãããªããªããŸãã ãŸããç¬èªã®ã³ã¢ã«ãããã³ãŒããè©ŠããŠã¿ãããšãéèŠã§ãããšæããããŸãã
ã«ãŒãã«ãæ§ç¯ããã«ã¯ããŸãã«ãŒãã«èªäœãå¿
èŠã§ã次ã«ã«ãŒãã«æ§æãã¡ã€ã«ãå¿
èŠã§ãã BPF ãå®éšããã«ã¯ãéåžžã®ã¡ãœããã䜿çšã§ããŸãã net
net-next
bpf
bpf-next
*-next
ã«ãŒãã«ã¯ãªã¹ããããŠãããã®ã®äžã§æãäžå®å®ã§ã)ã
ã«ãŒãã«æ§æãã¡ã€ã«ã®ç®¡çæ¹æ³ã«ã€ããŠèª¬æããããšã¯ããã®èšäºã®ç¯å²ãè¶
ããŠããŸããèªè
ã¯ãã®æ¹æ³ããã§ã«ç¥ã£ãŠãããããŸãã¯ãã®ãããããåæãšããŠããŸãã
äžèšã®ã«ãŒãã«ã®ãããããããŠã³ããŒãããŸãã
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
$ cd bpf-next
æå°éã®åäœã«ãŒãã«æ§æãæ§ç¯ããŸãã
$ cp /boot/config-`uname -r` .config
$ make localmodconfig
ãã¡ã€ã«å
ã® BPF ãªãã·ã§ã³ãæå¹ã«ãã .config
ããªãèªèº«ã®éžæïŒãããã CONFIG_BPF
systemd ã䜿çšããããããã§ã«æå¹ã«ãªã£ãŠããŸã)ã ãã®èšäºã§äœ¿çšããã«ãŒãã«ã®ãªãã·ã§ã³ã®ãªã¹ãã¯æ¬¡ã®ãšããã§ãã
CONFIG_CGROUP_BPF=y
CONFIG_BPF=y
CONFIG_BPF_LSM=y
CONFIG_BPF_SYSCALL=y
CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
CONFIG_BPF_JIT_ALWAYS_ON=y
CONFIG_BPF_JIT_DEFAULT_ON=y
CONFIG_IPV6_SEG6_BPF=y
# CONFIG_NETFILTER_XT_MATCH_BPF is not set
# CONFIG_BPFILTER is not set
CONFIG_NET_CLS_BPF=y
CONFIG_NET_ACT_BPF=y
CONFIG_BPF_JIT=y
CONFIG_BPF_STREAM_PARSER=y
CONFIG_LWTUNNEL_BPF=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_BPF_EVENTS=y
CONFIG_BPF_KPROBE_OVERRIDE=y
CONFIG_DEBUG_INFO_BTF=y
ãã®åŸãã¢ãžã¥ãŒã«ãšã«ãŒãã«ãç°¡åã«çµã¿ç«ãŠãŠã€ã³ã¹ããŒã«ã§ããŸã (ã¡ãªã¿ã«ãæ°ããçµã¿ç«ãŠããããã®ã䜿çšããŠã«ãŒãã«ãçµã¿ç«ãŠãããšãã§ããŸã) clang
è¿œå ããããšã§ CC=clang
):
$ make -s -j $(getconf _NPROCESSORS_ONLN)
$ sudo make modules_install
$ sudo make install
ãããŠæ°ããã«ãŒãã«ã§åèµ·åããŸãïŒç§ã¯ããã«äœ¿çšããŸãïŒ kexec
ããã±ãŒãžãã kexec-tools
):
v=5.8.0-rc6+ # еÑлО Ð²Ñ Ð¿ÐµÑеÑПбОÑаеÑе ÑекÑÑее ÑÐŽÑП, ÑП ЌПжМП ЎелаÑÑ v=`uname -r`
sudo kexec -l -t bzImage /boot/vmlinuz-$v --initrd=/boot/initrd.img-$v --reuse-cmdline &&
sudo kexec -e
bpftool
ãã®èšäºã§æããã䜿çšããããŠãŒãã£ãªãã£ã¯æ¬¡ã®ãšããã§ãã bpftool
ãLinux ã«ãŒãã«ã®äžéšãšããŠæäŸãããŸãã ããã¯ãBPF éçºè
ã«ãã£ãŠ BPF éçºè
ã®ããã«äœæããã³ä¿å®ãããŠãããããã°ã©ã ã®ããŒãããããã®äœæãšç·šéãBPF ãšã³ã·ã¹ãã ã®æ¢çŽ¢ãªã©ãããããã¿ã€ãã® BPF ãªããžã§ã¯ãã®ç®¡çã«äœ¿çšã§ããŸãã ããã¥ã¢ã«ããŒãžã®ãœãŒã¹ã³ãŒã圢åŒã®ããã¥ã¡ã³ããèŠã€ãããŸãã
ãã®èšäºã®å·çæç¹ bpftool
RHELãFedoraãããã³ Ubuntu çšã«ã®ã¿æ¢è£œã§æäŸãããŠããŸã (ããšãã°ããåç
§) bpftool
Debian ã§ã¯)ã ãã ãããã§ã«ã«ãŒãã«ãæ§ç¯ããŠããå Žåã¯ã次ã®ããã«æ§ç¯ããŸãã bpftool
ãã€ã®ããã«ç°¡å:
$ cd ${linux}/tools/bpf/bpftool
# ... пÑПпОÑОÑе пÑÑО к пПÑÐ»ÐµÐŽÐœÐµÐŒÑ clang, как ÑаÑÑказаМП вÑÑе
$ make -s
Auto-detecting system features:
... libbfd: [ on ]
... disassembler-four-args: [ on ]
... zlib: [ on ]
... libcap: [ on ]
... clang-bpf-co-re: [ on ]
Auto-detecting system features:
... libelf: [ on ]
... zlib: [ on ]
... bpf: [ on ]
$
ïŒãã ${linux}
- ããã¯ã«ãŒãã« ãã£ã¬ã¯ããªã§ãã) ãããã®ã³ãã³ããå®è¡ããåŸ bpftool
ãã£ã¬ã¯ããªã«åéãããŸã ${linux}/tools/bpf/bpftool
ãããŠããããã¹ã«è¿œå ã§ããŸãïŒãŸããŠãŒã¶ãŒã«è¿œå ããŸãïŒ root
) ãŸãã¯åã«ã³ããŒãã /usr/local/sbin
.
åéããŸã bpftool
åŸè
ã䜿çšããã®ãæåã§ã clang
ãäžèšã®ããã«çµã¿ç«ãŠãããæ£ããçµã¿ç«ãŠãããŠãããã©ããã確èªããŸããããšãã°ã次ã®ã³ãã³ãã䜿çšããŸãã
$ sudo bpftool feature probe kernel
Scanning system configuration...
bpf() syscall for unprivileged users is enabled
JIT compiler is enabled
JIT compiler hardening is disabled
JIT compiler kallsyms exports are enabled for root
...
ããã«ãããã«ãŒãã«ã§ã©ã® BPF æ©èœãæå¹ã«ãªã£ãŠãããã衚瀺ãããŸãã
ã¡ãªã¿ã«ãå ã»ã©ã®ã³ãã³ãã¯æ¬¡ã®ããã«å®è¡ã§ããŸãã
# bpftool f p k
ããã¯ãããã±ãŒãžã®ãŠãŒãã£ãªãã£ãšåæ§ã«è¡ãããŸãã iproute2
ãããã§ãããšãã°æ¬¡ã®ããã«èšããŸãã ip a s eth0
代ããã« ip addr show dev eth0
.
ãŸãšã
BPF ã䜿çšãããšãããã«éŽãå±¥ãããŠãã³ã¢ã®æ©èœãå¹æçã«æž¬å®ãããã®å Žã§å€æŽããããšãã§ããŸãã ãã®ã·ã¹ãã ã¯ãUNIX ã®æè¯ã®äŒçµ±ã«åŸã£ãŠãéåžžã«æåããããšãå€æããŸãããã«ãŒãã«ã (å) ããã°ã©ã ã§ããã·ã³ãã«ãªã¡ã«ããºã ã«ãããèšå€§ãªæ°ã®äººã ãçµç¹ãå®éšããããšãã§ããŸããã ãŸããå®éšã BPF ã€ã³ãã©ã¹ãã©ã¯ãã£èªäœã®éçºã¯ãŸã å®æã«ã¯çšé ããã®ã®ãã·ã¹ãã ã«ã¯ãã§ã«å®å®ãã ABI ãåãã£ãŠãããä¿¡é Œæ§ãé«ããæãéèŠãªããšã«å¹æçãªããžãã¹ ããžãã¯ãæ§ç¯ã§ããŸãã
ç§ã®æèŠã§ã¯ããã®ãã¯ãããžãŒãããã»ã©æ®åããã®ã¯ãäžæ¹ã§ã¯æ¬¡ã®ãããªããšãã§ããããã§ãã é㶠ïŒãã·ã³ã®ã¢ãŒããã¯ãã£ã¯äžæ©ã§å€ããå°ãªããç解ã§ããŸãïŒãäžæ¹ã§ããã®åºçŸåã«ïŒçŸããïŒè§£æ±ºã§ããªãã£ãåé¡ã解決ããããšãã§ããŸãã ããã XNUMX ã€ã®ã³ã³ããŒãã³ããçµã¿åãããããšã§ã人ã ã¯å®éšãšå€¢ãèŠãããšã匷ããããããããŸããŸãé©æ°çãªãœãªã¥ãŒã·ã§ã³ã®åºçŸã«ã€ãªãããŸãã
ãã®èšäºã¯ç¹ã«çããã®ã§ã¯ãããŸããããBPF ã®äžçã玹ä»ããã ãã§ããããé«åºŠãªãæ©èœãã¢ãŒããã¯ãã£ã®éèŠãªéšåã«ã€ããŠã¯èª¬æããŸããã ä»åŸã®èšç»ã¯æ¬¡ã®ãããªãã®ã§ãã次ã®èšäºã§ã¯ BPF ããã°ã©ã ã¿ã€ãã®æŠèŠã説æã (5.8 ã«ãŒãã«ã§ã¯ 30 ã®ããã°ã©ã ã¿ã€ãããµããŒããããŠããŸã)ãæåŸã«ã«ãŒãã« ãã¬ãŒã¹ ããã°ã©ã ã䜿çšããŠå®éã® BPF ã¢ããªã±ãŒã·ã§ã³ãäœæããæ¹æ³ãèŠãŠãããŸããäŸãšããŠãBPF ã¢ãŒããã¯ãã£ã«é¢ãããã詳现ãªã³ãŒã¹ãåè¬ãããã®åŸã« BPF ãããã¯ãŒãã³ã°ãšã»ãã¥ãªã㣠ã¢ããªã±ãŒã·ã§ã³ã®äŸã説æããŸãã
ãã®ã·ãªãŒãºã®ä»¥åã®èšäº
ãªã³ã¯
-
BPF ããã³ XDP ãªãã¡ã¬ã³ã¹ ã¬ã€ã â ç¹æ¯ãããæ£ç¢ºã«ã¯ãBPF ã®äœæè ããã³ä¿å®è ã® XNUMX 人ã§ãã Daniel Borkman ããã® BPF ã«é¢ããææžã ããã¯æåã®æ¬æ Œçãªèšè¿°ã® XNUMX ã€ã§ããããããšã«ã¯èªåãäœã«ã€ããŠæžããŠããã®ããæ£ç¢ºã«ç解ããŠãããããã«ééãããªããšããç¹ã§ä»ã®èšè¿°ãšã¯ç°ãªããŸãã ç¹ã«ããã®ããã¥ã¡ã³ãã§ã¯ãããç¥ããããŠãŒãã£ãªãã£ã䜿çšã㊠XDP ããã³ TC ã¿ã€ãã® BPF ããã°ã©ã ãæäœããæ¹æ³ã«ã€ããŠèª¬æããŸããip
ããã±ãŒãžããiproute2
. -
ããã¥ã¡ã³ã/ãããã¯ãŒã¯/filter.txt â ã¯ã©ã·ãã¯ããã³æ¡åŒµ BPF ã®ããã¥ã¡ã³ããå«ããªãªãžãã« ãã¡ã€ã«ã ã¢ã»ã³ããªèšèªãšæè¡çãªã¢ãŒããã¯ãã£ã®è©³çŽ°ã詳ããç¥ãããå Žåã¯ããã²èªãã§ãã ããã -
Facebook ããã® BPF ã«é¢ããããã° ã Alexei Starovoitov (eBPF ã®äœè ) ãš Andrii Nakryiko - (ã¡ã³ãã) ãããã«æžããŠããããã«ããã£ãã«æŽæ°ãããŸããããé©åã«æŽæ°ãããŸããlibbpf
). -
bpftool ã®ç§å¯ ã Quentin Monnet ã«ãã楜ãã Twitter ã¹ã¬ãããbpftool ã®äœ¿çšäŸãšç§å¯ãèšèŒãããŠããŸãã -
BPF ã®è©³çŽ°: èªã¿ç©ãªã¹ã ã Quentin Monnet ã«ãã BPF ããã¥ã¡ã³ããžã®ãªã³ã¯ã®å·šå€§ãª (ãããŠçŸåšãç¶æãããŠãã) ãªã¹ãã
åºæïŒ habr.com