The bpftime project develops a user-space implementation of eBPF

The bpftime project is presented, which develops a runtime and a virtual machine for executing eBPF handlers in user space. Bpftime allows eBPF tracing and process intervention programs to run entirely in user space, using features such as uprobe and programmatic system call interception. It is noted that by eliminating unnecessary context switches, bpftime can achieve a tenfold reduction in overhead compared to using the uprobe and uretprobe functionality provided by the Linux kernel. In addition, bpftime greatly simplifies debugging, can potentially be used on systems without a Linux kernel, and does not require the elevated privileges required to load an eBPF application into the kernel. The project code is written in C/C++ and is distributed under the MIT license.

Interception of system calls and integration of uprobe checks is implemented using the technique of rewriting executable code (binary rewriting), in which calls to system calls, entry points and local functions are replaced by a transition to debugging handlers through modification of the machine code of the executing application, which is much more effective than organizing interception using uprobe at the Linux kernel level.

Operations of replacing or changing functions, attaching handlers (hooks) and filters, redirecting, blocking or replacing system call parameters, intercepting entry and exit points of functions, as well as substituting a handler at an arbitrary offset in the code are supported. Bpftime can be attached to any running process on the system without the need to restart or rebuild it. Substitution of bpftime into processes can be done for running processes via ptrace, and for loading ones via LD_PRELOAD.

As part of bpftime, a runtime is being developed that allows you to attach eBPF programs to system call and uprobe trace points; eBPF virtual machine with JIT for isolated execution of eBPF programs at the user process level (AOT compilation is additionally supported); a background process for interacting with the kernel and organizing compatibility with the uprobe subsystem of the kernel (bpftime supports the mode of loading eBPF into user space from the kernel to organize collaboration with eBPF programs in the kernel, used, for example, for processing kprobe or setting network filters).

The eBPF virtual machine is designed as a plug-in library and provides an API similar to ubpf, which allows it to be used to integrate eBPF functionality into other projects. For summary aggregation of data from several processes, the creation of joint eBPF Maps located in shared memory is supported. Together with bpftime, standard eBPF handlers written for use in the kernel can be used, and standard clang and libbpf-based tools can be used for assembly.

With bpftime, tracing systems such as BCC, bpftrace and Deepflow can be executed in user space. For example, we demonstrated the use of the sslsniff script from the BCC framework to analyze and save encrypted traffic in nginx. In the tests conducted, nginx performance decreases by 58% when executing sslsniff on the kernel side, and by 12.3% when moving the handler to user space.

Process tracing architecture using the original eBPF in the kernel:

The bpftime project develops a user-space implementation of eBPF

User space tracing architecture using bpftime:

The bpftime project develops a user-space implementation of eBPF

Hybrid mode, in which bpftime works together with eBPF in the kernel, for example, to install network filters or move individual handlers to user space:

The bpftime project develops a user-space implementation of eBPF

Future plans include: the ability to substitute exceptions (Fault Injection); hot patching (Hot Patching) to change the logic of operation or correct errors in binary assemblies; creating a module for Nginx that allows you to create extensions using eBPF programs (for example, for dynamic route selection, caching, applying security policies and load balancing); expanding the capabilities of the FUSE subsystem (for example, creating extensions to the FS for caching or access control in the form of eBPF programs).

Source: opennet.ru

Add a comment