Linux 5.11 kernel release

After two months of development, Linus Torvalds has unveiled the release of the Linux 5.11 kernel. Notable changes include: support for Intel SGX enclaves, new syscall interception mechanism, auxiliary virtual bus, blocking assembly of modules without MODULE_LICENSE(), fast syscall filtering mode in seccomp, ia64 architecture maintenance deprecation, moving WiMAX technology to the "staging" branch, the ability to encapsulate SCTP in UDP.

The new version received 15480 fixes from 1991 developers, the patch size is 72 MB (the changes affected 12090 files, 868025 lines of code were added, 261456 lines were deleted). About 46% of all changes introduced in 5.11 are related to device drivers, about 16% of changes are related to updating code specific to hardware architectures, 13% are related to the networking stack, 3% to file systems, and 4% to internal kernel subsystems.

Main innovations:

  • Disk Subsystem, I/O and File Systems
    • Several mount options have been added to Btrfs to be used when recovering data from a damaged file system: "rescue=ignorebadroots" to mount despite damage to some root trees (extent, uuid, data reloc, device, csum, free space), "rescue=ignoredatacsums" to disable data checksum checks and "rescue=all" to enable 'ignorebadroots', 'ignoredatacsums' and 'nologreplay' modes at the same time. Removed support for the "inode_cache" mount option, which was previously deprecated. The code was prepared to implement support for blocks with metadata and data smaller than a page (PAGE_SIZE), as well as support for the zoned allocation mode. Unbuffered (Direct IO) requests have been moved to the iomap infrastructure. The performance of a number of operations has been optimized, in some cases the acceleration can reach tens of percent.
    • XFS implements a "needsrepair" flag to signal the need for a repair. When this flag is set, the file system cannot be mounted until the flag is reset by the xfs_repair utility.
    • In Ext4, only bug fixes and optimizations are proposed, as well as code cleaning.
    • Re-export of NFS-mounted filesystems is allowed (i.e. an NFS-mounted partition can now be exported via NFS and used as an intermediate cache).
    • The CLOSE_RANGE_CLOEXEC option has been added to the close_range() system call, which allows a process to close an entire range of open file descriptors at once, to close descriptors in close-on-exec mode.
    • New ioctl() calls have been added to the F2FS file system to allow user-space control over which files are compressed. Added a 'compress_mode=' mount option to choose whether to place the compression control handler in kernel or user space.
    • Allowed unprivileged processes to mount Overlayfs using a separate user namespace. To check the compliance of the security model implementation, a full audit of the code was carried out. Overlayfs also adds the ability to run using copies of filesystem images by optionally disabling UUID checking.
    • Support for the msgr2.1 protocol has been added to the Ceph file system, which allows using the AES-GCM algorithm when transferring data in encrypted form.
    • The dm-multipath module implements the ability to consider CPU affinity ("IO affinity") when choosing the route of I / O requests.
  • Memory and system services
    • A new mechanism for intercepting system calls has been added, based on prctl () and allowing to generate exceptions from user space when accessing a specific system call and emulate its execution. This functionality is required in Wine and Proton to emulate Windows system calls, which is necessary to ensure compatibility with games and programs that directly execute system calls bypassing the Windows API (for example, to protect against unauthorized use).
    • The userfaultfd() system call, which is designed to handle page faults (referring to unallocated memory pages) in user space, has the ability to disable the handling of exceptions that occur at the kernel level to complicate the exploitation of some vulnerabilities.
    • Support for a task-local storage has been added to the BPF subsystem, which provides data binding to a specific BPF handler.
    • Accounting for memory consumption by BPF programs has been completely redesigned - a cgroup controller has been proposed instead of memlock rlimit to manage memory usage in BPF objects.
    • The BTF (BPF Type Format) mechanism, which provides information for type checking in BPF pseudocode, supports kernel modules.
    • Added support for the shutdown(), renameat2(), and unlinkat() system calls to the io_uring asynchronous I/O interface. When calling io_uring_enter(), the ability to specify a timeout has been added (you can check the support for an argument to specify a timeout using the IORING_FEAT_EXT_ARG flag).
    • The ia64 architecture used in Intel Itanium processors has been moved to the "orphaned" category, which implies the termination of testing. Hewlett Packard Enterprise stopped accepting orders for new Itanium hardware, and Intel did so last year.
    • Support has been dropped for MicroBlaze-based systems that do not include a memory management unit (MMU). Such systems have not been found in everyday life.
    • For the MIPS architecture, support for code coverage testing has been added using the gcov utility.
    • Added support for the auxiliary virtual bus to interact with multifunctional devices that combine functionality that requires different drivers (for example, network cards with Ethernet and RDMA support). The bus can be used to assign a primary and secondary driver to a device, in a situation where the use of the MFD (Multi-Function Devices) subsystem is problematic.
    • For the RISC-V architecture, support has been added for the CMA (Contiguous Memory Allocator) memory allocation system, which is optimized for allocating large contiguous memory areas using the memory page movement technique. For RISC-V, there are also tools for restricting access to /dev/mem and accounting for interrupt processing time.
    • For 32-bit ARM systems, support has been added for the KASan (Kernel address sanitizer) debugging tool, which provides error detection when working with memory. For 64-bit ARM, the KAsan implementation has been switched to using MTE tags (MemTag).
    • Added epoll_pwait2() system call to allow timeouts with nanosecond precision (epoll_wait call manipulates milliseconds).
    • The build system provides an error when attempting to build loadable kernel modules that do not have a code license defined using the MODULE_LICENSE() macro. From now on, a build error will also be caused by using the EXPORT_SYMBOL() macro for static functions.
    • Added support for mapping GEM objects from memory used for I / O, which made it possible to speed up work with the framebuffer on some architectures.
    • Kconfig dropped support for Qt4 (support for Qt5, GTK and Ncurses remains).
  • Virtualization and Security
    • The seccomp() system call has added support for a quick response mode, which allows you to very quickly determine whether a particular system call is allowed or denied based on the permissions table (constant-action bitmap) attached to the process, which does not require the launch of a BPF handler.
    • Kernel components have been integrated to create and manage enclaves based on Intel SGX (Software Guard eXtensions) technology, which allows applications to execute code in isolated, encrypted memory areas, to which the rest of the system has limited access.
    • As part of the initiative to restrict access from user space to the MSR (model-specific register) registers, writing to the MSR_IA32_ENERGY_PERF_BIAS register is prohibited, which allows you to change the processor's energy efficiency mode ("normal", "performance", "powersave").
    • The ability to disable the migration of high-priority tasks between CPUs has been moved from the kernel-rt branch for real-time systems.
    • For ARM64 systems, the ability to use MTE tags (MemTag, Memory Tagging Extension) for signal handler memory addresses has been added. The use of MTE is enabled by specifying the SA_EXPOSE_TAGBITS option in sigaction() and allows checking the correctness of using pointers to block the exploitation of vulnerabilities caused by accessing already freed memory blocks, buffer overflows, pre-initialization calls, and use outside the current context.
    • Added parameter "DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING" to allow the dm-verity subsystem to verify the hash signatures of certificates placed in the secondary keystore (keyring). In practice, the setting allows you to verify not only certificates built into the kernel, but also certificates loaded during operation, which makes it possible to update certificates without updating the entire kernel.
    • User-mode Linux adds support for suspend-to-idle mode, which allows you to freeze the environment and use the SIGUSR1 signal to wake up.
    • The virtio-mem mechanism, which allows hot-plugging and unplugging memory to virtual machines, has been added support for Big Block Mode (BBM), which makes it possible to transfer or take memory in blocks larger than the size of the kernel memory block, which is necessary to optimize VFIO in QEMU.
    • Support for the CHACHA20-POLY1305 cipher has been added to the kernel implementation of TLS.
  • Network subsystem
    • For 802.1Q (VLAN), a connection failure management mechanism (CFM, Connectivity Fault Management) is implemented, which allows you to detect, verify and isolate failures in networks with virtual bridges (Virtual Bridged Networks). For example, CFM can be used to isolate problems in networks spanning multiple independent organizations whose employees only have access to their own equipment.
    • Added support for encapsulating SCTP protocol packets into UDP packets (RFC 6951), which allows using SCTP in networks with old address translators that do not directly support SCTP, as well as implementing SCTP on systems that do not provide direct access to the IP layer.
    • Implementation of WiMAX technology has been staging and is slated for future removal if there are no users who require WiMAX. WiMAX is no longer used in public networks, and in the kernel the only driver that can use WiMAX is the outdated Intel 2400m driver. WiMAX was deprecated in NetworkManager in 2015. Currently, WiMax has been almost completely replaced by technologies such as LTE, HSPA + and Wi-Fi 802.11n.
    • Work has been done to optimize the performance of processing incoming TCP traffic in the zerocopy mode, i.e. without additional copying to new buffers. For medium sized traffic spanning tens or hundreds of kilobytes of data, using zerocopy instead of recvmsg() is noticeably more efficient. For example, the implemented changes made it possible to increase the efficiency of processing RPC-style traffic with 32 KB messages when using zerocopy by 60-70%.
    • Added new ioctl() calls to create network bridges spanning multiple PPP links. The proposed capability allows frames to move from one channel to another, for example from PPPoE to a PPPoL2TP session.
    • Continued integration into the core of MPTCP (MultiPath TCP), an extension of the TCP protocol for organizing the operation of a TCP connection with the delivery of packets simultaneously along several routes through different network interfaces bound to different IP addresses. The new release adds support for the ADD_ADDR option to advertise available IP addresses to which a connection can be made when adding new streams to an existing MPTCP connection.
    • Added the ability to configure actions when the connection polling budget is exceeded (busy-polling). The previously available SO_BUSY_POLL mode meant switching to softirq when the budget was exhausted. For applications that need to continue polling, a new option SO_PREFER_BUSY_POLL has been proposed.
    • IPv6 supports the SRv6 End.DT4 and End.DT6 modes used to create multi-tenant IPv4 L3 VPNs and VRF (Virtual routing and forwarding) devices.
    • Netfilter unified the implementation of set expressions, which made it possible to specify multiple expressions for each element of set lists.
    • APIs have been added to the 802.11 wireless stack to configure SAR power limits, as well as AE PWE and HE MCS settings. Support for the 6GHz (Ultra High Band) band has been added to the Intel iwlwifi driver. The Qualcomm Ath11k driver adds support for FILS (Fast Initial Link Setup, standardized as IEEE 802.11ai) technology to eliminate roaming delays during migration from one access point to another.
  • Equipment
    • The amdgpu driver introduces support for AMD "Green Sardine" (Ryzen 5000) APUs and "Dimgrey Cavefish" (Navi 2) GPUs, as well as initial support for AMD Van Gogh APUs with Zen 2 core and RDNA 2 (Navi 2) GPUs. Added support for new Renoir APU IDs (based on CPU Zen 2 CPU and GPU Vega).
    • The i915 driver for Intel video cards implements support for IS (Integer scaling) technology with the implementation of a filter for scaling up taking into account the state of neighboring pixels (Nearest-neighbor interpolation) to determine the color of missing pixels. Expanded support for Intel DG1 discrete cards. Implemented support for the "Big Joiner" technology, which has been present since Ice Lake / Gen11 chips and allows you to use one transcoder to process two streams, for example, to output to an 8K screen through one DisplayPort. Added asynchronous switching between two buffers in video memory (async flip).
    • The nouveau driver has added initial support for NVIDIA GPUs based on the "Ampere" microarchitecture (GA100, GeForce RTX 30xx), currently limited to video mode controls.
    • Added support for the 3WIRE protocol used in LCD panels. Added support for novatek nt36672a, TDO tl070wsh30, Innolux N125HCE-GN1 and ABT Y030XX067A 3.0 panels. Separately, we can note the support for the panel of OnePlus 6 and 6T smartphones, which made it possible to organize loading of an unmodified kernel on devices.
    • Added support for the first Intel Maple Ridge Discrete USB4 Host Controller.
    • Added support for Allwinner H6 I2S, Analog Devices ADAU1372, Intel Alderlake-S, GMediatek MT8192, NXP i.MX HDMI and XCVR, Realtek RT715 and Qualcomm SM8250 audio codecs.
    • Added support for ARM boards, devices and platforms: Galaxy Note 10.1, Microsoft Lumia 950 XL, NanoPi R1, FriendlyArm ZeroPi, Elimo Initium SBC, Broadcom BCM4908, Mediatek MT8192/MT6779/MT8167, MStar Infinity2M, Nuvoton NPCM730, Marvell Armada 382, ​​Mikrotik based on Marvell Prestera 98DX3236, servers with Nuvoton NPCM750 BMC, Kontron i.MX8M Mini, Espressobin Ultra, "Trogdor" Chromebook, Kobol Helios64, Engicam PX30.Core.
    • Built-in support for the Ouya game console based on NVIDIA Tegra 3.

At the same time, the Latin American Free Software Foundation formed a variant of the completely free kernel 5.11 - Linux-libre 5.11-gnu, cleared of firmware and driver elements containing non-free components or code sections, the scope of which is limited by the manufacturer. In the new release, the drivers for qat_4xxx (crypto), lt9611uxcm (dsi/hdmi bridge), ccs/smia++ (sensor), ath11k_pci, nxp audio transceiver and mhi pci controller have been cleaned. Updated code for cleaning blobs in amdgpu, btqca, btrtl, btusb, i915 csr drivers and subsystems. Disabled new blobs in m3 rproc, idt82p33 ptp clock and qualcomm arm64.

Source: opennet.ru

Add a comment