Linux 5.8 kernel release

After two months of development Linus Torvalds presented kernel release Linux 5.8. Among the most notable changes: the KCSAN race condition detector, a universal mechanism for delivering notifications to user space, support for hardware for inline encryption, advanced protection mechanisms for ARM64, support for the Russian Baikal-T1 processor, the ability to separately mount procfs instances, the implementation of Shadow protection mechanisms for ARM64 Call Stack and BTI.

The 5.8 kernel was the largest in terms of the number of changes of all the kernels over the entire life of the project. At the same time, the changes are not related to any one subsystem, but cover different parts of the kernel and are mainly related to internal processing and cleaning. Most of the changes are observed in the drivers. The new version accepted 17606 fixes from 2081 developers, which affected approximately 20% of all files in the core code repository. The patch size is 65 MB (the changes affected 16180 files, 1043240 lines of code were added, 489854 lines were deleted). By comparison, the 5.7 branch had 15033 fixes, and the patch size was 39 MB. About 37% of all changes introduced in 5.8 are associated with device drivers, approximately 16% of changes are related to updating code specific to hardware architectures, 11% are related to the networking stack, 3% to file systems, and 4% to internal kernel subsystems.

All innovations:

  • Virtualization and Security
    • The blocking of the loading of kernel modules with sections with code in which bits are set, allowing execution and recording, is provided. The change was implemented as part of a larger project to rid the kernel of the use of memory pages that allow simultaneous execution and writing.
    • It is now possible to create separate procfs instances, allowing multiple procfs mount points mounted with different options but reflecting the same pid namespace. Previously, all procfs mount points only mirrored one internal representation, and any change to the mount options affected all other mount points associated with the same process ID namespace. Of the areas in which mounting with different options may be required, the implementation of lightweight isolation for embedded systems with the ability to hide certain types of processes and information nodes in procfs is noted.
    • For the ARM64 platform, support for the mechanism
      Shadow Call Stack, provided by the Clang compiler to protect against rewriting the return address from a function in case of a buffer overflow on the stack. The essence of protection is to save the return address in a separate β€œshadow” stack after transferring control to the function and extract this address before exiting the function.

    • Instruction support added for ARM64 platform ARMv8.5-BTI (Branch Target Indicator) to protect the execution of instruction sets that should not be branched to. Blocking transitions to arbitrary sections of code is implemented to prevent the creation of gadgets in exploits that use return-oriented programming (ROP) techniques (ROP - Return-Oriented Programming, the attacker does not try to place his code in memory, but operates on existing pieces of machine instructions, ending with a control return instruction, from which a chain of calls is built to obtain the desired functionality).
    • Added support for hardware for inline encryption of block devices (Inline Encryption). Inlinep encryption devices are typically built into the drive, but sit logically between system memory and disk, transparently encrypting and decrypting I/O based on kernel-specified keys and an encryption algorithm.
    • Added "initrdmem" kernel command-line option to allow specifying the physical location of the initrd in memory when placing the initial boot image in RAM.
    • Added new capability: CAP_PERFMON to access the perf subsystem and perform performance monitoring. CAP_BPF, allowing some BPF operations (such as loading BPF programs) that used to require CAP_SYS_ADMIN permissions (CAP_SYS_ADMIN permissions are now split into a combination of CAP_BPF, CAP_PERFMON, and CAP_NET_ADMIN).
    • Posted a new virtio-mem device that allows hot plugging and unplugging of memory to guest systems.
    • Implemented revocation of mapping operations in /dev/mem if the device driver uses overlapping memory areas.
    • Added vulnerability protection CROSSTalk/SRBDS, which allows you to restore the results of some instructions executed on another CPU core.
  • Memory and system services
    • In the document defining the code formatting rules, accepted recommendations on the use of inclusive terminology. Developers are not recommended to use 'master / slave' and 'blacklist / whitelist' links, as well as the word 'slave' separately. The recommendations concern only the new use of these terms. The references to the specified words already in the kernel will remain untouched. In new code, the use of marked terms is permitted if required to maintain the user-space API and ABI issued to user space, and when updating code to support existing hardware or protocols whose specifications mandate the use of certain terms.
    • Debug tool included KCSAN (Kernel Concurrency Sanitizer) designed to dynamically detect race conditions inside the nucleus. The use of KCSAN is supported when building in GCC and Clang, and requires special modifications at compile time to track memory access (breakpoints are used that are triggered when reading or changing memory). The main focus of KCSAN development is false positive prevention, scalability, and ease of use.
    • Added universal mechanism delivery of notifications from the kernel to user space. The mechanism is based on the regular pipe driver and allows you to efficiently distribute notifications from the kernel through channels open in user space. Notification receiving points are pipes that are opened in a special mode and allow you to accumulate messages coming from the kernel in the ring buffer. Reading is done by the usual read() function. The channel owner determines which sources in the kernel to monitor and can define a filter to ignore messages and events of a certain type. Of the events, only operations with keys are currently supported, such as adding / deleting keys and changing their attributes. These events are planned to be used in GNOME.
    • Continued development of 'pidfd' functionality to help handle PID reuse situations (pidfd is bound to a particular process and does not change, while a PID can be bound to another process after the current process associated with that PID terminates). The new version adds support for using pidfd to attach a process to namespaces (it is allowed to specify pidfd when executing the setns system call). The use of pidfd allows you to control the attachment of a process to several types of namespaces with a single call, significantly reducing the number of necessary system calls and implementing attachment in atomic mode (if a failure occurs when attaching to one of the namespaces, the rest will not connect).
    • A new faccessat2() system call has been added, different from
      faccessat() an additional argument with flags that comply with POSIX recommendations (previously these flags were emulated in the C library, and the new faccessat2 allows them to be implemented in the kernel).

    • In Group added memory.swap.high setting, which can be used to slow down tasks that take up too much space on the swap partition.
    • To the asynchronous I/O interface io_uring added support for the tee() system call.
    • Added mechanism "BPF iterator, designed to display the contents of kernel structures in user space.
    • Provided by the possibility of using a ring buffer for data exchange between BPF programs.
    • The mechanism padata, designed to organize parallel execution of tasks in the kernel, support for multi-threaded tasks with load balancing has been added.
    • In the pstore mechanism, which allows you to save debugging information about the cause of the crash in a memory area that is not lost between reboots, added backend for saving information to block devices.
    • From the PREEMPT_RT kernel branch carried over implementation of local locks.
    • Added new buffer allocation API (AF_XDP) aimed at making it easier to write network drivers with XDP (eXpress Data Path) support.
    • For the RISC-V architecture, support for debugging kernel components using KGDB has been implemented.
    • Prior to release 4.8, the requirements for the version of GCC that can be used to build the kernel have been increased. In one of the next releases, it is planned to raise the bar to GCC 4.9.
  • Disk Subsystem, I/O and File Systems
    • In device mapper added a new dm-ebs (emulate block size) handler that can be used to emulate a smaller logical block size (for example, to emulate 512-byte sectors on 4K sector size disks).
    • The F2FS file system now supports compression using the LZO-RLE algorithm.
    • In dm-crypt added support for encrypted keys.
    • Btrfs has improved handling of direct I/O read operations. When mounted accelerated checking for deleted subsections and directories left without a parent.
    • The "nodelete" parameter has been added to CIFS, which allows regular checks of rights on the server, but prohibits the client from deleting files or directories.
    • Improved error handling in Ext4 ENOSPC when using multithreading. Support for the gnu.* namespace used by the GNU Hurd has been added to xattr.
    • For Ext4 and XFS, support for DAX operations (direct access to the file system, bypassing the page cache without using the block device level) is enabled in relation to individual files and directories.
    • In a system call statx() flag added STATX_ATTR_DAX, specifying which information is retrieved using the DAX mechanism.
    • In EXFAT added support for boot area verification.
    • in FAT improved proactive loading of FS elements. Testing a slow 2TB USB drive showed a reduction in test completion time from 383 to 51 seconds.
  • Network subsystem
    • In the network bridge control code added protocol support MRP (Media Redundancy Protocol), which allows you to provide fault tolerance by looping several Ethernet switches.
    • To traffic control system (Tc) added a new "gate" action that allows you to define time intervals for processing and discarding certain packets.
    • Support for testing the connected network cable and self-diagnosis of network devices has been added to the kernel and the ethtool utility.
    • Support for the MPLS (Multiprotocol Label Switching) algorithm has been added to the IPv6 stack for routing packets using multiprotocol label switching (for IPv4, MPLS was supported before).
    • Added support for IKE (Internet Key Exchange) and IPSec over TCP (RFC 8229) to bypass possible UDP blocking.
    • Posted network block device rnbd, which allows organizing remote access to a block device using RDMA transport (InfiniBand, RoCE, iWARP) and the RTRS protocol.
    • On the TCP stack added support for range compression in selective acknowledgment (SACK) responses.
    • For IPv6 implemented TCP-LD support (RFC 6069, Long Connectivity Disruptions).
  • Equipment
    • The i915 DRM driver for Intel graphics cards is enabled by default for Intel Tiger Lake (GEN12) chips, which also implemented the ability to use the SAGV (System Agent Geyserville) system to dynamically adjust the frequency and voltage depending on the requirements for power consumption or performance.
    • Added support for the FP16 pixel format to the amdgpu driver and implemented the ability to work with encrypted buffers in video memory (TMZ, Trusted Memory Zone).
    • Added support for AMD Zen and Zen2 processor power consumption sensors, as well as AMD Ryzen 4000 Renoir temperature sensors. AMD Zen and Zen2 support retrieval of power consumption information via interface RAPL (Running Average Power Limit).
    • Added support for the NVIDIA modifier format to the Nouveau driver. For gv100, the possibility of using interlaced scanning modes is implemented. Added vGPU definition.
    • Added support for Adreno A405, A640 and A650 GPUs to the MSM (Qualcomm) driver.
    • Added internal framework for managing DRM (Direct Rendering Manager) resources.
    • Added support for Xiaomi Redmi Note 7 and Samsung Galaxy S2 smartphones, as well as Elm/Hana Chromebook laptops.
    • Added drivers for LCD panels: ASUS TM5P5 NT35596, Starry KR070PE2T, Leadtek LTK050H3146W, Visionox rm69299, Boe tv105wum-nw0.
    • Added support for ARM boards and platforms Renesas "RZ/G1H", Realtek RTD1195, Realtek RTD1395/RTD1619, Rockchips RK3326, AMLogic S905D, S905X3, S922XH, Olimex A20-OLinuXino-LIME-eMMC, Check Point L-50,
      , Beacon i.MX8m-Mini, Qualcomm SDM660/SDM630, Xnano X5 TV Box, Stinger96, Beaglebone-AI.

    • Added support for MIPS processor Loongson-2K (abbreviated as Loongson64). Added support for virtualization using the KVM hypervisor for CPU Loongson 3.
    • Added by
      support for the Russian Baikal-T1 processor and the system-on-a-chip based on it BE-T1000. The Baikal-T1 processor contains two P5600 MIPS 32 r5 superscalar cores operating at 1.2 GHz. The chip contains L2 cache (1 MB), DDR3-1600 ECC memory controller, 1 10Gb Ethernet port, 2 1Gb Ethernet ports, PCIe Gen.3 x4 controller, 2 SATA 3.0 ports, USB 2.0, GPIO, UART, SPI, I2C. The processor provides hardware support for virtualization, SIMD instructions, and an integrated hardware accelerator of cryptographic operations that supports GOST 28147-89. The chip was developed using a MIPS32 P5600 Warrior processor core block licensed from Imagination Technologies.

Simultaneously, the Latin American Free Software Foundation formed
option completely free kernel 5.8 β€” linux-libre 5.8-gnu, cleared of firmware and driver elements containing non-free components or code sections, the scope of which is limited by the manufacturer. New release disables blob loading in drivers for Atom ISP Video, MediaTek 7663 USB/7915 PCIe, Realtek 8723DE WiFi, Renesas PCI xHCI, HabanaLabs Gaudi, Enhanced Asynchronous Sample Rate Converter, Maxim Integrated MAX98390 Speaker Aimplifier, Microsemi ZL38060 Connected Home Audio Processor, and I2C EEPROM Slaves. Updated blob cleanup code in Adreno GPU drivers and subsystems, HabanaLabs Goya, x86 touchscreen, vt6656 and btbcm.

Source: opennet.ru

Add a comment