After two months of development Linus Torvalds presented kernel release Linux 7.0Among the most notable changes: rules for using AI assistants, moving Rust into the core kernel, improved swap performance, enabling PREEMPT_LAZY mode by default, support for filters for io_uring operations, a new Nullfs filesystem, fserror infrastructure, XFS monitoring tools, support for remapping in Btrfs, enabling NFS 4.1 by default, integrating the post-quantum ML-DSA cryptographic algorithm, activating AccECN in the network subsystem, and initial support for WiFi 8.
The number 7.0 was assigned because the 6.x branch had accumulated enough releases to warrant a change to the first digit of the version number (release 6.0 was originally released following 5.19). The numbering change is done for aesthetic reasons and is a formal step to alleviate the discomfort caused by the accumulation of a large number of releases in the series.
The new version incorporates 15624 fixes from 2477 developers, with a patch size of 56 MB (changes affected 18053 files, adding 704060 lines of code and removing 278132 lines). The previous release included 15657 fixes from 2237 developers, with a patch size of 52 MB. About 51% of all changes in 7.0 are related to device drivers, approximately 11% of changes are related to updates to code specific to hardware architectures, 14% are related to the network stack, 5% are related to file systems, and 3% are related to internal kernel subsystems.
Key new features in kernel 7.0 (kernelnewbies.org, lwn.net, opennet):
- Disk Subsystem, I/O and File Systems
- Implemented fserror infrastructure and added An API for retrieving information about I/O errors and metadata corruption when working with files. The proposed infrastructure unifies the transmission of error information to user space in file systems via the fsnotify mechanism.
- In XFS added New capabilities for monitoring file system health from user space. The XFS_IOC_HEALTH_MONITOR ioctl operation is proposed, returning a file descriptor that can be used to obtain information about failures related to metadata corruption or I/O errors, as well as to monitor file system state changes, such as unmounting and shutdown. Additionally, a systemd-managed background process, xfs_healer, is proposed, which processes file system health events from user space and automatically starts them if necessary. recovery procedures.
- Experimental support for the "remap tree" structure has been added to the Btrfs file system (remap tree), which could be used in the future as an intermediate layer for I/O operations. The essence of this added feature is that after moving data on the drive, instead of updating all structures associated with this data, the old and new data addresses are stored in an additional "remap tree" structure, after which the addresses are replaced when the data is accessed. The new approach is touted as more reliable and flexible, and also simplifies future expansion of Btrfs functionality.
- In Btrfs implemented support for direct I/O in situations where the block size exceeds the system's memory page size.
- The composition is included A new file system, Nullfs, can be used as a stub for the root file system. The Nullfs file system is always empty, contains no data, and does not support modifications. The purpose of Nullfs is to serve as an initial file system to simplify the system boot process. Other file systems are then mounted on top of Nullfs, and the pivot_root() system call is used to switch the root file system, rather than clearing the contents of the initramfs and using the associated root file system.
- Implemented Updating file modification time information in non-blocking mode. Previously, calling file_update_time_flags() with the IOCB_NOWAIT flag returned the -EAGAIN error, which prevented direct write operations from being used in non-blocking mode.
- In file systems, in the category of separately enabled options translated support for notification blocking (lease). This mechanism is no longer activated by default due to issues with file systems not originally designed for its use. For example, it is not supported by the 9p and cephfs file systems.
- The EROFS (Extendable Read-Only File System), designed for use on read-only partitions, uses the LZMA compression algorithm by default. DEFLATE and Zstandard, which are no longer marked as experimental, are optionally available. Page cache entries are now shared across identical files in separate EROFS file systems.
- Removed laptop_mode, a power-saving mode that defers and consolidates hard drive writes to extend the drive's sleep time and reduce the number of wake-ups. This mode has become obsolete as hard drives have been replaced by solid-state drives in modern mobile devices.
- The F2FS file system has been switched to using large folios of pages of memory (large folios).
- Reborn working on the ntfs3 driver, developing by Paragon Software. Support for iomap-based file operations was added, the llseek SEEK_DATA/SEEK_HOLE options were implemented, and the delalloc mode for delayed block allocation was added. Meanwhile, a message was posted on the kernel developers' mailing list in February. approved inclusion of a new NTFS implementation in one of the future kernel versions — ntfsplus, developed to replace ntfs3.
- By default during assembly is included NFS protocol version 4.1 (CONFIG_NFS_V4_1). Provided by Blocking NFS export of specialized pseudo-file systems, such as pidfs and nsfs. In NFSD implemented experimental possibility of using POSIX ACL and added support for dynamically changing the thread pool depending on the load.
- Memory and system services
-
Official rules have been approved applications of AI assistants and inclusion in the core automatically generated contentWhen submitting generated code, it is required to mark it with the Assisted-by tag indicating the AI assistant used. AI assistants are prohibited from adding the Signed-off-by tag—the person submitting the patch is considered its author, is responsible for the submitted change, and vouches for its quality. Developers are required to manually review the code generated by AI and verify that the result complies with licensing requirements.
-
Rust support translated from experimental to main kernel features.
-
Completed integration into the core of the mechanism "Swap Table", which improves swap performance. This speedup is achieved by reducing contention for swap cache access, more efficient cache lookups, and reduced fragmentation. A Swap Table-based backend is used for swap caching instead of the XArray backend and resulted in a 22% increase in the number of requests processed in the redis-benchmark with BGSAVE.
-
Added support for the extension introduced in Clang 22 Thread Safety Analysis, which enables compile-time detection of potential race conditions and errors caused by improper lock acquisition. The extension offers a series of attributes, such as GUARDED_BY(…), REQUIRES(…), RELEASE(…), and ACQUIRE(…), which allow you to mark functions covered by locks and separate lock scopes (define context). Compile-time verification of the correct use of synchronization primitives, such as mutexes, is performed based on the activity or inactivity of the associated context.
-
In the open_tree system call added The OPEN_TREE_NAMESPACE flag simplifies the setup of isolated containers and speeds up container startup on systems with a large number of mount points. Similar to OPEN_TREE_CLONE, this new flag copies only the specified mount tree, but instead of a local file descriptor, it returns a file descriptor in the new mount point namespace, in which the copied tree is mounted over a copy of the real root filesystem. The OPEN_TREE_NAMESPACE flag is useful to avoid separate execution of the unshare(CLONE_NEWNS) and pivot_root() operations used when creating containers.
-
In the rseq system call added A time slice extension mechanism that allows for additional CPU time to be allocated for uninterrupted execution of a critical section. The idea is to prevent the task scheduler from interrupting a critical section with an existing lock, which would result in control being transferred to other threads using the resource while the lock remains. Time slice extension is achieved without additional overhead, but also without the strict guarantees provided by full priority control.
-
For arm64, loongarch, powerpc, riscv, s390, and x86 architectures, the preemption mode is the default scheduler mode. changed from PREEMPT_NONE to PREEMPT_LAZY. Number of possible modes reduced From four to two – PREEMPT_FULL and PREEMPT_LAZY (the PREEMPT_NONE and PREEMPT_VOLUNTARY modes are reserved only for architectures that do not support PREEMPT_FULL and PREEMPT_LAZY). The PREEMPT_LAZY mode applies the full preemption model (PREEMPT_FULL) for realtime tasks (RR/FIFO/DEADLINE), but delays the preemption of normal tasks (SCHED_NORMAL) to the tick boundary. This introduced delay leads to a reduction in the number of lock holder preemptions, which allows for performance closer to configurations using the voluntary preemption model; i.e., PREEMPT_LAZY preserves full preemption capabilities for realtime tasks, but minimizes the performance penalty for normal tasks.
Enabling PREEMPT_LAZY led A serious regression that halves PostgreSQL performance on ARM64 systems. To address the performance drop, PostgreSQL developers proposed by Enable the PR_RSEQ_SLICE_EXTENSION option to reduce the likelihood of lock holder eviction.
-
Continued transfer changes from a branch Rust-for-Linux, related to the use of the Rust language as a second language for developing drivers and kernel modules (Rust support is not active by default, and does not result in Rust being included in the list of mandatory build dependencies for the kernel). Thanks to the previously integrated library "syn (crates.io)", which simplifies writing complex macros, reduced the size of Rust code in the kernel by simplifying the definitions of existing procedural macros. The capabilities of the kernel, macros, and pin-init libraries have been expanded.
-
In the asynchronous input/output system io_uring added option to use non-circular submission queues, which are more efficiently cached in situations where a request completes before the system call returns.
-
In the eBPF subsystem, the BTF (BPF Type Format) mechanism, which provides information for type checking in BPF pseudocode, is used to find debugging information. involved binary search, which improved the efficiency of loading BPF programs. In eBPF added support for implicit arguments when calling kfunc (kernel functions available for use in BPF programs) defined with the KF_IMPLICIT_ARGS flag.
-
Removed code to support the initial RAM disk (initrd) based on linuxrc, which has long been deprecated. Remaining implementations of initrd are scheduled to be removed in 2027. Initramfs should be used instead of initrd (the difference is that initrd places the initial boot environment in a disk image, while initramfs places it in a file system).
-
In a block device zram, used for compressed storage of the swap partition in memory, changed Logic for handling compressed memory pages when optionally moving data to persistent storage when available RAM is full. Previously, memory pages were decompressed before being written to physical storage, but now they are stored as is in compressed form, reducing CPU load and saving power during battery life.
-
To utility timerlat, designed to measure delays when running the task scheduler, added --bpf-action option to run BPF programs when a specified threshold is exceeded.
-
The ftrace tracing system now has a bitmask-list setting for outputting bitmasks in a readable format (as a list of bits, not a hexadecimal number). Auditing capabilities have been added to tracefs. filters и triggers. Added by The perf sched stats command collects and displays statistics about the task scheduler.
-
Added The LOGO_LINUX_MONO_FILE, LOGO_LINUX_VGA16_FILE, and LOGO_LINUX_CLUT224_FILE build options are used to specify a file containing a logo image that will be displayed when the kernel boots instead of the standard Tux penguin logo.
-
- Virtualization and Security
- In the asynchronous input/output system io_uring implemented The ability to attach BPF programs with filters that control what specific SQE (Submission Queue Entry) operations can do (similar to system calls in io_uring). This added feature is analogous to system call filters. Filters can be attached to specific tasks and are inherited when forking other processes after calling fork(). If filters are active, filters added on top can only attach additional restrictions, but not disable existing ones. This implemented feature will allow blocking methods bypassing system call filtering in sandbox environments, based on performing similar operations provided in io_uring instead of system calls.
- In SELinux added the ability to manage access to BPF tokens, allowing unprivileged processes to perform some privileged operations with BPF, such as loading BPF programs into the kernel and creating map structures.
- Added by support for the algorithm for generating digital signatures ML-DSA (CRYSTALS-Dilithium), based on lattice theory and resistant to fitting on a quantum computer. Provided by the ability to use ML-DSA to authenticate kernel modules.
- Removed the ability to use digital signature generation schemes with the SHA-1 algorithm to certify kernel modules (support for loading signed modules is retained).
- In the NETFILTER_PKT audit record added sport and dport fields to inspect network port numbers, not just IP addresses.
- For systems with RISC-V architecture implemented Support for the Zicfiss and Zicfilp extensions, which provide hardware capabilities for implementing Control Flow Integrity (CFI) protection, which blocks violations of the normal order of instruction execution (control flow) as a result of exploits that modify function pointers stored in memory.
- In the KVM hypervisor implemented the ability to transmit information to guest systems about the processor's support for the ERAPS (Enhanced Return Address Predictor Security) extension, which allows for the avoidance of some CPU state reset operations when the guest system returns control to the host. In addition, added Support for assigning performance monitoring units (PMUs) to guest systems, which allows for increased profiling accuracy compared to using emulated PMUs.
- In the driver for the Hyper-V hypervisor added Support for the debugfs interface for viewing statistics about the hypervisor's operation.
- Network subsystem
- The AccECN extension is enabled by default (Accurate Explicit Congestion Notification), which implements an improved version of the extension ECN (wikipedia.org), which allows hosts to mark IP packets in the event of congestion instead of dropping them, making it possible to detect the onset of congestion on communication channels without packet loss. The original ECN extension has a limitation of only allowing one congestion signal to be raised during a single TCP round-trip (RTT, Round-Trip Time, sending a request and receiving a response). AccECN removes this limitation and allows the receiver to transmit more than one congestion signal to the sender in the TCP packet header. Congestion control algorithms can use this information to more accurately respond to congestion and avoid abruptly reducing the packet rate when minor congestion occurs.
- In the implementation of the network queue management algorithm Cake added The ability to process multiple queues to distribute the load across multiple CPU cores. The CAKE algorithm is used to reduce the negative impact of intermediate packet buffering on edge network equipment and is aimed at achieving the highest possible throughput and minimal latency, even on slow communication channels.
- Into sockets VSOCK, used to interact with virtual machines, added support for network namespaces.
- Added by initial implementation of the future standard 8 WiFi (802.11bn, Ultra High Reliability WiFi).
- Added optimizations that increased the performance of incoming UDP packet processing by 12% during stress testing on a 100 Gigabit network.
- Equipment
- In the AMDGPU driver implemented support IP blocks, used in new AMD GPUs such as SMUIO 15.x, PSP 15.x, IH 6.1.1/7.1, MMHUB 3.4/4.2, GC 11.5.4/12.1, SDMA 6.1.4/7.1/7.11.4 and JPEG 5.3.
- The Nouveau driver has improved frequency management on Tegra 186+ systems.
- In the i915 driver added Initial support for the Xe3p_LPD display IP used in Intel Nova Lake-P processors.
- Continued Work on the Xe DRM (Direct Rendering Manager) driver for GPUs based on the Intel Xe architecture, which is used in Intel Arc family graphics cards and integrated graphics, starting with Tiger Lake processors. A mode has been added Multi Queue. Added Components needed to diagnose GPU freezes in Mesa. Added by Support for the MERT mechanism for managing access to GPU memory.
- Continued Integration of Nova driver components for NVIDIA GPUs equipped with GSP firmware used since the NVIDIA GeForce RTX 2000 series based on the Turing microarchitecture. The driver is written in Rust. This new version prepares for support for GPUs based on the Turing microarchitecture and includes various internal changes.
- Added by support for controllers and peripheral devices with a multi-channel SPI (Serial Peripheral Interface) interface, which allows data to be transmitted in several parallel streams.
- Added Driver for combined Type-C connectors used on devices based on Apple Silicon chips and combining USB3, DP-AltMode, and Thunderbolt/USB4 interfaces.
- Added support for sound subsystems of Tegra238, Minisforum V3 SE, iBasso DC04U, Intel Nova Lake, Nova Lake S and Focusrite Forte chips.
- Added support for ARM boards, SoCs and devices: Arduino UnoQ, OrangePi 6 Plus, OrangePi CM5, Anbernic RG-DS, Realtek Kent, Qualcomm Kaanapali, Mediatek Ezurio, Facebook Anacapa, Microchip LAN9668, Khadas VIM1S, QNAP TS133, i.MX952, i.MX93, i.MX94, VHIP4 EvalBoard, TQ-Systems MBLS1028A, Agilex5, Radxa CM3J, Glymur,
- Added support for smartphones and tablets: Fairphone Gen 6 (SoC Qualcomm Milos/Snapdragon 7s Gen 3), Pixel 3/3 xl, Microsoft surface pro 11.
Simultaneously, the Latin American Free Software Foundation formed option completely free kernel 7.0 - Linux-libre 7.0-gnu, cleaned of firmware and driver elements containing proprietary components or code sections with limited scope by the manufacturer. Release 7.0 includes a blob cleanup for the iwlwifi driver. The cleanup code for the amdgpu, adreno, TI PRUeth, air_en8811h, ath12k, TI VPE, rtw8852b, rt1320, rt5575 SPI, tas2783, and Intel catpt drivers has been updated. Blob names in devicetree (dts) files for ARM chips have been cleaned.
Source: linux.org.ru
