Release of systemd 252 system manager with UKI (Unified Kernel Image) support

After five months of development, the release of the systemd 252 system manager was presented. The key change in the new version was the integration of support for a modernized boot process, which allows verifying not only the kernel and bootloader, but also components of the base system environment using digital signatures.

The proposed method implies the use of a unified kernel image UKI (Unified Kernel Image) when loading, which combines a handler for loading the kernel from UEFI (UEFI boot stub), a Linux kernel image and the initrd system environment loaded into memory, used for initial initialization at the stage before mounting the root FS . A UKI image is packaged as a single PE executable file that can be loaded using traditional boot loaders or directly called from the UEFI firmware. When called from UEFI, it is possible to check the integrity and validity of the digital signature of not only the kernel, but also the contents of the initrd.

To calculate the parameters of the TPM PCR (Trusted Platform Module Platform Configuration Register) registers used to control the integrity and generate a digital signature of the UKI image, a new systemd-measure utility is included. The public key used in the signature and associated PCR information can be embedded directly into the UKI boot image (the key and signature are stored in the PE file in the '.pcrsig' and '.pcrkey' fields) and extracted from it by external or internal utilities.

In particular, the systemd-cryptsetup, systemd-cryptenroll and systemd-creds utilities have been adapted to use this information, with which you can ensure that encrypted disk partitions are bound to a digitally signed kernel (in this case, access to the encrypted partition is provided only if the UKI image has passed verification by digital signature based on the parameters placed in the TPM).

Additionally, the systemd-pcrphase utility is included, which allows you to control the binding of various boot stages to parameters placed in the memory of cryptoprocessors that support the TPM 2.0 specification (for example, you can make the LUKS2 partition decryption key available only in the initrd image and block access to it at later stages downloads).

Some other changes:

  • Ensured that the default locale is C.UTF-8 if no other locale is specified in the settings.
  • Implemented the ability to perform a full service preset operation ("systemctl preset") during the first boot. Enabling presets at boot time requires a build with the "-Dfirst-boot-full-preset" option, but is planned to be enabled by default in future releases.
  • The user management units use the CPU resource controller, which made it possible to ensure that the CPUWeight settings are applied to all slice units used to split the system into parts (app.slice, background.slice, session.slice) to isolate resources between different user services, competing for CPU resources. CPUWeight also supports the "idle" value to activate the appropriate resource provisioning mode.
  • In temporary (β€œtransient”) units and in the systemd-repart utility, overriding settings is allowed by creating drop-in files in the /etc/systemd/system/name.d/ directory.
  • System images are provided with the 'support-ended' flag set, determining this fact based on the value of the new 'SUPPORT_END=' parameter in the /etc/os-release file.
  • Added "ConditionCredential=" and "AssertCredential=" settings that can be used to ignore or crash units when certain credentials are missing from the system.
  • Added "DefaultSmackProcessLabel=" and "DefaultDeviceTimeoutSec=" settings to system.conf and user.conf to define default SMACK security level and unit activation timeout.
  • In the "ConditionFirmware=" and "AssertFirmware=" settings, the ability to specify individual SMBIOS fields has been added, for example, to start a unit only if the /sys/class/dmi/id/board_name field contains the value "Custom Board", you can specify "ConditionFirmware=smbios -field(board_name = "Custom Board")".
  • In the initialization process (PID 1), added the ability to import credentials from SMBIOS fields (Type 11, "OEM vendor strings") in addition to defining them via qemu_fwcfg, which makes it easier to provide credentials to virtual machines and eliminates the need for third-party tools such as cloud -init and ignition.
  • During shutdown, the logic for unmounting virtual file systems (proc, sys) has been changed and information about processes blocking the unmounting of file systems is saved in the log.
  • The system call filter (SystemCallFilter) allows access to the riscv_flush_icache system call by default.
  • The sd-boot loader has added the ability to boot in mixed mode, in which a 64-bit Linux kernel is run from a 32-bit UEFI firmware. Added experimental ability to automatically apply SecureBoot keys from files found in ESP (EFI system partition).
  • New options added to the bootctl utility "--all-architectures" to install binaries for all supported EFI architectures, "--root=" and "--image=" to work with a directory or disk image, "--install-source=" to define source to install, "--efi-boot-option-description=" to manage boot entry names.
  • Added the 'list-automounts' command to the systemctl utility to display a list of automatically mounted directories and the '--image=' option to execute commands bound to a specified disk image. Added "--state=" and "--type=" options to 'show' and 'status' commands.
  • Added options to systemd-networkd "TCPCongestionControlAlgorithm=" to select TCP congestion control algorithm, "KeepFileDescriptor=" to keep file descriptor of TUN/TAP interfaces, "NetLabel=" to set NetLabel labels, "RapidCommit=" to speed up configuration via DHCPv6 (RFC 3315). In the "RouteTable=" parameter, routing table names are allowed.
  • systemd-nspawn allows the use of relative file paths in the "--bind=" and "--overlay=" options. Added support for the 'rootidmap' option to the '--bind=' option to bind the root user ID in the container to the owner of the mounted directory on the host side.
  • systemd-resolved uses the OpenSSL package as the encryption backend by default (gnutls support is retained as an option). Unsupported DNSSEC algorithms are now treated as insecure instead of returning an error (SERVFAIL).
  • systemd-sysusers, systemd-tmpfiles and systemd-sysctl implement the ability to pass settings through the credential storage mechanism.
  • Added 'compare-versions' command to systemd-analyze to compare strings with version numbers (similar to 'rpmdev-vercmp' and 'dpkg --compare-versions'). Added the ability to filter units by mask to the 'systemd-analyze dump' command.
  • When choosing a multi-stage sleep mode (suspend-then-hibernate, hibernation after hibernation), the time spent in standby mode is now selected based on the forecast of the remaining battery life. An instant transition to sleep mode is performed when there is less than 5% of the battery charge.
  • A new output mode "-o short-delta" has been added to 'journalctl' to display the time difference between different messages in the log.
  • Systemd-repart added support for creating partitions with the Squashfs file system and partitions for dm-verity, including those with digital signatures.
  • Added "StopIdleSessionSec=" setting to systemd-logind to end an inactive session after a specified timeout has elapsed.
  • Added "--unlock-key-file=" option to systemd-cryptenroll to retrieve the decryption key from a file instead of prompting the user.
  • The systemd-growfs utility can be run in non-udev environments.
  • Systemd-backlight has improved support for systems with multiple graphics cards.
  • The license for code examples in the documentation has been changed from CC0 to MIT-0.

Compatibility changes:

  • When checking the kernel version number using the ConditionKernelVersion directive in the '=' and '!=' operators, a simple string comparison is now applied, and if a comparison operator is not specified at all, glob mask matching using the characters '*', '?' And '[', ']'. The '<', '>', '<=', and '>=' operators should be used to compare versions in the style of the stverscmp() function.
  • The SELinux label used to check access from a unit file is now read at the stage of loading the file, and not at the time of the access check.
  • The "ConditionFirstBoot" condition now fires on the first boot of the system only directly at the boot stage and returns "false" when calling units after the boot is completed.
  • In 2024, systemd plans to stop supporting the cgroup v1 resource limiting mechanism, which was deprecated in systemd release 248. Administrators are advised to take care of moving services tied to cgroup v2 to cgroup v1 in advance. The key difference between cgroups v2 and v1 is the use of a common cgroups hierarchy for all resource types, instead of separate hierarchies for CPU allocation, memory management, and I/O. Separate hierarchies lead to difficulties in organizing interaction between handlers and to additional costs of kernel resources when applying rules for a process mentioned in different hierarchies.
  • In the second half of 2023, it is planned to stop supporting split directory hierarchies, when /usr is mounted separately from the root, or the directories /bin and /usr/bin, /lib and /usr/lib are separated.

Source: opennet.ru

Add a comment