Facebook introduced the TMO mechanism, which allows you to save 20-32% of memory on servers

Engineers from Facebook (banned in the Russian Federation) published a report on the introduction of TMO (Transparent Memory Offloading) technology last year, which allows you to significantly save RAM on servers by displacing secondary data that is not required to perform work on cheaper drives, such as NVMe SSD -discs. Facebook estimates that using TMO saves between 20% and 32% of RAM on each server. The solution is designed for use in infrastructures where applications run in isolated containers. The kernel-side components of TMO are already included in the Linux kernel.

On the Linux kernel side, the operation of the technology is provided by the PSI (Pressure Stall Information) subsystem, supplied since release 4.20. PSI is already used in various out-of-memory handlers and allows you to analyze information about the waiting time for obtaining various resources (CPU, memory, I / O). With PSI, user-space processors can more accurately assess system load and slowdown patterns, allowing anomalies to be detected early, before they have a noticeable impact on performance.

In user space, TMO is run by the Senpai component, which dynamically adjusts the memory limit for application containers through cgroup2 based on data received from the PSI. Senpai analyzes the signs of the beginning of a resource shortage through PSI, evaluates the sensitivity of applications to slow memory access and tries to determine the minimum size of memory required for a container, at which the data required for work remains in RAM, and related data that has settled in the file cache or is not directly used in currently, are forced out to the swap partition.

Facebook introduced the TMO mechanism, which allows you to save 20-32% of memory on servers

Thus, the essence of TMO is to keep processes on a strict diet in terms of memory consumption, forcing unused memory pages to be moved to the swap partition, the removal of which does not noticeably affect performance (for example, pages with code used only during initialization, and single-use data in the disk cache). In contrast to evicting information to the swap partition in response to low memory, TMO evicts data based on predictive prediction.

The absence of a memory page access within 5 minutes is used as one of the criteria for preemption. Such pages are called cold (cold memory page) and on average make up about 35% of the application memory (depending on the type of application, there is a variation from 19% to 65%). Preemption takes into account the activity associated with anonymous pages of memory (memory allocated by the application) and memory used for file caching (allocated by the kernel). In some applications anonymous memory is the main consumption, but in others the file cache is also very important. To avoid imbalance when flushing memory to the cache, TMO uses a new paging algorithm that flushes anonymous pages and pages associated with the file cache proportionally.

Pushing infrequently used pages to slower memory does not have a huge impact on performance, but can significantly reduce hardware costs. Data is pushed out to SSDs or compressed swap space in RAM. At the cost of storing a byte of data, the use of NVMe SSD is up to 10 times cheaper than using compression in RAM.

Facebook introduced the TMO mechanism, which allows you to save 20-32% of memory on servers


Source: opennet.ru

Add a comment