Advantages and disadvantages of HugePages

Advantages and disadvantages of HugePages

Translation of the article prepared for students of the course "Linux Administrator".

Earlier, I talked about how to check and enable the use of Hugepages on Linux.
This article will only be useful if you really have where to use Hugepages. I have met many people who are deceived by the prospect that Hugepages will magically increase productivity. However, hugepaging is a complex topic, and if used incorrectly, it can degrade performance.

Part 1: Checking that hugepages is enabled on Linux (original here)

Problem:
You need to check if HugePages is enabled on your system.

Decision:
It's pretty simple:

cat /sys/kernel/mm/transparent_hugepage/enabled

You will get something like this:

always [madvise] never

You will see a list of available options (always, madvise, never), while the currently active option will be enclosed in brackets (by default madvise).

madvise means that transparent hugepages are only enabled for memory areas that explicitly request hugepages with madvise(2).

always means that transparent hugepages enabled always and for all processes. This usually improves performance, but if you have a use case where many processes consume a small amount of memory, then the overall memory load can increase dramatically.

never means that transparent hugepages will not be included even when requested using madvise. To find out more, please refer to documentation Linux kernels.

How to change the default

Option 1: Direct change sysfs (after a reboot, the parameter will return to the default value):

echo always >/sys/kernel/mm/transparent_hugepage/enabled
echo madvise >/sys/kernel/mm/transparent_hugepage/enabled
echo never >/sys/kernel/mm/transparent_hugepage/enabled

Option 2: Change the system default by recompiling the kernel with the modified configuration (this option is only recommended if you are using your own kernel):

  • To make always the default, use:
    CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
    # Comment out CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
  • To set madvise as default, use:
    CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
    # Comment out CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y

Part 2: Advantages and disadvantages of HugePages

We will try to selectively explain the advantages, disadvantages and possible errors when using Hugepages. Since a technologically complex and pedantic article is likely to be difficult to understand for people who are deceived into thinking Hugepages is a panacea, I will sacrifice accuracy in favor of simplicity. It's just worth keeping in mind that a lot of topics are really complex and therefore greatly simplified.

Note that we're talking about 64-bit x86 systems running Linux, and that I'm just assuming that the system supports transparent hugepages (since it's not a disadvantage that hugepages aren't spoofed), as it happens in almost every modern Linux environment.

In the links below I will attach more technical description.

Virtual memory

If you are a C++ programmer, you know that objects in memory have specific addresses (pointer values).

However, these addresses do not necessarily reflect physical addresses in memory (addresses in RAM). They represent addresses in virtual memory. The processor has a special MMU (memory management unit) module that helps the kernel map virtual memory to a physical location.

This approach has many advantages, but the main ones are:

  • Performance (for various reasons);
  • Program isolation, that is, no program can read from another program's memory.

What are pages?

Virtual memory is divided into pages. Each individual page points to a specific physical memory, it can point to an area in RAM, or it can point to an address assigned to a physical device, such as a video card.

Most of the pages you deal with either point to RAM or are swapped, i.e. stored on your hard drive or SSD. The kernel manages the physical layout of each page. If a spoofed page is accessed, the kernel stops the thread that is trying to access memory, reads the page from the hard disk/SSD into RAM, and then continues the thread.

This process is thread transparent, meaning it doesn't necessarily read directly from the hard drive/SSD. The size of normal pages is 4096 bytes. The size of Hugepages is 2 megabytes.

Translation Association Buffer (TLB)

When a program accesses a certain page of memory, the CPU must know which physical page to read data from (that is, have a virtual address map).

The kernel has a data structure (page table) that contains all the information about the pages in use. With this data structure, you can map a virtual address to a physical address.

However, the page table is quite complex and slow, so we simply can't parse the entire data structure every time a process accesses memory.

Fortunately, our processor has a TLB that caches the mapping of virtual and physical addresses. This means that even though we need to parse the page table the first time we try to access it, all subsequent page accesses can be handled in the TLB, which ensures fast performance.

Since it is implemented as a physical device (which makes it fast in the first place), its capacity is limited. So if you want to access more pages, the TLB won't be able to store the mapping for all of them, and your program will run much slower as a result.

Hugepages come to the rescue

So what can we do to avoid TLB overflow? (We assume that the program still needs the same amount of memory).

This is where Hugepages comes in. Instead of 4096 bytes requiring just one TLB entry, one TLB entry can now point to a whopping 2 megabytes. Let's assume the TLB has 512 entries, here without Hugepages we can match:

4096 bβ‹…512=2 MB

Whereas we can compare with them:

2 MBβ‹…512=1 GB

That's why Hugepages is cool. They can increase productivity without much effort. But there are significant caveats here.

Hugepages substitution

The kernel automatically keeps track of how often each memory page is being used. If there is not enough physical memory (RAM), the kernel will move less important (less frequently used) pages to the hard disk to free up some RAM for more important pages.
Basically, the same goes for Hugepages. However, the kernel can only swap entire pages, not single bytes.

Let's say we have a program like this:

char* mymemory = malloc(2*1024*1024); // Π’ΠΎΠ·ΡŒΠΌΠ΅ΠΌ это Π·Π° ΠΎΠ΄Π½Ρƒ Hugepage!
// Π—Π°ΠΏΠΎΠ»Π½ΠΈΠΌ mymemory ΠΊΠ°ΠΊΠΈΠΌΠΈ-Π»ΠΈΠ±ΠΎ Π΄Π°Π½Π½Ρ‹ΠΌΠΈ
// Π‘Π΄Π΅Π»Π°Π΅ΠΌ ΠΌΠ½ΠΎΠ³ΠΎ Π΄Ρ€ΡƒΠ³ΠΈΡ… Π²Π΅Ρ‰Π΅ΠΉ,
// ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Π΅ ΠΏΡ€ΠΈΠ²Π΅Π΄ΡƒΡ‚ ΠΊ ΠΏΠΎΠ΄ΠΌΠ΅Π½Π΅ страницы mymemory
// ...
// Запросим доступ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ ΠΊ ΠΏΠ΅Ρ€Π²ΠΎΠΌΡƒ Π±Π°ΠΉΡ‚Ρƒ
putchar(mymemory[0]); 

In this case, the kernel will need to replace (read) as many as 2 megabytes of information from the hard disk / SSD just for you to read one byte. As for normal pages, only 4096 bytes need to be read from the HDD/SSD.

So if hugepage is substituted, reading it is faster only if you need to access the entire page. This means that if you are trying to randomly access different parts of memory and just read a couple of kilobytes, you should use regular pages and don't worry about anything else.

On the other hand, if you need to access a lot of memory sequentially, hugepages will increase your performance. However, you need to test this for yourself (not against abstract software) and see what works faster.

Allocation in memory

If you write in C, you know that you can request arbitrarily small (or almost arbitrarily large) amounts of memory from the heap with malloc(). Let's say you need 30 bytes of memory:

char* mymemory = malloc(30);

To the programmer, it may seem that you are "requesting" 30 bytes of memory from the operating system and returning a pointer to some virtual memory. But actually malloc () is just a C function that calls from within the function brk and sbrk to request or release memory from the operating system.

However, requesting more and more memory for each allocation is inefficient; it is most likely that some memory segment has already been freed (free())and we can reuse it. malloc() implements rather complex algorithms for reusing freed memory.

At the same time, everything happens imperceptibly for you, so why should you care? But because the challenge free() doesn't mean that memory must be returned immediately to the operating system.

There is such a thing as memory fragmentation. In extreme cases, there are segments of the heap where only a few bytes are used while everything in between has been freed. (free()).

Please note that memory fragmentation is an incredibly complex topic, and even minor changes to the program can make a big difference. In most cases, programs do not cause significant memory fragmentation, but you should be aware that if there is a problem with fragmentation in some area of ​​the heap, hugepages can only exacerbate the situation.

Selective application of hugepages

After reading this article, you've determined which parts of your program could and could not benefit from hugepages. So should hugepages be included at all?

Luckily you can use madvise()to enable hugepaging only for areas of memory where it would be useful.

First, check that hugepages are running in madvise() mode with instructions at the beginning of the article.

Then, use madvise()to tell the kernel exactly where to use hugepages.

#include <sys/mman.h>
// АллоцируйтС большоС количСство памяти, ΠΊΠΎΡ‚ΠΎΡ€ΡƒΡŽ Π±ΡƒΠ΄Π΅Ρ‚Π΅ ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚ΡŒ
size_t size = 256*1024*1024;
char* mymemory = malloc(size);
// ΠŸΡ€ΠΎΡΡ‚ΠΎ Π²ΠΊΠ»ΡŽΡ‡ΠΈΡ‚Π΅ hugepages…
madvise(mymemory, size, MADV_HUGEPAGE);
// … ΠΈ Π·Π°Π΄Π°ΠΉΡ‚Π΅ ΡΠ»Π΅Π΄ΡƒΡŽΡ‰Π΅Π΅
madvise(mymemory, size, MADV_HUGEPAGE | MADV_SEQUENTIAL)

Note that this method is just a recommendation to the kernel about memory management. This does not mean that the kernel will automatically use hugepages for the given memory.

Refer to documentation (manpage)madviseto learn more about memory management and madvise(), this topic has an incredibly steep learning curve. So if you're going to get really good at it, prepare to read and test it for a few weeks before expecting any positive results.

What to read?

Have a question? Write in the comments!

Source: habr.com

Add a comment