Configuring Out-Of-Memory Killer on Linux for PostgreSQL

Configuring Out-Of-Memory Killer on Linux for PostgreSQL

When a database server crashes unexpectedly on Linux, you need to find the cause. There may be several reasons. For example, SIGSEGV β€” failure due to a bug in the backend server. But this is rare. Most often, you simply run out of disk space or memory. If you run out of disk space, there is only one way out - to free up space and restart the database.

Out Of Memory Killer

When a server or process runs out of memory, Linux offers 2 solutions: crash the entire system or terminate the process (application) that is eating up memory. It is better, of course, to terminate the process and save the OS from crashing. In a nutshell, Out-Of-Memory Killer is a process that terminates an application in order to save the kernel from crashing. It sacrifices the application to keep the OS running. Let's first discuss how OOM works and how to control it, and then we'll see how OOM Killer decides which application to terminate.

One of the main tasks of Linux is to allocate memory to processes when they ask for it. Usually, a process or application asks the OS for memory, but does not use it completely. If the OS gives out memory to everyone who asks for it, but does not plan to use it, very soon the memory will run out and the system will fail. To avoid this, the OS reserves memory for the process, but does not actually release it. Memory is allocated only when the process actually intends to use it. It happens that the OS does not have free memory, but it assigns memory to the process, and when the process needs it, the OS allocates it if it can. The downside is that sometimes the OS reserves memory, but at the right time there is no free memory, and the system crashes. OOM plays an important role in this scenario and terminates processes to keep the kernel from panicking. When a PostgreSQL process is forcibly terminated, a message appears in the log:

Out of Memory: Killed process 12345 (postgres).

If there is not enough memory in the system and it is impossible to free it, the function is called out_of_memory. At this stage, she has only one thing left - to complete one or more processes. Should OOM-killer kill the process right away or can you wait? Obviously, when out_of_memory is called, it is either waiting for an I/O operation or paging to disk. Therefore, the OOM-killer must first perform checks and, based on them, decide that the process needs to be terminated. If all of the checks below are positive, OOM will end the process.

Process selection

When memory runs out, a function is called out_of_memory(). It has a function select_bad_process(), which is evaluated by the function badness(). The most β€œbad” process will fall under the distribution. Function badness() selects a process according to certain rules.

  1. The kernel needs some minimum memory for itself.
  2. You need to free up a lot of memory.
  3. No need to terminate processes that use little memory.
  4. It is necessary to complete a minimum of processes.
  5. Sophisticated algorithms that increase the chances of completion for those processes that the user himself wants to complete.

After doing all these checks, OOM learns the estimate (oom_score). OOM appoints oom_score per process, and then multiplies this value by the amount of memory. Processes with higher values ​​are more likely to fall victim to the OOM Killer. Processes associated with a root user have a lower score and are less likely to be forced to terminate.

postgres=# SELECT pg_backend_pid();
pg_backend_pid 
----------------
    3813
(1 row)

The Postgres process ID is 3813, so another shell can get the estimate using this kernel option oom_score:

vagrant@vagrant:~$ sudo cat /proc/3813/oom_score
2

If you don't want OOM-Killer to end the process at all, there is another kernel option: oom_score_adj. Add a large negative value to reduce the chances of terminating a process you love.

sudo echo -100 > /proc/3813/oom_score_adj

To set a value oom_score_adj, set OOMScoreAdjust in the service block:

[Service]
OOMScoreAdjust=-1000

Or use oomprotect in a team rcctl.

rcctl set <i>servicename</i> oomprotect -1000

Force termination of a process

When one or more processes are already selected, OOM-Killer calls the function oom_kill_task(). This function sends a termination signal to the process. In case of low memory oom_kill() calls this function to send a SIGKILL signal to the process. A message is written to the kernel log.

Out of Memory: Killed process [pid] [name].

How to control OOM-Killer

On Linux, you can enable and disable OOM-Killer (although the latter is not recommended). Use the option to enable or disable vm.oom-kill. To enable OOM-Killer at runtime, run the command sysctl.

sudo -s sysctl -w vm.oom-kill = 1

To disable OOM-Killer, specify the value 0 in the same command:

sudo -s sysctl -w vm.oom-kill = 0

The result of this command will not be saved forever, but only until the first reboot. If you need more persistence, add this line to the file /etc/sysctl.conf:

echo vm.oom-kill = 1 >>/etc/sysctl.conf

Another way to enable and disable is to write a variable panic_on_oom. The value can always be checked in /proc.

$ cat /proc/sys/vm/panic_on_oom
0

If you set the value to 0, then when the memory runs out, there will be no kernel panic.

$ echo 0 > /proc/sys/vm/panic_on_oom

If you set the value to 1, then when the memory runs out, a kernel panic will occur.

echo 1 > /proc/sys/vm/panic_on_oom

OOM-Killer can not only be turned on and off. We have already said that Linux can reserve more memory for processes than is available, but not actually allocate it, and this behavior is controlled by a Linux kernel parameter. The variable is responsible for this. vm.overcommit_memory.

You can specify the following values ​​for it:

0: the kernel itself decides whether to reserve too much memory. This is the default on most versions of Linux.
1: the kernel will always reserve extra memory. This is risky, because memory can run out, because, most likely, one day the processes will demand what they have to.
2: the kernel will not reserve more memory than specified in the parameter overcommit_ratio.

In this parameter, you specify the percentage of memory for which over-reserving is allowed. If there is no room for it, no memory is allocated and the reservation will be denied. This is the safest option recommended for PostgreSQL. OOM-Killer is affected by another element - the ability to swap, which is controlled by the variable cat /proc/sys/vm/swappiness. These values ​​tell the kernel how to handle paging. The larger the value, the less likely it is that OOM will terminate the process, but due to I/O operations, this will negatively impact the database. And vice versa - the lower the value, the higher the likelihood of OOM-Killer interference, but the database performance is also higher. The default value is 60, but if the entire database fits in memory, it's better to set the value to 1.

Results

Don't let the "killer" in OOM-Killer scare you. In this case, the killer will be the savior of your system. It "kills" the worst processes and saves the system from crashing. To avoid having to use OOM-Killer to terminate PostgreSQL, set to vm.overcommit_memory a value of 2. This does not guarantee that the OOM-Killer will not have to intervene, but it will reduce the chance of forcibly terminating the PostgreSQL process.

Source: habr.com

Add a comment