Operating Systems: Three Easy Pieces. Part 5: Planning: Multi-Level Feedback Queue (translation)

Introduction to Operating Systems

Hey Habr! I would like to bring to your attention a series of articles-translations of one interesting literature in my opinion - OSTEP. This material discusses quite deeply the work of unix-like operating systems, namely, work with processes, various schedulers, memory, and other similar components that make up a modern OS. You can see the original of all materials here here. Please note that the translation was made unprofessionally (quite freely), but I hope I retained the general meaning.

Lab work on this subject can be found here:

Other parts:

You can also check out my channel at telegrams =)

Planning: Multi-Level Feedback Queue

In this lecture, we will talk about the problems of developing one of the most famous approaches to
planning, which is called Multi-Level Feedback Queue (MLFQ). The MLFQ scheduler was first described in 1962 by Fernando J. Corbató in a system called
Compatible Time-Sharing System (CTSS). These works (including later works on
Multics) were subsequently nominated for a Turing Award. The scheduler was
subsequently improved and acquired the appearance that can be found already in
some modern systems.

The MLFQ algorithm tries to solve 2 fundamental overlapping problems.
At first, it tries to optimize the turnaround time, which, as we discussed in the previous lecture, is optimized by the method of starting at the head of the queue most
short tasks. However, the OS does not know how long this or that process will run, and this
necessary knowledge for the operation of SJF, STCF algorithms. Secondly, MLFQ tries
make the system responsive for users (for example, for those who sit and
staring at the screen while waiting for the task to complete) and thus minimize the time
response. Unfortunately, algorithms like RR reduce response time, but
have a bad effect on the turnaround time metric. Hence our problem: How to design
scheduler that will meet our requirements and at the same time know nothing about
the nature of the process, in general? How can the scheduler learn the characteristics of tasks,
which it launches and thus make better scheduling decisions?

The essence of the problem: How to plan the setting of tasks without perfect knowledge?
How to design a scheduler that simultaneously minimizes response time
for interactive tasks and at the same time minimizes turnaround time without knowing
knowledge of task execution time?

Note: learning from previous events

The MLFQ queue is an excellent example of a system that is trained on
past events to predict the future. Such approaches are often
found in the OS (And many other branches in computer science, including branches
hardware predictions and caching algorithms). Similar hikes
trigger when tasks have behavioral phases and are thus predictable.
However, one should be careful with this technique, because predictions are very easy.
may turn out to be wrong and lead the system to make worse decisions than
would be without knowledge at all.

MLFQ: Basic Rules

Consider the basic rules of the MLFQ algorithm. And although the implementations of this algorithm
there are several, the basic approaches are similar.
In the implementation that we will consider, MLFQ will have several
separate queues, each of which will have a different priority. Anytime,
the task ready for execution is in the same queue. MLFQ uses priorities,
to decide which task to run for execution, i.e. task with higher
priority (a task from the queue with the highest priority) will be launched at the first
turn.
Of course, there can be more than one task in a particular queue, so
so they will have the same priority. In this case, the mechanism will be used
RR for launch planning among these tasks.
Thus we arrive at two basic rules for MLFQ:

  • Rule1: If priority(A) > Priority(B), task A will run (B will not)
  • Rule2: If priority(A) = Priority(B), A&B are started using RR

Based on the above, the key elements to planning an MLFQ are
are priorities. Instead of giving a fixed priority to each
task, MLFQ changes its priority depending on the observed behavior.
For example, if a task constantly quits on the CPU while waiting for keyboard input,
MLFQ will keep the process priority high because that's how
interactive process should work. If, on the contrary, the task is constantly and
is CPU intensive for a long period, MLFQ will downgrade it
a priority. Thus, MLFQ will study the behavior of processes at the time they are running.
and use behaviors.
Let's draw an example of what the queues might look like at some point
time and then you get something like this:
Operating Systems: Three Easy Pieces. Part 5: Planning: Multi-Level Feedback Queue (translation)

In this scheme, 2 processes A and B are in the queue with the highest priority. Process
C is somewhere in the middle, and process D is at the very end of the queue. According to the above
descriptions of the MLFQ algorithm, the scheduler will only execute tasks with the highest
priority according to RR, and tasks C, D will be out of work.
Naturally, a static snapshot will not give a complete picture of how MLFQ works.
It is important to understand exactly how the picture changes over time.

Attempt 1: How to change the priority

At this point, you need to decide how MLFQ will change the priority level
task (and thus the position of the task in the queue) during its life cycle. For
of this, you need to keep in mind the workflow: a certain amount
interactive tasks with short running times (and thus frequent release
CPU) and several long tasks that use the CPU all their working time, while
response time for such tasks is not important. And so you can make the first attempt
implement the MLFQ algorithm with the following rules:

  • Rule3: When a task enters the system, it is placed in the queue with the highest
  • priority.
  • Rule4a: If a task uses its entire time window, then it
  • the priority is lowered.
  • Rule4b: If a Task releases the CPU before its time window expires, then it
  • remains with the same priority.

Example 1: Single long-running task

As you can see in this example, the task at admission is set with the highest
priority. After a 10ms time window, the process is downgraded in priority.
scheduler. After the next time window, the task is finally downgraded to
lowest priority in the system, where it remains.
Operating Systems: Three Easy Pieces. Part 5: Planning: Multi-Level Feedback Queue (translation)

Example 2: Picked up a short task

Now let's see an example of how MLFQ will try to approach SJF. In that
example, two tasks: A, which is a long-running task constantly
occupying CPU and B, which is a short interactive task. Suppose
that A had already been running for some time by the time task B arrived.
Operating Systems: Three Easy Pieces. Part 5: Planning: Multi-Level Feedback Queue (translation)

This graph shows the results of the scenario. Task A, like any task,
using the CPU was at the very bottom. Task B will arrive at time T=100 and will
placed in the highest priority queue. Since the running time is short,
it will complete before it reaches the last queue.

From this example, you should understand the main goal of the algorithm: since the algorithm does not
knows a long task or a short one, then first of all he assumes that the task
short and gives it the highest priority. If it's really a short task, then
it will run quickly, otherwise if it is a long task it will move slowly
in priority down and will soon prove that she is indeed a long task that does not
requires a response.

Example 3: What about I/O?

Now let's look at an I/O example. As stated in rule 4b,
if a process releases the processor without having fully used its processor time,
then it remains at the same priority level. The intent of this rule is pretty simple.
- if the interactive job performs a lot of I/O, for example, waiting for
from the user keystrokes or mouse, such a task will free the processor
before the allotted window. We would not like to omit such a priority task,
and thus it will remain at the same level.
Operating Systems: Three Easy Pieces. Part 5: Planning: Multi-Level Feedback Queue (translation)

This example shows how the algorithm would work with such processes - interactive task B, which only needs the CPU for 1ms before executing
I/O process and a long job A, which uses the CPU all the time.
MLFQ keeps process B at the highest priority as it continues to
release the CPU. If B is an interactive task, then the algorithm in this case has reached
its purpose is to launch interactive tasks quickly.

Problems with the current MLFQ algorithm

In the previous examples, we have built a basic version of MLFQ. And it seems that he
does its job well and fairly, distributing CPU time fairly between
long tasks and allowing short tasks or tasks that are heavily accessed
to I/O to process quickly. Unfortunately, this approach contains several
serious problems.
At first, the problem of hunger: if the system will have many interactive
tasks, they will consume all the CPU time and thus not a single long
the task will not get a chance to be executed (they are starving).

Secondly, smart users could write their programs so that
fool the scheduler. The deceit lies in doing something in order to force
scheduler to give the process more CPU time. The algorithm that
described above is quite vulnerable to such attacks: before the time window is practically
over, you need to perform an I / O operation (to some, no matter which file)
and thus free up the CPU. Such behavior will allow you to stay in the same
the queue itself and again get a larger percentage of CPU time. If done
this is correct (e.g. run 99% of the window time before releasing the CPU),
such a task can simply monopolize the processor.

Finally, a program can change its behavior over time. Those tasks
that used the CPU can become interactive. In our example, similar
tasks will not receive proper treatment from the scheduler, as others would
(original) interactive tasks.

Question to the audience: what attacks on the scheduler could be made in the modern world?

Attempt 2: Increase the priority

Let's try to change the rules and see if we can avoid problems with
starvation. What can we do to ensure that related
CPU tasks will get their time (even if not long).
As a simple solution to the problem, you can suggest periodically
increase the priority of all such tasks in the system. There are many ways
to achieve this, let's try to implement something simple as an example: translate
all tasks at once to the highest priority, hence the new rule:

  • Rule5: After some period S, transfer all tasks in the system to the highest queue.

Our new rule solves two problems at once. First, the processes
guaranteed not to starve: tasks in the highest queue will share
processor time according to the RR algorithm and thus all processes will receive
processor time. Second, if some process that previously used
only the processor becomes interactive, it will remain in the queue with the highest
priority after receiving a boost to the highest priority once.
Consider an example. In this scenario, consider a single process using
Operating Systems: Three Easy Pieces. Part 5: Planning: Multi-Level Feedback Queue (translation)

CPU and two interactive, short processes. On the left in the figure, the figure shows the behavior without priority boost, and thus the long running task begins to starve after two interactive tasks arrive on the system. In the figure on the right, every 50ms a priority increase is performed and thus all processes are guaranteed to receive processor time and will be periodically started. 50ms in this case is taken as an example, in reality this number is somewhat higher.
It is obvious that the addition of the periodic rise time S leads to
logical question: what value should be set? One of the well-deserved
systems engineers John Ousterhout referred to similar quantities in systems as voo-doo
constant, since they in some way required black magic for the correct
exposure. And, unfortunately, S has such a flavor. If you set the value too
large - long tasks will starve. And if you set it too low,
interactive tasks will not receive proper CPU time.

Attempt 3: Better Accounting

Now we have one more problem to solve: how not to
allow to cheat our scheduler? The culprits for this possibility are
rules 4a, 4b that allow a job to keep its priority by freeing up the processor
before the expiration of the allotted time. How to deal with it?
The solution in this case can be considered a better accounting of CPU time on each
MLFQ level. Instead of forgetting the time the program used
processor for the allotted interval, you should take into account and save it. After
the process has used up its allotted time, it should be demoted to the next
priority level. Now it does not matter how the process will use its time - how
constantly computing on the processor or as a set of calls. Thus,
rule 4 should be rewritten as follows:

  • Rule4: After a task has used up its allocated time in the current queue (regardless of how many times it has released the CPU), the priority of such a task is reduced (it moves to the bottom of the queue).

Let's look at an example:
Operating Systems: Three Easy Pieces. Part 5: Planning: Multi-Level Feedback Queue (translation)»

The figure shows what happens if you try to trick the scheduler like
if it were with the previous rules 4a, 4b would be the result on the left. With new
the rule is the result is on the right. Prior to protection, any process could call I/O before completion and
thus dominate the CPU, after enabling protection, regardless of the behavior
I/O, he will still go down in the queue and thus will not be able to dishonestly
take over CPU resources.

Improving MLFQ and other issues

With the above improvements, new problems arise: one of the main
questions - how to parameterize such a scheduler? Those. How much should be
queues? What should be the size of the program window within the queue? How
program should often be prioritized to avoid starvation and
to take into account the change in the behavior of the program? To these questions, there is no simple
response and only experiments with loads and subsequent configuration
scheduler can lead to some satisfactory balance.

For example, most MLFQ implementations allow you to assign different
time slots for different queues. High priority queues are usually
short intervals. These queues consist of interactive tasks,
switching between which is quite sensitive and should take 10 or less
ms. In contrast, low-priority queues consist of long-running tasks that use
CPU. And in this case, long time intervals fit very well (100ms).
Operating Systems: Three Easy Pieces. Part 5: Planning: Multi-Level Feedback Queue (translation)

In this example, there are 2 tasks that have worked in high priority queue 20
ms divided into 10ms windows. 40ms in the middle queue (20ms window) and in the low priority queue
queue time window became 40ms where the tasks completed their work.

The implementation of MLFQ in the Solaris OS is a class of time-shared schedulers.
The scheduler will provide a set of tables that define exactly how it should
change the priority of the process over the course of its life, what should be the size
window to be allocated and how often to raise task priorities. Administrator
system can interact with this table and make the scheduler behave
differently. By default, this table has 60 queues with a gradual increase
window size from 20ms (high priority) to several hundred ms (lowest priority), and
also with the boost of all tasks once per second.

Other MLFQ schedulers do not use a table or any specific
the rules that are described in this chapter, on the contrary, they calculate priorities using
mathematical formulas. For example, the scheduler in FreeBSD uses a formula for
calculating the current task priority based on how much the process
used the CPU. In addition, CPU usage rots over time, and thus
Thus, the priority increase is somewhat different than described above. This is true
called decay algorithms. As of version 7.1, FreeBSD uses the ULE scheduler.

Finally, many planners have other features. For example, some
schedulers reserve higher levels for the operation of the operating system and thus
Thus, no user process can get the highest priority in
system. Some systems allow you to give advice to help
the scheduler to correctly prioritize. For example, using the command nice
you can increase or decrease the priority of a task and thus increase or decrease
reduce the chances of the program for CPU time.

MLFQ: Summary

We have described a planning approach called MLFQ. His name
concluded in the principle of operation - it has several queues and uses feedback
to prioritize a task.
The final form of the rules will be as follows:

  • Rule1: If priority(A) > Priority(B), task A will run (B will not)
  • Rule2: If priority(A) = Priority(B), A&B are started using RR
  • Rule3: When a task enters the system, it is placed in the highest priority queue.
  • Rule4: After a task has used up its allocated time in the current queue (regardless of how many times it has released the CPU), the priority of such a task is reduced (it moves to the bottom of the queue).
  • Rule5: After some period S, transfer all tasks in the system to the highest queue.

MLFQ is interesting for the following reason - instead of requiring knowledge about
nature of the task in advance, the algorithm learns the past behavior of the task and sets
priorities accordingly. Thus, he tries to sit on two chairs at once - to achieve performance for small tasks (SJF, STCF) and honestly run long ones,
CPU-loading jobs. Therefore, many systems, including BSD and their derivatives,
Solaris, Windows, Mac use some form of algorithm as the scheduler
MLFQ as a baseline.

Additional materials:

  1. manpages.debian.org/stretch/manpages/sched.7.en.html
  2. en.wikipedia.org/wiki/Scheduling_(computing)
  3. pages.lip6.fr/Julia.Lawall/atc18-bouron.pdf
  4. www.usenix.org/legacy/event/bsdcon03/tech/full_papers/roberson/roberson.pdf
  5. chebykin.org/freebsd-process-scheduling

Source: habr.com

Add a comment