Virtual file systems in Linux: why are they needed and how do they work? Part 1

Hi all! We continue to launch new streams for the courses you have already fallen in love with and now we are in a hurry to announce that we are starting a new set of courses "Linux Administrator"which will launch at the end of April. A new publication will be dated for this event. With the original material, you can read here.

Virtual file systems serve as a kind of magical abstraction that allows the philosophy of Linux to say that "everything is a file."

Virtual file systems in Linux: why are they needed and how do they work? Part 1

What is a file system? Based on the words of one of the first contributors and authors of Linux Roberta Lava, "A file system is a hierarchical storage of data assembled according to a specific structure." Be that as it may, this definition is equally well suited to VFAT (Virtual File Allocation Table), Git and Cassandra (NoSQL database). So what exactly defines such a thing as a "file system"?

Filesystem Basics

The Linux kernel has certain requirements for an entity that can be considered a file system. It must implement the methods open(), read() ΠΈ write() for persistent objects that have names. From an object-oriented point of view programming, the kernel defines a generic filesystem as an abstract interface, and these three large functions are considered "virtual" and have no concrete definition. Accordingly, the default file system implementation is called a virtual file system (VFS).

Virtual file systems in Linux: why are they needed and how do they work? Part 1

If we can open, read, and write to an entity, then that entity is considered a file, as we can see from the example in the console above.
The VFS phenomenon only underscores the Unix-like observation that "everything is a file". Think how weird that that little /dev/console example above shows how the console actually works. The picture shows an interactive Bash session. Sending a string to the console (virtual console device) displays it on a virtual screen. VFS has other, even stranger properties. For example, it allows you to search by These.

Familiar systems such as ext4, NFS, and /proc have three important functions in a C data structure called file_operations. In addition, certain file systems extend and redefine VFS functions in a familiar object-oriented way. As Robert Love points out, the VFS abstraction allows Linux users to nonchalantly copy files to or from third-party operating systems or abstract entities like pipes without worrying about their internal data format. On the user side (userspace), using a system call, a process can copy from a file to kernel data structures using the method read() one file system and then use the method write () another file system for data output.

The function definitions that belong to the base VFS types are in the files fs/*.c kernel source code, while subdirectories fs/ contain certain file systems. The core also contains entities such as cgroups, /dev ΠΈ tmpfs, which are required during the boot process and are therefore defined in the kernel subdirectory init/. Notice that cgroups, /dev ΠΈ tmpfs do not call the "big three" functions file_operations, but directly read and write to memory.
The diagram below shows how userspace accesses the different types of filesystems commonly mounted on Linux systems. Structures not shown pipes, dmesg ΠΈ POSIX clocks, which also implement the structure file_operations, accessed through the VFS layer.

Virtual file systems in Linux: why are they needed and how do they work? Part 1

VFS is a "wrapper layer" between system calls and implementations of certain file_operationsSuch as ext4 ΠΈ procfs. Functions file_operations can interact with either device drivers or memory access devices. tmpfs, devtmpfs ΠΈ cgroups do not use file_operations, but directly access the memory.
The existence of VFS provides an opportunity to reuse code, since the basic methods associated with file systems do not have to be re-implemented by each type of file system. Code reuse is a common practice among software engineers! However, if the reusable code contains serious mistakes, all implementations that inherit common methods suffer from them.

/tmp: Simple hint

An easy way to detect that VFS are present on a system is to type mount | grep -v sd | grep -v :/, which will show all mounted (mounted) filesystems that are not disk-resident and non-NFS, which is true on most computers. One of the listed mounts (mounts) VFS will undoubtedly /tmp, right?

Virtual file systems in Linux: why are they needed and how do they work? Part 1

Everyone knows that storage / tmp on a physical medium - madness! Source.

Why is it undesirable to store /tmp on physical media? Because the files in /tmp are temporary and storage devices are slower than the memory where tmpfs is created. Moreover, physical media is more prone to overwriting wear than memory. Finally, files in /tmp can contain sensitive information, so making them disappear on every reboot is an essential feature.

Unfortunately, some Linux distribution installation scripts create /tmp on the storage device by default. Don't despair if this happened to your system as well. Follow a few simple instructions with Arch wikito fix this, and be aware that the memory allocated for tmpfs becomes unavailable for other purposes. In other words, a system with a giant tmpfs and large files on it can run out of memory and crash. Another hint: while editing a file /etc/fstab, remember that it must end with a newline, otherwise your system will not boot.

/proc and /sys

Besides /tmp, VFS (virtual file systems) that are most familiar to Linux users are /proc ΠΈ /syssystem. (/dev resides in shared memory and does not have file_operations). Why these two components? Let's look into this issue.

procfs creates a snapshot of the kernel and the processes it monitors for userspace. In /proc the kernel prints information about what it has available, such as interrupts, virtual memory, and the scheduler. Besides, /proc/sys is the place where the parameters configured with the command sysctl, available for userspace. The status and statistics of individual processes are displayed in directories /proc/.

Virtual file systems in Linux: why are they needed and how do they work? Part 1

Here /proc/meminfo is an empty file that nevertheless contains valuable information.

Behavior /proc files shows how different VFS disk file systems can be. On the one side, /proc/meminfo contain information that can be viewed with the command free. On the other hand, it's empty! How does it work? The situation is reminiscent of the famous article titled Does the moon exist when no one is looking at it? Reality and Quantum Theory"written by Cornell University physics professor David Mermin in 1985. The fact is that the kernel collects memory statistics when a request is made to /proc, and actually in files /proc there is nothing when no one is looking. As said Mermin, "Fundamental quantum doctrine says that measurement generally does not reveal a pre-existing value of the property being measured." (And consider the question about the moon as homework!)
Seeming emptiness procfs makes sense because the information there is dynamic. A slightly different situation with sysfs. Let's compare how many files that are at least one byte in size are in /proc and /sys.

Virtual file systems in Linux: why are they needed and how do they work? Part 1

Procfs has one file, namely the exported kernel configuration, which is an exception because it only needs to be generated once per boot. On the other hand, in /sys there are many larger files, many of which take up an entire page of memory. Usually files sysfs contain exactly one number or line, unlike tables of information obtained from reading files such as /proc/meminfo.

Goal sysfs - provide read/write properties of what the kernel calls Β«kobjectsΒ» in userspace. The only goal kobjects is link counting: when the last link to a kobject is removed, the system will restore the resources associated with it. Nevertheless, /sys makes up most of the famous "stable ABI for userspace" core, which no one can ever, under any circumstances "break". This does not mean that files in sysfs are static, which would be inconsistent with reference counting of unstable objects.
The kernel's stable ABI limits what can appear in /sys, not what is actually present at that particular moment. Listing file permissions in sysfs provides insight into how configurable settings for devices, modules, filesystems, etc. can be configured or read. The logical conclusion is that procfs is also part of the kernel's stable ABI, although this is not explicitly stated in documentation.

Virtual file systems in Linux: why are they needed and how do they work? Part 1

Files in sysfs describe one particular property for each entity and can be readable, writable, or both. "0" in the file means that the SSD cannot be removed.

Let's start the second part of the translation with how to monitor VFS using the eBPF and bcc tools, and now we are waiting for your comments and traditionally invite you to open webinar, which will be held by our teacher on April 9 - Vladimir Drozdetsky.

Source: habr.com

Add a comment