NBD-VRAM Published

An open source project has been published NBD-VRAM, which allows you to use part of the NVIDIA GPU video memory as swap space in LinuxThe project is primarily aimed at laptops with soldered RAM, where the RAM cannot be expanded, but the system has a discrete NVIDIA RTX/GTX graphics card with unused VRAM. The code is written in C and shell and is distributed under a license. MIT.

The idea behind NBD-VRAM is simple: if the system is already starting to swap to the SSD, you can add another intermediate layer—video memory—in front of the SSD. The author gives the example of a laptop. RTX 3070 LaptopOf the 8 GB of VRAM, 7 GB were allocated for swap, and with RAM, zRAM, and SSD swap, the system had a total of approximately 46 GB of addressable memory. The expected order of overflow is as follows: RAM is used first, then VRAM as fast swap, then zRAM, and only then the SSD.

Technically, NBD-VRAM doesn't add a new kernel driver. A small daemon allocates video card memory via CUDA Driver API, then gives it to the core Linux as a block device through NBD — Network Block Device — over a Unix socket. After connecting with the standard nbd-client, /dev/nbdX appears in the system, which can be mapped as a regular swap drive using mkswap and enabled using swapon.

The author emphasizes that this approach was chosen due to limitations of NVIDIA consumer graphics cards. A more direct approach via the NVIDIA P2P API on GeForce, he notes, runs into EINVAL, as the corresponding capabilities are effectively only available for professional and server models. Directly accessing BAR1 also failed: only a small mapped region is accessible, and reading from the rest returns zeros. The NBD approach circumvents this limitation by using standard CUDA copy operations cuMemcpyHtoD and cuMemcpyDtoH.

Capabilities

  • Using VRAM as normal Linux swap. Once started, the daemon exposes video memory as /dev/nbd0 or another NBD device, which appears to the kernel as a standard block device.

  • Works without its own kernel module. The project does not require writing, building, and maintaining a separate kernel module, does not use internal NVIDIA driver symbols, and should survive kernel and driver updates without rebuilding.

  • Focused on consumer NVIDIA GPUs. Requirements include a CUDA-enabled NVIDIA GPU, including consumer RTX/GTX cards, an official NVIDIA driver with libcuda.so.1, and the nbd kernel module. Linux, nbd-client, gcc, and make. The CUDA Toolkit is not required.

  • systemd integration. Installation via install.sh adds the vram-swap-nbd service, which can be started via systemctl; after installation, the service is enabled to start automatically at boot.

  • Setting swap size and priority. In systemd-unit, you can set VRAM_SETUP_SIZE_MB, which is the upper limit of the allocated VRAM, and VRAM_SWAP_PRIORITY, which is the swap device priority. The higher the priority, the sooner Linux will use this swap layer.

  • Automatically reduce the requested size. If the required amount of VRAM is not available, the daemon attempts to reduce the size in 512 MiB blocks to still allocate the available amount, for example if some of the memory is already occupied by the compositor or graphics session.

  • Test scenarios. The repository contains test-nbd.sh for a smoke test with 1 MiB read/write and test-fill.sh for stress testing the entire VRAM partition.

  • The declared performance is about 1,3 GB/s. On an RTX 3070 Laptop, the author measured sequential writes of 7GB in 4MB blocks at approximately 1,3GB/s.

Application scenarios

Laptops with soldered RAM. The main scenario is modern laptops, where 16 or 32 GB of RAM is no longer sufficient, but expansion is not possible. If such a machine has a discrete RTX graphics card, some of the VRAM can be used as an additional swap layer. This doesn't turn the VRAM into full-fledged RAM, but it can prevent the system from suddenly switching to slow SSD swap or from OOM-killer crashes under peak loads.

Heavy developer work environments. IDEs, browsers with dozens of tabs, Docker containers, local databases, large project builds, and test environments can easily create short-term memory consumption spikes. In such a scenario, NBD-VRAM can act as a buffer: not speeding up regular work, but rather softening the moment when RAM runs out.

Reduced load on SSD swap. Frequent use of swap on an SSD not only slows down the system but also creates unnecessary writes to the drive. VRAM swap can be set to a higher priority so that when RAM is full, the system first flushes pages to video memory and then accesses the SSD. This is especially important for laptops, where the SSD is often non-removable or expensive to replace.

Combination with zram. The author clearly describes a scheme where VRAM swap receives higher priority and receives the first memory "spill," zRAM is used next, and the SSD remains the last line of defense. This scheme can be useful for workstations and laptops, where maintaining system responsiveness under memory pressure is more important than achieving maximum latency predictability.

Local AI/LLM tasks around the GPU, but not instead of VRAM for the model. NBD-VRAM does not increase the video memory available to a CUDA application as VRAM for the model. This is the reverse scenario: instead of RAM being used as VRAM, VRAM is used as swap space for regular memory. LinuxTherefore, the project won't allow directly loading a larger model onto the GPU. However, it could be useful on a machine running a browser, IDE, indexers, Python environments, and containers alongside LLM inference, and where system RAM is starting to run low.

Home and experimental workstations. The project is of interest to users whose graphics cards are often idle outside of gaming, rendering, or ML tasks. For example, 8–12 GB of VRAM on a desktop GeForce can be temporarily converted into an additional swap layer for heavy compilation, data processing, or launch tasks. virtual machines.

Restrictions

NBD-VRAM is not a replacement for RAM. Access to this swap space follows the chain: kernel swap → /dev/nbdX → nbd driver → Unix socket → daemon → CUDA copy → VRAM, so latency and behavior will differ from real RAM. It's more of a fallback or intermediate layer between RAM and SSD than a way to "add memory" without consequences.

The project also relies on NVIDIA's official CUDA stack. Nouveau/Nova aren't suitable for this, as they require libcuda.so.1. Phoronix also notesthat NBD-VRAM is designed specifically for consumer NVIDIA GPUs, where alternative approaches via the NVIDIA P2P API do not work.

In the end, NBD-VRAM is a small but interesting system hack for LinuxIt doesn't perform miracles and doesn't replace a RAM upgrade, but it does allow you to use idle video memory as an additional swap space before the SSD. For laptops with soldered memory and a discrete RTX card, this could be a practical way to handle peak loads without immediate application crashes or the painful loss of memory due to a slower drive.

Source: linux.org.ru

Buy reliable hosting for sites with DDoS protection, VPS VDS servers 🔥 Buy reliable website hosting with DDoS protection, VPS VDS servers | ProHoster