Researchers from the University of Toronto have presented the first Rowhammer-class attack used to distort the contents of video memory. The ability to perform an attack that results in the substitution of up to 8 bits of data is demonstrated on a discrete NVIDIA A6000 GPU with GDDR6 video memory. As a practical example, it is shown how such a distortion can be used to interfere with the execution of machine learning models and significantly (from 80% to 0.1%) reduce the accuracy of their results by changing the value of just one bit.

Until now, creating Rowhammer attacks on video memory has been difficult due to the difficulty of determining the physical memory layout in GDDR chips, high memory access latencies (4 times slower) and higher memory refresh rates. In addition, research was hampered by the use of proprietary mechanisms in GDDR chips to protect against manipulations that lead to premature loss of charge, the analysis of which required the creation of special hardware test benches based on FPGA.
To study video memory, the researchers created a new technique for reverse engineering GDDR DRAM, and the attack itself uses memory access optimizations used to organize parallel computing, which were used as amplifiers of the intensity of access to individual cells. Low-level CUDA code was used to carry out the attack on NVIDIA GPUs.
The CUDA user code does not access physical memory addresses, but the NVIDIA driver does map virtual memory to the same physical memory, which the researchers used to calculate the virtual memory offset and determine the memory bank layout. Recognition of accesses to different memory banks was made by analyzing the delays, which differed when accessing the same or different memory banks.
In systems that share GPUs, such as servers To run machine learning models, the attack allows one user to modify the video memory contents of another user's data. As a security measure, NVIDIA recommends enabling error correction codes (ECC) using the "nvidia-smi -e 1" command or using graphics cards with On-Die ECC (OD-ECC) support, such as the GeForce RTX 50, RTX PRO, GB200, B200, B100, H100, H200, H20, and GH200.
According to the researchers, enabling ECC on systems with A6000 GPUs results in a 10% decrease in machine learning model performance and a 6.25% decrease in memory. The toolkit for conducting an attack and reverse engineering the low-level video memory layout on systems with NVIDIA GPUs is published on GitHub.
RowHammer attack allows to distort the contents of individual bits of DRAM memory by cyclically reading data from neighboring memory cells. Since DRAM memory is a two-dimensional array of cells, each of which consists of a capacitor and a transistor, performing continuous reading of the same memory area leads to voltage fluctuations and anomalies, causing a small loss of charge of neighboring cells. If the reading intensity is high, then the neighboring cell can lose a large enough amount of charge that the next refresh cycle will not have time to restore its original state, which will lead to a change in the value of the data stored in the cell.
The Rowhammer attack method was proposed in 2014, after which a game of "cat and mouse" began between security researchers and hardware manufacturers - memory chip manufacturers tried to block the vulnerability, and researchers found new ways to bypass it. For example, to protect against RowHammer, chip manufacturers added the TRR (Target Row Refresh) mechanism, but it turned out that it only blocks cell corruption in special cases, but does not protect against all possible attack variants. Attack methods were developed for DDR3, DDR4 and DDR5 chips on systems with Intel, AMD and ARM processors. Methods for bypassing ECC error correction were also found, and options for carrying out an attack over the network and through the execution of JavaScript code in the browser were proposed.
Source: opennet.ru
