Intel Xeon surpassed eight Tesla V100 several times when training a neural network

The CPU several times outperformed a bunch of eight GPUs at once in deep learning of neural networks. Sounds like something out of science fiction, right? But researchers at Rice University, using Intel Xeon, have proven it's real.

Intel Xeon surpassed eight Tesla V100 several times when training a neural network

GPUs have always been much better suited for deep learning of neural networks than CPUs. This is due to the architecture of GPUs, which consist of many small cores that are capable of executing a lot of small tasks in parallel, which is exactly what is required for training neural networks. But it turned out that central processing units, with the right approach, can be very effective in deep learning.

It is reported that when using the SLIDE deep learning algorithm, the Intel Xeon processor with 44 cores was 3,5 times faster than a bunch of eight NVIDIA Tesla V100 accelerators. Perhaps this is the first time that the CPU has not only equaled the GPU in such a scenario, but also surpassed them, and very noticeably.

A press release issued by the university states that the SLIDE algorithm does not need GPUs, as it uses a completely different approach. Typically, when training neural networks, a learning error backpropagation technique is used, which uses matrix multiplication, which is an ideal load for the GPU. In turn, SLIDE turns learning into a search problem, which is solved using hash tables.


Intel Xeon surpassed eight Tesla V100 several times when training a neural network

According to the researchers, this significantly reduces the computational cost of training neural networks. To get a point of reference, the researchers used the Rice University lab's eight Tesla V100 accelerator system to train the neural network using Google's TensorFlow library. The process took 3,5 hours. After, a similar neural network was trained using the SLIDE algorithm on a system with one 44-core Xeon processor, and it took only 1 hour.

It is worth noting here that Intel currently does not have 44-core processor models in its assortment. It is possible that the researchers used some kind of custom or not yet released chip, but this is unlikely. It is much more likely that a system with two 22-core Intel Xeons was used here, or simply a mistake was made in the press release, and we are talking about 44 threads, which were provided by one 22-core processor. But in any case, this does not detract from the achievement itself.

Of course, the SLIDE algorithm has yet to go through many tests and prove its effectiveness, as well as the absence of any features and pitfalls. However, what we see now is very impressive and can really greatly influence the development of the industry.



Source: 3dnews.ru

Add a comment