Your first neural network on a graphics processing unit (GPU). Beginner's Guide

Your first neural network on a graphics processing unit (GPU). Beginner's Guide
In this article, I will show you how to set up a machine learning environment in 30 minutes, create a neural network for image recognition, and then run the same network on a graphics processing unit (GPU).

First, let's define what a neural network is.

In our case, this is a mathematical model, as well as its software or hardware implementation, built on the principle of organization and functioning of biological neural networks - networks of nerve cells of a living organism. This concept arose when studying the processes occurring in the brain, and when trying to model these processes.

Neural networks are not programmed in the usual sense of the word, they are trained. The ability to learn is one of the main advantages of neural networks over traditional algorithms. Technically, learning is about finding the coefficients of connections between neurons. In the learning process, the neural network is able to identify complex relationships between inputs and outputs, as well as perform generalization.

From the point of view of machine learning, a neural network is a special case of pattern recognition methods, discriminant analysis, clustering methods, and other methods.

Equipment

Let's take a look at the hardware first. We need a server with the Linux operating system installed on it. Equipment for the operation of machine learning systems requires quite powerful and, as a result, expensive. For those who do not have a good machine at hand, I recommend paying attention to the offer of cloud providers. The necessary server can be rented quickly and you pay only for the time of use.

In projects where it is necessary to create neural networks, I use the servers of one of the Russian cloud providers. The company offers for rent cloud servers specifically for machine learning with powerful graphics processors (GPU) Tesla V100 from NVIDIA. In short: using a server with a GPU can be dozens of times more efficient (faster) than a similarly priced server that uses a CPU (the well-known central processing unit) for computing. This is achieved due to the peculiarities of the GPU architecture, which copes with the calculations faster.

To run the examples described below, we purchased the following server for a few days:

  • SSD drive 150 GB
  • Everything 32 GB
  • Processor Tesla V100 16 Gb with 4 cores

Ubuntu 18.04 was installed on the machine.

Setting up the environment

Now let's install everything necessary for work on the server. Since our article is primarily for beginners, I will talk about some points in it that will be useful for them.

A lot of the work in setting up the environment is done through the command line. Most of the users use Windows as their working OS. The standard console in this OS leaves much to be desired. Therefore, we will use a handy tool cmder/. Download the mini version and run Cmder.exe. Next, you need to connect to the server via SSH:

ssh root@server-ip-or-hostname

Instead of server-ip-or-hostname specify the IP address or DNS name of your server. Next, we enter the password and upon successful connection, we should receive something like this message.

Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 4.15.0-74-generic x86_64)

The main language for developing ML models is Python. And the most popular platform for its use on Linux is Anaconda.

Let's install it on our server.

We start by updating the local package manager:

sudo apt-get update

Install curl (command line utility):

sudo apt-get install curl

Download the latest version of Anaconda Distribution:

cd /tmp
curl –O https://repo.anaconda.com/archive/Anaconda3-2019.10-Linux-x86_64.sh

We start the installation:

bash Anaconda3-2019.10-Linux-x86_64.sh

During the installation process, you will be asked to confirm the license agreement. Upon successful installation, you should see this:

Thank you for installing Anaconda3!

For the development of ML models, many frameworks have now been created, we work with the most popular ones: PyTorch ΠΈ Tensorflow.

Using the framework allows you to increase the speed of development and use ready-made tools for standard tasks.

In this example, we will work with PyTorch. Let's install it:

conda install pytorch torchvision cudatoolkit=10.1 -c pytorch

Now we need to launch Jupyter Notebook, a development tool popular with ML specialists. It allows you to write code and immediately see the results of its execution. Jupyter Notebook is included with Anaconda and is already installed on our server. You need to connect to it from our desktop system.

To do this, we will first start Jupyter on the server, specifying port 8080:

jupyter notebook --no-browser --port=8080 --allow-root

Next, opening another tab in our Cmder console (the top menu is New console dialog), we will connect to the server via SSH on port 8080:

ssh -L 8080:localhost:8080 root@server-ip-or-hostname

When we enter the first command, we will be prompted with links to open Jupyter in our browser:

To access the notebook, open this file in a browser:
        file:///root/.local/share/jupyter/runtime/nbserver-18788-open.html
    Or copy and paste one of these URLs:
        http://localhost:8080/?token=cca0bd0b30857821194b9018a5394a4ed2322236f116d311
     or http://127.0.0.1:8080/?token=cca0bd0b30857821194b9018a5394a4ed2322236f116d311

Let's use the link for localhost:8080. Copy the full path and paste it into the address bar of your PC's local browser. Jupyter Notebook will open.

Let's create a new notebook: New - Notebook - Python 3.

Let's check the correct operation of all the components that we installed. Let's enter the PyTorch code example into Jupyter and start the execution (Run button):

from __future__ import print_function
import torch
x = torch.rand(5, 3)
print(x)

The result should be something like this:

Your first neural network on a graphics processing unit (GPU). Beginner's Guide

If you have a similar result, then we have everything set up correctly and we can start developing a neural network!

Create a neural network

We will create a neural network for image recognition. Let's take this as a basis guide.

To train the network, we will use the publicly available CIFAR10 dataset. It has classes: "airplane", "car", "bird", "cat", "deer", "dog", "frog", "horse", "ship", "truck". Images in CIFAR10 are 3x32x32, that is, 3-channel color images of 32x32 pixels.

Your first neural network on a graphics processing unit (GPU). Beginner's Guide
For work, we will use the package created by PyTorch for working with images - torchvision.

We will do the following steps in order:

  • Loading and normalizing training and test datasets
  • Neural network definition
  • Network training on training data
  • Network testing on test data
  • Let's repeat the training and testing using the GPU

We will be executing all of the code below in a Jupyter Notebook.

Loading and normalizing CIFAR10

Copy and execute the following code in Jupyter:


import torch
import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

The answer should be:

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz
Extracting ./data/cifar-10-python.tar.gz to ./data
Files already downloaded and verified

Let's output some training images for testing:


import matplotlib.pyplot as plt
import numpy as np

# functions to show an image

def imshow(img):
    img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.show()

# get some random training images
dataiter = iter(trainloader)
images, labels = dataiter.next()

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))

Your first neural network on a graphics processing unit (GPU). Beginner's Guide

Neural network definition

Let us first consider how the neural network for image recognition works. This is a simple feed-forward network. It takes input, runs it through multiple layers one by one, and then finally produces output.

Your first neural network on a graphics processing unit (GPU). Beginner's Guide

Let's create a similar network in our environment:


import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Net()

We also define the loss function and the optimizer


import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

Network training on training data

Let's start training our neural network. I draw your attention to the fact that after you run this code for execution, you will need to wait a while until the work is completed. It took me 5 minutes. It takes time to train the network.

 for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

We get the following result:

Your first neural network on a graphics processing unit (GPU). Beginner's Guide

Save our trained model:

PATH = './cifar_net.pth'
torch.save(net.state_dict(), PATH)

Network testing on test data

We trained the network using the training data set. But we need to check if the network has learned anything at all.

We will test this by predicting the class label that the neural network outputs and testing it to see if it is true. If the prediction is correct, we add the sample to the list of correct predictions.
Let's display an image from the test set:

dataiter = iter(testloader)
images, labels = dataiter.next()

# print images
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(4)))

Your first neural network on a graphics processing unit (GPU). Beginner's Guide

Now let's ask the neural network to tell us what is in these pictures:


net = Net()
net.load_state_dict(torch.load(PATH))

outputs = net(images)

_, predicted = torch.max(outputs, 1)

print('Predicted: ', ' '.join('%5s' % classes[predicted[j]]
                              for j in range(4)))

Your first neural network on a graphics processing unit (GPU). Beginner's Guide

The results seem pretty good: the network correctly identified three out of four pictures.

Let's see how the network performs across the entire dataset.


correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))

Your first neural network on a graphics processing unit (GPU). Beginner's Guide

It looks like the network knows something and works. If it determined the classes at random, then the accuracy would be 10%.

Now let's see which classes the network defines better:

class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1


for i in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))

Your first neural network on a graphics processing unit (GPU). Beginner's Guide

The network seems to be the best at detecting cars and ships: 71% accuracy.

So the network is working. Now let's try to transfer its work to the graphics processor (GPU) and see what changes.

Training a neural network on the GPU

First, I will briefly explain what CUDA is. CUDA (Compute Unified Device Architecture) is a parallel computing platform developed by NVIDIA for general computing on graphics processing units (GPUs). With CUDA, developers can dramatically accelerate computing applications by harnessing the power of GPUs. On our server, which we purchased, this platform is already installed.

Let's first define our GPU as the first visible cuda device.

device = torch . device ( "cuda:0" if torch . cuda . is_available () else "cpu" )
# Assuming that we are on a CUDA machine, this should print a CUDA device:
print ( device )

Your first neural network on a graphics processing unit (GPU). Beginner's Guide

We send the network to the GPU:

net.to(device)

We also have to send inputs and targets at each step and to the GPU:

inputs, labels = data[0].to(device), data[1].to(device)

Let's start retraining the network already on the GPU:

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
    inputs, labels = data[0].to(device), data[1].to(device)

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

This time, network training lasted about 3 minutes. Recall that the same stage on a conventional processor lasted 5 minutes. The difference is not significant, it happens because our network is not that big. When using large arrays for training, the difference between the speed of the GPU and the traditional processor will increase.

This seems to be everything. What we managed to do:

  • We looked at what a GPU is and chose the server on which it is installed;
  • We have set up a software environment for creating a neural network;
  • We created a neural network for image recognition and trained it;
  • We repeated the training of the network using the GPU and got an increase in speed.

I will be glad to answer questions in the comments.

Source: habr.com

Add a comment