In this article, I will show you how to set up a machine learning environment in 30 minutes, create a neural network for image recognition, and then run the same network on a graphics processing unit (GPU).
First, let's define what a neural network is.
In our case, this is a mathematical model, as well as its software or hardware implementation, built on the principle of organization and functioning of biological neural networks - networks of nerve cells of a living organism. This concept arose when studying the processes occurring in the brain, and when trying to model these processes.
Neural networks are not programmed in the usual sense of the word, they are trained. The ability to learn is one of the main advantages of neural networks over traditional algorithms. Technically, learning is about finding the coefficients of connections between neurons. In the learning process, the neural network is able to identify complex relationships between inputs and outputs, as well as perform generalization.
From the point of view of machine learning, a neural network is a special case of pattern recognition methods, discriminant analysis, clustering methods, and other methods.
Equipment
Let's take a look at the hardware first. We need a server with the Linux operating system installed on it. Equipment for the operation of machine learning systems requires quite powerful and, as a result, expensive. For those who do not have a good machine at hand, I recommend paying attention to the offer of cloud providers. The necessary server can be rented quickly and you pay only for the time of use.
In projects where it is necessary to create neural networks, I use the servers of one of the Russian cloud providers. The company offers for rent cloud servers specifically for machine learning with powerful graphics processors (GPU) Tesla V100 from NVIDIA. In short: using a server with a GPU can be dozens of times more efficient (faster) than a similarly priced server that uses a CPU (the well-known central processing unit) for computing. This is achieved due to the peculiarities of the GPU architecture, which copes with the calculations faster.
To run the examples described below, we purchased the following server for a few days:
- SSD drive 150 GB
- Everything 32 GB
- Processor Tesla V100 16 Gb with 4 cores
Ubuntu 18.04 was installed on the machine.
Setting up the environment
Now let's install everything necessary for work on the server. Since our article is primarily for beginners, I will talk about some points in it that will be useful for them.
A lot of the work in setting up the environment is done through the command line. Most of the users use Windows as their working OS. The standard console in this OS leaves much to be desired. Therefore, we will use a handy tool
ssh root@server-ip-or-hostname
Instead of server-ip-or-hostname specify the IP address or DNS name of your server. Next, we enter the password and upon successful connection, we should receive something like this message.
Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 4.15.0-74-generic x86_64)
The main language for developing ML models is Python. And the most popular platform for its use on Linux is
Let's install it on our server.
We start by updating the local package manager:
sudo apt-get update
Install curl (command line utility):
sudo apt-get install curl
Download the latest version of Anaconda Distribution:
cd /tmp
curl βO https://repo.anaconda.com/archive/Anaconda3-2019.10-Linux-x86_64.sh
We start the installation:
bash Anaconda3-2019.10-Linux-x86_64.sh
During the installation process, you will be asked to confirm the license agreement. Upon successful installation, you should see this:
Thank you for installing Anaconda3!
For the development of ML models, many frameworks have now been created, we work with the most popular ones:
Using the framework allows you to increase the speed of development and use ready-made tools for standard tasks.
In this example, we will work with PyTorch. Let's install it:
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
Now we need to launch Jupyter Notebook, a development tool popular with ML specialists. It allows you to write code and immediately see the results of its execution. Jupyter Notebook is included with Anaconda and is already installed on our server. You need to connect to it from our desktop system.
To do this, we will first start Jupyter on the server, specifying port 8080:
jupyter notebook --no-browser --port=8080 --allow-root
Next, opening another tab in our Cmder console (the top menu is New console dialog), we will connect to the server via SSH on port 8080:
ssh -L 8080:localhost:8080 root@server-ip-or-hostname
When we enter the first command, we will be prompted with links to open Jupyter in our browser:
To access the notebook, open this file in a browser:
file:///root/.local/share/jupyter/runtime/nbserver-18788-open.html
Or copy and paste one of these URLs:
http://localhost:8080/?token=cca0bd0b30857821194b9018a5394a4ed2322236f116d311
or http://127.0.0.1:8080/?token=cca0bd0b30857821194b9018a5394a4ed2322236f116d311
Let's use the link for localhost:8080. Copy the full path and paste it into the address bar of your PC's local browser. Jupyter Notebook will open.
Let's create a new notebook: New - Notebook - Python 3.
Let's check the correct operation of all the components that we installed. Let's enter the PyTorch code example into Jupyter and start the execution (Run button):
from __future__ import print_function
import torch
x = torch.rand(5, 3)
print(x)
The result should be something like this:
If you have a similar result, then we have everything set up correctly and we can start developing a neural network!
Create a neural network
We will create a neural network for image recognition. Let's take this as a basis
To train the network, we will use the publicly available CIFAR10 dataset. It has classes: "airplane", "car", "bird", "cat", "deer", "dog", "frog", "horse", "ship", "truck". Images in CIFAR10 are 3x32x32, that is, 3-channel color images of 32x32 pixels.
For work, we will use the package created by PyTorch for working with images - torchvision.
We will do the following steps in order:
- Loading and normalizing training and test datasets
- Neural network definition
- Network training on training data
- Network testing on test data
- Let's repeat the training and testing using the GPU
We will be executing all of the code below in a Jupyter Notebook.
Loading and normalizing CIFAR10
Copy and execute the following code in Jupyter:
import torch
import torchvision
import torchvision.transforms as transforms
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
shuffle=True, num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
shuffle=False, num_workers=2)
classes = ('plane', 'car', 'bird', 'cat',
'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
The answer should be:
Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz
Extracting ./data/cifar-10-python.tar.gz to ./data
Files already downloaded and verified
Let's output some training images for testing:
import matplotlib.pyplot as plt
import numpy as np
# functions to show an image
def imshow(img):
img = img / 2 + 0.5 # unnormalize
npimg = img.numpy()
plt.imshow(np.transpose(npimg, (1, 2, 0)))
plt.show()
# get some random training images
dataiter = iter(trainloader)
images, labels = dataiter.next()
# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
Neural network definition
Let us first consider how the neural network for image recognition works. This is a simple feed-forward network. It takes input, runs it through multiple layers one by one, and then finally produces output.
Let's create a similar network in our environment:
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net()
We also define the loss function and the optimizer
import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
Network training on training data
Let's start training our neural network. I draw your attention to the fact that after you run this code for execution, you will need to wait a while until the work is completed. It took me 5 minutes. It takes time to train the network.
for epoch in range(2): # loop over the dataset multiple times
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the inputs; data is a list of [inputs, labels]
inputs, labels = data
# zero the parameter gradients
optimizer.zero_grad()
# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()
if i % 2000 == 1999: # print every 2000 mini-batches
print('[%d, %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 2000))
running_loss = 0.0
print('Finished Training')
We get the following result:
Save our trained model:
PATH = './cifar_net.pth'
torch.save(net.state_dict(), PATH)
Network testing on test data
We trained the network using the training data set. But we need to check if the network has learned anything at all.
We will test this by predicting the class label that the neural network outputs and testing it to see if it is true. If the prediction is correct, we add the sample to the list of correct predictions.
Let's display an image from the test set:
dataiter = iter(testloader)
images, labels = dataiter.next()
# print images
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(4)))
Now let's ask the neural network to tell us what is in these pictures:
net = Net()
net.load_state_dict(torch.load(PATH))
outputs = net(images)
_, predicted = torch.max(outputs, 1)
print('Predicted: ', ' '.join('%5s' % classes[predicted[j]]
for j in range(4)))
The results seem pretty good: the network correctly identified three out of four pictures.
Let's see how the network performs across the entire dataset.
correct = 0
total = 0
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy of the network on the 10000 test images: %d %%' % (
100 * correct / total))
It looks like the network knows something and works. If it determined the classes at random, then the accuracy would be 10%.
Now let's see which classes the network defines better:
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs, 1)
c = (predicted == labels).squeeze()
for i in range(4):
label = labels[i]
class_correct[label] += c[i].item()
class_total[label] += 1
for i in range(10):
print('Accuracy of %5s : %2d %%' % (
classes[i], 100 * class_correct[i] / class_total[i]))
The network seems to be the best at detecting cars and ships: 71% accuracy.
So the network is working. Now let's try to transfer its work to the graphics processor (GPU) and see what changes.
Training a neural network on the GPU
First, I will briefly explain what CUDA is. CUDA (Compute Unified Device Architecture) is a parallel computing platform developed by NVIDIA for general computing on graphics processing units (GPUs). With CUDA, developers can dramatically accelerate computing applications by harnessing the power of GPUs. On our server, which we purchased, this platform is already installed.
Let's first define our GPU as the first visible cuda device.
device = torch . device ( "cuda:0" if torch . cuda . is_available () else "cpu" )
# Assuming that we are on a CUDA machine, this should print a CUDA device:
print ( device )
We send the network to the GPU:
net.to(device)
We also have to send inputs and targets at each step and to the GPU:
inputs, labels = data[0].to(device), data[1].to(device)
Let's start retraining the network already on the GPU:
import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
for epoch in range(2): # loop over the dataset multiple times
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the inputs; data is a list of [inputs, labels]
inputs, labels = data[0].to(device), data[1].to(device)
# zero the parameter gradients
optimizer.zero_grad()
# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()
if i % 2000 == 1999: # print every 2000 mini-batches
print('[%d, %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 2000))
running_loss = 0.0
print('Finished Training')
This time, network training lasted about 3 minutes. Recall that the same stage on a conventional processor lasted 5 minutes. The difference is not significant, it happens because our network is not that big. When using large arrays for training, the difference between the speed of the GPU and the traditional processor will increase.
This seems to be everything. What we managed to do:
- We looked at what a GPU is and chose the server on which it is installed;
- We have set up a software environment for creating a neural network;
- We created a neural network for image recognition and trained it;
- We repeated the training of the network using the GPU and got an increase in speed.
I will be glad to answer questions in the comments.
Source: habr.com