Koj thawj neural network ntawm ib chav ua haujlwm graphics (GPU). Pib Phau Ntawv Qhia

Koj thawj neural network ntawm ib chav ua haujlwm graphics (GPU). Pib Phau Ntawv Qhia
Hauv tsab xov xwm no, kuv yuav qhia koj yuav ua li cas teeb tsa lub tshuab kev kawm hauv 30 feeb, tsim lub network neural rau kev lees paub cov duab, thiab tom qab ntawd khiav tib lub network ntawm cov duab processor (GPU).

Ua ntej, cia peb txhais qhov neural network yog dab tsi.

Hauv peb cov ntaub ntawv, qhov no yog tus qauv lej, nrog rau nws cov software lossis hardware embodiment, tsim los ntawm lub hauv paus ntsiab lus ntawm lub koom haum thiab kev ua haujlwm ntawm cov paj hlwb neural networks - networks ntawm paj hlwb ntawm cov kab mob muaj sia. Lub tswv yim no tau tshwm sim thaum kawm txog cov txheej txheem tshwm sim hauv lub hlwb thiab sim ua qauv rau cov txheej txheem no.

Neural tes hauj lwm tsis yog programmed hauv kev nkag siab ib txwm ntawm lo lus, lawv raug cob qhia. Kev muaj peev xwm kawm tau yog ib qho txiaj ntsig tseem ceeb ntawm neural networks tshaj li cov algorithms. Technically, kev kawm yog nrhiav cov coefficients ntawm kev sib txuas ntawm neurons. Thaum lub sij hawm kev cob qhia, lub neural network muaj peev xwm txheeb xyuas cov kev vam meej ntawm cov ntaub ntawv tawm tswv yim thiab cov ntaub ntawv tawm, nrog rau kev ua kom dav dav.

Los ntawm qhov kev xav ntawm kev kawm tshuab, lub neural network yog ib qho tshwj xeeb ntawm cov qauv kev lees paub, kev txheeb xyuas kev ntxub ntxaug, kev sib koom ua ke thiab lwm txoj hauv kev.

Khoom siv

Ua ntej, cia peb saib cov khoom siv. Peb xav tau ib tus neeg rau zaub mov nrog Linux operating system nruab rau nws. Cov cuab yeej siv los ua haujlwm tshuab kev kawm muaj zog heev thiab, vim li ntawd, kim. Rau cov neeg uas tsis muaj lub tshuab zoo ntawm tes, kuv pom zoo kom ua tib zoo mloog cov kev pabcuam huab. Koj tuaj yeem xauj lub server xav tau sai thiab them nyiaj rau lub sijhawm siv xwb.

Nyob rau hauv tej yaam num uas nws yog tsim nyog los tsim neural tes hauj lwm, kuv siv cov servers ntawm ib tug ntawm cov Lavxias teb sab huab muab kev pab cuam. Lub tuam txhab muaj huab servers rau nqi xauj tsev tshwj xeeb rau kev kawm tshuab nrog lub zog Tesla V100 graphics processors (GPU) los ntawm NVIDIA. Hauv luv luv: siv lub server nrog GPU tuaj yeem ua tau kaum lub sij hawm ntau dua (ceev) piv rau cov neeg rau zaub mov ntawm cov nqi zoo sib xws uas siv CPU (cov chaw ua haujlwm paub zoo hauv nruab nrab) rau kev suav. Qhov no yog ua tiav vim cov yam ntxwv ntawm GPU architecture, uas tiv nrog kev suav nrawm dua.

Txhawm rau ua raws li cov piv txwv tau piav qhia hauv qab no, peb yuav cov server hauv qab no rau ob peb hnub:

  • SSD disk 150 GB
  • RAM 32 GB
  • Tesla V100 16 Gb processor nrog 4 cores

Peb tau nruab Ubuntu 18.04 ntawm peb lub tshuab.

Kev teeb tsa ib puag ncig

Tam sim no cia peb nruab txhua yam tsim nyog rau kev ua haujlwm ntawm lub server. Txij li thaum peb tsab xov xwm yog feem ntau rau beginners, kuv yuav tham txog ib co ntsiab lus uas yuav pab tau rau lawv.

Ntau txoj haujlwm thaum teeb tsa ib puag ncig yog ua tiav los ntawm kab hais kom ua. Feem ntau ntawm cov neeg siv Windows ua lawv cov haujlwm OS. Tus qauv console hauv OS no tawm ntau yam uas xav tau. Yog li ntawd, peb yuav siv cov cuab yeej yooj yim Cmder/. Download tau lub mini version thiab khiav Cmder.exe. Tom ntej no koj yuav tsum txuas mus rau lub server ntawm SSH:

ssh root@server-ip-or-hostname

Hloov chaw server-ip-lossis-hostname, qhia tus IP chaw nyob lossis DNS npe ntawm koj lub server. Tom ntej no, sau tus password thiab yog tias kev sib txuas ua tiav, peb yuav tsum tau txais cov lus zoo ib yam li qhov no.

Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 4.15.0-74-generic x86_64)

Cov lus tseem ceeb rau kev tsim qauv ML yog Python. Thiab lub platform nrov tshaj plaws rau nws siv ntawm Linux yog Anaconda.

Cia peb nruab nws ntawm peb lub server.

Peb pib los ntawm kev hloov kho tus neeg saib xyuas pob hauv zos:

sudo apt-get update

Nruab curl (cov lus txib kab hluav taws xob):

sudo apt-get install curl

Download tau qhov tseeb version ntawm Anaconda Distribution:

cd /tmp
curl –O https://repo.anaconda.com/archive/Anaconda3-2019.10-Linux-x86_64.sh

Cia peb pib lub installation:

bash Anaconda3-2019.10-Linux-x86_64.sh

Thaum lub installation txheej txheem, koj yuav raug nug kom paub meej tias daim ntawv tso cai pom zoo. Thaum ua tiav installation koj yuav tsum pom qhov no:

Thank you for installing Anaconda3!

Ntau lub moj khaum tam sim no tau tsim los rau kev txhim kho ML qauv; peb ua haujlwm nrog cov neeg nyiam tshaj plaws: PyTorch ΠΈ tensor ntws.

Kev siv lub moj khaum tso cai rau koj los ua kom nrawm ntawm kev txhim kho thiab siv cov cuab yeej npaj ua tiav rau cov haujlwm txheem.

Hauv qhov piv txwv no peb yuav ua haujlwm nrog PyTorch. Cia peb nruab nws:

conda install pytorch torchvision cudatoolkit=10.1 -c pytorch

Tam sim no peb yuav tsum tau tshaj tawm Jupyter Notebook, lub cuab yeej tsim kho nrov rau ML cov kws tshaj lij. Nws tso cai rau koj sau code thiab tam sim ntawd pom cov txiaj ntsig ntawm nws qhov kev ua tiav. Jupyter Notebook suav nrog Anaconda thiab twb tau nruab rau ntawm peb lub server. Koj yuav tsum txuas rau nws los ntawm peb lub desktop.

Txhawm rau ua qhov no, peb yuav xub tso Jupyter ntawm tus neeg rau zaub mov qhia qhov chaw nres nkoj 8080:

jupyter notebook --no-browser --port=8080 --allow-root

Tom ntej no, qhib lwm lub tab hauv peb Cmder console (cov ntawv qhia zaub mov saum toj kawg nkaus - Tshiab console dialog) peb yuav txuas ntawm chaw nres nkoj 8080 mus rau lub server ntawm SSH:

ssh -L 8080:localhost:8080 root@server-ip-or-hostname

Thaum peb nkag mus rau thawj qhov kev hais kom ua, peb yuav muab cov kev txuas mus qhib Jupyter hauv peb qhov browser:

To access the notebook, open this file in a browser:
        file:///root/.local/share/jupyter/runtime/nbserver-18788-open.html
    Or copy and paste one of these URLs:
        http://localhost:8080/?token=cca0bd0b30857821194b9018a5394a4ed2322236f116d311
     or http://127.0.0.1:8080/?token=cca0bd0b30857821194b9018a5394a4ed2322236f116d311

Cia peb siv qhov txuas rau localhost: 8080. Luam tag nrho txoj kev thiab muab tso rau hauv qhov chaw nyob bar ntawm koj lub PC lub browser hauv zos. Jupyter Notebook yuav qhib.

Cia peb tsim phau ntawv tshiab: Tshiab - Phau Ntawv - Python 3.

Cia peb kuaj xyuas qhov kev ua haujlwm raug ntawm tag nrho cov khoom uas peb tau teeb tsa. Cia peb nkag mus rau qhov piv txwv PyTorch code rau hauv Jupyter thiab khiav qhov kev ua tiav (Khiav khawm):

from __future__ import print_function
import torch
x = torch.rand(5, 3)
print(x)

Cov txiaj ntsig yuav tsum yog qee yam zoo li no:

Koj thawj neural network ntawm ib chav ua haujlwm graphics (GPU). Pib Phau Ntawv Qhia

Yog tias koj muaj qhov tshwm sim zoo sib xws, ces peb tau teeb tsa txhua yam kom raug thiab peb tuaj yeem pib tsim lub network neural!

Tsim ib lub neural network

Peb yuav tsim ib lub neural network rau kev paub txog cov duab. Cia peb ua qhov no ua lub hauv paus kev ua thawj coj.

Peb yuav siv cov ntaub ntawv CIFAR10 uas tau tshaj tawm rau pej xeem los cob qhia lub network. Nws muaj cov chav kawm: "airplane", "car", "noog", "cat", "mos", "dog", "qav", "nees", "nkoj", "trailer". Cov duab hauv CIFAR10 yog 3x32x32, uas yog, 3-channel xim duab ntawm 32x32 pixels.

Koj thawj neural network ntawm ib chav ua haujlwm graphics (GPU). Pib Phau Ntawv Qhia
Rau kev ua haujlwm, peb yuav siv lub pob tsim los ntawm PyTorch rau kev ua haujlwm nrog cov duab - torchvision.

Peb yuav ua cov kauj ruam hauv qab no kom tiav:

  • Kev thauj khoom thiab normalizing kev cob qhia thiab cov ntaub ntawv xeem
  • Neural Network txhais
  • Kev cob qhia network ntawm cov ntaub ntawv kev cob qhia
  • Network xeem ntawm cov ntaub ntawv xeem
  • Cia peb rov ua dua kev cob qhia thiab kev sim siv GPU

Peb yuav ua tiav tag nrho cov cai hauv qab no hauv Jupyter Notebook.

Chaw thau khoom thiab normalizing CIFAR10

Luam thiab khiav cov cai hauv qab no hauv Jupyter:


import torch
import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

Cov lus teb yuav tsum yog:

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz
Extracting ./data/cifar-10-python.tar.gz to ./data
Files already downloaded and verified

Cia peb tso saib ntau cov duab kev cob qhia rau kev sim:


import matplotlib.pyplot as plt
import numpy as np

# functions to show an image

def imshow(img):
    img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.show()

# get some random training images
dataiter = iter(trainloader)
images, labels = dataiter.next()

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))

Koj thawj neural network ntawm ib chav ua haujlwm graphics (GPU). Pib Phau Ntawv Qhia

Neural Network txhais

Cia peb pib xav txog yuav ua li cas lub neural network rau kev paub txog cov duab ua haujlwm. Qhov no yog ib qho yooj yim point-to-point network. Nws siv cov ntaub ntawv tawm tswv yim, hla nws los ntawm ntau txheej ib los ntawm ib qho, thiab tom qab ntawd thaum kawg tsim cov ntaub ntawv tso tawm.

Koj thawj neural network ntawm ib chav ua haujlwm graphics (GPU). Pib Phau Ntawv Qhia

Cia peb tsim ib lub network zoo sib xws hauv peb ib puag ncig:


import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Net()

Peb kuj txhais tau qhov poob muaj nuj nqi thiab tus optimizer


import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

Kev cob qhia network ntawm cov ntaub ntawv kev cob qhia

Cia peb pib kawm peb cov neural network. Thov nco ntsoov tias tom qab koj khiav cov cai no, koj yuav tsum tau tos qee lub sijhawm kom txog thaum ua haujlwm tiav. Nws coj kuv 5 feeb. Nws yuav siv sij hawm los cob qhia lub network.

 for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

Peb tau txais cov txiaj ntsig hauv qab no:

Koj thawj neural network ntawm ib chav ua haujlwm graphics (GPU). Pib Phau Ntawv Qhia

Peb txuag peb cov qauv kev cob qhia:

PATH = './cifar_net.pth'
torch.save(net.state_dict(), PATH)

Network xeem ntawm cov ntaub ntawv xeem

Peb tau cob qhia lub network siv cov ntaub ntawv qhia kev cob qhia. Tab sis peb yuav tsum tau xyuas seb lub network puas tau kawm txhua yam.

Peb yuav sim qhov no los ntawm kev kwv yees daim ntawv teev cov chav kawm uas cov neural network tso tawm thiab sim nws kom pom tias nws muaj tseeb. Yog tias qhov kev twv ua ntej yog qhov tseeb, peb ntxiv cov qauv rau cov npe ntawm qhov kev kwv yees raug.
Cia peb qhia ib daim duab los ntawm cov txheej txheem xeem:

dataiter = iter(testloader)
images, labels = dataiter.next()

# print images
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(4)))

Koj thawj neural network ntawm ib chav ua haujlwm graphics (GPU). Pib Phau Ntawv Qhia

Tam sim no cia peb nug lub neural network qhia peb tias muaj dab tsi hauv cov duab no:


net = Net()
net.load_state_dict(torch.load(PATH))

outputs = net(images)

_, predicted = torch.max(outputs, 1)

print('Predicted: ', ' '.join('%5s' % classes[predicted[j]]
                              for j in range(4)))

Koj thawj neural network ntawm ib chav ua haujlwm graphics (GPU). Pib Phau Ntawv Qhia

Cov txiaj ntsig zoo li zoo nkauj: lub network raug txheeb xyuas peb ntawm plaub daim duab.

Cia peb saib seb lub network ua haujlwm li cas thoob plaws tag nrho cov ntaub ntawv.


correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))

Koj thawj neural network ntawm ib chav ua haujlwm graphics (GPU). Pib Phau Ntawv Qhia

Nws zoo li lub network paub qee yam thiab ua haujlwm. Yog tias nws txiav txim siab cov chav kawm ntawm random, qhov tseeb yuav yog 10%.

Tam sim no cia saib cov chav kawm twg lub network txheeb xyuas qhov zoo dua:

class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1


for i in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))

Koj thawj neural network ntawm ib chav ua haujlwm graphics (GPU). Pib Phau Ntawv Qhia

Nws zoo nkaus li tias lub network yog qhov zoo tshaj plaws ntawm kev txheeb xyuas cov tsheb thiab nkoj: 71% qhov tseeb.

Yog li lub network ua haujlwm. Tam sim no cia peb sim hloov nws txoj haujlwm mus rau cov duab processor (GPU) thiab pom qhov hloov pauv.

Kev cob qhia neural network ntawm GPU

Ua ntej, kuv yuav piav luv luv tias CUDA yog dab tsi. CUDA (Compute Unified Device Architecture) yog ib qho kev sib txuas lus sib txuas tsim los ntawm NVIDIA rau kev suav dav dav ntawm cov duab ua haujlwm (GPUs). Nrog CUDA, cov neeg tsim khoom tuaj yeem ua kom nrawm nrawm rau kev siv computer los ntawm kev siv lub zog ntawm GPUs. Lub platform no twb tau nruab rau ntawm peb lub server uas peb yuav.

Cia thawj txhais peb GPU ua thawj pom cuda ntaus ntawv.

device = torch . device ( "cuda:0" if torch . cuda . is_available () else "cpu" )
# Assuming that we are on a CUDA machine, this should print a CUDA device:
print ( device )

Koj thawj neural network ntawm ib chav ua haujlwm graphics (GPU). Pib Phau Ntawv Qhia

Xa lub network mus rau GPU:

net.to(device)

Peb tseem yuav tau xa cov tswv yim thiab cov hom phiaj ntawm txhua kauj ruam mus rau GPU:

inputs, labels = data[0].to(device), data[1].to(device)

Cia peb rov qhia lub network ntawm GPU:

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
    inputs, labels = data[0].to(device), data[1].to(device)

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

Lub sijhawm no, kev cob qhia network tau siv sijhawm li 3 feeb. Cia peb nco qab tias tib theem ntawm cov txheej txheem ib txwm siv sijhawm 5 feeb. Qhov txawv tsis tseem ceeb, qhov no tshwm sim vim peb lub network tsis loj heev. Thaum siv cov array loj rau kev cob qhia, qhov sib txawv ntawm qhov ceev ntawm GPU thiab cov txheej txheem ib txwm yuav nce.

Qhov ntawd zoo li tag nrho. Peb tau ua dab tsi:

  • Peb saib seb GPU yog dab tsi thiab xaiv lub server uas nws tau teeb tsa;
  • Peb tau teeb tsa software ib puag ncig los tsim lub network neural;
  • Peb tsim ib lub neural network rau kev paub txog cov duab thiab cob qhia nws;
  • Peb rov ua qhov kev cob qhia network siv GPU thiab tau txais kev nce ceev.

Kuv yuav zoo siab los teb cov lus nug hauv cov lus.

Tau qhov twg los: www.hab.com

Ntxiv ib saib