Netwọk akwara mbụ gị na ngalaba nhazi eserese (GPU). Nduzi onye mbido

Netwọk akwara mbụ gị na ngalaba nhazi eserese (GPU). Nduzi onye mbido
N'isiokwu a, m ga-agwa gị otu esi edozi ebe mmụta igwe n'ime nkeji 30, mepụta netwọk neural maka njirimara ihe oyiyi, wee na-agba ọsọ otu netwọk na ihe nhazi eserese (GPU).

Nke mbụ, ka anyị kọwaa ihe netwọk akwara bụ.

N'ọnọdụ anyị, nke a bụ usoro mgbakọ na mwepụ, yana ngwa ngwa ya ma ọ bụ ngwaike, nke e wuru na ụkpụrụ nke nhazi na ịrụ ọrụ nke netwọkụ akwara ozi - netwọk nke mkpụrụ ndụ akwara nke ihe dị ndụ. Echiche a bilitere mgbe ị na-amụ usoro ndị na-eme na ụbụrụ ma na-agbalị ịmepụta usoro ndị a.

A naghị ahazi netwọkụ akwara ozi n'ụdị okwu a na-emebu, a zụrụ ha. Ikike ịmụta bụ otu n'ime uru dị mkpa nke netwọkụ akwara karịa algọridim ọdịnala. Teknụzụ, mmụta bụ ịchọta ọnụọgụgụ njikọ dị n'etiti neurons. N'oge usoro ọzụzụ, netwọk neural na-enwe ike ịchọpụta ihe ndabere dị mgbagwoju anya n'etiti data ntinye na data mmepụta, yana ịrụ ọrụ n'ozuzu.

Site n'echiche nke mmụta igwe, netwọk akwara bụ ihe pụrụ iche nke usoro njirimara ụkpụrụ, nyocha ịkpa oke, usoro nchịkọta na ụzọ ndị ọzọ.

Ngwa

Nke mbụ, ka anyị leba anya na akụrụngwa. Anyị chọrọ ihe nkesa nwere sistemụ arụmọrụ Linux arụnyere na ya. Akụrụngwa achọrọ iji rụọ usoro mmụta igwe dị ike nke ukwuu, n'ihi ya, ọ dị oke ọnụ. Maka ndị na-enweghị igwe dị mma n'aka, ana m akwado ịṅa ntị na onyinye nke ndị na-enye igwe ojii. Ị nwere ike ịgbazite ihe nkesa achọrọ ngwa ngwa ma kwụọ ụgwọ naanị maka oge eji.

N'ime ọrụ ebe ọ dị mkpa ịmepụta netwọkụ neural, m na-eji sava nke otu n'ime ndị na-eweta igwe ojii Russia. Ụlọ ọrụ ahụ na-enye sava igwe ojii maka mgbazinye kpọmkwem maka mmụta igwe nwere ike Tesla V100 ndịna-emeputa eserese (GPU) sitere na NVIDIA. Na nkenke: iji ihe nkesa nwere GPU nwere ike ịdị ọtụtụ iri na-arụ ọrụ nke ọma (ngwa ngwa) ma e jiri ya tụnyere ihe nkesa nke ọnụ ahịa yiri ya nke na-eji CPU (nhazi nhazi etiti a ma ama) maka mgbakọ. A na-enweta nke a n'ihi njirimara nke GPU architecture, nke na-anagide mgbako ngwa ngwa.

Iji mejuputa atụ ndị akọwara n'okpuru, anyị zụtara ihe nkesa a ruo ọtụtụ ụbọchị:

  • SSD diski 150 GB
  • RAM 32 GB
  • Tesla V100 16 Gb processor nwere cores 4

Anyị tinye Ubuntu 18.04 na igwe anyị.

Ịtọlite ​​gburugburu ebe obibi

Ugbu a, ka anyị wụnye ihe niile dị mkpa maka ọrụ na ihe nkesa. Ebe ọ bụ na isiokwu anyị bụ isi maka ndị mbido, m ga-ekwu maka isi ihe ụfọdụ ga-abara ha uru.

A na-eme ọtụtụ ọrụ mgbe ị na-edozi gburugburu site na akara iwu. Ọtụtụ n'ime ndị ọrụ na-eji Windows dị ka OS ha na-arụ ọrụ. Ọkọlọtọ console na OS a na-ahapụ ọtụtụ ihe achọrọ. Ya mere, anyị ga-eji ngwá ọrụ dị mma Cmder/. Budata obere ụdị ma mee cmder.exe. Ọzọ ị ga-ejikọ na sava site na SSH:

ssh root@server-ip-or-hostname

Kama nkesa-ip-ma ọ bụ aha nnabata, ezipụta adreesị IP ma ọ bụ aha DNS nke ihe nkesa gị. Ọzọ, tinye paswọọdụ ma ọ bụrụ na njikọ ahụ aga nke ọma, anyị kwesịrị ịnata ozi yiri nke a.

Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 4.15.0-74-generic x86_64)

Asụsụ bụ isi maka ịmepụta ụdị ML bụ Python. Na ikpo okwu kacha ewu ewu maka ojiji ya na Linux bụ Anaconda.

Ka anyị tinye ya na sava anyị.

Anyị na-amalite site na imelite njikwa ngwugwu mpaghara:

sudo apt-get update

Wụnye curl (ọrụ ahịrị iwu):

sudo apt-get install curl

Budata Nkesa Anaconda kacha ọhụrụ:

cd /tmp
curl –O https://repo.anaconda.com/archive/Anaconda3-2019.10-Linux-x86_64.sh

Ka anyị malite nwụnye:

bash Anaconda3-2019.10-Linux-x86_64.sh

N'oge usoro nrụnye, a ga-ajụ gị ka ị kwado nkwekọrịta ikike. Mgbe echichi nke ọma, ị ga-ahụ nke a:

Thank you for installing Anaconda3!

Ekerela ọtụtụ usoro maka mmepe nke ụdị ML anyị na-arụ ọrụ na ndị kachasị ewu ewu: PyTorch и tensor eruba.

Iji usoro ahụ na-enye gị ohere ịbawanye ọsọ nke mmepe ma jiri ngwaọrụ ndị a kwadoro maka ọrụ ọkọlọtọ.

N'ihe atụ a, anyị ga-arụ ọrụ na PyTorch. Ka anyị tinye ya:

conda install pytorch torchvision cudatoolkit=10.1 -c pytorch

Ugbu a, anyị kwesịrị ịmalite Jupyter Notebook, ngwa mmepe ewu ewu maka ndị ọkachamara ML. Ọ na-enye gị ohere ide koodu wee hụ nsonaazụ nke ogbugbu ya ozugbo. Jupyter Notebook agụnyere na Anaconda ma etinyere ya na sava anyị. Ịkwesịrị ijikọ ya na sistemụ desktọpụ anyị.

Iji mee nke a, anyị ga-ebu ụzọ malite Jupyter na ihe nkesa na-akọwa ọdụ ụgbọ mmiri 8080:

jupyter notebook --no-browser --port=8080 --allow-root

Ọzọ, imepe taabụ ọzọ na njikwa Cmder anyị (nchịkọta n'elu - dialog console ọhụrụ) anyị ga-ejikọ site na ọdụ ụgbọ mmiri 8080 na sava site na SSH:

ssh -L 8080:localhost:8080 root@server-ip-or-hostname

Mgbe anyị tinyere iwu nke mbụ, a ga-enye anyị njikọ iji mepee Jupyter na ihe nchọgharị anyị:

To access the notebook, open this file in a browser:
        file:///root/.local/share/jupyter/runtime/nbserver-18788-open.html
    Or copy and paste one of these URLs:
        http://localhost:8080/?token=cca0bd0b30857821194b9018a5394a4ed2322236f116d311
     or http://127.0.0.1:8080/?token=cca0bd0b30857821194b9018a5394a4ed2322236f116d311

Ka anyị jiri njikọ maka localhost:8080. Detuo ụzọ zuru ezu wee mado ya n'ime ogwe adreesị nke ihe nchọgharị mpaghara PC gị. Akwụkwọ ndetu Jupyter ga-emeghe.

Ka anyị mepụta akwụkwọ ndetu ọhụrụ: New - Notebook - Python 3.

Ka anyị lelee ọrụ nke akụrụngwa niile anyị rụnyere. Ka anyị tinye koodu PyTorch ihe atụ n'ime Jupyter wee mee ogbugbu (bọtịnụ Gbaa):

from __future__ import print_function
import torch
x = torch.rand(5, 3)
print(x)

Nsonaazụ kwesịrị ịbụ ihe dị ka nke a:

Netwọk akwara mbụ gị na ngalaba nhazi eserese (GPU). Nduzi onye mbido

Ọ bụrụ na ị nwere nsonaazụ yiri ya, mgbe ahụ, anyị ahazila ihe niile n'ụzọ ziri ezi ma anyị nwere ike ịmalite ịmepụta netwọkụ akwara ozi!

Ịmepụta netwọkụ akwara ozi

Anyị ga-emepụta netwọk akwara maka njirimara onyonyo. Ka anyị were nke a dịka ntọala njikwa.

Anyị ga-eji dataset CIFAR10 dị ọha iji zụọ netwọkụ. Ọ nwere klaasị: "ụgbọ elu", "ụgbọ ala", "nnụnụ", "cat", "deer", "nkịta", "frog", "ịnyịnya", "ụgbọ mmiri", "ụgbọ ala". Onyonyo dị na CIFAR10 bụ 3x32x32, ya bụ, onyonyo agba ọwa 3 nke pikselụ 32 × 32.

Netwọk akwara mbụ gị na ngalaba nhazi eserese (GPU). Nduzi onye mbido
Maka ọrụ, anyị ga-eji ngwugwu nke PyTorch mepụtara maka ịrụ ọrụ na onyonyo - torchvision.

Anyị ga-eme usoro ndị a n'usoro:

  • Na-ebu na normalize ọzụzụ na ule data tent
  • Nkọwa netwọk Neural
  • Ọzụzụ netwọkụ na data ọzụzụ
  • Nnwale netwọkụ na data ule
  • Ka anyị jiri GPU na-emegharị ọzụzụ na nnwale

Anyị ga na-emezu koodu niile dị n'okpuru na Jupyter Notebook.

Na-ebu ma na-emezi CIFAR10

Detuo ma mee koodu a na Jupyter:


import torch
import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

Azịza ya kwesịrị ịbụ:

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz
Extracting ./data/cifar-10-python.tar.gz to ./data
Files already downloaded and verified

Ka anyị gosipụta ọtụtụ onyonyo ọzụzụ maka nnwale:


import matplotlib.pyplot as plt
import numpy as np

# functions to show an image

def imshow(img):
    img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.show()

# get some random training images
dataiter = iter(trainloader)
images, labels = dataiter.next()

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))

Netwọk akwara mbụ gị na ngalaba nhazi eserese (GPU). Nduzi onye mbido

Nkọwa netwọk Neural

Ka anyị buru ụzọ tulee ka netwọkụ akwara ozi maka njirimara onyonyo si arụ ọrụ. Nke a bụ netwọk dị mfe-na-atụ. Ọ na-ewe data ntinye, na-agafe ya n'ọtụtụ ọkwa n'otu n'otu, ma mesịa mepụta data mmepụta.

Netwọk akwara mbụ gị na ngalaba nhazi eserese (GPU). Nduzi onye mbido

Ka anyị mepụta netwọkụ yiri ya na gburugburu anyị:


import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Net()

Anyị na-akọwakwa ọrụ mfu na ihe kacha mma


import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

Ọzụzụ netwọkụ na data ọzụzụ

Ka anyị malite ịzụ netwọkụ akwara ozi anyị. Biko mara na ka ịmechara koodu a, ị ga-echere obere oge ruo mgbe arụchara ọrụ ahụ. O were m nkeji ise. Ọ na-ewe oge iji zụọ netwọk.

 for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

Anyị na-enweta nsonaazụ a:

Netwọk akwara mbụ gị na ngalaba nhazi eserese (GPU). Nduzi onye mbido

Anyị na-echekwa ụdị anyị a zụrụ azụ:

PATH = './cifar_net.pth'
torch.save(net.state_dict(), PATH)

Nnwale netwọkụ na data ule

Anyị zụrụ netwọk site na iji usoro data ọzụzụ. Mana anyị kwesịrị ịlele ma netwọkụ ahụ amụtala ihe ọ bụla.

Anyị ga-anwale nke a site n'ibu amụma akara klas na netwọkụ akwara na-ewepụta ma nwalee ya ka ọ mara ma ọ bụ eziokwu. Ọ bụrụ na amụma ahụ ziri ezi, anyị na-agbakwunye sample na ndepụta amụma ziri ezi.
Ka anyị gosi onyonyo sitere na ntọala nnwale:

dataiter = iter(testloader)
images, labels = dataiter.next()

# print images
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(4)))

Netwọk akwara mbụ gị na ngalaba nhazi eserese (GPU). Nduzi onye mbido

Ugbu a, ka anyị jụọ netwọkụ akwara ka ọ gwa anyị ihe dị na foto ndị a:


net = Net()
net.load_state_dict(torch.load(PATH))

outputs = net(images)

_, predicted = torch.max(outputs, 1)

print('Predicted: ', ' '.join('%5s' % classes[predicted[j]]
                              for j in range(4)))

Netwọk akwara mbụ gị na ngalaba nhazi eserese (GPU). Nduzi onye mbido

Nsonaazụ dị ka ọ mara mma: netwọkụ achọpụtara nke ọma atọ n'ime foto anọ.

Ka anyị hụ ka netwọkụ si arụ ọrụ n'ofe dataset niile.


correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))

Netwọk akwara mbụ gị na ngalaba nhazi eserese (GPU). Nduzi onye mbido

Ọ dị ka netwọk ahụ maara ihe ma na-arụ ọrụ. Ọ bụrụ na o kpebisiri ike na klaasị na enweghị usoro, izi ezi ga-abụ 10%.

Ugbu a, ka anyị hụ klaasị netwọk ahụ na-achọpụta nke ọma:

class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1


for i in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))

Netwọk akwara mbụ gị na ngalaba nhazi eserese (GPU). Nduzi onye mbido

Ọ dị ka netwọk kacha mma n'ịchọpụta ụgbọ ala na ụgbọ mmiri: 71% ziri ezi.

Ya mere, netwọk na-arụ ọrụ. Ugbu a, ka anyị gbalịa ịnyefe ọrụ ya na processor graphics (GPU) ma hụ ihe na-agbanwe.

Ọzụzụ netwọkụ akwara na GPU

Mbụ, m ga-akọwa nkenke ihe CUDA bụ. CUDA (Compute Unified Device Architecture) bụ usoro mgbakọ na mwepụ yiri nke NVIDIA mepụtara maka mkpokọta mkpokọta na nkeji nhazi eserese (GPUs). Site na CUDA, ndị mmepe nwere ike mee ngwa ngwa ịgbakọ ngwa ngwa site na itinye ike nke GPU. Awụnyelarị ikpo okwu a na sava anyị nke anyị zụtara.

Ka anyị buru ụzọ kọwaa GPU anyị dị ka ngwaọrụ cuda mbụ a na-ahụ anya.

device = torch . device ( "cuda:0" if torch . cuda . is_available () else "cpu" )
# Assuming that we are on a CUDA machine, this should print a CUDA device:
print ( device )

Netwọk akwara mbụ gị na ngalaba nhazi eserese (GPU). Nduzi onye mbido

Na-eziga netwọk na GPU:

net.to(device)

Anyị ga-ezigakwa ntinye na ebumnuche na nzọụkwụ ọ bụla na GPU:

inputs, labels = data[0].to(device), data[1].to(device)

Ka anyị zụọkwa netwọkụ na GPU:

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
    inputs, labels = data[0].to(device), data[1].to(device)

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

Oge a, ọzụzụ netwọk ahụ were ihe dịka nkeji 3. Ka anyị cheta na otu ogbo na a ot processor were 5 nkeji. Ihe dị iche abụghị ihe dị ịrịba ama, nke a na-eme n'ihi na netwọk anyị adịghị oke. Mgbe ị na-eji nnukwu arrays maka ọzụzụ, ọdịiche dị n'etiti ọsọ nke GPU na ihe nhazi omenala ga-abawanye.

Nke ahụ yiri ka ọ bụ naanị. Ihe anyị jisiri ike mee:

  • Anyị lere anya ihe GPU bụ wee họrọ ihe nkesa nke etinyere ya;
  • Anyị ehiwela ebe ngwanrọ iji mepụta netwọkụ akwara;
  • Anyị mepụtara netwọk neural maka njirimara onyonyo wee zụọ ya;
  • Anyị megharịrị ọzụzụ netwọk site na iji GPU wee nweta mmụba na ọsọ.

M ga-enwe obi ụtọ ịza ajụjụ na nkọwa.

isi: www.habr.com

Tinye a comment