Kusatshalaliswe Ukufunda nge-Apache MXNet kanye ne-Horovod

Ukuhunyushwa kwesihloko kwalungiselelwa ngobusuku bangaphambi kokuqala kwesifundo "Industrial ML on Big Data"

Ukuqeqeshwa okusatshalaliswa ezimweni eziningi zekhompuyutha ezisebenza kahle kunganciphisa isikhathi sokuqeqeshwa kwamanethiwekhi emizwa ajulile edatha enanini elikhulu ledatha ukusuka emavikini ukuya emahoreni noma ngisho namaminithi, okwenza le ndlela yokuqeqesha ivame ekusebenziseni okungokoqobo kokufunda okujulile. Abasebenzisi kufanele baqonde indlela yokwabelana nokuvumelanisa idatha kuzo zonke izimo eziningi, okubuye kube nomthelela omkhulu ekusebenzeni kahle kwesikali. Ngaphezu kwalokho, abasebenzisi kufanele futhi bazi ukuthi bangasifaka kanjani iskripthi sokuqeqesha esisebenza ngesikhathi esisodwa ezimweni eziningi.

Kulesi sihloko sizokhuluma ngendlela esheshayo nelula yokusabalalisa ukufunda sisebenzisa umtapo wolwazi ojulile wokufunda i-Apache MXNet kanye nohlaka lokufunda olusabalalisiwe lwe-Horovod. Sizobonisa ngokucacile izinzuzo zokusebenza zohlaka lwe-Horovod futhi sibonise indlela yokubhala iskripthi sokuqeqesha se-MXNet ukuze sisebenze ngendlela esabalalisiwe ne-Horovod.

Iyini i-Apache MXNet

I-Apache MXNet iwuhlaka lokufunda olujulile olunomthombo ovulekile olusetshenziselwa ukudala, ukuqeqesha, kanye nokusebenzisa amanethiwekhi ajulile e-neural. I-MXNet ifushanisa ubunkimbinkimbi obuhlobene nokusebenzisa amanethiwekhi e-neural, isebenza kahle kakhulu futhi iyakala, futhi inikeza ama-API ezilimi zokuhlela ezidumile ezifana Python, C ++, I-Clojure, Java, Julia, R, Scala nabanye.

Kusatshalaliswe ukuqeqeshwa ku-MXNet ngeseva yepharamitha

Imojula yokufunda esabalalisiwe ejwayelekile ku-MXNet isebenzisa indlela yeseva yepharamitha. Isebenzisa isethi yamaseva epharamitha ukuze iqoqe ama-gradients esisebenzini ngasinye, yenza ukuhlanganisa, futhi ithumele ama-gradient abuyekeziwe emuva kubasebenzi ukuze iphinde isebenze ngokugcwele. Ukunquma isilinganiso esifanele samaseva kubasebenzi kuyisihluthulelo sokukala okuphumelelayo. Uma kunesiphakeli sepharamitha eyodwa kuphela, kungase kuvele kube umgoqo ekubalweni. Ngokuphambene, uma amaseva amaningi kakhulu asetshenziswa, ukuxhumana okuningi kuya kwabaningi kungavala konke ukuxhumana kwenethiwekhi.

Yini iHorovod

I-Horovod iwuhlaka lokufunda olujulile olusabalalisiwe oluvulekile olwakhiwe kwa-Uber. Isebenzisa ubuchwepheshe obusebenzayo be-cross-GPU kanye ne-cross-node efana ne-NVIDIA Collective Communications Library (NCCL) kanye ne-Message Passing Interface (MPI) ukuze isabalalise futhi ihlanganise amapharamitha emodeli kuwo wonke ama-vorecs. Ithuthukisa ukusetshenziswa komkhawulokudonsa wenethiwekhi nezikali kahle lapho isebenza namamodeli enethiwekhi ye-neural ejulile. Njengamanje isekela izinhlaka ezimbalwa zokufunda zomshini ezidumile, okungukuthi I-MXNet, Tensorflow, Keras, kanye ne-PyTorch.

Ukuhlanganiswa kwe-MXNet ne-Horovod

I-MXNet ihlanganisa ne-Horovod ngama-Distributed Learning API achazwe ku-Horovod. I-Horovod communication APIs horovod.sakaza(), horovod.allgather() и horovod.allreduce() isetshenziswe kusetshenziswa ukushayelwa kwe-asynchronous kwenjini ye-MXNet, njengengxenye yegrafu yayo yomsebenzi. Ngale ndlela, ukuncika kwedatha phakathi kokuxhumana nokubala kuphathwa kalula yinjini ye-MXNet ukuze kugwenywe ukulahlekelwa kokusebenza ngenxa yokuvumelanisa. Into ye-optimizer esabalalisiwe echazwe ku-Horovod horovod.DistributedOptimizer liyanwebeka I-Optimizer ku-MXNet ukuze ibize ama-API we-Horovod ahambisanayo ngezibuyekezo zepharamitha esabalalisiwe. Yonke le mininingwane yokusetshenziswa isobala kubasebenzisi bokugcina.

Ukuqala okusheshayo

Ungaqala ngokushesha ukuqeqesha inethiwekhi encane ye-convolutional neural kudathasethi ye-MNIST usebenzisa i-MXNet ne-Horovod ku-MacBook yakho.
Okokuqala, faka i-mxnet ne-hoovod kusuka ku-PyPI:

pip install mxnet
pip install horovod

Qaphela: Uma uhlangabezana nephutha ngesikhathi pip ukufaka horovodmhlawumbe udinga ukungeza okuguquguqukayo MACOSX_DEPLOYMENT_TARGET=10.vvkuphi vv - lena inguqulo yenguqulo yakho ye-MacOS, isibonelo, ye-MacOSX Sierra uzodinga ukubhala MACOSX_DEPLOYMENT_TARGET=10.12 pip ukufaka horovod

Bese ufaka i-OpenMPI kusuka lapha.

Ekugcineni, landa umbhalo wokuhlola mxnet_mnist.py kusuka lapha bese usebenzisa imiyalo elandelayo kutheminali ye-MacBook kumkhombandlela osebenzayo:

mpirun -np 2 -H localhost:2 -bind-to none -map-by slot python mxnet_mnist.py

Lokhu kuzosebenzisa ukuqeqeshwa kuma-cores amabili we-processor yakho. Okuphumayo kuzoba okulandelayo:

INFO:root:Epoch[0] Batch [0-50] Speed: 2248.71 samples/sec      accuracy=0.583640
INFO:root:Epoch[0] Batch [50-100] Speed: 2273.89 samples/sec      accuracy=0.882812
INFO:root:Epoch[0] Batch [50-100] Speed: 2273.39 samples/sec      accuracy=0.870000

Idemo yokusebenza

Lapho uqeqesha imodeli ye-ResNet50-v1 kudathasethi ye-ImageNet kuma-GPU angu-64 ngezimo eziyisishiyagalombili p3.16okukhulu I-EC2, ngayinye iqukethe ama-NVIDIA Tesla V8 GPU angu-100 efwini le-AWS, sizuze ukuqeqeshwa kwezithombe/isekhondi elingu-45000 (okungukuthi, inani lamasampuli aqeqeshiwe ngomzuzwana). Ukuqeqeshwa kuqedwe emizuzwini engama-44 ngemuva kwezinkathi ezingama-90 ngokunemba okungcono kakhulu okungu-75.7%.

Siqhathanise lokhu nendlela yokuqeqesha esabalalisiwe ye-MXNet yokusebenzisa amaseva epharamitha kuma-GPU angu-8, 16, 32 kanye nama-64 GPU anepharamitha eyodwa kanye nesilinganiso seseva kusisebenzi sika-1 kuya ku-1 no-2 kuya ku-1 ngokulandelanayo. Ungabona umphumela kuMfanekiso 1 ngezansi. Ku-eksisi ka-y kwesokunxele, amabha amelela inani lezithombe ezizoqeqeshwa ngomzuzwana, imigqa ibonisa ukusebenza kahle kokukalwa (okungukuthi, isilinganiso sokuphuma kwangempela kuya kokuhle) ku-eksisi ka-y kwesokudla. Njengoba ubona, ukukhetha kwenombolo yamaseva kuthinta ukusebenza kahle kwesikali. Uma kuneseva yepharamitha eyodwa kuphela, ukusebenza kahle kokukala kwehlela ku-38% kuma-GPU angama-64. Ukuze uzuze ukusebenza kahle kokukala njengeHorovod, udinga ukuphinda kabili inani lamaseva uma kuqhathaniswa nenani labasebenzi.

Kusatshalaliswe Ukufunda nge-Apache MXNet kanye ne-Horovod
Umfanekiso 1. Ukuqhathaniswa kokufunda okusabalalisiwe kusetshenziswa i-MXNet ne-Horovod kanye neseva yepharamitha

Kuthebula 1 ngezansi, siqhathanisa izindleko zokugcina ngesikhathi ngasinye lapho kusetshenziswa izivivinyo kuma-GPU angu-64. Ukusebenzisa i-MXNet nge-Horovod kunikeza ukusebenza okuhle kakhulu ngezindleko eziphansi.

Kusatshalaliswe Ukufunda nge-Apache MXNet kanye ne-Horovod
Ithebula 1. Ukuqhathaniswa kwezindleko phakathi kwe-Horovod neseva yepharamitha enesilinganiso seseva nesisebenzi esingu-2 ukuya ku-1.

Izinyathelo zokukhiqiza kabusha

Ezinyathelweni ezilandelayo, sizokukhombisa ukuthi ungakhiqiza kanjani kabusha umphumela wokuqeqeshwa okusabalalisiwe usebenzisa i-MXNet neHorovod. Ukuze ufunde kabanzi mayelana nokufunda okusabalalisiwe nge-MXNet funda lokhu okuthunyelwe.

Isinyathelo 1

Dala iqoqo lezimo ezifanayo nge-MXNet version 1.4.0 noma ngaphezulu kanye nenguqulo ye-Horovod engu-0.16.0 noma ngaphezulu ukuze usebenzise ukufunda okusabalalisiwe. Uzodinga futhi ukufaka imitapo yolwazi yokuqeqeshwa kwe-GPU. Ezimweni zethu, sikhethe Ubuntu 16.04 Linux, ene-GPU Driver 396.44, CUDA 9.2, cuDNN 7.2.1 library, NCCL 2.2.13 communicator kanye ne-OpenMPI 3.1.1. Futhi ungasebenzisa I-Amazon Deep Learning AMI, lapho lawa mamitapo asevele efakwe ngaphambilini.

Isinyathelo 2

Engeza ikhono lokusebenza ne-Horovod API kusikripthi sakho sokuqeqesha se-MXNet. Umbhalo ongezansi osuselwe ku-MXNet Gluon API ungasetshenziswa njengesifanekiso esilula. Imigqa egqamile iyadingeka uma usuvele uneskripthi sokuqeqesha esihambisanayo. Nazi izinguquko ezibalulekile ezimbalwa okudingeka uzenze ukuze ufunde ngeHorovod:

  • Setha umongo ngokwezinga le-Horovod lendawo (umugqa wesi-8) ukuze uqonde ukuthi ukuqeqeshwa kwenziwa kumongo wezithombe olungile.
  • Dlulisa amapharamitha okuqala esisebenzini esisodwa uye kubo bonke (umugqa we-18) ukuze uqinisekise ukuthi bonke abasebenzi baqala ngemingcele efanayo yokuqala.
  • Dala iHorovod I-DistributedOptimizer (umugqa wama-25) ukuze ubuyekeze amapharamitha ngendlela esabalalisiwe.

Ukuze uthole umbhalo ogcwele, sicela ubheke izibonelo ze-Horovod-MXNet I-MNIST и IMAGEnet.

1  import mxnet as mx
2  import horovod.mxnet as hvd
3
4  # Horovod: initialize Horovod
5  hvd.init()
6
7  # Horovod: pin a GPU to be used to local rank
8  context = mx.gpu(hvd.local_rank())
9
10 # Build model
11 model = ...
12
13 # Initialize parameters
14 model.initialize(initializer, ctx=context)
15 params = model.collect_params()
16
17 # Horovod: broadcast parameters
18 hvd.broadcast_parameters(params, root_rank=0)
19
20 # Create optimizer
21 optimizer_params = ...
22 opt = mx.optimizer.create('sgd', **optimizer_params)
23
24 # Horovod: wrap optimizer with DistributedOptimizer
25 opt = hvd.DistributedOptimizer(opt)
26
27 # Create trainer and loss function
28 trainer = mx.gluon.Trainer(params, opt, kvstore=None)
29 loss_fn = ...
30
31 # Train model
32 for epoch in range(num_epoch):
33    ...

Isinyathelo 3

Ngena ngemvume komunye wabasebenzi ukuze uqale ukuqeqeshwa okusabalalisiwe usebenzisa umyalelo we-MPI. Kulesi sibonelo, ukuqeqeshwa okusabalalisiwe kusebenza ezimweni ezine ezinama-GPU angu-4 lilinye, kanye nengqikithi yama-GPU angu-16 kuqoqo. I-Stochastic Gradient Descent (SGD) optimizer izosetshenziswa nama-hyperparameter alandelayo:

  • usayizi weqoqo elincane: 256
  • izinga lokufunda: 0.1
  • isivinini: 0.9
  • ukuwohloka kwesisindo: 0.0001

Njengoba sehla sisuka ku-GPU eyodwa saya kuma-GPU angama-64, silinganise izinga lokuqeqeshwa ngokwenani lama-GPU (kusuka ku-0,1 ku-1 GPU kuya ku-6,4 kuma-GPU angama-64), ngenkathi sigcina inani lezithombe nge-GPU ngayinye ku-256 (kusuka kunqwaba ye Izithombe ezingu-256 ze-GPU engu-1 ukuya kwezingu-16 zama-GPU angu-384). Ukuwohloka kwesisindo namapharamitha womfutho ashintshile njengoba inani lama-GPU likhula. Sisebenzise ukuqeqeshwa okunembayo okuxubile nohlobo lwedatha ye-float64 ye-forward pass kanye ne-float16 kuma-gradients ukusheshisa izibalo ze-float32 ezisekelwa yi-NVIDIA Tesla GPUs.

$ mpirun -np 16 
    -H server1:4,server2:4,server3:4,server4:4 
    -bind-to none -map-by slot 
    -mca pml ob1 -mca btl ^openib 
    python mxnet_imagenet_resnet50.py

isiphetho

Kulesi sihloko, sibheke indlela eyingozi yokuqeqeshwa kwemodeli esatshalaliswa kusetshenziswa i-Apache MXNet neHorovod. Sibonise ukusebenza kahle kwesikali kanye nokusebenza kahle kwezindleko uma kuqhathaniswa nendlela yeseva yepharamitha kudathasethi ye-ImageNet lapho imodeli ye-ResNet50-v1 yaqeqeshwa khona. Siphinde safaka nezinyathelo ongazisebenzisa ukuze uguqule iskripthi esikhona ukuze uqalise ukuqeqeshwa okuyizimo eziningi usebenzisa i-Horovod.

Uma usanda kuqalisa nge-MXNet nokufunda okujulile, hamba ekhasini lokufaka MXNeukwakha kuqala i-MXNet. Siphinde sincoma kakhulu ukufunda lesi sihloko I-MXNet emizuzwini engama-60ukuze uqalise.

Uma usuvele wasebenza ne-MXNet futhi ufuna ukuzama ukufunda okusabalalisiwe nge-Horovod, bese ubheka Ikhasi lokufaka le-Horovod, yakha kusuka ku-MXNet bese ulandela isibonelo I-MNIST noma IMAGEnet.

*izindleko zibalwa ngokusekelwe ku amanani ngehora I-AWS ye-EC2 Instances

Funda kabanzi mayelana nesifundo "Industrial ML on Big Data"

Source: www.habr.com

Engeza amazwana