Ukusasazwa kokuFunda nge-Apache MXNet kunye neHorovod

Ukuguqulelwa kwenqaku kwalungiselelwa kwangaphambi kokuqalisa kwekhosi "Iindustrial ML kwiDatha enkulu"

Uqeqesho olusasazwayo kwiimeko ezininzi zekhompuyutha ezisebenza kakhulu zinokulinciphisa ixesha loqeqesho lothungelwano lwe-neural olunzulu lwale mihla kwisixa esikhulu sedatha ukusuka kwiiveki ukuya kwiiyure okanye nakwimizuzu, okwenza obu buchule boqeqesho buxhaphake ekusebenziseni okusebenzayo kokufunda nzulu. Abasebenzisi kufuneka baqonde indlela yokwabelana kunye nokulungelelaniswa kwedatha kwiimeko ezininzi, nto leyo enempembelelo enkulu ekusebenzeni komgangatho. Ukongeza, abasebenzisi kufuneka bayazi indlela yokuhambisa iskripthi soqeqesho esisebenza kwimeko enye ukuya kwiimeko ezininzi.

Kweli nqaku siza kuthetha ngendlela ekhawulezayo nelula yokusasaza ukufunda sisebenzisa ithala leencwadi lokufunda nzulu elivulekileyo i-Apache MXNet kunye nesakhelo sokufunda esasasazwa yiHorovod. Siza kubonisa ngokucacileyo iinzuzo zokusebenza kwesakhelo seHorovod kwaye sibonise indlela yokubhala iskripthi soqeqesho lweMXNet ukwenzela ukuba sisebenze ngendlela yokusabalalisa kunye neHorovod.

Yintoni iApache MXNet

I-Apache MXNet sisikhokelo sokufunda nzulu esivulelekileyo esisetyenziselwa ukuyila, ukuqeqesha, kunye nokusasaza uthungelwano olunzulu lwe-neural. I-MXNet ithatha ubunzima obunxulunyaniswa nokuphumeza uthungelwano lwe-neural, iyasebenza kakhulu kwaye iyakaleka, kwaye inikezela ngee-APIs kwiilwimi zeprogram ezidumileyo ezinje Python, C ++, Clojure, Java, Julia, R, Scala nabanye.

Ukusasazwa koqeqesho kwi-MXNet kunye nomncedisi wepharamitha

Imodyuli yokufunda esasazwe ngokusemgangathweni kwiMXNet isebenzisa indlela yomncedisi we parameter. Isebenzisa iseti yeeseva zeparameter ukuqokelela iigradients kumsebenzi ngamnye, ukwenza udityaniso, kwaye ithumele i-gradients ezihlaziyiweyo kubasebenzi kwi-iteration elandelayo yokwandisa. Ukumisela umlinganiselo ochanekileyo weeseva kubasebenzi ngundoqo wokulinganisa okusebenzayo. Ukuba kukho iseva yepharamitha enye kuphela, inokujika ibe ngumqobo kwizibalo. Ngokwahlukileyo, ukuba zininzi iiseva ezisetyenzisiweyo, unxibelelwano oluninzi ukuya kuninzi lunokuvala lonke unxibelelwano lwenethiwekhi.

Yintoni iHorovod

Horovod sisikhokelo sokufunda esivulekileyo esisasazwe nzulu esiphuhliswe kwa-Uber. Isebenzisa i-cross-GPU esebenzayo kunye nobuchwepheshe be-cross-node obufana neThala leeNcwadi loNxibelelwano lwe-NVIDIA (NCCL) kunye ne-Message Passing Interface (MPI) ukusasaza kunye nokuhlanganisa imodeli yeeparamitha kwii-vorecs. Yandisa ukusetyenziswa kwe-bandwidth yenethiwekhi kunye nezikali kakuhle xa usebenza kunye neemodeli ezinzulu zenethiwekhi ye-neural. Okwangoku ixhasa iinkqubo zokufunda zoomatshini ezininzi ezidumileyo, ezizezi MX Net, Tensorflow, Keras, kunye nePyTorch.

Ukudityaniswa kweMXNet kunye neHorovod

I-MXNet idibanisa neHorovod ngokusebenzisa i-Ditributed Learning APIs echazwe kwiHorovod. Horovod unxibelelwano APIs horovod.sasazo(), horovod.allgather() и horovod.allreduce() iphunyezwe kusetyenziswa ii-callbacks ezilinganayo ze-injini ye-MXNet, njengenxalenye yegrafu yomsebenzi wayo. Ngale ndlela, ukuxhomekeka kwedatha phakathi konxibelelwano kunye nokubala kusingathwa ngokulula yi-injini ye-MXNet ukuphepha ilahleko yokusebenza ngenxa yolungelelwaniso. Into ye-optimizer esasaziweyo echazwe kwiHorovod horovod.DistributedOptimizer yandisa Optimizer kwiMXNet ukuze ibize iHorovod APIs ehambelanayo yohlaziyo lweparameter. Zonke ezi nkcukacha zokuphunyezwa ziselubala kubasebenzisi bokugqibela.

Ukuqala ngokukhawuleza

Ungaqalisa ngokukhawuleza ukuqeqesha inethiwekhi encinci ye-convolutional neural kwi-dataset ye-MNIST usebenzisa i-MXNet kunye neHorovod kwi-MacBook yakho.
Okokuqala, faka i-mxnet kunye ne-hoovod kwi-PyPI:

pip install mxnet
pip install horovod

Qaphela: Ukuba ufumana impazamo ngexesha pip ufake horovodmhlawumbi kufuneka udibanise umahluko MACOSX_DEPLOYMENT_TARGET=10.vvphi vv - olu luguqulelo lwenguqulelo yakho yeMacOS, umzekelo, kwiMacOSX Sierra kuya kufuneka ubhale MACOSX_DEPLOYMENT_TARGET=10.12 ipip ufake ihorovod

Emva koko faka i-OpenMPI kusuka apha.

Ekugqibeleni, khuphela iskripthi sovavanyo mxnet_mnist.py kusuka apha kwaye uqhube le miyalelo ilandelayo kwi-terminal ye-MacBook kuluhlu olusebenzayo:

mpirun -np 2 -H localhost:2 -bind-to none -map-by slot python mxnet_mnist.py

Oku kuya kuqhuba uqeqesho kwiicores ezimbini zeprosesa yakho. Isiphumo siya kuba silandelayo:

INFO:root:Epoch[0] Batch [0-50] Speed: 2248.71 samples/sec      accuracy=0.583640
INFO:root:Epoch[0] Batch [50-100] Speed: 2273.89 samples/sec      accuracy=0.882812
INFO:root:Epoch[0] Batch [50-100] Speed: 2273.39 samples/sec      accuracy=0.870000

Umboniso wokuSebenza

Xa uqeqesha imodeli ye-ResNet50-v1 kwi-ImageNet dataset kwi-64 GPUs ezinemizekelo esibhozo p3.16inkulu I-EC2, nganye iqulethe i-8 NVIDIA Tesla V100 GPUs kwilifu le-AWS, siphumelele uqeqesho lwe-45000 imifanekiso / isekhondi (oko kukuthi, inani leesampuli eziqeqeshiwe ngesekhondi). Uqeqesho lugqitywe kwimizuzu engama-44 emva kwe-90 epochs ngeyona ndlela ichanekileyo ye-75.7%.

Sithelekise oku kwindlela yoqeqesho esasazwayo ye-MXNet yokusebenzisa iiseva zeparamitha kwi-8, 16, 32 kunye ne-64 GPUs kunye neseva yepharamitha enye kunye ne-server kumlinganiselo wabasebenzi we-1 ukuya ku-1 kunye no-2 ukuya ku-1, ngokulandelanayo. Ungasibona isiphumo kuMfanekiso 1 ngezantsi. Kwi-axis ka-y ekhohlo, imivalo imele inani lemifanekiso yokuqeqesha ngesekhondi, imigca ibonisa ukusebenza kakuhle kokulinganisa (oko kukuthi, umlinganiselo weyona nto ilungileyo ukuya kwi-throughput) kwi-y-axis ngasekunene. Njengoko ubona, ukhetho lwenani lamaseva luchaphazela ukusebenza kakuhle kokulinganisa. Ukuba kukho iseva yepharamitha enye kuphela, ukusebenza kakuhle kokulinganisa kwehla ukuya kwi-38% kwi-64 GPUs. Ukufezekisa ukusebenza kakuhle kokulinganisa njengeHorovod, kufuneka uphinde kabini inani leeseva ezihambelana nenani labasebenzi.

Ukusasazwa kokuFunda nge-Apache MXNet kunye neHorovod
Umzobo 1. Ukuthelekiswa kokufunda okusasazwayo usebenzisa i-MXNet kunye neHorovod kunye ne-server yepharamitha

KwiThebhile yoku-1 engezantsi, sithelekisa iindleko zokugqibela ngomzekelo ngamnye xa uqhuba imifuniselo kwi-64 GPUs. Ukusebenzisa i-MXNet kunye neHorovod ibonelela ngeyona mveliso ilungileyo ngexabiso eliphantsi.

Ukusasazwa kokuFunda nge-Apache MXNet kunye neHorovod
Itheyibhile 1. Ukuthelekiswa kweendleko phakathi kweHorovod kunye neParameter Server kunye nomncedisi kumlinganiselo wabasebenzi we-2 ukuya kwi-1.

Amanyathelo okuvelisa kwakhona

Kumanyathelo alandelayo, siya kukubonisa indlela yokuvelisa kwakhona iziphumo zoqeqesho olusasazwayo usebenzisa iMXNet kunye neHorovod. Ukufunda ngakumbi malunga nokusasazwa kokufunda ngeMXNet funda esi sithuba.

Isinyathelo 1

Yenza iqoqo leemeko ezifanayo kunye ne-MXNet inguqulo 1.4.0 okanye ngaphezulu kunye nenguqulo ye-Horovod 0.16.0 okanye ngaphezulu ukuze usebenzise ukufunda okusasaziweyo. Kuya kufuneka kwakhona ufake iilayibrari zoqeqesho lweGPU. Kwimizekelo yethu, sikhethe Ubuntu 16.04 Linux, kunye neGPU Driver 396.44, CUDA 9.2, cuDNN 7.2.1 ilayibrari, NCCL 2.2.13 umnxibelelanisi kunye ne-OpenMPI 3.1.1. Kananjalo ungasebenzisa I-Amazon yokufunda ngokunzulu i-AMI, apho la mathala sele efakelwe kwangaphambili.

Isinyathelo 2

Yongeza ukukwazi ukusebenza kunye neHorovod API kwisikripthi sakho soqeqesho seMXNet. Isikripthi esingezantsi esisekelwe kwi-MXNet Gluon API singasetyenziswa njenge template elula. Imigca ebhalwe ngqindilili iyafuneka ukuba sele uneskripthi soqeqesho esihambelanayo. Nalu utshintsho olubalulekileyo ekufuneka ulwenzile ukuze ufunde ngeHorovod:

  • Misela umxholo ngokwenqanaba leHorovod yendawo (umgca 8) ukuqonda ukuba uqeqesho lwenziwa kwi-graphics core echanekileyo.
  • Dlula iiparamitha zokuqala ukusuka kumsebenzi omnye ukuya kubo bonke (umgca we-18) ukuqinisekisa ukuba bonke abasebenzi baqala ngeeparamitha zokuqala ezifanayo.
  • Yenza iHorovod DistributedOptimizer (umgca 25) ukuhlaziya iiparameters ngendlela esasazwayo.

Ukufumana iskripthi esipheleleyo, nceda ubhekisele kwimizekelo yeHorovod-MXNet MNIST и IMAGEnet.

1  import mxnet as mx
2  import horovod.mxnet as hvd
3
4  # Horovod: initialize Horovod
5  hvd.init()
6
7  # Horovod: pin a GPU to be used to local rank
8  context = mx.gpu(hvd.local_rank())
9
10 # Build model
11 model = ...
12
13 # Initialize parameters
14 model.initialize(initializer, ctx=context)
15 params = model.collect_params()
16
17 # Horovod: broadcast parameters
18 hvd.broadcast_parameters(params, root_rank=0)
19
20 # Create optimizer
21 optimizer_params = ...
22 opt = mx.optimizer.create('sgd', **optimizer_params)
23
24 # Horovod: wrap optimizer with DistributedOptimizer
25 opt = hvd.DistributedOptimizer(opt)
26
27 # Create trainer and loss function
28 trainer = mx.gluon.Trainer(params, opt, kvstore=None)
29 loss_fn = ...
30
31 # Train model
32 for epoch in range(num_epoch):
33    ...

Isinyathelo 3

Ngena komnye wabasebenzi ukuze uqalise ukusasazwa koqeqesho usebenzisa umyalelo weMPI. Kulo mzekelo, uqeqesho olusasazwayo luqhuba kwiimeko ezine kunye ne-4 GPUs nganye, kunye ne-16 GPUs iyonke kwiqela. I-Stochastic Gradient Descent (SGD) optimizer iya kusetyenziswa ngezi hyperparameters zilandelayo:

  • ubungakanani bebhetshi encinci: 256
  • izinga lokufunda: 0.1
  • amandla: 0.9
  • ukubola ubunzima: 0.0001

Njengoko sisuka kwi-GPU enye ukuya kwii-GPU ezingama-64, silinganise umyinge woqeqesho ngokwenani le-GPU (ukusuka kwi-0,1 ye-1 GPU ukuya kwi-6,4 ye-64 GPUs), ngelixa sigcina inani lemifanekiso nge-GPU nganye kwi-256 (ukusuka kwibhetshi ye Imifanekiso eyi-256 ye-1 GPU ukuya kwi-16 ye-384 GPUs). Ukubola kobunzima kunye neeparitha zesantya zitshintshile njengoko inani le-GPUs landa. Sisebenzise uqeqesho oluchanekileyo oluxubeneyo kunye nohlobo lwedatha ye-float64 yokudlula phambili kunye ne-float16 ye-gradients ukukhawulezisa izibalo ze-float32 ezixhaswa yi-NVIDIA Tesla GPUs.

$ mpirun -np 16 
    -H server1:4,server2:4,server3:4,server4:4 
    -bind-to none -map-by slot 
    -mca pml ob1 -mca btl ^openib 
    python mxnet_imagenet_resnet50.py

isiphelo

Kweli nqaku, sijonge indlela enobungozi yokusasazwa koqeqesho lwemodeli usebenzisa i-Apache MXNet kunye neHorovod. Sibonise ukusebenza kakuhle kokulinganisa kunye nokuphumelela kweendleko xa kuthelekiswa nendlela yomncedisi weparameter kwi-ImageNet dataset apho imodeli ye-ResNet50-v1 yaqeqeshwa khona. Sikwaquke amanyathelo onokuwasebenzisa ukulungisa umbhalo osele ukho ukuze uqhube uqeqesho lweziganeko ezininzi usebenzisa iHorovod.

Ukuba usandula ukuqalisa ngeMXNet kunye nokufunda nzulu, yiya kwiphepha lokufakela MXNeukwakha kuqala iMXNet. Sikwacebisa ngamandla ukufunda inqaku MXNet kwimizuzu engama-60ukuqalisa.

Ukuba sele usebenzile kunye ne-MXNet kwaye ufuna ukuzama ukusasazwa kokufunda ngeHorovod, jonga ke Iphepha lofakelo lweHorovod, yakha kwi-MXNet kwaye ulandele umzekelo MNIST okanye IMAGEnet.

*indleko ibalwa ngokusekelwe kwi amaxabiso ngeyure I-AWS yeemeko ze-EC2

Funda ngakumbi malunga nekhosi "Iindustrial ML kwiDatha enkulu"

umthombo: www.habr.com

Yongeza izimvo