Pavel Klemenkov, NVIDIA: Sizama ukunciphisa igebe phakathi kwalokho usosayensi wedatha angakwenza nalokho okufanele akwazi ukukwenza

Ukuthathwa kwesibili kwabafundi bohlelo lwe-master in data science kanye nobuhlakani bebhizinisi i-Ozon Masters isiqalile - nokwenza kube lula ukunquma ukushiya isicelo bese uthatha isivivinyo se-inthanethi, sibuze othisha bohlelo mayelana nokuthi yini okufanele bayilindele ekufundeni nasekusebenzeni. ngedatha.

Pavel Klemenkov, NVIDIA: Sizama ukunciphisa igebe phakathi kwalokho usosayensi wedatha angakwenza nalokho okufanele akwazi ukukwenza Usosayensi Omkhulu Wedatha u-NVIDIA kanye nothisha izifundo zeBig Data kanye nobunjiniyela bedatha UPavel Klemenkov ukhulume ngokuthi kungani izazi zezibalo zidinga ukubhala ikhodi nokufunda e-Ozon Masters iminyaka emibili.

- Ingabe zikhona izinkampani eziningi ezisebenzisa idatha yesayensi algorithm?

- Empeleni kakhulu impela. Izinkampani eziningi ezinkulu ezinedatha enkulu ngempela ziqala ukusebenza nayo ngempumelelo noma sezisebenze nayo isikhathi eside. Kuyacaca ukuthi ingxenye yemakethe isebenzisa idatha engangena kuspredishithi se-Excel noma ingabalwa kuseva enkulu, kodwa angeke kushiwo ukuthi kukhona amabhizinisi ambalwa kuphela angasebenza ngedatha.

- Sitshele kancane mayelana namaphrojekthi lapho kusetshenziswa isayensi yedatha.

- Isibonelo, ngenkathi sisebenza e-Rambler, senza uhlelo lokukhangisa olusebenza ngezimiso ze-RTB (Real Time Bidding) - sasidinga ukwakha amamodeli amaningi azothuthukisa ukuthengwa kokukhangisa noma, isibonelo, angabikezela amathuba. ngokuchofoza, ukuguqulwa, njalonjalo. Ngesikhathi esifanayo, indali yokukhangisa ikhiqiza idatha eningi: izingodo zezicelo zesayithi kubathengi abangaba ukukhangisa, amalogi wemibono yokukhangisa, amalogi okuchofoza - lokhu amashumi ama-terabytes edatha ngosuku.

Ngaphezu kwalokho, kule misebenzi sibone into ethokozisayo: uma unikeza idatha eyengeziwe ukuze uqeqeshe imodeli, iba phezulu nekhwalithi yayo. Ngokuvamile, ngemva kwenani elithile ledatha, ikhwalithi yesibikezelo iyayeka ukuthuthuka, futhi ukuze uqhubeke uthuthukisa ukunemba, udinga ukusebenzisa imodeli ehluke ngokuyisisekelo, indlela ehlukile yokulungiselela idatha, izici, njalonjalo. Lapha silayishe idatha eyengeziwe futhi ikhwalithi iyanda.

Lesi yisimo esijwayelekile lapho abahlaziyi bekufanele, okokuqala, basebenze ngamasethi amakhulu edatha ukuze okungenani benze isilingo, nalapho bekungenzeki khona ukudlula ngesampula elincane elilingana neMacBook ethokomele. Ngesikhathi esifanayo, sasidinga amamodeli asabalalisiwe, ngoba ngaphandle kwalokho ayengeke aqeqeshwe. Ngokwethulwa kombono wekhompiyutha ekukhiqizeni, izibonelo ezinjalo zivame kakhulu, njengoba izithombe ziyinani elikhulu ledatha, futhi ukuqeqesha imodeli enkulu, izigidi zezithombe ziyadingeka.

Umbuzo uphakama ngokushesha: indlela yokugcina lonke lolu lwazi, indlela yokulucubungula ngokuphumelelayo, indlela yokusebenzisa ama-algorithms okufunda okusabalalisiwe - okugxilwe kukho kuyasuka kuzibalo ezihlanzekile kuye kobunjiniyela. Ngisho noma ungayibhali ikhodi ekukhiqizeni, udinga ukwazi ukusebenza ngamathuluzi obunjiniyela ukuze wenze ukuhlolwa.

- Ishintshe kanjani indlela yokuthola izikhala zesayensi yedatha eminyakeni yamuva nje?

- Idatha enkulu iyekile ukuba yi-hype futhi isibe ngokoqobo. Ama-hard drive ashibhile, okusho ukuthi kungenzeka ukuqoqa yonke imininingwane ukuze esikhathini esizayo kube okwanele ukuhlola noma yikuphi ukuqagela. Ngenxa yalokho, ulwazi lwamathuluzi okusebenza ngedatha enkulu seluthandwa kakhulu, futhi, ngenxa yalokho, kuvela izikhala eziningi zonjiniyela bedatha.

Ngokuqonda kwami, umphumela womsebenzi wososayensi wedatha awukona ukuhlola, kodwa umkhiqizo ofinyelele ekukhiqizeni. Futhi kusukela kulo mbono, ngaphambi kokufika kwe-hype ezungeze idatha enkulu, inqubo yayilula: onjiniyela babehlanganyela ekufundeni komshini ukuze baxazulule izinkinga ezithile, futhi kwakungekho zinkinga ngokuletha ama-algorithms ekukhiqizeni.

- Yini edingekayo ukuze uhlale unguchwepheshe ofunwayo?

- Manje abantu abaningi beze kwisayensi yedatha abaye bafunda izibalo, ithiyori yokufunda ngomshini, futhi babamba iqhaza emiqhudelwaneni yokuhlaziya idatha, lapho kunikezwa khona ingqalasizinda esenziwe ngomumo: idatha iyahlanzwa, amamethrikhi achazwe, futhi awekho izidingo zokuthi ikhambi liphinde likhiqizwe futhi lisheshe.

Ngenxa yalokho, abafana beza emsebenzini bengalungiselele kahle amaqiniso ebhizinisi, futhi kwakheka igebe phakathi kwabasanda kuhlanganyela nabathuthukisi abanolwazi.

Ngokuthuthukiswa kwamathuluzi akuvumela ukuthi uhlanganise imodeli yakho kusuka kumamojula enziwe ngomumo - futhi iMicrosoft, i-Google nabanye abaningi sebevele banazo izixazululo - kanye nokuzenzakalelayo kokufunda komshini, leli gebe lizogqama nakakhulu. Ngokuzayo, lo msebenzi uzobe udingeka kubacwaningi abakhulu abaqhamuka nama-algorithms amasha, kanye nabasebenzi abanamakhono onjiniyela athuthukile abazosebenzisa amamodeli kanye nezinqubo ezizenzakalelayo. Isifundo se-Ozon Masters kwezobunjiniyela bedatha yakhelwe ukuthuthukisa amakhono obunjiniyela kanye nekhono lokusebenzisa ama-algorithms okufunda omshini asabalalisiwe kudatha enkulu. Sizama ukunciphisa igebe phakathi kwalokho usosayensi wedatha angakwenza nalokho okufanele akwazi ukukwenza lapho esebenza.

- Kungani kufanele isazi sezibalo esinediploma siyofunda ibhizinisi?

- Umphakathi wesayensi yedatha yaseRussia usuqonde ukuthi ikhono nolwazi luguqulwa ngokushesha lube yimali, ngakho-ke, ngokushesha nje lapho uchwepheshe enolwazi olusebenzayo, izindleko zakhe ziqala ukukhula ngokushesha, abantu abanekhono kakhulu bayabiza kakhulu - futhi lokhu kuyiqiniso kulesi sikhathi samanje semakethe yentuthuko.

Ingxenye enkulu yomsebenzi wososayensi wedatha ukungena kudatha, uqonde ukuthi yini elele lapho, uthintane nabantu abanomthwalo wemfanelo wezinqubo zebhizinisi futhi ukhiqize le datha - bese uyisebenzisela ukwakha amamodeli. Ukuqala ukusebenza ngedatha enkulu, kubaluleke kakhulu ukuba namakhono onjiniyela - lokhu kwenza kube lula kakhulu ukugwema amakhona abukhali, amaningi awo kusayensi yedatha.

Indaba evamile: ubhale umbuzo ku-SQL owenziwe kusetshenziswa uhlaka lwe-Hive olusebenza kudatha enkulu. Isicelo sicutshungulwa ngemizuzu eyishumi, esimweni esibi kakhulu - ehoreni elilodwa noma amabili, futhi ngokuvamile, lapho uthola ukulandwa kwale datha, uyabona ukuthi ukhohlwe ukucabangela isici esithile noma ulwazi olwengeziwe. Kufanele uphinde uthumele isicelo futhi ulinde le mizuzu namahora. Uma ungumuntu ohlakaniphile osebenza kahle, uzothatha omunye umsebenzi, kodwa, njengoba umkhuba ubonisa, sinongqondongqondo abambalwa bokusebenza kahle, futhi abantu balindile nje. Ngakho-ke, ezifundweni sizosebenzisa isikhathi esiningi sokusebenza kahle ukuze siqale sibhale imibuzo engasebenzi amahora amabili, kodwa imizuzu embalwa. Leli khono liphindaphinda ukukhiqiza, kanye nalo ukubaluleka kochwepheshe.

- Ihluke kanjani i-Ozon Masters kwezinye izifundo?

- I-Ozon Masters ifundiswa ngabasebenzi base-Ozon, futhi imisebenzi isekelwe emacaleni ebhizinisi angempela axazululwa ezinkampanini. Eqinisweni, ngaphezu kokuntuleka kwamakhono obunjiniyela, umuntu ofunde isayensi yedatha eyunivesithi unenye inkinga: umsebenzi webhizinisi uhlelwe ngolimi lwebhizinisi, futhi umgomo walo ulula kakhulu: ukuthola imali eyengeziwe. Futhi uchwepheshe wezibalo wazi kahle ukuthi asetshenziswa kanjani ngokugcwele amamethrikhi ezibalo - kodwa ukuthola inkomba ezohlotshaniswa nemethrikhi yebhizinisi kunzima. Futhi udinga ukuqonda ukuthi uxazulula inkinga yebhizinisi, futhi kanye nebhizinisi, yenza amamethrikhi angenziwa ngokugcwele ngokwezibalo. Leli khono litholwa ngamacala angempela, futhi anikezwa ngu-Ozon.
Futhi noma singawanaki amacala, isikole sifundiswa odokotela abaningi abaxazulula izinkinga zebhizinisi ezinkampanini zangempela. Ngenxa yalokho, indlela yokufundisa ngokwayo isagxile kakhulu ekuzilolongeni. Okungenani esifundweni sami, ngizozama ukugxilisa ukugxila ekutheni ungasebenzisa kanjani amathuluzi, yiziphi izindlela ezikhona, nokunye. Kanye nabafundi, sizoqonda ukuthi umsebenzi ngamunye unethuluzi lawo, futhi ithuluzi ngalinye linendawo yalo elisebenza ngayo.

- Uhlelo oludume kakhulu lokuqeqeshwa kokuhlaziywa kwedatha, yiqiniso, i-SHAD - yini ngempela umehluko kulo?

- Kuyacaca ukuthi i-ShaD ne-Ozon Masters, ngaphezu komsebenzi wezemfundo, ixazulula inkinga yendawo yokuqeqeshwa kwabasebenzi. Abaphothule abaphezulu be-SHAD ngokuyinhloko baqashwa ku-Yandex, kodwa okubanjiwe ukuthi i-Yandex, ngenxa yokucaciswa kwayo - futhi inkulu futhi yadalwa lapho kunamathuluzi ambalwa amahle okusebenza ngedatha enkulu - ingqalasizinda yayo kanye namathuluzi okusebenza ngedatha. , okusho ukuthi, kuzodingeka uwazi kahle. I-Ozon Masters inomlayezo ohlukile - uma uphumelele uhlelo futhi i-Ozon noma enye ye-99% yezinye izinkampani ikumema ukuthi usebenze, kuzoba lula kakhulu ukuqala ukuzuza ibhizinisi; isethi yamakhono etholwe njengengxenye ye-Ozon Masters kuzokwanela ukuthi uqale ukusebenza.

- Isifundo sithatha iminyaka emibili. Kungani udinga ukuchitha isikhathi esiningi kulokhu?

- Umbuzo omuhle. Kuthatha isikhathi eside, ngoba ngokokuqukethwe kanye nezinga lothisha, lolu wuhlelo olubalulekile lwenkosi oludinga isikhathi esiningi ukuze lulwazi kahle, okuhlanganisa nomsebenzi wasekhaya.

Ngokombono wami wesifundo, ukulindela ukuthi umfundi achithe amahora angu-2-3 ngesonto ezabelweni kuvamile. Okokuqala, imisebenzi yenziwa kuqoqo lokuqeqeshwa, futhi noma yiliphi iqoqo elihlanganyelwe lisho ukuthi abantu abambalwa bayalisebenzisa ngesikhathi esisodwa. Okusho ukuthi, kuzodingeka ulinde umsebenzi ukuze uqale ukuwenza; ezinye izinsiza zingase zikhethwe futhi zidluliselwe kulayini obaluleke kakhulu. Ngakolunye uhlangothi, noma yimuphi umsebenzi onedatha enkulu uthatha isikhathi esiningi.

Uma uneminye imibuzo mayelana nohlelo, usebenza ngedatha enkulu noma amakhono wobunjiniyela, i-Ozon Masters inosuku oluvuliwe ku-inthanethi ngoMgqibelo, Ephreli 25 ngo-12:00. Sihlangana nothisha nabafundi ku Zoom futhi kuqhubeke YouTube.

Source: www.habr.com

Engeza amazwana