Ukufunda Ngomshini Wezimboni: Izimiso Zokuklama eziyi-10

Ukufunda Ngomshini Wezimboni: Izimiso Zokuklama eziyi-10

Namuhla, izinsizakalo ezintsha, izinhlelo zokusebenza kanye nezinye izinhlelo ezibalulekile zenziwa nsuku zonke ezenza kube nokwenzeka ukudala izinto ezimangalisayo: kusukela kusofthiwe yokulawula i-rocket ye-SpaceX ukuya ekuxhumaneni neketela ekamelweni elilandelayo nge-smartphone.

Futhi, ngezinye izikhathi, wonke umhleli wezimfundamakhwela, kungakhathaliseki ukuthi uyisiqalisi esishisekayo noma i-Full Stack noma i-Data Scientist, ngokushesha noma kamuva ufika ekuqapheleni ukuthi kunemithetho ethile yokuhlela nokudala isofthiwe eyenza ukuphila kube lula.

Kulesi sihloko, ngizochaza kafushane imigomo engu-10 yokuthi ungahlela kanjani ukufunda komshini wezimboni ukuze ukwazi ukuhlanganiswa kalula kuhlelo lokusebenza/isevisi, ngokusekelwe kundlela yokusebenza yohlelo lokusebenza enezici eziyi-12. kuphakanyiswe ithimba le-Heroku. Isinyathelo sami ukukhulisa ukuqwashisa ngale ndlela, engasiza onjiniyela abaningi nabantu besayensi yedatha.

Lesi sihloko siyisandulela sochungechunge lwezihloko ezimayelana nokufunda komshini wezimboni. Kuzo ngizophinde ngikhulume ngokuthi ungayenza kanjani imodeli futhi uyiqalise ekukhiqizeni, udale i-API yayo, kanye nezibonelo ezivela ezindaweni ezihlukahlukene kanye nezinkampani ezakhelwe ngaphakathi kwe-ML ezinhlelweni zazo.

Isimiso 1: Isisekelo sekhodi eyodwa

Abanye abahleli ezigabeni zokuqala, ngenxa yobuvila bokukuthola (noma ngesizathu esithile sabo), bakhohlwe nge-Git. Bangalikhohlwa ngokuphelele igama, okungukuthi, bajikijelana amafayela kudrayivu/bavele bajikijele umbhalo/ukuthunyelwa ngamajuba, noma bangacabangi ngokuhamba komsebenzi wabo, bese bezinikela egatsheni labo, bese kuthi inkosi.

Lesi simiso sithi: ube ne-codebase eyodwa kanye nokusetshenziswa okuningi.

I-Git ingasetshenziswa kokubili ekukhiqizeni nasekucwaningeni nasekuthuthukisweni (R&D), lapho ingasetshenziswa khona njalo.

Isibonelo, esigabeni se-R&D ungashiya izibophezelo ngezindlela ezihlukene zokucubungula idatha namamodeli, ukuze ukhethe engcono kakhulu futhi uqhubeke kalula nokusebenza nayo ngokuqhubekayo.

Okwesibili, ekukhiqizeni lokhu kuyinto engenakushintshwa - uzodinga ukubheka njalo ukuthi ikhodi yakho ishintsha kanjani futhi wazi ukuthi iyiphi imodeli ekhiqize imiphumela engcono kakhulu, iyiphi ikhodi esebenze ekugcineni nokuthi yini eyenzekile ebangele ukuthi iyeke ukusebenza noma iqale ukukhiqiza imiphumela engalungile. . Yilokho okuzibophezelayo!

Ungakha futhi iphakethe lephrojekthi yakho, ulibeke, ngokwesibonelo, ku-Gemfury, bese umane ungenise imisebenzi kuyo kwamanye amaphrojekthi, ukuze ungawabhali kabusha izikhathi eziyi-1000, kodwa ngaphezulu kwalokho kamuva.

Isimiso sesi-2: Veza ngokucacile futhi uhlukanise okuncikile

Iphrojekthi ngayinye inemitapo yolwazi ehlukene oyingenisa ngaphandle ukuze uwasebenzise ndawana thize. Noma ngabe imitapo yolwazi ye-Python, noma imitapo yolwazi yezinye izilimi ngezinhloso ezahlukahlukene, noma amathuluzi esistimu - umsebenzi wakho uthi:

  • Memezela ngokucacile ukuncika, okungukuthi, ifayela elizoqukatha yonke imitapo yolwazi, amathuluzi, nezinguqulo zazo ezisetshenziswa kuphrojekthi yakho futhi okufanele zifakwe (ngokwesibonelo, kuPython lokhu kungenziwa kusetshenziswa i-Pipfile noma i-requirements.txt. A isixhumanisi esivumela okuhle ukuqonda: realpython.com/pipenv-guide)
  • Hlukanisa ukuncika ngokukhethekile kuhlelo lwakho phakathi nokuthuthukiswa. Awufuni ukushintsha njalo izinguqulo bese ufaka kabusha, isibonelo, i-Tensorflow?

Ngale ndlela, abathuthukisi abazojoyina iqembu lakho ngokuzayo bazokwazi ukujwayelana ngokushesha namalabhulali nezinguqulo zabo ezisetshenziswa kuphrojekthi yakho, futhi uzoba nethuba lokuphatha izinguqulo namalabhulali ngokwawo afakelwe okuthile. iphrojekthi, ezokusiza ugweme ukungahambisani kwemitapo yolwazi noma izinguqulo zayo.

Uhlelo lwakho lokusebenza akufanele futhi luthembele kumathuluzi esistimu angase afakwe ku-OS ethile. Lawa mathuluzi kumele amenyezelwe ku-devancy manifest. Lokhu kuyadingeka ukuze ugweme izimo lapho inguqulo yamathuluzi (kanye nokutholakala kwawo) ingafani namathuluzi esistimu ye-OS ethile.

Ngakho-ke, noma ngabe i-curl ingasetshenziswa cishe kuwo wonke amakhompyutha, kusafanele ukumemezele ngokuncika, ngoba lapho uthuthela kwenye ipulatifomu kungenzeka ingabi khona noma inguqulo ngeke kube yileyo obuyidinga ekuqaleni.

Isibonelo, i-requirements.txt yakho ingase ibukeke kanje:

# Model Building Requirements
numpy>=1.18.1,<1.19.0
pandas>=0.25.3,<0.26.0
scikit-learn>=0.22.1,<0.23.0
joblib>=0.14.1,<0.15.0

# testing requirements
pytest>=5.3.2,<6.0.0

# packaging
setuptools>=41.4.0,<42.0.0
wheel>=0.33.6,<0.34.0

# fetching datasets
kaggle>=1.5.6,<1.6.0

Isimiso sesi-3: Ukucushwa

Abaningi bazwile izindaba zabafana abahlukahlukene abanjiniyela abalayisha ikhodi ngephutha ku-GitHub kumakhosombe omphakathi anamagama ayimfihlo nabanye okhiye abavela ku-AWS, bevuka ngakusasa benesikweletu sika-$6000, noma ngisho nama-$50000.

Ukufunda Ngomshini Wezimboni: Izimiso Zokuklama eziyi-10

Yiqiniso, lezi zimo ziyingozi kakhulu, kodwa zibaluleke kakhulu. Uma ugcina imininingwane yakho noma enye idatha edingekayo ukuze ucushwe ngaphakathi kwekhodi, wenza iphutha, futhi ngicabanga ukuthi asikho isidingo sokuchaza ukuthi kungani.

Enye indlela yalokhu ukugcina ukulungiselelwa kokuguquguquka kwemvelo. Ungafunda kabanzi mayelana nokuguquguquka kwemvelo lapha.

Izibonelo zedatha evamise ukugcinwa ezindaweni eziguquguqukayo:

  • Amagama ezizinda
  • Ama-API URLs/URI's
  • Okhiye basesidlangalaleni nabayimfihlo
  • Oxhumana nabo (i-imeyili, amafoni, njll.)

Ngale ndlela akudingeki ukuthi uhlale ushintsha ikhodi uma okuguquguqukayo kokucushwa kwakho kushintsha. Lokhu kuzokusiza ukonga isikhathi, umzamo kanye nemali.

Isibonelo, uma usebenzisa i-Kaggle API ukwenza izivivinyo (isibonelo, landa isofthiwe futhi uqalise imodeli kuyo ukuze uhlole lapho usebenzisa ukuthi imodeli isebenza kahle), okhiye abayimfihlo abavela ku-Kaggle, njenge-KAGGLE_USERNAME kanye ne-KAGGLE_KEY, kufanele agcinwe ezindaweni eziguquguqukayo.

Isimiso 4: Izinsizakalo Zenkampani Yangaphandle

Umqondo lapha uwukudala uhlelo ngendlela yokuthi ungabibikho umehluko phakathi kwezinsiza zasendaweni nezangaphandle ngokwekhodi. Isibonelo, ungakwazi ukuxhuma kokubili i-MySQL yendawo kanye nalabo abavela eceleni. Okufanayo kuya kuma-API ahlukahlukene njenge-Google Amamephu noma i-Twitter API.

Ukuze ukhubaze isevisi yomuntu wesithathu noma uxhume enye, udinga nje ukushintsha okhiye ekucushweni kokuguquguquka kwemvelo, engikhulume ngakho endimeni engenhla.

Ngakho-ke, ngokwesibonelo, esikhundleni sokucacisa indlela eya kumafayela anamasethi edatha ngaphakathi kwekhodi isikhathi ngasinye, kungcono ukusebenzisa umtapo wezincwadi we-pathlib futhi umemezele indlela eya kumasethi edatha ku-config.py, ukuze kungakhathaliseki ukuthi iyiphi isevisi oyisebenzisayo (yayo. isibonelo, CircleCI), uhlelo lukwazile ukuthola indlela eya kumadathasethi kucatshangelwa ukwakheka kwesistimu entsha yefayela kusevisi entsha.

Isimiso 5. Yakha, khulula, isikhathi sokusebenza

Abantu abaningi ku-Data Science bakuthola kuwusizo ukuthuthukisa amakhono abo okubhala isofthiwe. Uma sifuna uhlelo lwethu luphahlazeke kaningi ngangokunokwenzeka futhi lusebenze ngaphandle kokwehluleka isikhathi eside ngangokunokwenzeka, sidinga ukuhlukanisa inqubo yokukhipha inguqulo entsha ibe izigaba ezi-3:

  1. Isiteji imihlangano. Uguqula ikhodi yakho engenalutho ngezinsizakusebenza ngazinye zibe lokho okubizwa ngokuthi iphakheji eliqukethe yonke ikhodi nedatha edingekayo. Le phakheji ibizwa ngokuthi i-assembly.
  2. Isiteji ukukhululwa - lapha sixhuma ukulungiselelwa kwethu emhlanganweni, ngaphandle kwalokho besingeke sikwazi ukukhulula uhlelo lwethu. Manje lokhu ukukhishwa okulungele ukwethulwa ngokuphelele.
  3. Okulandelayo kuza isiteji ukugcwaliseka. Lapha sikhulula uhlelo lokusebenza ngokusebenzisa izinqubo ezidingekayo ekukhishweni kwethu.

Uhlelo olunjalo lokukhulula izinguqulo ezintsha zemodeli noma lonke ipayipi likuvumela ukuthi uhlukanise izindima phakathi kwabaphathi nabathuthukisi, ikuvumela ukuthi ulandelele izinguqulo futhi uvimbele ukuma okungafunwa kohlelo.

Ngomsebenzi wokukhipha, amasevisi amaningi ahlukene adaliwe lapho ungabhala khona izinqubo zokuziqhuba wena kufayela elithi .yml (isibonelo, ku-CircleCI lena i-config.yml ukusekela inqubo ngokwayo). I-Wheely inhle ekudaleni amaphakheji wamaphrojekthi.

Ungakha amaphakheji anezinguqulo ezihlukene zemodeli yakho yokufunda yomshini, bese uwapakisha futhi ubhekisele kumaphakheji adingekayo nezinguqulo zawo ukuze usebenzise imisebenzi oyibhale lapho. Lokhu kuzokusiza ukuthi udale i-API yemodeli yakho, futhi iphasela lakho lingasingathwa ku-Gemfury, isibonelo.

Isimiso 6. Sebenzisa imodeli yakho njengenqubo eyodwa noma ngaphezulu

Ngaphezu kwalokho, izinqubo akufanele zibe nedatha eyabiwe. Okusho ukuthi, izinqubo kufanele zibe khona ngokuhlukene, futhi zonke izinhlobo zedatha kufanele zibe khona ngokuhlukene, isibonelo, ezinsizeni zezinkampani zangaphandle ezifana ne-MySQL noma ezinye, kuye ngokuthi udinga ini.

Okusho ukuthi, akufanelekile ngempela ukugcina idatha ngaphakathi kwesistimu yefayela lenqubo, ngaphandle kwalokho lokhu kungase kuholele ekusuleni le datha ngesikhathi sokukhishwa okulandelayo/ushintsho lokucushwa noma ukudluliswa kwesistimu lapho uhlelo lusebenza khona.

Kodwa kukhona okuhlukile: kumaphrojekthi wokufunda ngomshini, ungagcina inqolobane yemitapo yolwazi ukuze ungaphinde uyifake njalo uma uvula inguqulo entsha, uma ingekho imitapo yolwazi eyengeziwe noma yiziphi izinguquko ezenziwe ezinguqulweni zayo. Ngale ndlela, uzonciphisa isikhathi esisithathayo ukwethula imodeli yakho embonini.

Ukuze usebenzise imodeli njengezinqubo ezimbalwa, ungakha ifayela elithi .yml lapho ucacisa khona izinqubo ezidingekayo kanye nokulandelana kwazo.

Isimiso sesi-7: Ukusebenziseka kabusha

Izinqubo ezisebenza kumodeli yakho yohlelo lokusebenza kufanele kube lula ukuziqala nokuzimisa. Ngakho-ke, lokhu kuzokuvumela ukuthi uthumele ngokushesha izinguquko zekhodi, izinguquko zokumisa, ukukala ngokushesha nangokuguquguqukayo, futhi uvimbele ukuphuka okungenzeka kwenguqulo yokusebenza.

Okusho ukuthi, inqubo yakho nemodeli kufanele:

  • Nciphisa isikhathi sokuqalisa. Ngokufanelekile, isikhathi sokuqalisa (kusukela ngesikhathi kukhishwa umyalo wokuqalisa kuze kube yilapho inqubo iqala ukusebenza) akufanele sibe ngaphezu kwemizuzwana embalwa. Ukugcinwa kwesikhashana kwelabhulali, okuchazwe ngenhla, kuyindlela eyodwa yokunciphisa isikhathi sokuqalisa.
  • Qeda ngendlela efanele. Okusho ukuthi, ukulalela embobeni yesevisi empeleni kumisiwe, futhi izicelo ezintsha ezithunyelwe kulesi sikhumulo ngeke zicutshungulwe. Lapha udinga ukusetha ukuxhumana okuhle nonjiniyela be-DevOps, noma uqonde ukuthi kusebenza kanjani wena (okungcono, yiqiniso, lokhu kwakamuva, kodwa ukuxhumana kufanele kugcinwe njalo, kunoma iyiphi iphrojekthi!)

Isimiso 8: Ukusatshalaliswa Okuqhubekayo/Ukuhlanganisa

Izinkampani eziningi zisebenzisa ukwehlukana phakathi kokuthuthukiswa kwezinhlelo zokusebenza namaqembu okuthunyelwa (okwenza uhlelo lutholakale kubasebenzisi bokugcina). Lokhu kunganciphisa kakhulu ukuthuthukiswa kwesofthiwe kanye nenqubekela phambili ekuyithuthukiseni. Kuphinde konakalise isiko le-DevOps, lapho ukuthuthukiswa nokuhlanganiswa kuhlangene, uma sikhuluma nje.

Ngakho-ke, lesi simiso sithi indawo yakho yokuthuthuka kufanele ibe seduze ngangokunokwenzeka endaweni yakho yokukhiqiza.

Lokhu kuzovumela:

  1. Nciphisa isikhathi sokukhululwa izikhathi eziyishumi
  2. Yehlisa inani lamaphutha ngenxa yokungahambisani kwekhodi.
  3. Lokhu futhi kunciphisa umthwalo womsebenzi kubasebenzi, njengoba onjiniyela kanye nabantu abathumela uhlelo lokusebenza manje sebeyiqembu elilodwa.

Amathuluzi akuvumela ukuthi usebenze nalokhu yi-CircleCI, Travis CI, GitLab CI namanye.

Ungakwazi ngokushesha ukwenza izengezo kumodeli, uyibuyekeze, futhi uyiqalise ngokushesha, kuyilapho kuzoba lula, uma kwenzeka ukwehluleka, ukubuyela ngokushesha kakhulu enguqulweni yokusebenza, ukuze umsebenzisi wokugcina angayiboni. Lokhu kungenziwa kalula futhi ngokushesha uma unokuhlolwa okuhle.

Nciphisa umehluko!!!

Isimiso 9. Izingodo zakho

Amalogi (noma β€œAmalogi”) yimicimbi, ngokuvamile eqoshwa ngefomethi yombhalo, eyenzeka ngaphakathi kohlelo lokusebenza (ukusakazwa komcimbi). Isibonelo esilula: "2020-02-02 - izinga lesistimu - igama lenqubo." Zakhiwe ngendlela yokuthi unjiniyela akwazi ukubona ngokoqobo okwenzekayo lapho uhlelo lusebenza. Ubona ukuqhubeka kwezinqubo futhi uyaqonda ukuthi kunjengoba unjiniyela ayehlosile.

Lesi simiso sithi akufanele ugcine izingodo zakho ngaphakathi kwesistimu yakho yefayela - kufanele "ukhiphe" esikrinini, isibonelo, wenze lokhu ekuphumeni okujwayelekile kwesistimu. Futhi ngale ndlela kuzokwazi ukuqapha ukugeleza ku-terminal ngesikhathi sokuthuthukiswa.

Ingabe lokhu kusho ukuthi asikho isidingo sokulondoloza izingodo? Vele akunjalo. Isicelo sakho akufanele senze lokhuβ€”sishiyele kumasevisi ezinkampani zangaphandle. Uhlelo lwakho lokusebenza lungadlulisela kuphela amalogi kufayela elithile noma itheminali ukuze ibukwe ngesikhathi sangempela, noma iwadlulisele ohlelweni olujwayelekile lokugcina idatha (njenge-Hadoop). Uhlelo lwakho lokusebenza ngokwalo akufanele lugcine noma luhlanganyele namalogi.

Isimiso 10. Hlola!

Ngokufunda komshini wezimboni, lesi sigaba sibaluleke kakhulu, ngoba udinga ukuqonda ukuthi imodeli isebenza kahle futhi ikhiqiza obukufuna.

Ukuhlolwa kungadalwa kusetshenziswa i-pytest, futhi ihlolwe kusetshenziswa idathasethi encane uma unomsebenzi wokuhlehla/wokwehlukanisa.

Ungakhohlwa ukusetha imbewu efanayo kumamodeli wokufunda okujulile ukuze angakhiqizi njalo imiphumela ehlukene.

Lokhu kwakuyincazelo emfushane yezimiso ze-10, futhi, yiqiniso, kunzima ukuzisebenzisa ngaphandle kokuzama nokubona ukuthi zisebenza kanjani, ngakho-ke lesi sihloko siyisandulela sochungechunge lwezihloko ezithakazelisayo lapho ngizoveza khona indlela yokudala. amamodeli okufunda emishini yezimboni , indlela yokuwahlanganisa kumasistimu, nokuthi lezi zimiso zingenza kanjani ukuphila kube lula kithi sonke.

Ngizozama futhi ukusebenzisa izimiso ezipholile noma ubani angazishiya kumazwana uma ethanda.

Source: www.habr.com

Engeza amazwana