UkuFunda koomatshini boShishino: Imigaqo eyi-10 yoYilo

UkuFunda koomatshini boShishino: Imigaqo eyi-10 yoYilo

Kule mihla, iinkonzo ezintsha, izicelo kunye nezinye iinkqubo ezibalulekileyo zenziwa yonke imihla ezenza kube lula ukwenza izinto ezimangalisayo: ukusuka kwisoftware yokulawula i-rocket ye-SpaceX ukuya ekusebenzisaneni neketile kwigumbi elilandelayo nge-smartphone.

Kwaye, ngamanye amaxesha, wonke umdwebi wenkqubo we-novice, nokuba ungumqali onomdla okanye iSitaki esipheleleyo esipheleleyo okanye iNzululwazi yeDatha, kungekudala okanye kamva ufika ekuqondeni ukuba kukho imithetho ethile yokucwangcisa kunye nokudala isoftware eyenza lula kakhulu ubomi.

Kule nqaku, ndiza kuchaza ngokufutshane imigaqo ye-10 yenkqubo yokufunda umatshini woshishino ukwenzela ukuba ihlanganiswe lula kwisicelo / inkonzo, ngokusekelwe kwi-12-factor App methodology. kucetyiswe liqela leHeroku. Inyathelo lam kukwandisa ulwazi lobu buchule, obunokunceda abaphuhlisi abaninzi kunye nabantu besayensi yedatha.

Eli nqaku liyintshayelelo kuthotho lwamanqaku malunga nokuFunda koShishino lwamashishini. Kuzo ndiya kuqhubeka ndithetha malunga nendlela yokwenza ngokwenene imodeli kwaye uyiqalise kwimveliso, ukudala i-API kuyo, kunye nemizekelo evela kwiindawo ezahlukeneyo kunye neenkampani ezakhelwe kwi-ML kwiinkqubo zazo.

Umgaqo 1: Isiseko sekhowudi enye

Abanye abadwelisi kwizigaba zokuqala, ngenxa yobuvila ukuyifumanisa (okanye ngenxa yezizathu zabo), libala ngeGit. Bangalilibala ngokupheleleyo igama, oko kukuthi, baphosa iifayile omnye komnye kwi-drive / baphosa nje umbhalo / ukuthumela ngamahobe, okanye abacingi ngokuhamba komsebenzi wabo, kwaye bazinikele kwisebe labo, kwaye emva koko inkosi.

Lo mgaqo uthi: ube nesiseko sekhowudi enye kunye nokusasazwa okuninzi.

I-Git ingasetyenziswa kokubini kwimveliso nakuphando nakuphuhliso (R&D), apho ingasetyenziswa rhoqo.

Umzekelo, kwisigaba se-R&D unokushiya izibophelelo ngeendlela ezahlukeneyo zokusetyenzwa kwedatha kunye neemodeli, ukuze ukhethe eyona ilungileyo kwaye uqhubeke ngokulula nokusebenza nayo ngakumbi.

Okwesibini, kwimveliso le yinto engenakubuyiselwa - kuya kufuneka ujonge rhoqo indlela ikhowudi yakho etshintsha ngayo kwaye wazi ukuba yeyiphi imodeli evelisa iziphumo ezilungileyo, yeyiphi ikhowudi esebenzayo ekugqibeleni kwaye kwenzeka ntoni ukuba iyeke ukusebenza okanye iqalise ukuvelisa iziphumo ezingalunganga. . Nantso into yokuzibophelela!

Unokwenza ipakethe yeprojekthi yakho, uyibeke, umzekelo, kwi-Gemfury, kwaye ungenise ngokulula imisebenzi kuyo yezinye iiprojekthi, ukuze ungazibhali kwakhona amaxesha angama-1000, kodwa ngaphezulu koko kamva.

Umgaqo 2: Chaza ngokucacileyo kwaye uwahlukanise abantu abaxhomekeke kuwe

Iprojekthi nganye inamathala eencwadi ahlukeneyo owangenisa ngaphandle ukuze uwasebenzise kwenye indawo. Nokuba ngamathala eencwadi ePython, okanye amathala eencwadi ezinye iilwimi ngeenjongo ezahlukeneyo, okanye izixhobo zenkqubo- umsebenzi wakho ngulo:

  • Chaza ngokucacileyo ukuxhomekeka, oko kukuthi, ifayile eya kuqulatha onke amathala eencwadi, izixhobo, kunye neenguqulelo zazo ezisetyenziswa kwiprojekthi yakho nekufuneka ifakwe (umzekelo, kwiPython oku kunokwenziwa kusetyenziswa iPipfile okanye iimfuno.txt. A ikhonkco elivumela okulungileyo ukuqonda: realpython.com/pipenv-guide)
  • Ukuxhomekeka kwakwahlula ngokukodwa kwinkqubo yakho ngexesha lophuhliso. Awufuni ukutshintsha rhoqo iinguqulelo kwaye uphinde ufake, umzekelo, iTensorflow?

Ngale ndlela, abaphuhlisi abaya kujoyina iqela lakho kwixesha elizayo baya kuba nakho ukuqhelana ngokukhawuleza namathala eencwadi kunye neenguqulelo zabo ezisetyenziswa kwiprojekthi yakho, kwaye uya kuba nethuba lokulawula iinguqulelo kunye namathala eencwadi ngokwawo afakelwe okuthile. iprojekthi, eya kukunceda uphephe ukungahambelani kwamathala eencwadi okanye iinguqulelo zawo.

Isicelo sakho akufuneki sixhomekeke kwizixhobo zesixokelelwano ezinokufakwa kwi-OS ethile. Ezi zixhobo kufuneka kwakhona zibhengezwe kwii-demedies manifest. Oku kuyimfuneko ukwenzela ukuphepha iimeko apho uguqulelo lwezixhobo (kunye nokufumaneka kwazo) aluhambelani nezixhobo zenkqubo ye-OS ethile.

Ke, nokuba i-curl ingasetyenziswa phantse kuzo zonke iikhompyuter, kufuneka usayibhengeze ngokuxhomekeka, kuba xa ufudukela kwelinye iqonga lisenokungabikho okanye inguqulelo ayizukuba yileyo ubuyifuna ekuqaleni.

Umzekelo, iimfuno zakho.txt zinokujongeka ngolu hlobo:

# Model Building Requirements
numpy>=1.18.1,<1.19.0
pandas>=0.25.3,<0.26.0
scikit-learn>=0.22.1,<0.23.0
joblib>=0.14.1,<0.15.0

# testing requirements
pytest>=5.3.2,<6.0.0

# packaging
setuptools>=41.4.0,<42.0.0
wheel>=0.33.6,<0.34.0

# fetching datasets
kaggle>=1.5.6,<1.6.0

Umgaqo 3: Ulungelelwaniso

Abaninzi bawevile amabali abaphuhlisi abahlukeneyo abafaka ikhowudi ngempazamo kwi-GitHub kwiindawo zokugcina zikawonke-wonke ezinamagama ayimfihlo kunye nezinye izitshixo ezivela kwi-AWS, ukuvuka ngosuku olulandelayo ngetyala le-6000 yeedola, okanye i-$ 50000.

UkuFunda koomatshini boShishino: Imigaqo eyi-10 yoYilo

Ngokuqinisekileyo, ezi meko zinzima, kodwa zibaluleke kakhulu. Ukuba ugcina iziqinisekiso zakho okanye enye idatha efunekayo ukucwangciswa ngaphakathi kwekhowudi, wenza iphutha, kwaye ndicinga ukuba akukho mfuneko yokuchaza ukuba kutheni.

Enye indlela yoku kukugcina ubumbeko kwizinto eziguquguqukayo zemekobume. Unokufunda ngakumbi malunga nokuguquguquka kokusingqongileyo apha.

Imizekelo yedatha eqhele ukugcinwa kwizinto ezahlukeneyo zokusingqongileyo:

  • Amagama ommandla
  • API URLs/URI's
  • Izitshixo zikawonke-wonke nezabucala
  • Abafowunelwa (i-imeyile, iifowuni, njl.)

Ngale ndlela awunyanzelekanga ukuba utshintshe rhoqo ikhowudi ukuba utshintsho lwakho loqwalaselo luyatshintsha. Oku kuya kukunceda ukonga ixesha, umgudu kunye nemali.

Umzekelo, ukuba usebenzisa i-Kaggle API ukwenza iimvavanyo (umzekelo, khuphela isoftware kwaye usebenzise imodeli ngayo ukuvavanya xa uqhuba ukuba imodeli isebenza kakuhle), emva koko izitshixo zabucala ezivela kuKaggle, ezifana KAGGLE_USERNAME kunye KAGGLE_KEY, kufuneka igcinwe kwizinto ezahlukeneyo zemo engqongileyo.

Umgaqo 4: Iinkonzo zeqela lesithathu

Ingcamango apha kukudala inkqubo ngendlela yokuba akukho mahluko phakathi kwemithombo yendawo kunye neyesithathu ngokwemigaqo yekhowudi. Ngokomzekelo, unokudibanisa zombini i-MySQL yendawo kunye neyesithathu. Okufanayo kuya kwii-APIs ezahlukeneyo ezifana neGoogle Maps okanye i-Twitter API.

Ukuze ukhubaze inkonzo yomntu wesithathu okanye udibanise enye, kufuneka utshintshe izitshixo kuqwalaselo kwizinto eziguquguqukayo zokusingqongileyo, endithethe ngazo kumhlathi ongentla.

Ngoko ke, umzekelo, endaweni yokuchaza indlela eya kwiifayile ezineesethi zedatha ngaphakathi kwikhowudi ngexesha ngalinye, kungcono ukusebenzisa ithala leencwadi le-pathlib kwaye ubhengeze indlela eya kwiiseti zedatha kwi-config.py, ukuze kungakhathaliseki ukuba yeyiphi inkonzo oyisebenzisayo (yeyiphi umzekelo, CircleCI), inkqubo ikwazile ukufumana indlela eya kwiiseti zedatha ithathela ingqalelo ubume benkqubo yefayile entsha kwinkonzo entsha.

Umgaqo 5. Yakha, khulula, ixesha lokuqhuba

Abantu abaninzi kwiNzululwazi yeDatha bakufumanisa kuluncedo ukuphucula izakhono zabo zokubhala isoftware. Ukuba sifuna inkqubo yethu iphazamiseke kunqabile kwaye isebenze ngaphandle kokusilela ixesha elide, kufuneka sahlule inkqubo yokukhupha inguqulelo entsha ibe ngamanqanaba ama-3:

  1. Inqanaba Iindibano. Uguqula ikhowudi yakho engenanto kunye nezibonelelo zomntu ngamnye kwiphakheji ebizwa ngokuba yiphakheji equlethe yonke ikhowudi efunekayo kunye nedatha. Le phakheji ibizwa ngokuba yindibano.
  2. Inqanaba ukukhululwa - apha sidibanisa i-config yethu kwindibano, ngaphandle kwayo asiyi kukwazi ukukhulula inkqubo yethu. Ngoku olu lukhupho olulungele ukuqaliswa ngokupheleleyo.
  3. Okulandelayo kuza iqonga inzaliseko. Apha sikhupha isicelo ngokuqhuba iinkqubo eziyimfuneko ekukhululweni kwethu.

Inkqubo enjalo yokukhulula iinguqulelo ezintsha zemodeli okanye umbhobho wonke ikuvumela ukuba uhlukanise iindima phakathi kwabalawuli kunye nabaphuhlisi, ikuvumela ukuba ulandele iinguqulelo kunye nokuthintela ukuyeka okungafunekiyo kweprogram.

Ngomsebenzi wokukhululwa, iinkonzo ezininzi ezahlukeneyo zenziwe apho ungabhala iinkqubo zokuziqhuba kwifayile ye-.yml (umzekelo, kwi-CircleCI le yi-config.yml ukuxhasa inkqubo ngokwayo). I-Wheely ilungile ekudaleni iiphakheji zeeprojekthi.

Unokwenza iipakethe ngeenguqulelo ezahlukeneyo zemodeli yakho yokufunda ngomatshini, kwaye emva koko uzipakishe kwaye ubhekisele kwiipakethe eziyimfuneko kunye neenguqulelo zazo zokusebenzisa imisebenzi oyibhalileyo ukusuka apho. Oku kuya kukunceda wenze i-API yomzekelo wakho, kwaye iphakheji yakho inokubanjwa kwi-Gemfury, umzekelo.

Umgaqo 6. Sebenzisa imodeli yakho njengenkqubo enye okanye ngaphezulu

Ngaphezu koko, iinkqubo akufuneki zibe nedatha ekwabelwana ngayo. Oko kukuthi, iinkqubo kufuneka zibekho ngokwahlukeneyo, kwaye zonke iintlobo zedatha kufuneka zibekho ngokwahlukeneyo, umzekelo, kwiinkonzo zomntu wesithathu ezifana ne-MySQL okanye ezinye, kuxhomekeke kwinto oyifunayo.

Okokuthi, ngokuqinisekileyo akufanelekanga ukugcina idatha ngaphakathi kwenkqubo yefayile yenkqubo, ngaphandle koko oku kunokukhokelela ekucoceni le datha ngexesha lokukhutshwa okulandelayo / utshintsho lwezicwangciso okanye ukudluliselwa kwenkqubo apho inkqubo isebenza khona.

Kodwa kukho okungafaniyo: kwiiprojekthi zokufunda koomatshini, unokugcina i-cache yeelayibrari ukuze ungazifaki kwakhona rhoqo xa usungula inguqulelo entsha, ukuba akukho mathala ongezelelweyo okanye naluphi na utshintsho olwenziwe kwiinguqulelo zabo. Ngale ndlela, uya kunciphisa ixesha elithathayo ukuqalisa imodeli yakho kwishishini.

Ukusebenzisa umfuziselo njengeenkqubo ezininzi, ungenza ifayile .yml apho ukhankanya khona iinkqubo eziyimfuneko kunye nolandelelwano lwazo.

Umgaqo 7: Ukurisayikilisha

Iinkqubo ezisebenza kumzekelo wesicelo sakho kufuneka kube lula ukuqalisa nokuyeka. Ngaloo ndlela, oku kuya kukuvumela ukuba uthumele ngokukhawuleza utshintsho lwekhowudi, utshintsho lwesimo, ngokukhawuleza kunye nokuguquguquka kwesikali, kwaye uthintele ukuchithwa okunokwenzeka kwenguqulo yokusebenza.

Oko kukuthi, inkqubo yakho kunye nemodeli kufuneka:

  • Nciphisa ixesha lokuqalisa. Ngokufanelekileyo, ixesha lokuqalisa (ukususela kumzuzu wokuqalisa umyalelo wokuqalisa ukuya kuthi ga ngoku inkqubo iqala ukusebenza) kufuneka ingabi ngaphezu kwemizuzwana embalwa. I-caching yethala leencwadi, echazwe ngasentla, yenye yeendlela zokunciphisa ixesha lokuqalisa.
  • Phelisa ngokuchanekileyo. Oko kukuthi, ukumamela kwizibuko lenkonzo kunqunyanyisiwe ngokwenene, kwaye izicelo ezitsha ezingeniswe kweli zibuko aziyi kuqwalaselwa. Apha mhlawumbi ufuna ukuseta unxibelelwano oluhle kunye neenjineli ze-DevOps, okanye uqonde indlela ezisebenza ngayo ngokwakho (ngokukhethekileyo, ngokuqinisekileyo, okokugqibela, kodwa unxibelelwano kufuneka luhlale lugcinwe, kuyo nayiphi na iprojekthi!)

Umgaqo 8: Ukusasazwa ngokuqhubekayo / ukuHlanganisa

Iinkampani ezininzi zisebenzisa ukwahlukana phakathi kophuhliso lwesicelo kunye namaqela okuthunyelwa (ukwenza isicelo sifumaneke kubasebenzisi bokugqibela). Oku kunokucothisa kakhulu uphuhliso lwesoftware kunye nenkqubela phambili ekuyiphuculeni. Kwakhona yonakalisa inkcubeko ye-DevOps, apho uphuhliso kunye nokudibanisa, ngokufanelekileyo, zidibaniswe.

Ke ngoko, lo mgaqo uthi indawo yakho yophuhliso kufuneka isondele kangangoko kwindawo yakho yemveliso.

Oku kuya kuvumela:

  1. Nciphisa ixesha lokukhulula ngamaxesha alishumi
  2. Nciphisa inani leempazamo ngenxa yokungahambelani kwekhowudi.
  3. Oku kukwanciphisa umthwalo womsebenzi kubasebenzi, kuba abaphuhlisi kunye nabantu abathumela isicelo ngoku liqela elinye.

Izixhobo ezikuvumela ukuba usebenze ngale nto yi-CircleCI, Travis CI, GitLab CI kunye nezinye.

Unokwenza ngokukhawuleza ukongezwa kwimodeli, uyihlaziye, kwaye uyiqalise ngokukhawuleza, ngelixa kuya kuba lula, xa kukho ukungaphumeleli, ukubuyisela ngokukhawuleza kwinguqulo yokusebenza, ukwenzela ukuba umsebenzisi wokugqibela angayiboni. Oku kunokwenziwa ngokulula kwaye ngokukhawuleza ukuba uneemvavanyo ezilungileyo.

Nciphisa iyantlukwano!!!

Umgaqo 9. Iilogi zakho

Iilogi (okanye "Iilog") ziziganeko, zihlala zirekhodwa kwifomathi yokubhaliweyo, eyenzeka ngaphakathi kwesicelo (umsinga womcimbi). Umzekelo olula: "2020-02-02 - inqanaba lenkqubo - igama lenkqubo." Ziyilwe ukuze umphuhlisi akwazi ukubona ngokoqobo okwenzekayo xa inkqubo iqhuba. Ubona ukuqhubela phambili kweenkqubo kwaye uyayiqonda ukuba ingaba njengoko umphuhlisi enenjongo yakhe.

Lo mgaqo uthi akufuneki ugcine iilog zakho ngaphakathi kwisixokelelwano sakho sefayile - kufuneka "uzikhuphe" kwiscreen, umzekelo, yenza oku kwimveliso esemgangathweni yenkqubo. Kwaye ngale ndlela kuya kwenzeka ukubeka iliso ukuhamba kwi-terminal ngexesha lophuhliso.

Ngaba oku kuthetha ukuba akukho mfuneko yokugcina iinkuni konke konke? Akunjalongo noko. Isicelo sakho akufuneki sikwenze oku-sishiye kwiinkonzo zomntu wesithathu. Isicelo sakho sinokugqithisela kuphela iilogi kwifayile ethile okanye i-terminal yokujonga ixesha lokwenyani, okanye uyithumele kwinkqubo yokugcina idatha yenjongo jikelele (efana neHadoop). Isicelo sakho ngokwaso akufuneki sigcine okanye sidibane neelog.

Umgaqo 10. Uvavanyo!

Ukufunda koomatshini boshishino, esi sigaba sibaluleke kakhulu, kuba kufuneka uqonde ukuba imodeli isebenza ngokuchanekileyo kwaye ivelisa oko ubukufuna.

Uvavanyo lungenziwa ngokusebenzisa i-pytest, kwaye ivavanywe usebenzisa i-dataset encinci ukuba unomsebenzi wokubuyisela / wokuhlelwa.

Ungalibali ukuseta imbewu efanayo kwiimodeli zokufunda ezinzulu ukuze zingasoloko zivelisa iziphumo ezahlukeneyo.

Le yayiyinkcazo emfutshane yemigaqo ye-10, kwaye, ngokuqinisekileyo, kunzima ukuyisebenzisa ngaphandle kokuzama nokubona indlela esebenza ngayo, ngoko eli nqaku liyintshayelelo yoluhlu lwamanqaku anomdla apho ndiya kutyhila indlela yokudala. iimodeli zokufunda koomatshini boshishino, indlela yokudibanisa kwiinkqubo, kunye nendlela le migaqo enokwenza ngayo ubomi bube lula kuthi sonke.

Ndiza kuzama ukusebenzisa imigaqo epholileyo ukuba nabani na angayishiya kwizimvo ukuba bayathanda.

umthombo: www.habr.com

Yongeza izimvo