Kuphunzira kwa Makina Amakampani: Mfundo 10 Zopanga

Kuphunzira kwa Makina Amakampani: Mfundo 10 Zopanga

Masiku ano, mautumiki atsopano, mapulogalamu ndi mapulogalamu ena ofunikira amapangidwa tsiku ndi tsiku zomwe zimapangitsa kuti pakhale zinthu zodabwitsa: kuchokera ku mapulogalamu oyendetsa rocket ya SpaceX kuti agwirizane ndi ketulo mu chipinda chotsatira kudzera pa foni yamakono.

Ndipo, nthawi zina, aliyense woyambitsa mapulogalamu, kaya ndi woyambitsa mwachidwi kapena wamba Full Stack kapena Data Scientist, posakhalitsa amazindikira kuti pali malamulo ena opangira mapulogalamu ndi kupanga mapulogalamu omwe amapangitsa moyo kukhala wosalira zambiri.

M'nkhaniyi, ndikufotokozera mwachidule mfundo za 10 za momwe mungapangire maphunziro a makina a mafakitale kuti athe kuphatikizidwa mosavuta ndi ntchito / ntchito, pogwiritsa ntchito njira ya 12-factor App. adaperekedwa ndi gulu la Heroku. Cholinga changa ndikukulitsa chidziwitso cha njirayi, yomwe ingathandize ambiri opanga ndi anthu a sayansi ya data.

Nkhaniyi ndi chiyambi cha nkhani za mafakitale Machine Learning. Mwa iwo ndilankhulanso za momwe mungapangire chitsanzo ndikuyambitsa kupanga, kupanga API kwa izo, komanso zitsanzo zochokera kumadera osiyanasiyana ndi makampani omwe adamanga ML mu machitidwe awo.

Mfundo 1: Khodi imodzi yokha

Ena opanga mapulogalamu pamagawo oyamba, chifukwa cha ulesi kuti azindikire (kapena pazifukwa zawo), kuyiwala za Git. Amayiwalatu mawuwa, ndiye kuti, amaponyera mafayilo wina ndi mnzake mugalimoto / amangoponya zolemba / kutumiza nkhunda, kapena samaganizira momwe amagwirira ntchito, ndikudzipereka kunthambi yake, kenako mbuye.

Mfundo imeneyi imati: kukhala ndi codebase imodzi ndi deployments zambiri.

Git itha kugwiritsidwa ntchito popanga komanso pakufufuza ndi chitukuko (R&D), momwe imagwiritsidwira ntchito nthawi zambiri.

Mwachitsanzo, mu gawo la R&D mutha kusiya mabizinesi okhala ndi njira ndi mitundu yosinthira deta, kuti musankhe yabwino kwambiri ndikupitilizabe kugwira nayo ntchito.

Kachiwiri, popanga ichi ndi chinthu chosasinthika - muyenera kuyang'ana nthawi zonse momwe ma code anu amasinthira ndikudziwa kuti ndi chitsanzo chiti chomwe chinatulutsa zotsatira zabwino kwambiri, zomwe zimagwira ntchito pamapeto pake ndi zomwe zidachititsa kuti asiye kugwira ntchito kapena kuyamba kutulutsa zotsatira zolakwika. . Ndi zomwe zimachitikira!

Mutha kupanganso phukusi la pulojekiti yanu, ndikuyiyika, mwachitsanzo, pa Gemfury, ndikungolowetsamo ntchito zama projekiti ena, kuti musawalembenso nthawi 1000, koma zambiri pambuyo pake.

Mfundo yachiwiri: Nenani momveka bwino ndikupatula anthu omwe amadalira

Pulojekiti iliyonse ili ndi malaibulale osiyanasiyana omwe mumaitanitsa kuchokera kunja kuti muwagwiritse ntchito kwinakwake. Kaya ndi malaibulale a Python, kapena malaibulale azilankhulo zina pazifukwa zosiyanasiyana, kapena zida zamakina - ntchito yanu ndi:

  • Nenani momveka bwino zomwe zimadalira, ndiye kuti, fayilo yomwe idzakhala ndi malaibulale onse, zida, ndi mitundu yake yomwe ikugwiritsidwa ntchito mupulojekiti yanu ndipo iyenera kukhazikitsidwa (mwachitsanzo, mu Python izi zitha kuchitika pogwiritsa ntchito Pipfile kapena requirements.txt. A ulalo womwe umathandizira kumvetsetsa bwino: realpython.com/pipenv-guide)
  • Dzipatulani kudalira makamaka pulogalamu yanu panthawi yachitukuko. Simukufuna kusintha mitundu ndikuyikanso, mwachitsanzo, Tensorflow?

Mwanjira iyi, omanga omwe adzalowa nawo gulu lanu mtsogolomo azitha kudziwa mwachangu malaibulale ndi mitundu yawo yomwe imagwiritsidwa ntchito mu polojekiti yanu, komanso mudzakhala ndi mwayi wowongolera zomasulira ndi malaibulale omwe adayikidwa kuti akwaniritse zina. polojekiti, zomwe zingakuthandizeni kupewa kusagwirizana kwa malaibulale kapena mitundu yawo.

Ntchito yanu iyeneranso kudalira zida zamakina zomwe zitha kukhazikitsidwa pa OS inayake. Zida izi ziyeneranso kulengezedwa mu kudalira kumawonetseredwa. Izi ndizofunikira kuti tipewe zochitika zomwe zida (komanso kupezeka kwake) sizikugwirizana ndi zida za OS inayake.

Chifukwa chake, ngakhale ma curl atha kugwiritsidwa ntchito pafupifupi pamakompyuta onse, muyenera kulengeza modalira, chifukwa mukasamukira ku nsanja ina sikungakhaleko kapena mtunduwo sudzakhala womwe mumaufuna poyamba.

Mwachitsanzo, your requirements.txt ikhoza kuwoneka motere:

# Model Building Requirements
numpy>=1.18.1,<1.19.0
pandas>=0.25.3,<0.26.0
scikit-learn>=0.22.1,<0.23.0
joblib>=0.14.1,<0.15.0

# testing requirements
pytest>=5.3.2,<6.0.0

# packaging
setuptools>=41.4.0,<42.0.0
wheel>=0.33.6,<0.34.0

# fetching datasets
kaggle>=1.5.6,<1.6.0

Mfundo 3: Masanjidwe

Ambiri adamvapo nkhani za anyamata osiyanasiyana omwe adayikamo mwangozi ku GitHub m'malo osungira anthu ambiri okhala ndi mawu achinsinsi ndi makiyi ena ochokera ku AWS, kudzuka tsiku lotsatira ndi ngongole ya $ 6000, kapena $50000.

Kuphunzira kwa Makina Amakampani: Mfundo 10 Zopanga

Zoonadi, milanduyi ndi yoopsa, koma yofunika kwambiri. Ngati mumasunga zidziwitso zanu kapena deta ina yofunikira kuti mukonzekere mkati mwa code, mukulakwitsa, ndipo ndikuganiza kuti palibe chifukwa chofotokozera chifukwa chake.

Njira ina yochitira izi ndikusunga masinthidwe pazosintha zachilengedwe. Mutha kuwerenga zambiri zamitundu yosiyanasiyana yachilengedwe apa.

Zitsanzo za data yomwe nthawi zambiri imasungidwa m'malo osiyanasiyana:

  • Mayina amadomeni
  • Ma API URL/URI's
  • Makiyi apagulu ndi achinsinsi
  • Contacts (maimelo, mafoni, etc.)

Mwanjira iyi simuyenera kusintha kachidindo nthawi zonse ngati masinthidwe anu asintha. Izi zidzakuthandizani kusunga nthawi, khama komanso ndalama.

Mwachitsanzo, ngati mugwiritsa ntchito Kaggle API kuyesa (mwachitsanzo, koperani pulogalamuyo ndikuyendetsa chitsanzocho kuti muyese kuti chitsanzocho chimagwira ntchito bwino), ndiye kuti makiyi achinsinsi ochokera ku Kaggle, monga KAGGLE_USERNAME ndi KAGGLE_KEY, ayenera kukhala. zosungidwa muzosintha zachilengedwe.

Mfundo 4: Ntchito Zagulu Lachitatu

Lingaliro apa ndikupanga pulogalamuyo mwanjira yakuti palibe kusiyana pakati pa zinthu zapanyumba ndi zachitatu malinga ndi code. Mwachitsanzo, mutha kulumikiza onse am'deralo a MySQL ndi a chipani chachitatu. Zomwezo zimapitanso ku ma API osiyanasiyana monga Google Maps kapena Twitter API.

Kuti mulepheretse ntchito ya chipani chachitatu kapena kugwirizanitsa wina, mumangofunika kusintha makiyi mu kasinthidwe muzosintha zachilengedwe, zomwe ndinanena m'ndime pamwambapa.

Chifukwa chake, mwachitsanzo, m'malo mofotokozera njira yopita ku mafayilo okhala ndi ma dataset mkati mwa code nthawi iliyonse, ndi bwino kugwiritsa ntchito laibulale ya pathlib ndikulengeza njira yopita ku dataset mu config.py, kotero kuti ziribe kanthu zomwe mumagwiritsa ntchito (kwa Mwachitsanzo, CircleCI), pulogalamuyi inatha kudziwa njira yopita ku dataset poganizira momwe mafayilo atsopano amachitira muutumiki watsopano.

Mfundo 5. Kumanga, kumasula, nthawi yothamanga

Anthu ambiri mu Data Science amawona kuti ndizothandiza kukonza luso lawo lolemba mapulogalamu. Ngati tikufuna kuti pulogalamu yathu iwonongeke nthawi zambiri komanso kuti igwire ntchito popanda zolephera kwa nthawi yayitali, tifunika kugawa njira yotulutsira mtundu watsopano m'magawo atatu:

  1. Gawo misonkhano ikuluikulu. Mumasintha code yanu yopanda kanthu ndi zinthu zapayekha kukhala phukusi lotchedwa phukusi lomwe lili ndi code ndi deta zonse zofunika. Phukusili limatchedwa msonkhano.
  2. Gawo kumasulidwa - apa tikulumikiza dongosolo lathu ku msonkhano, popanda zomwe sitingathe kumasula pulogalamu yathu. Tsopano uku ndi kumasulidwa kokonzeka kwathunthu kukhazikitsidwa.
  3. Kenako pakubwera siteji kukwaniritsidwa. Apa tikumasula pulogalamuyo poyendetsa zofunikira kuchokera pakumasulidwa kwathu.

Dongosolo loterolo lotulutsa mitundu yatsopano yachitsanzo kapena mapaipi onse amakulolani kuti mulekanitse maudindo pakati pa oyang'anira ndi omanga, kumakupatsani mwayi wotsata matembenuzidwe ndikuletsa kuyimitsidwa kosafunika kwa pulogalamuyi.

Pantchito yotulutsa, mautumiki ambiri osiyanasiyana adapangidwa momwe mungalembe njira zoyendetsera nokha mu fayilo ya .yml (mwachitsanzo, mu CircleCI iyi ndi config.yml kuthandizira ndondomekoyi). Wheely ndi yabwino kupanga phukusi la ma projekiti.

Mutha kupanga mapaketi okhala ndi mitundu yosiyanasiyana yamakina anu ophunzirira makina, kenako ndikuyika ndikulozera pamaphukusi ofunikira ndi mitundu yawo kuti mugwiritse ntchito zomwe mudalemba kuchokera pamenepo. Izi zidzakuthandizani kupanga API yachitsanzo chanu, ndipo phukusi lanu likhoza kuchitidwa pa Gemfury, mwachitsanzo.

Mfundo 6. Yendetsani chitsanzo chanu ngati njira imodzi kapena zingapo

Komanso, ndondomeko siziyenera kugawana deta. Ndiko kuti, njira ziyenera kukhalapo padera, ndipo mitundu yonse ya deta iyenera kukhalapo padera, mwachitsanzo, pamagulu achitatu monga MySQL kapena ena, malingana ndi zomwe mukufuna.

Ndiko kuti, sikuli koyenera kusunga deta mkati mwa dongosolo la fayilo, apo ayi izi zingayambitse kuchotsa detayi panthawi yotulutsidwa / kusintha kwa kasinthidwe kapena kusamutsa dongosolo lomwe pulogalamuyo ikuyendera.

Koma pali chosiyana: pamakina ophunzirira makina, mutha kusunga kache ya malaibulale kuti musawakhazikitsenso nthawi iliyonse mukakhazikitsa mtundu watsopano, ngati palibe malaibulale owonjezera kapena zosintha zilizonse zomwe zasinthidwa. Mwanjira iyi, muchepetse nthawi yomwe imatengera kuyambitsa chitsanzo chanu mumakampani.

Kuti mugwiritse ntchito chitsanzo ngati njira zingapo, mukhoza kupanga fayilo ya .yml momwe mumafotokozera zofunikira ndi ndondomeko yake.

Mfundo 7: Kubwezeretsanso

Njira zomwe zikuyenda mu pulogalamu yanu yachitsanzo ziyenera kukhala zosavuta kuyambitsa ndikuyimitsa. Chifukwa chake, izi zikuthandizani kuti mutumize mwachangu kusintha kwa ma code, kusintha masinthidwe, mwachangu komanso mosinthasintha, ndikuletsa kuwonongeka kwa mtundu wogwira ntchito.

Ndiye kuti, ndondomeko yanu ndi chitsanzo iyenera:

  • Chepetsani nthawi yoyambira. Momwemo, nthawi yoyambira (kuyambira pomwe lamulo loyambira lidaperekedwa mpaka pomwe njirayo iyamba kugwira ntchito) sayenera kupitilira masekondi angapo. Kusungidwa kwa library, komwe tafotokozera pamwambapa, ndi njira imodzi yochepetsera nthawi yoyambira.
  • Malizani molondola. Ndiko kuti, kumvetsera pa doko lautumiki kumayimitsidwa, ndipo zopempha zatsopano zomwe zatumizidwa ku dokoli sizidzasinthidwa. Apa muyenera kukhazikitsa kulumikizana kwabwino ndi mainjiniya a DevOps, kapena kumvetsetsa momwe zimagwirira ntchito nokha (makamaka, zomaliza, koma kulumikizana kuyenera kusungidwa nthawi zonse, muntchito iliyonse!)

Mfundo 8: Kupititsa patsogolo / Kuphatikiza

Makampani ambiri amagwiritsa ntchito kulekanitsa pakati pa magulu opititsa patsogolo ntchito ndi magulu otumizira (kupangitsa kuti pulogalamuyi ipezeke kwa ogwiritsa ntchito omaliza). Izi zitha kuchepetsa kwambiri chitukuko cha mapulogalamu ndikupita patsogolo pakuwongolera. Zimawononganso chikhalidwe cha DevOps, kumene chitukuko ndi kugwirizanitsa, pafupifupi, zimaphatikizidwa.

Choncho, mfundoyi ikunena kuti malo anu otukuka ayenera kukhala pafupi kwambiri ndi malo anu opangira.

Izi zidzalola kuti:

  1. Chepetsani nthawi yotulutsa kakhumi
  2. Chepetsani kuchuluka kwa zolakwika chifukwa chosagwirizana ndi ma code.
  3. Izi zimachepetsanso kuchuluka kwa ntchito kwa ogwira ntchito, popeza opanga ndi anthu omwe akutumiza pulogalamuyi tsopano ndi gulu limodzi.

Zida zomwe zimakulolani kuti mugwiritse ntchito izi ndi CircleCI, Travis CI, GitLab CI ndi ena.

Mutha kupanga zowonjezera pachitsanzochi, kusinthira, ndikuyambitsa nthawi yomweyo, pomwe zidzakhala zosavuta, ngati zitalephereka, kubwereranso mwachangu ku mtundu wogwirira ntchito, kuti wogwiritsa ntchito asazindikire. Izi zitha kuchitika makamaka mosavuta komanso mwachangu ngati muli ndi mayeso abwino.

Chepetsani kusiyana!!!

Mfundo 9. Zipika zanu

Logos (kapena "Logs") ndi zochitika, zomwe nthawi zambiri zimalembedwa m'mawu, zomwe zimachitika mkati mwa pulogalamu (njira yotsatsira). Chitsanzo chosavuta: "2020-02-02 - mlingo wa dongosolo - dzina la ndondomeko." Zapangidwa kuti wopanga athe kuwona zomwe zikuchitika pulogalamuyo ikayamba. Amawona kupita patsogolo kwa njira ndikumvetsetsa ngati zili momwemomwe wopanga amafunira.

Mfundoyi ikunena kuti simuyenera kusunga zipika zanu mkati mwa fayilo yanu - muyenera "kutulutsa" pazenera, mwachitsanzo, chitani izi pazotulutsa zomwe zimatuluka. Ndipo motere kudzakhala kotheka kuyang'anira kutuluka mu terminal panthawi ya chitukuko.

Kodi izi zikutanthauza kuti palibe chifukwa chosunga zipika? Inde sichoncho. Ntchito yanu sikuyenera kuchita izi - isiyireni kuzinthu zina. Pulogalamu yanu imatha kutumiza zipika ku fayilo inayake kapena terminal kuti muwonere nthawi yeniyeni, kapena kuitumiza ku njira yosungira deta (monga Hadoop). Pulogalamu yanu yokha siyenera kusunga kapena kuyanjana ndi zipika.

Mfundo 10. Yesani!

Pophunzira makina a mafakitale, gawo ili ndilofunika kwambiri, chifukwa muyenera kumvetsetsa kuti chitsanzocho chimagwira ntchito bwino ndipo chimapanga zomwe mukufuna.

Mayesero amatha kupangidwa pogwiritsa ntchito pytest, ndikuyesedwa pogwiritsa ntchito deta yaying'ono ngati muli ndi regression/classification task.

Musaiwale kukhazikitsa mbewu yomweyo ya zitsanzo zakuya zophunzirira kuti zisakhale ndi zotsatira zosiyana nthawi zonse.

Uku kunali kufotokozera mwachidule za mfundo za 10, ndipo, ndithudi, n'zovuta kuzigwiritsa ntchito popanda kuyesa ndikuwona momwe zimagwirira ntchito, kotero nkhaniyi yangokhala mawu oyambira mndandanda wa nkhani zosangalatsa zomwe ndiwulula momwe ndingapangire. zitsanzo zamakina ophunzirira makina , momwe mungaphatikizire mu machitidwe, ndi momwe mfundozi zingapangitse moyo kukhala wosavuta kwa tonsefe.

Ndiyesetsanso kugwiritsa ntchito mfundo zoziziritsa kukhosi zomwe aliyense angasiye mu ndemanga ngati akufuna.

Source: www.habr.com

Kuwonjezera ndemanga