Aikin Buɗaɗɗen Data Hub dandamali ne na koyon inji wanda ya danganci Red Hat OpenShift

Nan gaba ta zo, kuma an riga an sami nasarar amfani da basirar wucin gadi da fasahar koyon inji ta shagunan da kuka fi so, kamfanonin sufuri har ma da gonakin turkey.

Aikin Buɗaɗɗen Data Hub dandamali ne na koyon inji wanda ya danganci Red Hat OpenShift

Kuma idan wani abu ya kasance, to, akwai wani abu game da shi akan Intanet ... wani aikin budewa! Duba yadda Buɗe Data Hub yana taimaka muku haɓaka sabbin fasahohi da guje wa ƙalubalen aiwatarwa.

Tare da duk fa'idodin basirar wucin gadi (AI) da koyan injin (ML), ƙungiyoyi galibi suna da wahalar ƙima waɗannan fasahohin. Babban matsalolin da ke cikin wannan harka yawanci sune kamar haka:

  • Musayar bayanai da haɗin kai - yana da kusan ba zai yiwu a yi musayar bayanai ba tare da ƙoƙari ba da haɗin kai cikin hanzari.
  • Samun bayanai - ga kowane aiki yana buƙatar sake gina shi da hannu, wanda ke ɗaukar lokaci mai yawa.
  • Samun shiga akan buƙata - babu wata hanya ta samun damar samun damar yin amfani da kayan aikin koyo na inji da dandamali, da kuma kayan aikin kwamfuta.
  • Production - samfura sun kasance a matakin samfur kuma ba a kawo su ga amfanin masana'antu ba.
  • Bibiya da bayyana sakamakon AI - reproducibility, bin diddigin da bayanin sakamakon AI / ML yana da wahala.

Idan ba a magance su ba, waɗannan matsalolin suna yin mummunan tasiri ga sauri, inganci, da haɓakar masana kimiyyar bayanai masu mahimmanci. Wannan yana haifar da bacin rai, rashin jin daɗi a cikin aikin su, kuma a sakamakon haka, tsammanin kasuwanci game da AI / ML ya tafi a banza.

Alhakin magance waɗannan matsalolin yana kan ƙwararrun IT, waɗanda dole ne su samar da manazarta bayanai - daidai ne, wani abu kamar girgije. A cikin ƙarin daki-daki, muna buƙatar dandamali wanda ke ba da 'yancin zaɓi kuma yana da dacewa, sauƙi mai sauƙi. A lokaci guda, yana da sauri, sauƙin sake daidaitawa, mai daidaitawa akan buƙata kuma yana jure rashin nasara. Gina irin wannan dandali akan fasahohin buɗaɗɗen tushe yana taimakawa guje wa kulle-kulle mai siyarwa da kiyaye fa'idar dabarun dogon lokaci dangane da sarrafa farashi.

Bayan 'yan shekarun da suka gabata, wani abu makamancin haka yana faruwa a cikin haɓaka aikace-aikacen kuma ya haifar da bullar microservices, gajimare masu gajimare, sarrafa kansa na IT, da matakan agile. Don jimre wa duk wannan, ƙwararrun IT sun juya zuwa kwantena, Kubernetes da buɗe girgije.

Yanzu ana amfani da wannan ƙwarewar don amsa ƙalubalen Al. Abin da ya sa ƙwararrun IT ke gina dandamali waɗanda ke tushen kwantena, suna ba da damar ƙirƙirar ayyukan AI/ML a cikin matakai masu ƙarfi, haɓaka ƙima, kuma an gina su da ido ga gajimare.

Aikin Buɗaɗɗen Data Hub dandamali ne na koyon inji wanda ya danganci Red Hat OpenShift

Za mu fara gina irin wannan dandali tare da Red Hat OpenShift, dandali na Kubernetes na kwantena don gajimaren gajimare, wanda ke da saurin haɓaka yanayin yanayin software da mafita na ML (NVIDIA, H2O.ai, Starburst, PerceptiLabs, da sauransu). Wasu daga cikin abokan cinikin Red Hat, irin su BMW Group, ExxonMobil da sauransu, sun riga sun tura jigilar kayan aiki na ML da tsarin DevOps a saman dandamali da tsarin halittar sa don kawo abubuwan gine-ginen ML ɗin su don samarwa da kuma hanzarta ayyukan masu nazarin bayanai.

Wani dalilin da ya sa muka ƙaddamar da Buɗaɗɗen Ayyukan Cibiyar Data shine don nuna misalin gine-ginen da ya danganci ayyukan software da yawa da aka bude da kuma nuna yadda ake aiwatar da dukan tsarin rayuwa na maganin ML bisa tushen OpenShift.

Bude Data Hub Project

Wannan aikin buɗaɗɗen tushe ne wanda aka haɓaka a cikin al'ummomin ci gaba daidai kuma yana aiwatar da cikakken tsarin aiki - daga lodi da canza bayanan farko zuwa samarwa, horarwa da kiyaye samfuri - lokacin warware matsalolin AI / ML ta amfani da kwantena da Kubernetes akan OpenShift. dandamali. Wannan aikin za a iya la'akari da aiwatar da tunani, misali na yadda za a gina buɗaɗɗen AI / ML-as-a-service bayani dangane da OpenShift da kayan aikin budewa masu dangantaka kamar Tensorflow, JupyterHub, Spark da sauransu. Yana da mahimmanci a lura cewa Red Hat kanta tana amfani da wannan aikin don samar da ayyukan AI/ML. Bugu da ƙari, OpenShift yana haɗawa da maɓalli na software da hardware ML mafita daga NVIDIA, Seldon, Starbust da sauran dillalai, yana sauƙaƙa ginawa da gudanar da tsarin koyon injin ku.

Aikin Buɗaɗɗen Data Hub dandamali ne na koyon inji wanda ya danganci Red Hat OpenShift

Aikin Buɗe Data Hub yana mai da hankali kan nau'ikan masu amfani da masu amfani da su:

  • Mai nazarin bayanai wanda ke buƙatar mafita don aiwatar da ayyukan ML, wanda aka tsara kamar girgije tare da ayyukan aikin kai.
  • Mai nazarin bayanai wanda ke buƙatar mafi girman zaɓi daga sabon buɗaɗɗen tushen kayan aikin AI/ML da dandamali.
  • Mai nazarin bayanai wanda ke buƙatar samun dama ga tushen bayanai lokacin horar da ƙirar.
  • Mai nazarin bayanai wanda ke buƙatar samun damar yin amfani da albarkatun kwamfuta (CPU, GPU, ƙwaƙwalwar ajiya).
  • Mai nazarin bayanai wanda ke buƙatar ikon yin aiki tare da raba aiki tare da abokan aiki, karɓar ra'ayi, da haɓaka haɓakawa cikin sauri.
  • Manazarcin bayanai wanda ke son yin hulɗa tare da masu haɓakawa (da ƙungiyoyin sadaukarwa) don samfuran ML ɗin sa da sakamakon aikin sa su shiga samarwa.
  • Injiniyan bayanai wanda ke buƙatar samar da manazarcin bayanai tare da samun dama ga maɓuɓɓugar bayanai iri-iri yayin bin ka'idoji da buƙatun tsaro.
  • Mai gudanar da tsarin IT / mai gudanarwa wanda ke buƙatar ikon sarrafa yanayin rayuwa (shigarwa, daidaitawa, haɓakawa) na abubuwan buɗaɗɗen tushen tushe da fasaha. Har ila yau, muna buƙatar gudanarwa da kayan aiki masu dacewa.

Aikin Buɗaɗɗen Data Hub yana haɗa nau'ikan kayan aikin buɗaɗɗen tushe don aiwatar da cikakken tsarin ayyukan AI / ML. Jupyter Notebook ana amfani dashi anan azaman babban kayan aiki don nazarin bayanai. Kayan aikin ya shahara tsakanin masana kimiyyar bayanai a yau, kuma Bude Data Hub yana ba su damar ƙirƙira da sarrafa wuraren aikin Jupyter Notebook cikin sauƙi ta amfani da ginanniyar JupyterHub. Baya ga ƙirƙira da shigo da littattafan rubutu na Jupyter, aikin Buɗaɗɗen Data Hub kuma yana ƙunshe da adadin shirye-shiryen littattafan rubutu a cikin nau'in Laburaren AI.

Wannan ɗakin karatu tarin abubuwan ilmantarwa na injin buɗaɗɗen tushe da mafita don al'amuran gama-gari waɗanda ke sauƙaƙe samfuri cikin sauri. An haɗa JupyterHub tare da samfurin samun damar RBAC na OpenShift, wanda ke ba ku damar amfani da asusun OpenShift da ke akwai da aiwatar da sa hannu guda ɗaya. Bugu da ƙari, JupyterHub yana ba da haɗin gwiwar mai amfani mai amfani da ake kira spawner, ta hanyar da mai amfani zai iya daidaita adadin albarkatun kwamfuta (CPU cores, memory, GPU) don zaɓin Jupyter Notebook.

Bayan mai nazarin bayanai ya ƙirƙira da daidaita kwamfutar tafi-da-gidanka, duk sauran abubuwan da ke damun shi ana kula da su ta hanyar tsara tsarin Kubernetes, wanda ke cikin OpenShift. Masu amfani za su iya aiwatar da gwaje-gwajen su kawai, adanawa da raba sakamakon aikinsu. Bugu da ƙari, masu amfani da ci gaba za su iya samun dama ga harsashi na OpenShift CLI kai tsaye daga littattafan rubutu na Jupyter don yin amfani da abubuwan da suka dace na Kubernetes kamar ayyukan Ayuba ko OpenShift kamar Tekton ko Knative. Ko don wannan zaka iya amfani da GUI mai dacewa na OpenShift, wanda ake kira "OpenShift web console".

Aikin Buɗaɗɗen Data Hub dandamali ne na koyon inji wanda ya danganci Red Hat OpenShift

Aikin Buɗaɗɗen Data Hub dandamali ne na koyon inji wanda ya danganci Red Hat OpenShift

Ci gaba zuwa mataki na gaba, Buɗe Data Hub yana ba da damar sarrafa bututun bayanai. Don wannan, ana amfani da abu Ceph, wanda aka tanadar a matsayin ajiyar bayanan abu mai dacewa da S3. Apache Spark yana ba ku damar jera bayanai daga tushen waje ko ginanniyar ajiyar Ceph S3, kuma yana ba ku damar aiwatar da canje-canjen bayanan farko. Apache Kafka yana ba da ingantaccen sarrafa bututun bayanai (inda za a iya loda bayanai da yawa sau da yawa, da kuma canjin bayanai, bincike, da ayyukan dagewa).

Don haka, mai nazarin bayanan ya sami damar shiga bayanan kuma ya gina samfurin. Yanzu yana da sha'awar raba sakamakon da aka samu tare da abokan aiki ko masu haɓaka aikace-aikacen, kuma ya ba su samfurinsa akan ka'idodin sabis. Wannan yana buƙatar uwar garken inference, kuma Buɗe Data Hub yana da irin wannan uwar garken, ana kiranta Seldon kuma yana ba ku damar buga samfurin azaman sabis na RESTful.

A wani lokaci, akwai nau'ikan nau'ikan iri da yawa akan uwar garken Seldon, kuma akwai buƙatar saka idanu akan yadda ake amfani da su. Don cimma wannan, Buɗe Data Hub yana ba da tarin ma'auni masu dacewa da injin ba da rahoto dangane da kayan aikin sa ido na buɗe tushen da ake amfani da su sosai Prometheus da Grafana. Sakamakon haka, muna karɓar ra'ayi don saka idanu akan amfani da samfuran AI, musamman a cikin yanayin samarwa.

Aikin Buɗaɗɗen Data Hub dandamali ne na koyon inji wanda ya danganci Red Hat OpenShift

Ta wannan hanyar, Buɗe Data Hub yana samar da tsarin kamar girgije a duk tsawon rayuwar AI / ML, daga samun damar bayanai da shirye-shirye zuwa samfurin horo da samarwa.

Saka shi duka tare

Yanzu tambaya ta taso yadda za a tsara duk wannan don mai gudanar da OpenShift. Kuma wannan shine inda wani ma'aikacin Kubernetes na musamman don ayyukan Buɗe Data Hub ya shigo cikin wasa.

Aikin Buɗaɗɗen Data Hub dandamali ne na koyon inji wanda ya danganci Red Hat OpenShift

Wannan ma'aikacin yana kula da shigarwa, daidaitawa da tsarin rayuwa na aikin Buɗaɗɗen Data Hub, ciki har da ƙaddamar da kayan aikin da aka ambata kamar JupyterHub, Ceph, Spark, Kafka, Seldon, Prometheus da Grafana. Ana iya samun aikin Buɗaɗɗen Data Hub a cikin na'ura mai ba da hanya tsakanin hanyoyin sadarwa na OpenShift, a cikin sashin ma'aikatan al'umma. Don haka, mai gudanarwa na OpenShift zai iya ƙayyade cewa ayyukan OpenShift daidai an kasafta su a matsayin "Bude Data Hub project". Ana yin wannan sau ɗaya. Bayan haka, mai binciken bayanan ya shiga cikin sararin aikinsa ta hanyar na'ura mai ba da hanya tsakanin hanyoyin sadarwa na OpenShift kuma ya ga cewa an shigar da ma'aikacin Kubernetes kuma yana samuwa don ayyukansa. Sannan ya ƙirƙiri misalin Buɗe Data Hub aikin tare da dannawa ɗaya kuma nan da nan ya sami damar yin amfani da kayan aikin da aka bayyana a sama. Kuma duk wannan za a iya saita a high samuwa da kuma kuskure yanayin haƙuri.

Aikin Buɗaɗɗen Data Hub dandamali ne na koyon inji wanda ya danganci Red Hat OpenShift

Idan kuna son gwada aikin Buɗe Data Hub da kanku, fara da umarnin shigarwa da gabatarwar koyawa. Ana iya samun cikakkun bayanai na fasaha na gine-ginen Buɗe Data Hub a nan, tsare-tsaren ci gaban ayyuka - a nan. A nan gaba, muna shirin aiwatar da ƙarin haɗin gwiwa tare da Kubeflow, warware batutuwa da dama tare da tsarin bayanai da tsaro, da kuma tsara haɗin kai tare da tsarin tushen dokoki Drools da Optaplanner. Bayyana ra'ayin ku kuma zama ɗan takara a cikin aikin Bude Data Hub mai yiwuwa a shafi al'umma.

Don sake dubawa: Mummunan ƙalubalen ƙalubalen suna hana ƙungiyoyi su gane cikakken yuwuwar basirar ɗan adam da koyan na'ura. Red Hat OpenShift an dade ana samun nasarar amfani da shi don magance irin wadannan matsaloli a masana'antar software. Aikin Buɗaɗɗen Data Hub, wanda aka aiwatar a cikin al'ummar ci gaban tushen buɗe ido, yana ba da tsarin gine-gine don tsara cikakken zagayowar ayyukan AI/ML dangane da gajimare na OpenShift. Muna da tsari mai haske da tunani don ci gaban wannan aikin, kuma muna da gaske game da ƙirƙirar al'umma mai ƙwazo da hayayyafa a kusa da shi don haɓaka buɗaɗɗen hanyoyin AI akan dandalin OpenShift.

source: www.habr.com

Add a comment