Ukubuyekezwa kwe-Gartner MQ 2020: Ukufunda Ngomshini kanye Nezinkundla Zobuhlakani Zokwenziwa

Akunakwenzeka ukuchaza isizathu sokuthi kungani ngifunde lokhu. Nganginesikhathi nje futhi nganginentshisekelo yokuthi imakethe isebenza kanjani. Futhi lokhu sekuvele kuyimakethe egcwele ngokugcwele ngokusho kukaGartner kusukela ngo-2018. Kusukela ngo-2014-2016 kwabizwa ngokuthi i-Advanced analytics (izimpande ku-BI), ngo-2017 - Isayensi Yedatha (angazi ukuthi ngingahumusha kanjani lokhu ngesiRashiya). Kulabo abathanda ukuhamba kwabathengisi bezungeza isikwele, ungakwazi lapha bheka. Futhi ngizokhuluma ngesikwele sika-2020, ikakhulukazi njengoba izinguquko lapho kusukela ngo-2019 zincane kakhulu: I-SAP yaphuma futhi i-Altair yathenga i-Datawatch.

Lokhu akukona ukuhlaziya okuhlelekile noma ithebula. Umbono womuntu ngamunye, futhi kusukela ekubukeni kwe-geophysicist. Kodwa ngihlale nginesifiso sokufunda i-Gartner MQ, bakha amaphuzu athile kahle. Ngakho-ke nazi izinto engizinake kokubili ngokobuchwepheshe, ukuhlakanipha kwemakethe, kanye nefilosofi.

Lokhu akukona okwabantu abajule esihlokweni se-ML, kodwa okwabantu abathanda lokho okwenzeka ngokuvamile emakethe.

Imakethe ye-DSML ngokwayo ngokunengqondo ihlala phakathi kwe-BI nezinsizakalo zonjiniyela we-Cloud AI.

Ukubuyekezwa kwe-Gartner MQ 2020: Ukufunda Ngomshini kanye Nezinkundla Zobuhlakani Zokwenziwa

Izingcaphuno eziyintandokazi namagama kuqala:

  • "Umholi angeke abe yisinqumo esingcono kakhulu" - Umholi wemakethe akuyona into oyidingayo. Kuyaphuthuma kakhulu! Njengomphumela wokuntuleka kwekhasimende elisebenzayo, bahlala befuna isisombululo "esingcono kakhulu", kunokuba "esifanele" esisodwa.
  • "Ukusebenza kwemodeli" - ifingqiwe njengama-MOP. Futhi wonke umuntu unesikhathi esinzima ngama-pugs! - (itimu epholile ye-pug yenza imodeli isebenze).
  • "Indawo yamanothi" umqondo obalulekile lapho ikhodi, ukuphawula, idatha kanye nemiphumela kuhlangana khona. Lokhu kucace kakhulu, kuyathembisa futhi kunganciphisa kakhulu inani lekhodi ye-UI.
  • "Kugxilwe ku-OpenSource" - kusho kahle - izimpande kumthombo ovulekile.
  • "Citizen Data Scientists" - ama-dudes anjalo alula, ama-lamers anjalo, hhayi ochwepheshe, abadinga indawo ebonakalayo kanye nazo zonke izinhlobo zezinto ezisizayo. Ngeke babhale ikhodi.
  • "Intando yeningi" — ngokuvamile kuvame ukusho ukuthi “ukwenza kutholakale inqwaba yabantu abaningi.” Singasho ukuthi “yenza intando yeningi kudatha” esikhundleni sokuthi “khulula idatha” eyingozi ebesiyisebenzisa. "I-Democratise" ihlale ingumsila omude futhi bonke abathengisi bagijima ngemva kwayo. Lahla ekushubeni kolwazi - zuza ekungenekeni!
  • "I-Exploratory Data Analysis - EDA" - ukucatshangelwa kwalezi zindlela ezikhona. Ezinye izibalo. Ukubuka kancane. Into wonke umuntu ayenza ngezinga elithile noma kwelinye. Bengazi ukuthi kunegama lalokhu
  • "Ukukhiqiza kabusha" - ukulondolozwa okuphezulu kwayo yonke imingcele yemvelo, okokufaka kanye nemiphumela ukuze ukuhlolwa kuphindwe uma sekwenziwe. Itemu elibaluleke kakhulu lendawo yokuhlola yokuhlola!

Ngakho:

I-Alteryx

Isikhombikubona esihle, njengethoyizi. I-scalability, yebo, inzima kancane. Ngokunjalo, umphakathi weCitizen wonjiniyela abaseduze okufanayo nama-tchotchkes abazowadlala. Izibalo zingezakho ebhodleleni elilodwa. Ungikhumbuze ngenkimbinkimbi yokuhlaziywa kwedatha ye-spectral-correlation I-Coscad, eyahlelwa ngeminyaka yawo-90s.

Anaconda

Umphakathi ozungeze ochwepheshe bePython nabakwa-R. Umthombo ovulekile mkhulu ngokufanele. Kuvele ukuthi ozakwethu bayayisebenzisa ngaso sonke isikhathi. Kodwa ngangingazi.

IdathaBricks

Iqukethe amaphrojekthi amathathu we-opensource - abathuthukisi be-Spark bakhulise isihogo semali eningi kusukela ngo-2013. Kufanele ngicaphune i-wiki:

“NgoSepthemba 2013, iDatabricks yamemezela ukuthi iqoqe u-$13.9 million ku-Andreessen Horowitz. Le nkampani ikhulise ama- $ 33 wezigidi ngo-2014, ama- $ 60 wezigidi ngo-2016, ama- $ 140 wezigidi ngo-2017, ama- $ 250 wezigidi ngo-2019 (Feb) kanye nama- $ 400 wezigidi ngo-2019 (Oct) ”!!!

Abanye abantu abakhulu basika uSpark. Angazi, ngiyaxolisa!

Futhi amaphrojekthi yilezi:

  • Delta Lake - I-ACID ku-Spark isanda kukhishwa (esiphuphe ngayo nge-Elasticsearch) - iyiguqulela kusizindalwazi: i-schema eqinile, i-ACID, ukucwaninga, izinguqulo...
  • Ukugeleza kwe-ML - ukulandelela, ukupakisha, ukuphathwa nokugcinwa kwamamodeli.
  • i-koala - I-Pandas DataFrame API on Spark - Pandas - Python API yokusebenza ngamatafula nedatha ngokuvamile.

Ungabheka i-Spark kulabo abangazi noma abakhohlwe: isixhumanisi. Ngibuke amavidiyo anezibonelo ezivela eziqotsheni ezibonisa isicefe kodwa ezinemininingwane: DataBricks for Data Science (isixhumanisi) kanye nobunjiniyela bedatha (isixhumanisi).

Ngamafuphi, i-Databricks ikhipha i-Spark. Noma ubani ofuna ukusebenzisa i-Spark ngokuvamile efwini uthatha i-DataBricks ngaphandle kokungabaza, njengoba kuhlosiwe 🙂 I-Spark ingumhlukanisi omkhulu lapha.
Ngifunde ukuthi Ukusakazwa kwe-Spark akusona isikhathi sangempela esingamanga noma i-microbatching. Futhi uma udinga isikhathi sangempela sangempela, siku-Apache STORM. Wonke umuntu uphinde asho abhale ukuthi i-Spark ingcono kune-MapReduce. Lesi isiqubulo.

IDATHAIKU

Into epholile yokuphela-to-ekupheleni. Kunezikhangiso eziningi. Angiqondi ukuthi ihluke kanjani ku-Alteryx?

IdathaRobot

I-Paxata yokulungisa idatha yinkampani ehlukile eyathengwa ama-Data Robots ngo-December 2019. Sikhulise i-MUSD engu-20 futhi sathengisa. Konke eminyakeni engu-7.

Ukulungiswa kwedatha ku-Paxata, hhayi i-Excel - bona lapha: isixhumanisi.
Kukhona ukubheka okuzenzakalelayo neziphakamiso zokujoyina phakathi kwamadathasethi amabili. Into enhle - ukuqonda idatha, kungaba nokugcizelelwa okwengeziwe kolwazi lombhalo (isixhumanisi).
Ikhathalogi yedatha ikhathalogi enhle kakhulu yamasethi edatha "bukhoma" angenamsebenzi.
Kuyathakazelisa futhi ukuthi izinkomba zenziwa kanjani ku-Paxata (isixhumanisi).

“Ngokusho kwenkampani yabahlaziyi I-Ovum, isofthiwe yenziwa yenzeke ngokuthuthukela ku i-analytics yokubikezela, ukufunda imishini futhi I-NoSQL indlela yokulondoloza idatha.[15] Isoftware isebenzisa isemantic ama-algorithms okuqonda incazelo yamakholomu ethebula ledatha nama-algorithms okuqaphela iphethini ukuze kutholwe izimpinda ezingaba khona kusethi yedatha.[15][7] Isebenzisa futhi inkomba, ukuqashelwa kwephethini yombhalo nobunye ubuchwepheshe obuvame ukutholakala ezinkundleni zokuxhumana kanye nesofthiwe yokusesha.”

Umkhiqizo oyinhloko we-Data Robot ngu lapha. Isiqubulo sabo sisuka kuModel siye ku-Enterprise Application! Ngithole ukuxoxisana nemboni kawoyela mayelana nale nkinga, kodwa bekungavumelekile futhi kungathakazelisi: isixhumanisi. Ngibuke amavidiyo abo kumaMops noma ku-MLops (isixhumanisi). Lena i-Frankenstein enjalo eqoqwe kusuka ku-6-7 ukuthengwa kwemikhiqizo ehlukahlukene.

Yiqiniso, kuyacaca ukuthi ithimba elikhulu le-Data Scientists kumele libe nendawo enjalo yokusebenza namamodeli, ngaphandle kwalokho bazokhiqiza okuningi futhi bangalokothi bathumele lutho. Futhi eqinisweni lethu likawoyela negesi elikhuphuka nomfula, uma nje singakha imodeli eyodwa ephumelelayo, lokho kungaba inqubekelaphambili enkulu!

Inqubo ngokwayo yayikhumbuza kakhulu umsebenzi ngezinhlelo zokuklama ku-geology-geophysics, isibonelo Petrel. Wonke umuntu ongavilaphi kakhulu wenza futhi aguqule amamodeli. Qoqa idatha kumodeli. Bese benza imodeli yereferensi futhi bayithumela ekukhiqizeni! Phakathi, ake sithi, imodeli ye-geological kanye nemodeli ye-ML, ungathola okuningi okufanayo.

Domino

Ukugcizelelwa kwenkundla evulekile nokusebenzisana. Abasebenzisi bebhizinisi bamukelwa mahhala. I-Data Lab yabo ifana kakhulu ne-sharepoint. (Futhi igama lihlaba kakhulu i-IBM). Konke ukuhlolwa kuxhuma kudathasethi yoqobo. Lokhu kujwayeleke kangakanani :) Njengomkhuba wethu - enye idatha yahudulelwa kumodeli, yabe ihlanzwa futhi yafakwa ngokuhlelekile kumodeli, futhi konke lokhu sekuvele kuhlala lapho kumodeli futhi iziphetho azitholakali kudatha yomthombo. .

I-Domino ine-virtualization yengqalasizinda epholile. Ngahlanganisa umshini ama-cores amaningi njengoba kudingeka ngomzuzwana futhi ngibale. Ukuthi kwenziwa kanjani akukacaci. I-Docker ikhona yonke indawo. Inkululeko eningi! Noma yiziphi izindawo zokusebenza zezinguqulo zakamuva zingaxhunywa. Ukwethulwa okufanayo kokuhlolwa. Ukulandelela nokukhethwa kwabaphumelele.

Okufanayo ne-DataRobot - imiphumela ishicilelwe kubasebenzisi bebhizinisi ngendlela yezicelo. Kulabo "ababambe iqhaza" abanesiphiwo. Futhi ukusetshenziswa kwangempela kwamamodeli nakho kuyagadwa. Konke kuma-Pugs!

Angiqondi ngokugcwele ukuthi amamodeli ayinkimbinkimbi agcina kanjani ekukhiqizeni. Olunye uhlobo lwe-API luhlinzekwa ukuze lubanike idatha futhi bathole imiphumela.

H2O

I-Driveless AI iwuhlelo oluhlangene kakhulu futhi olunembile lwe-ML Eqondiswayo. Konke ebhokisini elilodwa. Akucaci ngokuphelele ngaso leso sikhathi mayelana ne-backend.

Imodeli ipakishwa ngokuzenzakalelayo kuseva ye-REST noma i-Java App. Lona umqondo omuhle. Kuningi osekwenziwe ukutolika nokuchazwa. Ukuhunyushwa kanye nencazelo yemiphumela yemodeli (Yini ngokwemvelo okungafanele ichazwe, ngaphandle kwalokho umuntu angakwazi ukubala okufanayo?).
Ngokokuqala ngqa, ucwaningo lwecala mayelana nedatha engakhiwe kanye I-NLP. Isithombe sezakhiwo ezisezingeni eliphezulu. Futhi ngokujwayelekile ngazithanda izithombe.

Kunohlaka olukhulu lomthombo ovulekile lwe-H2O olungacacile ngokuphelele (isethi yama-algorithms/imitapo yolwazi?). I-laptop yakho ebonakalayo ngaphandle kokuhlela njenge-Jupiter (isixhumanisi). Ngiphinde ngafunda ngamamodeli we-Pojo ne-Mojo - H2O asongwe nge-Java. Eyokuqala iqondile, eyesibili ihambisana nokwenza kahle. I-H20 yibo kuphela(!) uGartner abafake kuhlu lwezibalo zombhalo kanye ne-NLP njengamandla abo, kanye nemizamo yabo mayelana nokuchazwa. Kubaluleke kakhulu!

Endaweni efanayo: ukusebenza okuphezulu, ukuthuthukiswa kanye nezinga lemboni emkhakheni wokuhlanganiswa nehadiwe namafu.

Futhi ubuthakathaka bunengqondo - I-Driverles AI ibuthakathaka futhi incane uma iqhathaniswa nomthombo wabo ovulekile. Ukulungiswa kwedatha kukhubazekile uma kuqhathaniswa nePaxata! Futhi abayinaki idatha yezimboni - ukusakaza, igrafu, i-geo. Phela, konke angeke kube kuhle.

UKWAZI

Ngithande izimo zebhizinisi ezi-6 ezicacile kakhulu, ezithakazelisa kakhulu ekhasini eliyinhloko. I-OpenSource eqinile.

U-Gartner ubehlise kusukela kubaholi waba ngababonisi bemibono. Ukuthola imali kabi kuwuphawu oluhle kubasebenzisi, uma kubhekwa ukuthi uMholi akahlale eyisinqumo esingcono kakhulu.

Igama eliyisihluthulelo, njengaku-H2O, li-augmented, okusho ukusiza ososayensi abampofu bedatha yezakhamizi. Kungokokuqala ukuthi umuntu agxekwe ngokusebenza esibuyekezweni! Kuyathakazelisa? Okungukuthi, kunamandla amaningi okwenza ikhompuyutha kangangokuthi ukusebenza akukwazi ukuba yinkinga yesistimu nhlobo? UGartner unaleli gama elithi "Augmented" isihloko esihlukile, engafinyelelwanga.
Futhi i-KNIME ibonakala ingowokuqala ongeyena umMelika ekubuyekezweni! (Futhi abaklami bethu balithande ngempela ikhasi labo lokufikela. Abantu abangaziwa.

I-MathWorks

I-MatLab iyiqabane elidala elihlonishwayo elaziwa yiwo wonke umuntu! Amabhokisi amathuluzi azo zonke izindawo zempilo nezimo. Okuthile okuhluke kakhulu. Eqinisweni, inkatho nenqwaba yezibalo zayo yonke into empilweni!

Umkhiqizo ongeziwe we-Simulink womklamo wesistimu. Ngingene emabhokisini amathuluzi eDigital Twins - angiqondi lutho ngakho, kodwa lapha kuningi okubhaliwe. Ngoba imboni kawoyela. Ngokuvamile, lokhu kuwumkhiqizo ohluke kakhulu ekujuleni kwezibalo nobunjiniyela. Ukukhetha amathuluzi ezibalo ezithile. NgokukaGartner, izinkinga zabo ziyefana nezonjiniyela abakhaliphile - akukho ukusebenzisana - wonke umuntu uzulazula ngemodeli yakhe, ayikho intando yeningi, akukho ncazelo.

I-RapidMiner

Ngike ngahlangana futhi ngezwa okuningi phambilini (kanye noMatlab) esimeni somthombo omuhle ovulekile. Ngimbe kancane kuTurboPrep njengenjwayelo. Nginentshisekelo yokuthi ungayithola kanjani idatha ehlanzekile kudatha engcolile.

Futhi ungabona ukuthi abantu bahle ngokusekelwe ezintweni zokuthengisa zango-2018 kanye nabantu ababi abakhuluma isiNgisi kudemo yesici.

Futhi abantu abavela eDortmund kusukela ngo-2001 abanesizinda esiqinile saseJalimane)

Ukubuyekezwa kwe-Gartner MQ 2020: Ukufunda Ngomshini kanye Nezinkundla Zobuhlakani Zokwenziwa
Angikaqondi kusayithi ukuthi yini ngempela etholakala emthonjeni ovulekile - udinga ukumba ujule. Amavidiyo amahle mayelana nokusetshenziswa kanye nemiqondo ye-AutoML.

Akukho okukhethekile mayelana ne-backend ye-RapidMiner Server. Cishe izobe ihlangene futhi isebenze kahle ku-premium ngaphandle kwebhokisi. Ipakishwe ku-Docker. Indawo okwabelwana ngayo kuphela kuseva ye-RapidMiner. Bese kuba khona i-Radoop, idatha evela ku-Hadoop, ibala imilolozelo evela ku-Spark ku-Studio flowflow.

Njengoba kwakulindelekile, abathengisi abashisayo abasebasha “abathengisa izinti ezinemithende” baziyisa phansi. U-Gartner, nokho, ubikezela impumelelo yabo yesikhathi esizayo endaweni ye-Enterprise. Ungaqoqa imali lapho. AmaJalimane ayakwazi ukwenza lokhu, ngcwele-ngcwele :) Ungakhulumi ngeSAP !!!

Kuningi abakwenzela izakhamuzi! Kodwa kusukela ekhasini ungabona ukuthi uGartner uthi banenkinga yokuthengisa emisha futhi abalweli ububanzi bokufakwa, kodwa ngenzuzo.

Kusele SAS и Tibco abathengisi abajwayelekile be-BI kimi... Futhi bobabili baphezulu kakhulu, okuqinisekisa ukuzethemba kwami ​​​​kokuthi i-DataScience evamile ikhula ngokunengqondo.
kusuka ku-BI, hhayi kusuka emafini nasezingqalasizinda ze-Hadoop. Kusuka ebhizinisini, okungukuthi, hhayi kusuka ku-IT. Njengesibonelo ku-Gazpromneft: isixhumanisi,Indawo ye-DSML evuthiwe ikhula kuzinqubo eziqinile ze-BI. Kodwa mhlawumbe ishaya phansi ngonyawo futhi ichemile ku-MDM nezinye izinto, kwazi bani.

SAS

Akukho okuningi ongakusho. Izinto ezisobala kuphela.

I-TIBCO

Isu lifundwa ohlwini lokuthenga ekhasini elide le-Wiki. Yebo, indaba ende, kodwa 28 !!! UCharles. Ngathenga i-BI Spotfire (2007) emuva ku-techno-youth yami. Futhi nokubika okuvela ku-Jaspersoft (2014), bese kuba ngabathengisi bezibalo ababikezelayo abangaba abathathu i-Insightful (S-plus) (2008), i-Statistica (2017) ne-Alpine Data (2017), ukucubungula umcimbi nokusakaza i-Streambase System (2013), i-MDM Orchestra Amanethiwekhi (2018) kanye ne-Snappy Data (2019) yesikhulumi senkumbulo.

Sawubona Frankie!

Ukubuyekezwa kwe-Gartner MQ 2020: Ukufunda Ngomshini kanye Nezinkundla Zobuhlakani Zokwenziwa

Source: www.habr.com

Engeza amazwana