Uphononongo lwe-Gartner MQ 2020: Ukufunda ngoomatshini kunye neeplatifti zobukrelekrele bokuzenzela

Akunakwenzeka ukucacisa isizathu sokuba ndifunde oku. Ndandinexesha nje kwaye ndandinomdla kwindlela imarike isebenza ngayo. Kwaye le sele iyimakethi egcwele ngokupheleleyo ngokukaGartner ukusukela ngo-2018. Ukususela kwi-2014-2016 yayibizwa ngokuba yi-analytics ephezulu (iingcambu kwi-BI), ngo-2017 - iNzululwazi yeDatha (andiyazi indlela yokuguqulela oku kwisiRashiya). Kwabo banomdla kwiintshukumo zabathengisi bejikeleze isikwere, unako apha khangela. Kwaye ndiza kuthetha ngesikwere sika-2020, ngakumbi kuba utshintsho olukhoyo ukusukela ngo-2019 luncinci: i-SAP iphumile kwaye iAltair yathenga iDatawatch.

Olu alulo uhlalutyo olucwangcisiweyo okanye itafile. Imbono yomntu, kwakhona ukusuka kwindawo yokujonga i-geophysicist. Kodwa ndihlala ndinomdla wokufunda iGartner MQ, baqulunqa amanqaku athile ngokugqibeleleyo. Ke nantsi izinto endizithathele ingqalelo kuzo zombini ubuchwephesha, ubulumko bentengiso, kunye nefilosofi.

Oku akusiyo eyabantu abanzulu kwisihloko se-ML, kodwa kubantu abanomdla kwinto eyenzekayo ngokubanzi kwimarike.

Imakethi ye-DSML ngokwayo ngokusengqiqweni ihlala phakathi kwe-BI kunye neenkonzo zophuhlisi ze-AI.

Uphononongo lwe-Gartner MQ 2020: Ukufunda ngoomatshini kunye neeplatifti zobukrelekrele bokuzenzela

Iingcaphuno ezizithandayo kunye namagama kuqala:

  • "Inkokeli ayinakuba lolona khetho lulungileyo" - Inkokeli yemarike ayiyonto oyifunayo. Kungxamiseke kakhulu! Ngenxa yokungabikho komthengi osebenzayo, bahlala bekhangela isisombululo "esigqwesileyo", kunokuba "esifanelekileyo".
  • "Ukusebenza kwemodeli" - ezifinyeziweyo njenge MOPs. Kwaye wonke umntu unobunzima kunye neepugs! – (umxholo wepug opholileyo wenza ukuba imodeli isebenze).
  • "Imeko yencwadi yamanqaku" ngumbono obalulekileyo apho ikhowudi, izimvo, idatha kunye neziphumo zidibana. Oku kucace kakhulu, kuyathembisa kwaye kunokunciphisa kakhulu inani lekhowudi ye-UI.
  • "Imiliselwe kwi-OpenSource" - kuthethwe kakuhle - kuthatha ingcambu kumthombo ovulekileyo.
  • "IiNzululwazi zeDatha yabemi" - ama-dudes anjalo alula, ama-lamers anjalo, kungekhona iingcali, ezifuna indawo ebonakalayo kunye nazo zonke iintlobo zezinto ezincedisayo. Abayi kubhala ikhowudi.
  • "Idemokhrasi" - ihlala isetyenziselwa ukuthetha "ukwenza kufumaneke kuluhlu olubanzi lwabantu." Sinokuthi "demokhrasi idatha" endaweni yengozi "khulula idatha" ebesiyisebenzisa. "Idemokhrasi" ihlala ingumsila omde kwaye bonke abathengisi babaleka emva kwayo. Ukuphulukana nobunzulu bolwazi-ukufumana ukufikeleleka!
  • "Uhlalutyo lweDatha yoHlolo-EDA" — ukuqwalaselwa kwezi ndlela zikhoyo. Ezinye izibalo. Umbono omncinci. Into eyenziwa ngumntu wonke ukuya kwinqanaba elithile okanye kwelinye. Bendingazi ukuba kukho igama lale nto
  • "Ukuvelisa kwakhona" - ukugcinwa okuphezulu kwazo zonke iiparamitha zokusingqongileyo, amagalelo kunye neziphumo ukuze uvavanyo lube nokuphindwa xa sele lwenziwe. Elona gama libalulekileyo kwindawo yovavanyo lovavanyo!

Ngoko:

I-alteryx

Ujongano olupholileyo, njengento yokudlala. I-scalability, kunjalo, inzima kancinci. Ngokufanelekileyo, uluntu lweenjineli ezijikeleze ngokufanayo kunye neetchotchkes zokudlala. I-Analytics yeyakho yonke kwibhotile enye. Ndikhumbuze uhlalutyo oluntsonkothileyo lwedatha ye-spectral eCoscad, eyaqulunqwa ngeminyaka yee-90.

Anaconda

Uluntu olujikeleze iPython kunye neengcali ze-R. Umthombo ovulekileyo mkhulu ngokufanelekileyo. Kwavela ukuba oogxa bam bayayisebenzisa ngalo lonke ixesha. Kodwa ndandingazi.

DataBricks

Ibandakanya iiprojekthi ezintathu ze-opensource - abaphuhlisi be-Spark bakhulise isihogo semali eninzi ukususela ngo-2013. Kufuneka ndicaphule i-wiki:

“NgoSeptemba 2013, iDatabricks yabhengeza ukuba inyuse i-13.9 yezigidi zeedola ku-Andreessen Horowitz. Inkampani inyuse ezongezelelweyo zeedola ezingama-33 ezigidi ngo-2014, i-60 yezigidi zeedola ngo-2016, i-140 yezigidi zeedola ngo-2017, i-250 yezigidi zeedola ngo-2019 (ngoFebruwari) kunye nezigidi ezingama-400 zeedola ngo-2019 (Oct)”!!!

Abanye abantu abakhulu basika uSpark. Andazi, uxolo!

Kwaye iiprojekthi zezi:

  • Delta Lake I-ACID kwi-Spark isandula ukukhutshwa (into ebesiyiphuphe ngayo nge-Elasticsearch) - iyiguqulela kwisiseko sedatha: i-schema eqinile, i-ACID, uphicotho-zincwadi, iinguqulelo...
  • Ukuhamba kweML - ukulandelela, ukupakishwa, ulawulo kunye nokugcinwa kweemodeli.
  • ikoala -Pandas DataFrame API kwiSpark -Pandas -Python API yokusebenza kunye neetafile kunye nedatha ngokubanzi.

Ungajonga kwi-Spark kwabo bangaziyo okanye abalibeleyo: unxibelelwano. Ndibukele iividiyo ezinemizekelo evela kwiinkuni ezidikayo kodwa ezineenkcukacha: IiBricks zeDatha yeNzululwazi (unxibelelwano) kunye nobuNjineli beDatha (unxibelelwano).

Ngamafutshane, iDatabricks ikhupha iSpark. Nabani na ofuna ukusebenzisa i-Spark ngokuqhelekileyo efini uthatha i-DataBricks ngaphandle kokuthandabuza, njengoko kujoliswe kuko 🙂 Spark ngowona mahluko ophambili apha.
Ndifundile ukuba ukusasazwa kwe-Spark ayisiyonyani yokwenyani okanye i-microbatching. Kwaye ukuba ufuna ixesha lokwenyani lokwenyani, likwi-Apache STORM. Wonke umntu uthi kwaye ubhala ukuba i-Spark ingcono kune-MapReduce. Esi sisilogeni.

IDATHAIKU

Cool-to-ekupheleni into. Kukho iintengiso ezininzi. Andiyiqondi ukuba yohluke njani kwi-Alteryx?

IdathaRobot

IPaxata yokulungiswa kwedatha yinkampani eyahlukileyo eyathengwa yiData Robots ngoDisemba ka-2019. Sikhulise i-MUSD engama-20 kwaye sathengisa. Konke kwi-7 iminyaka.

Ukulungiswa kwedatha kwiPaxata, hayi iExcel - bona apha: unxibelelwano.
Kukho ujongo oluzenzekelayo kunye nezindululo zokudityaniswa phakathi kweeseti zedatha ezimbini. Into enkulu - ukuqonda idatha, kuya kubakho ugxininiso ngakumbi kulwazi olubhaliweyo (unxibelelwano).
Ikhathalogu yedatha yikhathalogu ebalaseleyo yeeseti zedatha "eziphilayo" ezingenamsebenzi.
Ikwanika umdla indlela abalawuli abaqulunqwa ngayo kwiPaxata (unxibelelwano).

“Ngokutsho kwefem yomhlalutyi I-Ovum, isoftware yenziwa ukuba ibekho ngokuhambela phambili kwi Uhlalutyo oluqikelelweyo, yokufunda umatshini kwaye i I-NoSQL indlela yokugcina idatha.[15] Isoftware isebenzisa intsingiselo ii-algorithms zokuqonda intsingiselo yeekholamu zetheyibhile yedatha kunye ne-algorithms yokuqaphela ipateni ukufumana uphinda-phindo olunokubakho kwiseti yedatha.[15][7] Ikwasebenzisa isalathiso, ukuqondwa kwepateni yombhalo kunye nobunye ubuchwephesha obufunyanwa ngokwesiko kumajelo eendaba ezentlalo kunye nesoftware yokukhangela. "

Imveliso ephambili yeData Robot yi apha. Isilogeni sabo sisuka kwiModeli ukuya kwiSicelo soShishino! Ndifumene ukubonisana kwishishini leoli ngokunxulumene nengxaki, kodwa yayingavumelekanga kakhulu kwaye ingathandeki: unxibelelwano. Ndibukele iividiyo zabo kwiMops okanye iMLops (unxibelelwano). Le yiFrankenstein ehlanganiswe kwi-6-7 yokuthengwa kweemveliso ezahlukeneyo.

Ewe, kuyacaca ukuba iqela elikhulu leeNzululwazi zeDatha kufuneka libe nendawo enjalo yokusebenza kunye neemodeli, ngaphandle koko baya kuvelisa uninzi lwabo kwaye bangaze basebenzise nantoni na. Kwaye kwinyani yethu ye-oyile kunye negesi enyukayo, ukuba nje sinokwenza imodeli enye eyimpumelelo, iya kuba yinkqubela phambili enkulu!

Inkqubo ngokwayo yayikhumbuza kakhulu umsebenzi kunye neenkqubo zokuyila kwi-geology-geophysics, umzekelo Petrel. Wonke umntu ongonqeni kakhulu wenza kwaye aguqule iimodeli. Qokelela idatha kwimodeli. Emva koko benza imodeli yereferensi kwaye bayithumela kwimveliso! Phakathi, yithi, imodeli yejoloji kunye nemodeli ye-ML, unokufumana okuninzi okufanayo.

Domino

Ugxininiso kwiqonga elivulekileyo kunye nentsebenziswano. Abasebenzisi beshishini bamkelwa simahla. ILabhu yabo yeDatha ifana kakhulu ne-sharepoint. (Kwaye igama libetha ngamandla i-IBM). Yonke imifuniselo iqhagamshela kwiseti yedatha yoqobo. Indlela eyaziwa ngayo le nto 🙂 Njengoko kwindlela yethu yokusebenza - enye idatha yatsalwa kwimodeli, emva koko yahlanjululwa kwaye yabekwa ngokulandelelana kwimodeli, kwaye konke oku sele kuhlala kukho kwimodeli kwaye iziphelo azinakufumaneka kwidatha yomthombo.

I-Domino ine-virtualization epholileyo yeziseko zophuhliso. Ndadibanisa umatshini kangangoko kufuneka ngomzuzwana ndaya kubala. Indlela eyenziwe ngayo ayikacaci kwangoko. I-Docker ikuyo yonke indawo. Ininzi inkululeko! Naziphi na iindawo zokusebenza zeenguqulelo zamva nje zinokuqhagamshelwa. Ukuqaliswa okufanayo kwemifuniselo. Ukulandelela kunye nokukhethwa kwabaphumeleleyo.

Ngokufanayo neDataRobot - iziphumo zipapashwa kubasebenzisi bezoshishino ngendlela yezicelo. Kubantu abanesiphiwo ngakumbi "abachaphazelekayo". Kwaye ukusetyenziswa kwangempela kweemodeli kukwajongwa. Yonke into yeePugs!

Andiqondi ngokupheleleyo ukuba iimodeli ezinzima ziphela njani kwimveliso. Olunye uhlobo lwe-API lunikezelwa ukubondla idatha kwaye ufumane iziphumo.

H2O

I-Driveless AI yinkqubo ehlangeneyo kwaye ecacileyo ye-ML eLawulwayo. Yonke into kwibhokisi enye. Akucaci ngokupheleleyo ngokukhawuleza malunga ne-backend.

Imodeli ipakishwe ngokuzenzekelayo kwiseva ye-REST okanye i-Java App. Lo ngumbono omkhulu. Kuninzi osele kwenziwe ukuTolika nokucaciswa. Ukutolikwa kunye nenkcazo yeziphumo zemodeli (Yintoni ngokwemvelo akufanele ichazwe, ngaphandle koko umntu unokubala okufanayo?).
Ngethuba lokuqala, isifundo semeko malunga nedatha engacwangciswanga kunye NLP. Umfanekiso woyilo olukumgangatho ophezulu. Kwaye ngokubanzi ndiyithandile imifanekiso.

Kukho umthombo omkhulu ovulekileyo wesikhokelo se-H2O esingacacanga ngokupheleleyo (iseti yee-algorithms/iilayibrari?). Eyakho ilaptop ebonakalayo ngaphandle kwenkqubo efana neJupiter (unxibelelwano). Ndiphinde ndafunda malunga neemodeli zePojo kunye neMojo-H2O esongelwe kwiJava. Eyokuqala ithe tye, eyesibini ngokulungiselela kakuhle. I-H20 ngabo kuphela (!) UGartner udwelise uhlalutyo lombhalo kunye ne-NLP njengamandla abo, kunye nemigudu yabo malunga nokuchazwa. Ibaluleke kakhulu!

Kwindawo enye: ukusebenza okuphezulu, ukulungelelaniswa kunye nomgangatho woshishino kwintsimi yokudibanisa kunye ne-hardware kunye namafu.

Kwaye ubuthathaka bunengqiqo - Driverles AI ibuthathaka kwaye imxinwa xa kuthelekiswa nomthombo wabo ovulekileyo. Ukulungiswa kwedatha kuqhwalela xa kuthelekiswa nePaxata! Kwaye abayihoyi idatha yeshishini-umsinga, igrafu, i-geo. Ewe, yonke into ayinakuba ngcono.

IXESHA

Ndithande i-6 ecacileyo, iimeko zeshishini ezinomdla kakhulu kwiphepha eliphambili. I-OpenSource eyomeleleyo.

UGartner ubathobile ukusuka kwiinkokeli ukuya kwiimboniselo. Ukufumana imali kakubi luphawu oluhle kubasebenzisi, kuba iNkokeli ayisoloko ilolona khetho lulungileyo.

Igama eliphambili, njenge-H2O, landiswa, oku kuthetha ukunceda izazinzulu zedatha yabemi. Eli lixesha lokuqala umntu egxekwa ngokusebenza kuphononongo! Unomdla? Oko kukuthi, kukho amandla amaninzi ekhompyuter kangangokuba ukusebenza akunakuba yingxaki yenkqubo konke konke? UGartner unalo malunga neli gama elithi "Augmented" Inqaku elahlukileyo, eyayingenakufikelelwa.
Kwaye i-KNIME ibonakala ingowokuqala ongeyena waseMelika kuphononongo! (Kwaye abayili bethu balithanda kakhulu iphepha labo lokufika. Abantu abangaqhelekanga.

MathWorks

IMatLab liqabane lakudala elibekekileyo elaziwa nguye wonke umntu! Iibhokisi zezixhobo kuzo zonke iinkalo zobomi kunye neemeko. Into eyahlukileyo kakhulu. Ngapha koko, amaqashiso kunye neemathematika ezininzi kuyo yonke into ebomini!

Imveliso yokongeza i-Simulink yoyilo lwenkqubo. Ndingene kwiibhokisi zezixhobo zeDigital Twins-andiqondi nto ngayo, kodwa apha kuninzi okubhaliweyo. Kuba ishishini leoli. Ngokubanzi, le yimveliso eyahlukileyo ukusuka kubunzulu bemathematika kunye nobunjineli. Ukukhetha izixhobo zemathematika ezithile. NgokukaGartner, iingxaki zabo ziyafana nezo zeenjineli ezikrelekrele - akukho ntsebenziswano - wonke umntu ujonga imodeli yakhe, akukho demokrasi, akukho ngcaciso.

RapidMiner

Ndikhe ndadibana kwaye ndeva okuninzi ngaphambili (kunye noMatlab) kumxholo womthombo ovulekileyo olungileyo. Ndemba kancinci kwiTurboPrep njengesiqhelo. Ndinomdla kwindlela yokufumana idatha ecocekileyo kwidatha emdaka.

Kwakhona unokubona ukuba abantu balungile ngokusekelwe kwizinto zokuthengisa ze-2018 kunye nabantu ababi abathetha isiNgesi kwi-demo yesici.

Kwaye abantu baseDortmund ukusukela ngo-2001 abanemvelaphi eyomeleleyo yaseJamani)

Uphononongo lwe-Gartner MQ 2020: Ukufunda ngoomatshini kunye neeplatifti zobukrelekrele bokuzenzela
Andikayiqondi indawo ukuba yintoni kanye kanye ekhoyo kumthombo ovulekileyo- kufuneka ugrumbe nzulu. Iividiyo ezilungileyo malunga nokuthunyelwa kunye neengcamango ze-AutoML.

Akukho nto ikhethekileyo malunga ne-RapidMiner Server backend nokuba. Kuya kuba compact kwaye isebenze kakuhle kwiprimiyamu ngaphandle kwebhokisi. Ifakwe kwi-Docker. Indawo ekwabelwana ngayo kuphela kwiseva ye-RapidMiner. Kwaye ke kukho iRadoop, idatha evela kuHadoop, ukubala izicengcelezo ezivela kwi-Spark kwi-Studio workflow.

Njengoko kwakulindelekile, abathengisi abashushu abaselula “abathengisa iintonga ezinemigca” bazihlisa. UGartner, nangona kunjalo, uqikelela impumelelo yabo yexesha elizayo kwindawo yeShishini. Unganyusa imali apho. AmaJamani ayayazi indlela yokwenza oku, ingcwele-ingcwele :) Musa ukukhankanya iSAP !!!

Kuninzi abakwenzela abemi! Kodwa ukusuka kwiphepha ungabona ukuba uGartner uthi bazama ukuthengisa izinto ezintsha kwaye abalweli ububanzi bokugubungela, kodwa inzuzo.

Ihleli SAS и uTibco abathengisi abaqhelekileyo be-BI kum... Kwaye bobabini baphezulu kakhulu, nto leyo eqinisekisa ukuzithemba kwam ukuba iDataScience eqhelekileyo ikhula ngokusengqiqweni.
ukusuka kwi-BI, kwaye hayi ukusuka kumafu kunye neziseko zeHadoop. Ukusuka kwishishini, oko kukuthi, kwaye hayi kwi-IT. NjengakwiGazpromneft umzekelo: unxibelelwano,Imeko-bume ye-DSML ekhulileyo ikhula kwizenzo ezinamandla ze-BI. Kodwa mhlawumbi iyabetha kwaye inomkhethe kwi-MDM nakwezinye izinto, ngubani owaziyo.

SAS

Akukho nto ingako yokuthetha. Kuphela izinto ezicacileyo.

TIBCO

Isicwangciso sifundwa kuluhlu lokuthenga kwiphepha elide le-Wiki. Ewe, ibali elide, kodwa 28 !!! UCharles. Ndithenge i-BI Spotfire (2007) emva kwi-techno-youth yam. Kwaye kwakhona ukunika ingxelo evela Jaspersoft (2014), ngoko ke kangangoko ezintathu kwangaphambili analytics abathengisi Insightful (S-plus) (2008), Statistica (2017) kunye Alpine Data (2017), inkqubo isiganeko kunye nokusasaza Streambase System (2013), MDM Orchestra Amanethiwekhi (2018) kunye ne-Snappy Data (2019) kwi-platform yememori.

Molo Frankie!

Uphononongo lwe-Gartner MQ 2020: Ukufunda ngoomatshini kunye neeplatifti zobukrelekrele bokuzenzela

umthombo: www.habr.com

Yongeza izimvo