Ukufunda ngoomatshini ngaphandle kwePython, iAnaconda kunye nezinye izinto ezirhubuluzayo

Hayi, kunjalo, andikho serious. Makubekho umda wokuwenza lula umbandela lowo. Kodwa kwizigaba zokuqala, ukuqonda iingcamango ezisisiseko kwaye ngokukhawuleza "ukungena" kwisihloko, kunokwamkeleka. Kwaye indlela yokubiza le nto ngokuchanekileyo (ukhetho: "Ukufunda ngomatshini kwiidummies", "Uhlalutyo lwedatha ukusuka kwi-diapers", "I-algorithms yezona zincinci"), siya kuxubusha ekugqibeleni.

Kwishishini. Ubhale iinkqubo ezininzi zesicelo kwi-MS Excel ukujonga kunye nokubona iinkqubo ezenzeka ngeendlela ezahlukeneyo zokufunda koomatshini xa uhlalutya idatha. Ukubona kukholelwa, ekugqibeleni, njengoko abathwali benkcubeko eyaphuhlisa uninzi lwezi ndlela bathi (ngendlela, kungekhona zonke. Eyona nto inamandla kakhulu "umatshini we-vector yenkxaso", okanye i-SVM, umatshini we-vector yenkxaso yinto esungulwe ngayo. Umlingane wethu uVladimir Vapnik, Moscow Institute of Management 1963, ngendlela! Ngoku, nangona kunjalo, ufundisa kwaye usebenza e-USA).

Iifayile ezintathu zokuphononongwa

1. Ukudibanisa ngeendlela k

Iingxaki zolu hlobo zibhekiselele "kwimfundo engajongwanga", xa sifuna ukwaphula idatha yomthombo kwinani elithile elaziwayo lamacandelo, kodwa kwangaxeshanye asinalo naliphi na inani "leempendulo ezichanekileyo", kufuneka sizikhuphe idatha ngokwayo. Ingxaki esisiseko yeklasiki yokufumana i-subspecies yeentyantyambo ze-iris (Ronald Fisher, 1936!), Ethathwa njengophawu lokuqala lwalo mmandla wolwazi, luloluhlobo.

Indlela ilula kakhulu. Sinesethi yezinto ezimelwe njengee-vectors (iiseti zamanani angu-N). Kwi-irises, ezi ziiseti zamanani ama-4 abonisa intyatyambo: ubude kunye nobubanzi bamacandelo angaphandle nangaphakathi e-periant, ngokulandelanayo (Iirises yeFisher - Wikipedia.). Njengomgama, okanye umlinganiselo wokusondela phakathi kwezinto, i-metric eqhelekileyo yeCartesian iyakhethwa.

Ukuqhubela phambili, ngokuzenzekelayo (okanye kungabikho ngokungaqhelekanga, jonga ngezantsi) amaziko eengqungquthela akhethiwe, kwaye imigama ukusuka kwinto nganye ukuya kumaziko eengqungquthela ibalwa. Into nganye kwinqanaba elinikiweyo lokuphindaphinda iphawulwe njengelelona ziko likufutshane. Emva koko iziko leqela ngalinye lidluliselwa kwi-arithmetic mean ye-coordinates yamalungu ayo (ngokufanisa ne-physics, ikwabizwa ngokuba yi "center of mass"), kwaye inkqubo iphinda iphindwe.

Inkqubo idibana ngokukhawuleza. Kwimifanekiso ekwimilinganiselo emibini ibonakala ngolu hlobo:

1. Ukuhanjiswa kokuqala okungahleliwe kwamanqaku kwinqwelomoya kunye nenani lamaqela

Ukufunda ngoomatshini ngaphandle kwePython, iAnaconda kunye nezinye izinto ezirhubuluzayo

2. Ukumisela amaziko eeklasta kunye nokwabela amanqaku kumaqela abo

Ukufunda ngoomatshini ngaphandle kwePython, iAnaconda kunye nezinye izinto ezirhubuluzayo

3. Ukudlulisa ulungelelwaniso lwamaziko eeklasta, ukubala kwakhona ukubalwa kwamanqaku de amaziko azinze. Umkhondo wokuhamba kweziko leqela ukuya kwindawo yokugqibela ibonakala.

Ukufunda ngoomatshini ngaphandle kwePython, iAnaconda kunye nezinye izinto ezirhubuluzayo

Ngaliphi na ixesha, unokuseta amaziko amatsha e-cluster (ngaphandle kokuvelisa usasazo olutsha lwamanqaku!) Kwaye ubone ukuba inkqubo yokwahlula ayisoloko ingabonakali. Ngokwezibalo, oku kuthetha ukuba umsebenzi ophuculweyo (isimbuku semigama ephindwe kabini ukusuka kumanqaku ukuya kumbindi weqela lawo) asifumani umhlaba jikelele, kodwa ubuncinci bendawo. Le ngxaki inokoyiswa mhlawumbi ngokukhetha okungakhethiyo kwamaziko okuqala eeklasta, okanye ngokubalwa kwamaziko anokwenzeka (ngamanye amaxesha kunenzuzo ukuwabeka ngokuthe ngqo kwelinye lamanqaku, ngoko ke kukho isiqinisekiso sokuba asiyi kuyenza. fumana amaqela angenanto). Kwimeko nayiphi na into, isethi eqingqiweyo ihlala ine-infimum.

Ungadlala ngale fayile kule link (ungalibali ukunika inkxaso enkulu. Iifayile zikhangelwe iintsholongwane)

Inkcazelo yendlela kwiWikipedia βˆ’ k-uthetha indlela

2. Uqikelelo ngeepolynomials kunye nokucazululwa kwedatha. Uqeqesho kwakhona

Isazinzulu esimangalisayo kunye nosaziwayo kwisayensi yedatha u-K.V. U-Vorontsov uthetha ngokufutshane malunga neendlela zokufunda ngomatshini njenge "inzululwazi yokudweba iijika ngamanqaku." Kulo mzekelo, siya kufumana ipateni kwidatha kusetyenziswa indlela yezikwere ezincinci.

Ubuchwephesha bokwahlula idatha yokuqala "kuqeqesho" kunye "nolawulo", kunye nento efana ne-overfitting, okanye "ukulungiswa kwakhona" kwidatha, ibonisiwe. Ngokuqikelelwa okuchanekileyo, siya kuba nempazamo kwidatha yoqeqesho kunye nempazamo enkulu kancinci kwidatha yolawulo. Ukuba ayichanekanga, ukulungiswa kakuhle kwidatha yoqeqesho kunye nempazamo enkulu kwezolawulo.

(Kuyinto eyaziwayo ukuba ngamanqaku e-N kunokwenzeka ukuzoba ijika elilodwa le-N-1th degree, kwaye le ndlela ngokubanzi ayinikeli umphumo oyifunayo. Lagrange interpolation polynomial kwi Wikipedia)

1. Seta unikezelo lokuqala

Ukufunda ngoomatshini ngaphandle kwePython, iAnaconda kunye nezinye izinto ezirhubuluzayo

2. Sahlula amanqaku "kuqeqesho" kunye "nokulawula" kwi-70 ukuya kwi-30.

Ukufunda ngoomatshini ngaphandle kwePython, iAnaconda kunye nezinye izinto ezirhubuluzayo

3. Sizoba ijika elisondelayo ecaleni kwamanqaku oqeqesho, sibona impazamo eyinikayo kwidatha yolawulo.

Ukufunda ngoomatshini ngaphandle kwePython, iAnaconda kunye nezinye izinto ezirhubuluzayo

4. Sizoba ijika elichanekileyo ngokusebenzisa amanqaku oqeqesho, kwaye sibona impazamo enkulu kwidatha yokulawula (kunye ne-zero kuqeqesho, kodwa yintoni inqaku?).

Ukufunda ngoomatshini ngaphandle kwePython, iAnaconda kunye nezinye izinto ezirhubuluzayo

Ngokuqinisekileyo, eyona nguqulelo ilula iboniswa ngesahlulelo esinye kwi-"training" kunye ne-"control" subsets; kwimeko eqhelekileyo, oku kwenziwa ngokuphindaphindiweyo ukwenzela ukulungiswa kakuhle kwee-coefficients.

Ifayile iyafumaneka apha, ijongwe yi-antivirus. Yenza iimakhro zisebenze ngokuchanekileyo

3. Ukuhla kwegradient kunye nokutshintsha kweempazamo

Kuyakubakho i-4-dimensional case kunye nokuhlehla komgca apha. I-coefficients yokubuyisela umgca iya kumiselwa inyathelo ngenyathelo kusetyenziswa indlela yokwehla kwe-gradient, ekuqaleni zonke ii-coefficients ngu-zero. Igrafu eyahlukileyo ibonisa ukuguquka kweempazamo njengoko i-coefficients ilungiswa kakuhle ngakumbi nangakumbi. Kuyenzeka ukujonga zonke iingqikelelo ezine ze-2D.

Ukuba sibeka inyathelo lokuhla kwe-gradient likhulu kakhulu, kuyacaca ukuba ixesha ngalinye siya kutsiba ubuncinci kwaye sifikelele kwisiphumo ngamanyathelo angaphezulu, nangona, ekugqibeleni, siseza (ngaphandle kokuba silibazisa kakhulu inyathelo lokuhla - emva koko i-algorithm iya kuhamba " in disarray "). Kwaye igrafu yokuxhomekeka kwephutha kwisinyathelo sokuphindaphinda ayiyi kuba lula, kodwa "i-twitchy".

1. Ukuvelisa idatha, seta inyathelo lokuhla kwe-gradient

Ukufunda ngoomatshini ngaphandle kwePython, iAnaconda kunye nezinye izinto ezirhubuluzayo

2. Ngokhetho oluchanekileyo lwenyathelo lokuhla kwe-gradient, sihamba kakuhle kwaye ngokukhawuleza ngokwaneleyo siza kuncinci

Ukufunda ngoomatshini ngaphandle kwePython, iAnaconda kunye nezinye izinto ezirhubuluzayo

3. Ukuba inyathelo lokuhla kwe-gradient likhethwe ngokungalunganga, sitsiba ubuninzi, igrafu yempazamo ithi "twitchy", ukudibanisa kuthatha inani elikhulu lamanyathelo.

Ukufunda ngoomatshini ngaphandle kwePython, iAnaconda kunye nezinye izinto ezirhubuluzayo
ΠΈ

Ukufunda ngoomatshini ngaphandle kwePython, iAnaconda kunye nezinye izinto ezirhubuluzayo

4. Ngokukhetha okungalunganga ngokupheleleyo kwesinyathelo sokwehla kwe-gradient, sisuka kude neyona nto incinci

Ukufunda ngoomatshini ngaphandle kwePython, iAnaconda kunye nezinye izinto ezirhubuluzayo

(Ukuphinda uvelise inkqubo kwinqanaba le-gradient descent eboniswe kwimifanekiso, khangela ibhokisi "yedatha yereferensi").

Ifayile - landela eli khonkco, kufuneka uvule ii-macros, akukho zintsholongwane.

Ngokutsho koluntu oluhlonelwayo, ngaba ukwenza lula ngolo hlobo nendlela yokunikela umbandela kwamkelekile? Ngaba inqaku kufuneka liguqulelwe kwisiNgesi?

umthombo: www.habr.com

Yongeza izimvo