Cha, kunjalo, angizimisele. Kufanele kube nomkhawulo wezinga okungenzeka ngalo ukwenza isifundo sibe lula. Kodwa ngezigaba zokuqala, ukuqonda imiqondo eyisisekelo futhi ngokushesha "ukungena" esihlokweni, kungase kwamukeleke. Sizoxoxa ngokuthi singaqanjwa kanjani kahle le nto (izinketho: "Ukufunda ngomshini wama-dummies", "Ukuhlaziywa kwedatha kusuka kumanabukeni", "Ama-algorithms ezingane") ekugcineni.
Ephuzwini. Ubhale izinhlelo zokusebenza ezimbalwa ku-MS Excel ukuze ubonise ngeso lengqondo nokumelwa okubonakalayo kwezinqubo ezenzeka ngezindlela ezihlukene zokufunda zomshini lapho kuhlaziywa idatha. Ukubona kuwukukholelwa, ngemva kwakho konke, njengoba abathwali besiko besho, okwathuthukisa iningi lalezi zindlela (ngendlela, hhayi zonke. Umshini we-vector wokusekela onamandla kakhulu, noma i-SVM, umshini we-vector wokusekela ukusungulwa sakithi uVladimir Vapnik, Moscow Institute of Management. 1963, by the way! Manje, nokho, ufundisa futhi usebenza e-USA).
Amafayela amathathu azobuyekezwa
1. K-isho ukuhlanganisa
Izinkinga zalolu hlobo zibhekisela “ekufundweni okungagadiwe,” lapho sidinga ukuhlukanisa idatha yokuqala ibe inombolo ethile yezigaba ezaziwa kusengaphambili, kodwa asinayo inombolo “yezimpendulo ezilungile”; kufanele sizikhiphe kudatha ngokwayo. . Inkinga yakudala eyisisekelo yokuthola izinhlobonhlobo zezimbali ze-iris (Ronald Fisher, 1936!), okubhekwa njengophawu lokuqala lwalo mkhakha wolwazi, ingalolu hlobo nje.
Indlela ilula kakhulu. Sinesethi yezinto ezimelelwe njengama-vectors (amasethi ezinombolo ezingu-N). Ku-irises, lawa amasethi ezinombolo ezi-4 ezibonisa imbali: ubude nobubanzi be-lobes yangaphandle nengaphakathi ye-perianth, ngokulandelana (
Okulandelayo, izikhungo zeqoqo zikhethwa ngokungahleliwe (noma hhayi ngokungahleliwe, bheka ngezansi), futhi amabanga ukusuka entweni ngayinye ukuya ezikhungweni zeqoqo abalwa. Into ngayinye esinyathelweni esinikeziwe sokuphindaphinda imakwa njengeyesikhungo esiseduze. Khona-ke isikhungo seqoqo ngalinye sidluliselwa ku-arithmetic mean yezixhumanisi zamalungu ayo (ngokufanisa ne-physics, ibizwa nangokuthi "isikhungo sobuningi"), futhi inqubo iphindaphindiwe.
Inqubo ihlangana ngokushesha okukhulu. Ezithombeni ezinobukhulu obubili kubonakala kanje:
1. Ukusatshalaliswa okungahleliwe kokuqala kwamaphoyinti endizeni kanye nenani lamaqoqo
2. Ukucacisa izikhungo ze-cluster nokwabela amaphuzu kumaqoqo azo
3. Ukudlulisa izixhumanisi zezikhungo ze-cluster, ukubala kabusha ukuhlanganiswa kwamaphoyinti kuze kube yilapho izikhungo zizinza. I-trajectory yesikhungo se-cluster ehamba endaweni yaso yokugcina ibonakala.
Noma kunini, ungasetha izikhungo ezintsha zeqoqo (ngaphandle kokukhiqiza ukusabalalisa okusha kwamaphoyinti!) futhi ubone ukuthi inqubo yokuhlukanisa ayihlali icacile. Ngokwezibalo, lokhu kusho ukuthi ngomsebenzi othuthukisiwe (inani lamabanga ayisikwele ukusuka kumaphoyinti ukuya ezindaweni zamaqoqo awo), asitholi isilinganiso somhlaba wonke, kodwa ubuncane bendawo. Le nkinga inganqotshwa ngokukhetha okungahleliwe kwezikhungo zeqoqo lokuqala, noma ngokubala izikhungo ezingaba khona (ngezinye izikhathi kunenzuzo ukuzibeka kwelinye lamaphuzu, khona-ke okungenani kunesiqinisekiso sokuthi ngeke sithole lutho. amaqoqo). Kunoma yikuphi, isethi elinganiselwe ihlale ine-infimum.
Incazelo yendlela ku-Wikipedia -
2. Ukulinganisa ngama-polynomials nokuhlukaniswa kwedatha. Ukuqeqesha kabusha
Usosayensi omangalisayo nosaziwayo wesayensi yedatha K.V. U-Vorontsov uchaza kafushane izindlela zokufunda ngomshini ngokuthi "isayensi yokudweba ijika emaphoyinti." Kulesi sibonelo, sizothola iphethini kudatha sisebenzisa indlela yezikwele ezincane kakhulu.
Indlela yokuhlukanisa idatha yomthombo ibe "ukuqeqeshwa" kanye "nokulawula" iyaboniswa, kanye nento efana nokuqeqesha kabusha, noma "ukulungisa kabusha" kudatha. Ngokulinganiselwa okulungile, sizoba nephutha elithile kudatha yokuqeqeshwa kanye nephutha elikhudlwana kudatha yokulawula. Uma kungalungile, kubangela ukulungiswa okunembile kwedatha yokuqeqeshwa kanye nephutha elikhulu kudatha yokuhlola.
(Kuyiqiniso elaziwayo ukuthi ngamaphoyinti angu-N umuntu angakwazi ukudweba ijika elilodwa le-N-1th degree, futhi le ndlela esimweni esivamile ayinikezi umphumela oyifunayo.
1. Setha ukusatshalaliswa kokuqala
2. Sihlukanisa amaphuzu ngokuthi "ukuqeqeshwa" kanye "nokulawula" ngesilinganiso sika-70 kuya ku-30.
3. Sidweba ijika eliseduze eduze kwamaphoyinti okuqeqesha, sibona iphutha elinikezayo kudatha yokulawula
4. Sidweba ijika eliqondile ezindaweni zokuqeqesha, futhi sibona iphutha elikhulu kudatha yokulawula (kanye neqanda kudatha yokuqeqeshwa, kodwa liyini iphuzu?).
Okubonisiwe, yiqiniso, inketho elula kakhulu ngokuhlukaniswa okukodwa kube amasethi angaphansi "okuqeqesha" kanye "nokulawula"; esimweni esijwayelekile, lokhu kwenziwa izikhathi eziningi ukuze kulungiswe kahle kakhulu ama-coefficients.
3. Ukwehla kwegradient kanye namandla okushintsha kwamaphutha
Kuzoba khona icala elingu-4-dimensional kanye nokuhlehla komugqa. Ama-coefficients wokuhlehla komugqa azonqunywa isinyathelo ngesinyathelo kusetshenziswa indlela yokwehla kwegradient, ekuqaleni wonke ama-coefficients anguziro. Igrafu ehlukile ibonisa ukuguquguquka kokunciphisa amaphutha njengoba ama-coefficients elungiswa ngokunembe kakhulu. Kungenzeka ukubuka wonke ama-projection amane-2-dimensional.
Uma usetha isinyathelo sokwehla kwe-gradient sibe sikhulu kakhulu, ungabona ukuthi ngaso sonke isikhathi sizokweqa ubuncane futhi sizofika kumphumela ngenani elikhulu lezinyathelo, nakuba ekugcineni sisazofika (ngaphandle kwalapho sibambezela isinyathelo sokwehla futhi. okuningi - khona-ke i-algorithm izohamba "ngama-spades"). Futhi igrafu yephutha kuye ngokuthi isinyathelo sokuphindaphinda ngeke ibe bushelelezi, kodwa "i-jerky".
1. Khiqiza idatha, setha isinyathelo sokwehla kwe-gradient
2. Ngokukhetha okulungile kwesinyathelo sokwehla kwe-gradient, sifinyelela ngokushelelayo nangokushesha kokuncane
3. Uma isinyathelo sokwehla kwegrediyenti sikhethwe ngokungalungile, sidlula umkhawulo, igrafu yephutha ithi "jerky", ukuhlangana kuthatha inombolo enkulu yezinyathelo
и
4. Uma sikhetha ukwehla kwe-gradient ngokungalungile ngokuphelele, sisuka kokuncane
(Ukuze ukhiqize kabusha inqubo usebenzisa amanani esinyathelo sokwehla kwe-gradient aboniswe ezithombeni, khetha ibhokisi elithi “reference data”).
Ngokomphakathi ohlonishwayo, ingabe ukwenza lula kanjalo nendlela yokwethula indaba kuyamukeleka? Ingabe kufanelekile ukuhumushela lesi sihloko ngesiNgisi?
Source: www.habr.com