Ayi, chabwino, ine sindiri serious. Payenera kukhala malire kumlingo wothekera kufeŵetsa nkhaniyo. Koma pazigawo zoyamba, kumvetsetsa mfundo zoyambirira ndi "kulowa" mwamsanga mutuwo, zikhoza kukhala zovomerezeka. Ndi momwe mungatchulire nkhaniyi molondola (zosankha: "Kuphunzira kwa makina a dummies", "Kusanthula deta kuchokera ku matewera", "ma Algorithms ang'onoang'ono"), tidzakambirana pamapeto pake.
Ku bizinesi. Analemba mapulogalamu angapo ogwiritsira ntchito mu MS Excel kuti muwone ndikuwona njira zomwe zimachitika munjira zosiyanasiyana zophunzirira makina posanthula deta. Kuwona ndikukhulupirira, pamapeto pake, monga onyamula chikhalidwe chomwe chinayambitsa njira zambiri izi zimati (mwa njira, osati zonse. Makina amphamvu kwambiri "makina othandizira vekitala", kapena SVM, makina othandizira vekitala ndi kupangidwa kwa mnzanga Vladimir Vapnik, Moscow Institute of Management. 1963, ndi njira! Tsopano, komabe, amaphunzitsa ndikugwira ntchito ku USA).
1. Kuphatikizana ndi k-njira
Mavuto amtunduwu amatanthawuza "kuphunzira kosayang'aniridwa", pamene tifunika kuswa deta yochokera kumagulu ena omwe amadziwika kale, koma panthawi imodzimodziyo tilibe chiwerengero cha "mayankho olondola", tiyenera kuwachotsa. deta yokha. Vuto lalikulu lachikale lopeza mitundu ya maluwa a iris (Ronald Fisher, 1936!), lomwe limawonedwa ngati chizindikiro choyamba cha chidziwitso ichi, ndi lamtundu wotere.
Njira ndi yosavuta. Tili ndi zinthu zomwe zimayimiridwa ngati ma vector (ma seti a N manambala). Mu irises, awa ndi ma seti a manambala 4 omwe amadziwika ndi duwa: kutalika ndi m'lifupi mwa magawo akunja ndi amkati a perianth, motsatana (
Kuwonjezera apo, mosasamala (kapena mosasamala, onani m'munsimu) malo amagulu amasankhidwa, ndipo mtunda wochokera ku chinthu chilichonse kupita kumadera amagulu amawerengedwa. Chinthu chilichonse pa sitepe yobwerezabwereza chimalembedwa kuti ndi chapafupi kwambiri. Ndiye pakati pa gulu lililonse amasamutsidwa ku masamu amatanthauza makonzedwe a mamembala ake (mofanana ndi physics, amatchedwanso "center of mass"), ndipo ndondomeko akubwerezedwa.
Njirayi imalumikizana mwachangu. Pazithunzi mu miyeso iwiri zikuwoneka motere:
1. Kugawa koyambirira kwachisawawa kwa mfundo pa ndege ndi kuchuluka kwa magulu
2. Kukhazikitsa malo a magulu ndi kugawa mfundo kumagulu awo
3. Kusamutsa ma coordinates a malo a masango, kuwerengeranso zomwe zili ndi mfundo mpaka malowo akhazikika. Njira ya kayendedwe ka masango mpaka kumalo omaliza akuwoneka.
Nthawi iliyonse, mutha kukhazikitsa malo atsopano amagulu (popanda kupanga kugawa kwatsopano kwa mfundo!) Mwamasamu, izi zikutanthauza kuti ntchito yokongoletsedwa (chiwerengero cha mtunda wa masikweya kuchokera ku mfundo mpaka pakati pa magulu awo) sitipeza zapadziko lonse lapansi, koma zocheperako zakomweko. Vutoli litha kuthetsedwa ndi kusankha kosasintha kwa malo oyamba amagulu, kapena kuwerengera malo omwe angatheke (nthawi zina ndikwabwino kuwayika ndendende mu mfundo imodzi, ndiye kuti pali chitsimikizo kuti sitidzatero. pezani masango opanda kanthu). Mulimonsemo, seti yomaliza imakhala ndi infimum.
Kufotokozera za njira pa Wikipedia −
2. Kuyerekeza ndi ma polynomials ndi kuwonongeka kwa deta. Kuphunzitsanso
Wasayansi wodabwitsa komanso wotchuka wa sayansi ya data K.V. Vorontsov akukamba mwachidule za njira zophunzirira makina monga "sayansi yojambula mokhotakhota kupyolera mu mfundo." Muchitsanzo ichi, tipeza ndondomeko mu data pogwiritsa ntchito njira yocheperako.
Njira yogawanitsa deta yoyambirira kukhala "maphunziro" ndi "kuwongolera", komanso chodabwitsa chotere monga kuwonjezereka, kapena "kukonzanso" deta, ikuwonetsedwa. Ndi kuyerekezera kolondola, tidzakhala ndi zolakwika pazambiri zophunzitsira komanso cholakwika chokulirapo pang'ono pa data yowongolera. Ngati zolakwika, kusintha kwabwino kwa data yophunzitsira ndi cholakwika chachikulu pazowongolera.
(Ndi chodziwikiratu chodziwika bwino kuti kudzera mu mfundo za N ndizotheka kujambula khola limodzi la digiri ya N-1, ndipo njira iyi nthawi zambiri siyipereka zotsatira zomwe mukufuna.
1. Khazikitsani kugawa koyamba
2. Timagawa mfundozo mu "maphunziro" ndi "control" mu chiŵerengero cha 70 mpaka 30.
3. Timajambula pendekera pafupi ndi malo ophunzitsira, tikuwona cholakwika chomwe chimapereka pazowongolera
4. Timajambula njira yokhotakhota pamagawo ophunzitsira, ndipo tikuwona cholakwika chachikulu pazidziwitso zowongolera (ndi zero pamaphunzirowo, koma ndi chiyani?).
Zachidziwikire, mtundu wosavuta kwambiri umawonetsedwa ndi gawo limodzi kukhala magawo a "maphunziro" ndi "control"; nthawi zambiri, izi zimachitika mobwerezabwereza kuti ma coefficients asinthe.
3. Kutsika kwa gradient ndi kusintha kwa zolakwika
Padzakhala 4-dimensional kesi ndi mzere wobwereranso apa. Ma Linear regression coefficients adzadziwika pang'onopang'ono pogwiritsa ntchito njira yotsika, poyambira ma coefficients onse amakhala ziro. Chithunzi chosiyana chikuwonetsa kusintha kwa zolakwikazo pamene ma coefficients amasinthidwa bwino kwambiri. Ndizotheka kuwona zowonera zonse zinayi za 2D.
Ngati tiyika gawo lotsika kwambiri, ndiye kuti zikuwonekeratu kuti nthawi iliyonse tidzadumpha pang'ono ndikufika pazotsatira zambiri, ngakhale, pamapeto pake, tidzabwerabe (pokhapokha titachedwetsa gawo lotsika kwambiri - ndiye algorithm ipita "mu disarray"). Ndipo graph ya kudalira kwa cholakwika pa sitepe yobwerezabwereza sichidzakhala yosalala, koma "twitchy".
1. Pangani deta, ikani sitepe yotsika
2. Ndi kusankha koyenera kwa gawo lotsika la gradient, timakhala bwino komanso mwachangu timafika pochepera
3. Ngati sitepe yotsika ya gradient yasankhidwa molakwika, timadumpha kuchuluka kwake, graph yolakwika ndi "twitchy", convergence imatenga masitepe ambiri.
и
4. Ndi kusankha kolakwika kwathunthu kwa gawo lotsika la gradient, timachoka pachochepa
(Kuti mubweretsenso ndondomekoyi pamasitepe otsika omwe akuwonetsedwa pazithunzi, onani bokosi la "reference data").
Malinga ndi gulu lolemekezeka, kodi kufeŵetsa koteroko ndi njira yolankhulirana ndi yovomerezeka? Kodi nkhaniyo imasuliridwe m'Chingerezi?
Source: www.habr.com