The Magic of Ensemble Learning

Pa Habr! Tikuyitanira Akatswiri a Data ndi akatswiri a Kuphunzira kwa Makina ku phunziro laulere la Demo "Kutulutsa kwamitundu ya ML m'mafakitale pogwiritsa ntchito chitsanzo cha malingaliro a pa intaneti". Timasindikizanso nkhaniyo Luca Monno - Mtsogoleri wa Financial Analytics ku CDP SpA.

Imodzi mwa njira zothandiza komanso zosavuta zophunzirira makina ndi Kuphunzira kwa Ensemble. Ensemble Learning ndiye njira yomwe ili kumbuyo kwa XGBoost, Bagging, Random Forest ndi ma algorithms ena ambiri.

Pali zolemba zambiri zabwino pa Towards Data Science, koma ndinasankha nkhani ziwiri (Choyamba ΠΈ chachiwiri) zomwe ndimakonda kwambiri. Nanga bwanji kulemba nkhani ina za EL? Chifukwa ndikufuna kukuwonetsani momwe zimagwirira ntchito ndi chitsanzo chosavuta, zomwe zinandipangitsa kuti ndimvetsetse kuti palibe matsenga apa.

Pamene ndinayamba kuona EL akugwira ntchito (kugwira ntchito ndi zitsanzo zosavuta kwambiri zochepetsera) sindinakhulupirire maso anga, ndipo ndikukumbukirabe pulofesa yemwe anandiphunzitsa njira iyi.

Ndinali ndi mitundu iwiri yosiyana (ma algorithms awiri ofooka ophunzitsira) okhala ndi ma metric kunja kwa zitsanzo RΒ² wofanana ndi 0,90 ndi 0,93, motsatana. Ndisanayang'ane zotsatira zake, ndimaganiza kuti ndipeza RΒ² penapake pakati pa zikhalidwe ziwiri zoyambirira. Mwa kuyankhula kwina, ndimakhulupirira kuti EL angagwiritsidwe ntchito kupanga chitsanzo osati molakwika ngati chitsanzo choipitsitsa, koma osati komanso chitsanzo chabwino kwambiri.

Chondidabwitsa kwambiri, kungoyerekeza zoloserazo kunatulutsa RΒ² ya 0,95. 

Poyamba ndinayamba kufunafuna cholakwikacho, koma kenako ndinaganiza kuti mwina pali zamatsenga zomwe zikubisala apa!

Kodi Kuphunzira kwa Ensemble ndi chiyani

Ndi EL, mutha kuphatikiza maulosi amitundu iwiri kapena kupitilira apo kuti mupange mtundu wodalirika komanso wochita bwino. Pali njira zambiri zogwirira ntchito ndi ma ensembles achitsanzo. Apa ndikhudza ziwiri zothandiza kwambiri kupereka mwachidule.

Ndi chithandizo cha kubwerera m'mbuyo ndizotheka kuwerengera magwiridwe antchito amitundu yomwe ilipo.

Ndi chithandizo cha gulu Mukhoza kupereka zitsanzo mwayi wosankha malemba. Chizindikiro chomwe chimasankhidwa nthawi zambiri ndi chomwe chidzasankhidwa ndi chitsanzo chatsopano.

Chifukwa chiyani EL imagwira ntchito bwino

Chifukwa chachikulu chomwe EL amachitira bwino ndikuti kulosera kulikonse kumakhala ndi cholakwika (tikudziwa izi kuchokera ku chiphunzitso chotheka), kuphatikiza maulosi awiri kungathandize kuchepetsa cholakwikacho, motero kuwongolera magwiridwe antchito (RMSE, RΒ², etc.). d.).

Chithunzi chotsatira chikuwonetsa momwe ma algorithms awiri ofooka amagwirira ntchito pa seti ya data. Algorithm yoyamba ili ndi malo otsetsereka kuposa momwe amafunikira, pomwe yachiwiri ili ndi pafupifupi ziro (mwina chifukwa chokhazikika). Koma pamodzi zikuwonetsa zotsatira zabwino kwambiri. 

Ngati muyang'ana pa chizindikiro cha RΒ², ndiye kuti muyeso yoyamba ndi yachiwiri yophunzitsira idzakhala yofanana ndi -0.01ΒΉ, 0.22, motero, pamene gululo lidzakhala lofanana ndi 0.73.

The Magic of Ensemble Learning

Pali zifukwa zambiri zomwe algorithm imatha kukhala yoyipa ngakhale pazitsanzo zoyambira monga izi: mwina mudaganiza zogwiritsa ntchito nthawi zonse kuti mupewe kuchulukirachulukira, kapena mwasankha kuti musapewe zovuta zina, kapena mwina mudagwiritsa ntchito kuponderezedwa kwa polynomial ndikulakwitsa. digiri (mwachitsanzo, tidagwiritsa ntchito polynomial ya digiri yachiwiri, ndipo zoyeserera zikuwonetsa asymmetry yomveka bwino yomwe digiri yachitatu ingakhale yoyenera).

Pamene EL ikugwira ntchito bwino

Tiyeni tiwone ma aligorivimu awiri ophunzirira omwe amagwira ntchito ndi data yomweyo.

The Magic of Ensemble Learning

Apa mutha kuwona kuti kuphatikiza mitundu iwiriyi sikunasinthe magwiridwe antchito kwambiri. Poyambirira, kwa ma aligorivimu awiri ophunzitsira, zizindikiro za RΒ² zinali zofanana ndi -0,37 ndi 0,22, motero, ndipo pagululi zidapezeka -0,04. Ndiko kuti, chitsanzo cha EL chinalandira mtengo wapakati wa zizindikiro.

Komabe, pali kusiyana kwakukulu pakati pa zitsanzo ziwirizi: mu chitsanzo choyamba, zolakwika zachitsanzo zinali zogwirizana molakwika, ndipo chachiwiri, zinali zogwirizana bwino (ma coefficients a zitsanzo zitatuzi sanaganizidwe, koma amangosankhidwa ndi wolemba mwachitsanzo.)

Chifukwa chake, Kuphunzira kwa Ensemble kumatha kugwiritsidwa ntchito kukonza kukondera / kusiyanasiyana kulikonse, koma liti Zolakwika zachitsanzo sizikugwirizana bwino, kugwiritsa ntchito EL kungapangitse kuti ntchitoyo ikhale yabwino.

Mitundu yosiyanasiyana komanso yosiyana

Nthawi zambiri EL imagwiritsidwa ntchito pamitundu yofananira (monga mu chitsanzo ichi kapena nkhalango yosasinthika), koma kwenikweni mutha kuphatikiza mitundu yosiyana (linear regression + neural network + XGBoost) yokhala ndi mitundu yosiyanasiyana yofotokozera. Izi zitha kubweretsa zolakwika zosagwirizana ndikuchita bwino.

Kufananiza ndi mitundu yosiyanasiyana ya portfolio

EL imagwira ntchito mofananamo ndi kusiyanasiyana kwa chiphunzitso cha mbiri, koma zabwino kwambiri kwa ife. 

Mukamasintha zinthu zosiyanasiyana, mumayesa kuchepetsa kusiyanasiyana kwa magwiridwe antchito anu poika ndalama m'matangadza osagwirizana. Magawo osiyanasiyana am'matangadza azichita bwino kuposa omwe ali oyipa kwambiri, koma osapambana kuposa zabwino.

Kulemba mawu a Warren Buffett: 

"Kusiyanasiyana ndi chitetezo ku umbuli; kwa munthu amene sadziwa zomwe akuchita, [zosiyanasiyana] sizimveka. "

Pakuphunzirira pamakina, EL imathandizira kuchepetsa kusiyanasiyana kwachitsanzo chanu, koma zitha kubweretsa mtundu wokhala ndi magwiridwe antchito onse kuposa mtundu wabwino kwambiri woyambirira.

Tiyeni tiwone zotsatira

Kuphatikizira mitundu ingapo kukhala imodzi ndi njira yosavuta yomwe ingayambitse kuthetsa vuto la kusagwirizana ndi kuwongolera magwiridwe antchito.

Ngati muli ndi zitsanzo ziwiri kapena zingapo zomwe zimagwira ntchito bwino, musasankhe pakati pawo: zigwiritseni ntchito zonse (koma mosamala)!

Kodi mukufuna kupanga mbali iyi? Lowani paphunziro laulere la Demo "Kutulutsa kwamitundu ya ML m'mafakitale pogwiritsa ntchito chitsanzo cha malingaliro a pa intaneti" ndikuchita nawo kukumana pa intaneti ndi Andrey Kuznetsov - Machine Learning Engineer ku Mail.ru Group.

Source: www.habr.com

Kuwonjezera ndemanga