source:
Juyawa na layi yana ɗaya daga cikin algorithms na asali don yankuna da yawa masu alaƙa da nazarin bayanai. Dalilin haka a fili yake. Wannan algorithm ne mai sauƙi kuma mai sauƙin fahimta, wanda ya ba da gudummawa ga yaduwar amfani da shi na shekaru da yawa, idan ba ɗaruruwan ba. Manufar ita ce mu ɗauka dogaro na madaidaiciyar madaidaicin ɗaya akan saitin wasu masu canji, sannan mu yi ƙoƙarin dawo da wannan dogaro.
Amma wannan labarin ba game da yin amfani da koma baya na layi ba don magance matsalolin aiki. Anan za mu yi la'akari da fasali masu ban sha'awa na aiwatar da algorithms masu rarraba don dawo da su, waɗanda muka ci karo da su lokacin rubuta tsarin koyon injin a ciki.
Me muke magana akai?
Muna fuskantar aikin maido da dogaro na layi. A matsayin bayanan shigar da bayanai, an ba da ɗimbin nau'ikan nau'ikan nau'ikan ma'auni masu zaman kansu, kowannensu yana da alaƙa da takamaiman ƙimar abin dogaro. Ana iya wakilta wannan bayanan ta hanyar matrices biyu:
Yanzu, tun lokacin da aka yi la'akari da dogara, kuma, haka ma, layi, za mu rubuta tunaninmu a cikin nau'i na samfurin matrices (don sauƙaƙe rikodin, a nan da ƙasa an ɗauka cewa kalmar kyauta na lissafin yana ɓoye a baya. , da kuma ginshiƙi na ƙarshe na matrix ya ƙunshi raka'a):
Yayi kama da tsarin daidaitattun layi, ko ba haka ba? Da alama, amma mafi kusantar ba za a sami mafita ga irin wannan tsarin daidaitawa ba. Dalilin haka shi ne hayaniya, wanda ke cikin kusan kowane ainihin bayanai. Wani dalili kuma na iya zama rashin dogaro na layi kamar haka, wanda za'a iya yaƙarsa ta hanyar gabatar da ƙarin masu canji waɗanda ba bisa ƙa'ida ba sun dogara da na asali. Yi la'akari da misali mai zuwa:
source:
Wannan misali ne mai sauƙi na koma baya na layi wanda ke nuna alakar maɓalli ɗaya (tare da axis ) daga wani m (tare da axis ). Domin tsarin ma'auni masu dacewa da wannan misali don samun mafita, duk maki dole ne su kwanta daidai akan layi madaidaiciya. Amma wannan ba gaskiya ba ne. Amma ba sa kwance a kan madaidaiciyar layi daidai saboda hayaniya (ko don zaton dangantakar layin ta kasance kuskure). Don haka, don maido da alaƙar linzamin kwamfuta daga ainihin bayanan, yawanci ya zama dole don gabatar da ƙarin zato: bayanan shigar da su ya ƙunshi hayaniya kuma wannan amo yana da.
Hanya mafi girma
Don haka, mun ɗauka kasancewar amo da aka rarraba bazuwar. Me za a yi a irin wannan yanayi? Don wannan yanayin a cikin lissafi akwai kuma ana amfani da shi sosai
Mun dawo don maido da alaƙar linzamin kwamfuta daga bayanai tare da amo na al'ada. Lura cewa dangantakar layin da aka zaci shine tsammanin lissafi data kasance al'ada rarraba. A lokaci guda, yiwuwar cewa yana ɗaukar ƙimar ɗaya ko wata, ƙarƙashin kasancewar abubuwan lura , mai bi:
Yanzu bari mu musanya maimakon и Abubuwan da muke buƙata su ne:
Abin da ya rage shi ne nemo vector , wanda wannan yuwuwar ita ce iyakar. Don haɓaka irin wannan aikin, yana da dacewa don fara ɗaukar logarithm nasa (logarithm na aikin zai kai matsakaicin matsayi ɗaya da aikin kanta):
Wanda, bi da bi, ya sauko don rage girman aiki mai zuwa:
Af, ana kiran wannan hanya
Bazuwar QR
Ana iya samun mafi ƙarancin aikin da ke sama ta hanyar nemo maƙasudin da matakin wannan aikin ya zama sifili. Kuma za a rubuta gradient kamar haka:
Don haka muna lalata matrix ku matrices и kuma aiwatar da jerin sauye-sauye (ba za a yi la'akari da bazuwar QR kanta a nan ba, amfani da shi kawai dangane da aikin da ke hannun):
Matrix orthogonal ne. Wannan yana ba mu damar kawar da aikin :
Kuma idan kun canza a kan , to zai yi aiki . Ganin haka babban matrix triangular ne, yayi kama da haka:
Ana iya magance wannan ta amfani da hanyar musanya. Abun ciki is located as , kashi na baya is located as da sauransu.
Ya kamata a lura a nan cewa rikitarwa na sakamakon algorithm saboda yin amfani da bazuwar QR daidai yake da. . Bugu da ƙari, duk da cewa aikin haɓaka matrix yana da daidaituwa sosai, ba zai yiwu a rubuta ingantaccen sigar rarraba wannan algorithm ba.
Saukowar Gradient
Lokacin magana game da rage girman aiki, yana da kyau koyaushe tuna hanyar (stochastic) zuriyar gradient. Wannan hanya ce mai sauƙi kuma mai inganci dangane da ƙididdige ƙimar aikin a lokaci guda sannan a matsar da shi zuwa alkiblar da ta saba wa maɗaukakiyar. Kowane irin wannan mataki yana kawo mafita kusa da mafi ƙarancin. Har yanzu gradient yana kama da haka:
Wannan hanyar kuma tana daidaitawa da rarrabawa saboda madaidaiciyar kaddarorin mai aikin gradient. Lura cewa a cikin dabarar da ke sama, ƙarƙashin alamar jimla akwai sharuɗɗan masu zaman kansu. A takaice dai, zamu iya lissafin gradient da kansa don duk fihirisa daga farko zuwa , a layi daya da wannan, lissafta gradient don fihirisa tare da to . Sa'an nan kuma ƙara sakamakon gradients. Sakamakon ƙari zai kasance daidai da idan muka ƙididdige ma'aunin gradient ga fihirisa daga farko zuwa . Don haka, idan an rarraba bayanan tsakanin sassan bayanai da yawa, ana iya ƙididdige gradient da kansa akan kowane yanki, sa'an nan kuma za a iya taƙaita sakamakon waɗannan lissafin don samun sakamako na ƙarshe:
Daga mahangar aiwatarwa, wannan ya dace da yanayin
Duk da sauƙin aiwatarwa da ikon aiwatarwa a cikin MapReduce paradigm, saukowar gradient shima yana da nasa illa. Musamman, adadin matakan da ake buƙata don cimma haɗin kai yana da girma sosai idan aka kwatanta da sauran hanyoyin musamman na musamman.
LSQR
Hanyar LSQR ta dogara ne akan
Amma idan muka ɗauka cewa matrix an raba shi a kwance, sannan kowane juzu'i za a iya wakilta shi azaman matakai biyu na MapReduce. Ta wannan hanyar, yana yiwuwa a rage yawan canja wurin bayanai yayin kowane juzu'i (kawai vectors masu tsayi daidai da adadin waɗanda ba a sani ba):
Wannan hanya ce da ake amfani da ita yayin aiwatar da koma bayan layin layi a ciki
ƙarshe
Akwai algorithms dawo da koma baya da yawa, amma ba duka ba ne za a iya amfani da su a kowane yanayi. Don haka bazuwar QR yana da kyau don ingantaccen bayani akan ƙananan saitin bayanai. Saukowar gradient abu ne mai sauƙi don aiwatarwa kuma yana ba ku damar nemo madaidaicin bayani cikin sauri. Kuma LSQR yana haɗa mafi kyawun kaddarorin algorithms guda biyu da suka gabata, tunda ana iya rarraba shi, yana haɗuwa da sauri idan aka kwatanta da zuriyar gradient, kuma yana ba da damar dakatar da algorithm da wuri, sabanin bazuwar QR, don nemo madaidaicin bayani.
source: www.habr.com