I-athikili idingida izindlela ezimbalwa zokunquma isibalo sezibalo somugqa wokuhlehla olula (obhanqiwe).
Zonke izindlela zokuxazulula isibalo okuxoxwe ngaso lapha zisekelwe endleleni yezikwele ezincane. Ake sibonise izindlela ezilandelayo:
- Isixazululo sokuhlaziya
- Ukwehla kweGradient
- Ukwehla kwe-Stochastic gradient
Ngendlela ngayinye yokuxazulula i-equation yomugqa oqondile, i-athikili ihlinzeka ngemisebenzi ehlukahlukene, ehlukaniswe ikakhulukazi leyo ebhaliwe ngaphandle kokusebenzisa umtapo wezincwadi. I-NumPy nalabo abasebenzisa izibalo I-NumPy. Kukholelwa ukuthi ukusetshenziswa okunekhono I-NumPy izokwehlisa izindleko zekhompyutha.
Yonke ikhodi enikezwe esihlokweni ibhalwe ngolimi i-python 2.7 usebenzisa Incwadi kaJupyter. Ikhodi yomthombo nefayela elinesampula yedatha kuthunyelwa kuyo
Lesi sihloko sihloselwe kakhulu kokubili abaqalayo nalabo asebevele baqala kancane kancane ukufunda isifundo sesigaba esibanzi kakhulu kubuhlakani bokufakelwa - ukufunda ngomshini.
Ukuze sicacise indaba, sisebenzisa isibonelo esilula kakhulu.
Izimo zesibonelo
Sinamanani amahlanu abonisa ukuncika Y kusukela X (Ithebula No. 1):
Ithebula No. 1 βIzimo zesiboneloβ
Sizocabanga ukuthi amanani yinyanga yonyaka, futhi - imali engenayo kule nyanga. Ngamanye amazwi, imali engenayo incike enyangeni yonyaka, futhi - uphawu kuphela imali engenayo encike kulo.
Isibonelo sinjalo, kokubili ngokombono wokuncika okunemibandela kwemali engenayo enyangeni yonyaka, futhi ngokombono wenombolo yamanani - kukhona ambalwa kakhulu awo. Kodwa-ke, ukwenza lula okunjalo kuzokwenza kube lula, njengoba besho, ukuchaza, hhayi kalula, ukwaziswa abaqalayo abakutholayo. Futhi nokulula kwezinombolo kuzovumela labo abafisa ukuxazulula isibonelo ephepheni ngaphandle kwezindleko ezibalulekile zomsebenzi.
Ake sicabange ukuthi ukuncika okunikezwe esibonelweni kungalinganiselwa kahle ngezibalo zezibalo zomugqa wokuhlehla olula (obhanqiwe) wefomu:
kuphi inyanga okwatholwa ngayo imali engenayo, - imali engenayo ehambisana nenyanga, ΠΈ ama-coefficients okwehla komugqa olinganiselwe.
Qaphela ukuthi i-coefficient ngokuvamile ebizwa ngokuthi umthambeka noma ukuthambeka komugqa olinganiselwe; imele inani i uma ishintsha .
Ngokusobala, umsebenzi wethu esibonelweni ukukhetha ama-coefficient anjalo ku-equation ΠΈ , lapho ukuchezuka kwamanani ethu emali engenayo abaliwe ngenyanga kusuka ezimpendulweni zangempela, i.e. amanani ethulwa kusampula azoba mancane.
Indlela yesikwele encane
Ngokwendlela yezikwele ezincane, ukuchezuka kufanele kubalwe ngokusikwela. Le nqubo ikuvumela ukuthi ugweme ukwesulwa okufanayo kokuphambuka uma kunezimpawu eziphambene. Isibonelo, uma kwesinye isimo, ukuchezuka +5 (kanye nesihlanu), kanti kwenye -5 (khipha okuhlanu), bese isamba sokuchezuka sizokhansela ngaphandle futhi sibe ngu-0 (ziro). Kungenzeka ukuthi ungasikwele ukuchezuka, kodwa ukusebenzisa indawo yemoduli bese konke ukuchezuka kuzoba kuhle futhi kuzonqwabelana. Ngeke sigxile kuleli phuzu ngokuningiliziwe, kodwa simane sibonise ukuthi ukuze kube lula ukubala, kuwumkhuba ukufaka isikwele ukuphambuka.
Lena yindlela ifomula ebukeka ngayo esizonquma ngayo isamba esincane sokuchezuka okuyisikwele (amaphutha):
kuphi kuwumsebenzi wokulinganisa izimpendulo zangempela (okungukuthi, imali engenayo esiyibalile),
yizimpendulo eziyiqiniso (imali engenayo inikezwe kusampula),
inkomba yesampula (inombolo yenyanga lapho kunqunywa khona ukuchezuka)
Masihlukanise umsebenzi, sichaze izilinganiso zokuhlukanisa ingxenye, futhi silungele ukudlulela kusixazululo sokuhlaziya. Kodwa okokuqala, ake sithathe uhambo olufushane mayelana nokuthi kuyini ukuhlukaniswa futhi sikhumbule incazelo yejometri yokuphuma kokunye.
Umehluko
Umehluko umsebenzi wokuthola okuphuma kokunye komsebenzi.
I-derivative isetshenziselwa ini? Okuphuma kokunye komsebenzi kubonakalisa izinga lokushintsha komsebenzi futhi kusitshela indlela yawo. Uma okuphuma kokunye endaweni ethile kuyiphozithivu, khona-ke umsebenzi uyakhula; ngaphandle kwalokho, umsebenzi uyancipha. Futhi uma likhulu inani lokuphuma kokunye okuphelele, izinga lokushintsha lamanani omsebenzi liyaphakama, kanye nokukhuphuka komthambeka wegrafu yomsebenzi.
Isibonelo, ngaphansi kwemibandela yesistimu yokuxhumanisa i-Cartesian, inani lokuphuma kokunye endaweni engu-M(0,0) lilingana no- + 25 kusho ukuthi endaweni ethile, lapho inani ligudluzwa kwesokudla ngeyunithi evamile, inani inyuka ngamayunithi angama-25 ajwayelekile. Kugrafu kubukeka sengathi amanani akhuphuka kancane kusuka endaweni ethile.
Esinye isibonelo. Inani lokuphuma kokunye liyalingana -0,1 kusho ukuthi uma esuswa ngeyunithi eyodwa evamile, inani yehla ngo-0,1 kuphela iyunithi evamile. Ngesikhathi esifanayo, kugrafu yomsebenzi, singakwazi ukubona umthambeka owehlayo ongabonakali kahle. Ukudweba isifaniso nentaba, kunjengokungathi sehla kancane kancane emthambekeni omnene usuka entabeni, ngokungafani nesibonelo sangaphambilini, lapho kudingeke khona ukugibela iziqongo eziphakeme kakhulu :)
Ngakho, ngemva kokuhlukanisa umsebenzi ngamaphutha ΠΈ , sichaza izilinganiso ezihlukanisayo ze-oda loku-1. Ngemuva kokunquma izilinganiso, sizothola uhlelo lwezibalo ezimbili, ngokuxazulula esizokwazi ukukhetha amanani anjalo wama-coefficients. ΠΈ , lapho amanani okuphuma kokunye okuhambisanayo emaphuzwini athile ashintsha khona ngenani elincane kakhulu, futhi esimweni sesixazululo sokuhlaziya awashintshi nhlobo. Ngamanye amazwi, umsebenzi wephutha kuma-coefficients atholiwe uzofinyelela ubuncane, njengoba amanani okuphuma okuyingxenye kulawa maphuzu azolingana noziro.
Ngakho-ke, ngokwemithetho yokuhlukanisa, i-equation eyingxenye ye-oda loku-1 ngokuphathelene ne-coefficient. izothatha ifomu:
I-oda loku-1 i-equation yengxenye ephuma kokunye ngokuphathelene nokuthi izothatha ifomu:
Ngenxa yalokho, sithole isistimu yezibalo enesixazululo sokuhlaziya esilula:
qala{equation*}
qala{cases}
na + bsumlimits_{i=1}^nx_i β sumlimits_{i=1}^ny_i = 0
sumlimits_{i=1}^nx_i(a +bsumlimits_{i=1}^nx_i - sumlimits_{i=1}^ny_i) = 0
end{cases}
ukuphela{isibalo*}
Ngaphambi kokuxazulula isibalo, masilayishe kuqala, sihlole ukuthi ukulayisha kulungile, bese sifometha idatha.
Ilayisha futhi ifometha idatha
Kufanele kuqashelwe ukuthi ngenxa yeqiniso lokuthi ngesixazululo sokuhlaziya, futhi kamuva ukwehla kwe-gradient kanye ne-stochastic gradient, sizosebenzisa ikhodi ngokuhlukahluka okubili: sisebenzisa umtapo wolwazi. I-NumPy futhi ngaphandle kokuyisebenzisa, sizobe sesidinga ukufometha kwedatha okufanele (bona ikhodi).
Ikhodi yokulayisha nokucubungula idatha
# ΠΈΠΌΠΏΠΎΡΡΠΈΡΡΠ΅ΠΌ Π²ΡΠ΅ Π½ΡΠΆΠ½ΡΠ΅ Π½Π°ΠΌ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠΈ
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import math
import pylab as pl
import random
# Π³ΡΠ°ΡΠΈΠΊΠΈ ΠΎΡΠΎΠ±ΡΠ°Π·ΠΈΠΌ Π² Jupyter
%matplotlib inline
# ΡΠΊΠ°ΠΆΠ΅ΠΌ ΡΠ°Π·ΠΌΠ΅Ρ Π³ΡΠ°ΡΠΈΠΊΠΎΠ²
from pylab import rcParams
rcParams['figure.figsize'] = 12, 6
# ΠΎΡΠΊΠ»ΡΡΠΈΠΌ ΠΏΡΠ΅Π΄ΡΠΏΡΠ΅ΠΆΠ΄Π΅Π½ΠΈΡ Anaconda
import warnings
warnings.simplefilter('ignore')
# Π·Π°Π³ΡΡΠ·ΠΈΠΌ Π·Π½Π°ΡΠ΅Π½ΠΈΡ
table_zero = pd.read_csv('data_example.txt', header=0, sep='t')
# ΠΏΠΎΡΠΌΠΎΡΡΠΈΠΌ ΠΈΠ½ΡΠΎΡΠΌΠ°ΡΠΈΡ ΠΎ ΡΠ°Π±Π»ΠΈΡΠ΅ ΠΈ Π½Π° ΡΠ°ΠΌΡ ΡΠ°Π±Π»ΠΈΡΡ
print table_zero.info()
print '********************************************'
print table_zero
print '********************************************'
# ΠΏΠΎΠ΄Π³ΠΎΡΠΎΠ²ΠΈΠΌ Π΄Π°Π½Π½ΡΠ΅ Π±Π΅Π· ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΡ NumPy
x_us = []
[x_us.append(float(i)) for i in table_zero['x']]
print x_us
print type(x_us)
print '********************************************'
y_us = []
[y_us.append(float(i)) for i in table_zero['y']]
print y_us
print type(y_us)
print '********************************************'
# ΠΏΠΎΠ΄Π³ΠΎΡΠΎΠ²ΠΈΠΌ Π΄Π°Π½Π½ΡΠ΅ Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ NumPy
x_np = table_zero[['x']].values
print x_np
print type(x_np)
print x_np.shape
print '********************************************'
y_np = table_zero[['y']].values
print y_np
print type(y_np)
print y_np.shape
print '********************************************'
Ukubonakala
Manje, ngemva kokuba, okokuqala, silayishe idatha, okwesibili, sahlola ukufaneleka kokulayisha futhi ekugcineni safometha idatha, sizokwenza ukubonwa kokuqala. Indlela evame ukusetshenziselwa lokhu isiqephu imitapo yolwazi ozalwa olwandle. Esibonelweni sethu, ngenxa yezinombolo ezilinganiselwe, asikho isidingo sokusebenzisa umtapo wolwazi ozalwa olwandle. Sizosebenzisa umtapo wolwazi ojwayelekile I-Matplotlib futhi bheka nje uhlaka.
Ikhodi ye-scatterplot
print 'ΠΡΠ°ΡΠΈΠΊ β1 "ΠΠ°Π²ΠΈΡΠΈΠΌΠΎΡΡΡ Π²ΡΡΡΡΠΊΠΈ ΠΎΡ ΠΌΠ΅ΡΡΡΠ° Π³ΠΎΠ΄Π°"'
plt.plot(x_us,y_us,'o',color='green',markersize=16)
plt.xlabel('$Months$', size=16)
plt.ylabel('$Sales$', size=16)
plt.show()
Ishadi No. 1 βUkuncika emalini engenayo enyangeni yonyakaβ
Isixazululo sokuhlaziya
Masisebenzise amathuluzi avame kakhulu ku i-python futhi uxazulule isistimu yezibalo:
qala{equation*}
qala{cases}
na + bsumlimits_{i=1}^nx_i β sumlimits_{i=1}^ny_i = 0
sumlimits_{i=1}^nx_i(a +bsumlimits_{i=1}^nx_i - sumlimits_{i=1}^ny_i) = 0
end{cases}
ukuphela{isibalo*}
Ngokomthetho kaCramer sizothola i-determinant jikelele, kanye nezinqumo ngo futhi , ngemva kwalokho, ukuhlukanisa isihlukanisi nge kusihlukanisi esivamile - thola i-coefficient , ngokufanayo sithola i-coefficient .
Ikhodi yesixazululo sokuhlaziya
# ΠΎΠΏΡΠ΅Π΄Π΅Π»ΠΈΠΌ ΡΡΠ½ΠΊΡΠΈΡ Π΄Π»Ρ ΡΠ°ΡΡΠ΅ΡΠ° ΠΊΠΎΡΡΡΠΈΡΠΈΠ΅Π½ΡΠΎΠ² a ΠΈ b ΠΏΠΎ ΠΏΡΠ°Π²ΠΈΠ»Ρ ΠΡΠ°ΠΌΠ΅ΡΠ°
def Kramer_method (x,y):
# ΡΡΠΌΠΌΠ° Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ (Π²ΡΠ΅ ΠΌΠ΅ΡΡΡΠ°)
sx = sum(x)
# ΡΡΠΌΠΌΠ° ΠΈΡΡΠΈΠ½Π½ΡΡ
ΠΎΡΠ²Π΅ΡΠΎΠ² (Π²ΡΡΡΡΠΊΠ° Π·Π° Π²Π΅ΡΡ ΠΏΠ΅ΡΠΈΠΎΠ΄)
sy = sum(y)
# ΡΡΠΌΠΌΠ° ΠΏΡΠΎΠΈΠ·Π²Π΅Π΄Π΅Π½ΠΈΡ Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ Π½Π° ΠΈΡΡΠΈΠ½Π½ΡΠ΅ ΠΎΡΠ²Π΅ΡΡ
list_xy = []
[list_xy.append(x[i]*y[i]) for i in range(len(x))]
sxy = sum(list_xy)
# ΡΡΠΌΠΌΠ° ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ
list_x_sq = []
[list_x_sq.append(x[i]**2) for i in range(len(x))]
sx_sq = sum(list_x_sq)
# ΠΊΠΎΠ»ΠΈΡΠ΅ΡΡΠ²ΠΎ Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ
n = len(x)
# ΠΎΠ±ΡΠΈΠΉ ΠΎΠΏΡΠ΅Π΄Π΅Π»ΠΈΡΠ΅Π»Ρ
det = sx_sq*n - sx*sx
# ΠΎΠΏΡΠ΅Π΄Π΅Π»ΠΈΡΠ΅Π»Ρ ΠΏΠΎ a
det_a = sx_sq*sy - sx*sxy
# ΠΈΡΠΊΠΎΠΌΡΠΉ ΠΏΠ°ΡΠ°ΠΌΠ΅ΡΡ a
a = (det_a / det)
# ΠΎΠΏΡΠ΅Π΄Π΅Π»ΠΈΡΠ΅Π»Ρ ΠΏΠΎ b
det_b = sxy*n - sy*sx
# ΠΈΡΠΊΠΎΠΌΡΠΉ ΠΏΠ°ΡΠ°ΠΌΠ΅ΡΡ b
b = (det_b / det)
# ΠΊΠΎΠ½ΡΡΠΎΠ»ΡΠ½ΡΠ΅ Π·Π½Π°ΡΠ΅Π½ΠΈΡ (ΠΏΡΠΎΠΎΠ²Π΅ΡΠΊΠ°)
check1 = (n*b + a*sx - sy)
check2 = (b*sx + a*sx_sq - sxy)
return [round(a,4), round(b,4)]
# Π·Π°ΠΏΡΡΡΠΈΠΌ ΡΡΠ½ΠΊΡΠΈΡ ΠΈ Π·Π°ΠΏΠΈΡΠ΅ΠΌ ΠΏΡΠ°Π²ΠΈΠ»ΡΠ½ΡΠ΅ ΠΎΡΠ²Π΅ΡΡ
ab_us = Kramer_method(x_us,y_us)
a_us = ab_us[0]
b_us = ab_us[1]
print ' 33[1m' + ' 33[4m' + "ΠΠΏΡΠΈΠΌΠ°Π»ΡΠ½ΡΠ΅ Π·Π½Π°ΡΠ΅Π½ΠΈΡ ΠΊΠΎΡΡΡΠΈΡΠΈΠ΅Π½ΡΠΎΠ² a ΠΈ b:" + ' 33[0m'
print 'a =', a_us
print 'b =', b_us
print
# ΠΎΠΏΡΠ΅Π΄Π΅Π»ΠΈΠΌ ΡΡΠ½ΠΊΡΠΈΡ Π΄Π»Ρ ΠΏΠΎΠ΄ΡΡΠ΅ΡΠ° ΡΡΠΌΠΌΡ ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² ΠΎΡΠΈΠ±ΠΎΠΊ
def errors_sq_Kramer_method(answers,x,y):
list_errors_sq = []
for i in range(len(x)):
err = (answers[0] + answers[1]*x[i] - y[i])**2
list_errors_sq.append(err)
return sum(list_errors_sq)
# Π·Π°ΠΏΡΡΡΠΈΠΌ ΡΡΠ½ΠΊΡΠΈΡ ΠΈ Π·Π°ΠΏΠΈΡΠ΅ΠΌ Π·Π½Π°ΡΠ΅Π½ΠΈΠ΅ ΠΎΡΠΈΠ±ΠΊΠΈ
error_sq = errors_sq_Kramer_method(ab_us,x_us,y_us)
print ' 33[1m' + ' 33[4m' + "Π‘ΡΠΌΠΌΠ° ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠΉ" + ' 33[0m'
print error_sq
print
# Π·Π°ΠΌΠ΅ΡΠΈΠΌ Π²ΡΠ΅ΠΌΡ ΡΠ°ΡΡΠ΅ΡΠ°
# print ' 33[1m' + ' 33[4m' + "ΠΡΠ΅ΠΌΡ Π²ΡΠΏΠΎΠ»Π½Π΅Π½ΠΈΡ ΡΠ°ΡΡΠ΅ΡΠ° ΡΡΠΌΠΌΡ ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠΉ:" + ' 33[0m'
# % timeit error_sq = errors_sq_Kramer_method(ab,x_us,y_us)
Nakhu esinakho:
Ngakho-ke, amanani ama-coefficients atholiwe, isamba sokuchezuka okuyisikwele sesisunguliwe. Masidwebe umugqa oqondile ku-histogram ehlakazayo ngokuhambisana nama-coefficients atholiwe.
Ikhodi yomugqa wokuhlehla
# ΠΎΠΏΡΠ΅Π΄Π΅Π»ΠΈΠΌ ΡΡΠ½ΠΊΡΠΈΡ Π΄Π»Ρ ΡΠΎΡΠΌΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΠΌΠ°ΡΡΠΈΠ²Π° ΡΠ°ΡΡΡΠ΅ΡΠ½ΡΡ
Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ Π²ΡΡΡΡΠΊΠΈ
def sales_count(ab,x,y):
line_answers = []
[line_answers.append(ab[0]+ab[1]*x[i]) for i in range(len(x))]
return line_answers
# ΠΏΠΎΡΡΡΠΎΠΈΠΌ Π³ΡΠ°ΡΠΈΠΊΠΈ
print 'ΠΡΡΠΈΠΊβ2 "ΠΡΠ°Π²ΠΈΠ»ΡΠ½ΡΠ΅ ΠΈ ΡΠ°ΡΡΠ΅ΡΠ½ΡΠ΅ ΠΎΡΠ²Π΅ΡΡ"'
plt.plot(x_us,y_us,'o',color='green',markersize=16, label = '$True$ $answers$')
plt.plot(x_us, sales_count(ab_us,x_us,y_us), color='red',lw=4,
label='$Function: a + bx,$ $where$ $a='+str(round(ab_us[0],2))+',$ $b='+str(round(ab_us[1],2))+'$')
plt.xlabel('$Months$', size=16)
plt.ylabel('$Sales$', size=16)
plt.legend(loc=1, prop={'size': 16})
plt.show()
Ishadi No. 2 βIzimpendulo ezifanele nezibaliweβ
Ungabheka igrafu yokuchezuka yenyanga ngayinye. Esimweni sethu, ngeke sithole inzuzo ebalulekile kuyo, kodwa sizokwanelisa ilukuluku lethu lokuthi isibalo sokuhlehla esilula somugqa sikubonisa kanjani ukuncika kwemali engenayo enyangeni yonyaka.
Ikhodi yeshadi lokuchezuka
# ΠΎΠΏΡΠ΅Π΄Π΅Π»ΠΈΠΌ ΡΡΠ½ΠΊΡΠΈΡ Π΄Π»Ρ ΡΠΎΡΠΌΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΠΌΠ°ΡΡΠΈΠ²Π° ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠΉ Π² ΠΏΡΠΎΡΠ΅Π½ΡΠ°Ρ
def error_per_month(ab,x,y):
sales_c = sales_count(ab,x,y)
errors_percent = []
for i in range(len(x)):
errors_percent.append(100*(sales_c[i]-y[i])/y[i])
return errors_percent
# ΠΏΠΎΡΡΡΠΎΠΈΠΌ Π³ΡΠ°ΡΠΈΠΊ
print 'ΠΡΠ°ΡΠΈΠΊβ3 "ΠΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΡ ΠΏΠΎ-ΠΌΠ΅ΡΡΡΠ½ΠΎ, %"'
plt.gca().bar(x_us, error_per_month(ab_us,x_us,y_us), color='brown')
plt.xlabel('Months', size=16)
plt.ylabel('Calculation error, %', size=16)
plt.show()
Ishadi No. 3 βDeviations, %β
Asiphelele, kodwa siwuqedile umsebenzi wethu.
Masibhale umsebenzi, ukuze sinqume ama-coefficients ΠΈ isebenzisa umtapo wolwazi I-NumPy, ngokunembile, sizobhala imisebenzi emibili: eyodwa isebenzisa i-pseudoinverse matrix (ayinconyiwe ekusebenzeni, njengoba inqubo iyinkimbinkimbi futhi ayizinzile), enye isebenzisa i-matrix equation.
Ikhodi Yesixazululo Sokuhlaziya (NumPy)
# Π΄Π»Ρ Π½Π°ΡΠ°Π»Π° Π΄ΠΎΠ±Π°Π²ΠΈΠΌ ΡΡΠΎΠ»Π±Π΅Ρ Ρ Π½Π΅ ΠΈΠ·ΠΌΠ΅Π½ΡΡΡΠΈΠΌΡΡ Π·Π½Π°ΡΠ΅Π½ΠΈΠ΅ΠΌ Π² 1.
# ΠΠ°Π½Π½ΡΠΉ ΡΡΠΎΠ»Π±Π΅Ρ Π½ΡΠΆΠ΅Π½ Π΄Π»Ρ ΡΠΎΠ³ΠΎ, ΡΡΠΎΠ±Ρ Π½Π΅ ΠΎΠ±ΡΠ°Π±Π°ΡΡΠ²Π°ΡΡ ΠΎΡΠ΄Π΅Π»ΡΠ½ΠΎ ΠΊΠΎΡΡΡΠΈΡΠ΅Π½Ρ a
vector_1 = np.ones((x_np.shape[0],1))
x_np = table_zero[['x']].values # Π½Π° Π²ΡΡΠΊΠΈΠΉ ΡΠ»ΡΡΠ°ΠΉ ΠΏΡΠΈΠ²Π΅Π΄Π΅ΠΌ Π² ΠΏΠ΅ΡΠ²ΠΈΡΠ½ΡΠΉ ΡΠΎΡΠΌΠ°Ρ Π²Π΅ΠΊΡΠΎΡ x_np
x_np = np.hstack((vector_1,x_np))
# ΠΏΡΠΎΠ²Π΅ΡΠΈΠΌ ΡΠΎ, ΡΡΠΎ Π²ΡΠ΅ ΡΠ΄Π΅Π»Π°Π»ΠΈ ΠΏΡΠ°Π²ΠΈΠ»ΡΠ½ΠΎ
print vector_1[0:3]
print x_np[0:3]
print '***************************************'
print
# Π½Π°ΠΏΠΈΡΠ΅ΠΌ ΡΡΠ½ΠΊΡΠΈΡ, ΠΊΠΎΡΠΎΡΠ°Ρ ΠΎΠΏΡΠ΅Π΄Π΅Π»ΡΠ΅Ρ Π·Π½Π°ΡΠ΅Π½ΠΈΡ ΠΊΠΎΡΡΡΠΈΡΠΈΠ΅Π½ΡΠΎΠ² a ΠΈ b Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ ΠΏΡΠ΅Π²Π΄ΠΎΠΎΠ±ΡΠ°ΡΠ½ΠΎΠΉ ΠΌΠ°ΡΡΠΈΡΡ
def pseudoinverse_matrix(X, y):
# Π·Π°Π΄Π°Π΅ΠΌ ΡΠ²Π½ΡΠΉ ΡΠΎΡΠΌΠ°Ρ ΠΌΠ°ΡΡΠΈΡΡ ΠΏΡΠΈΠ·Π½Π°ΠΊΠΎΠ²
X = np.matrix(X)
# ΠΎΠΏΡΠ΅Π΄Π΅Π»ΡΠ΅ΠΌ ΡΡΠ°Π½ΡΠΏΠΎΠ½ΠΈΡΠΎΠ²Π°Π½Π½ΡΡ ΠΌΠ°ΡΡΠΈΡΡ
XT = X.T
# ΠΎΠΏΡΠ΅Π΄Π΅Π»ΡΠ΅ΠΌ ΠΊΠ²Π°Π΄ΡΠ°ΡΠ½ΡΡ ΠΌΠ°ΡΡΠΈΡΡ
XTX = XT*X
# ΠΎΠΏΡΠ΅Π΄Π΅Π»ΡΠ΅ΠΌ ΠΏΡΠ΅Π²Π΄ΠΎΠΎΠ±ΡΠ°ΡΠ½ΡΡ ΠΌΠ°ΡΡΠΈΡΡ
inv = np.linalg.pinv(XTX)
# Π·Π°Π΄Π°Π΅ΠΌ ΡΠ²Π½ΡΠΉ ΡΠΎΡΠΌΠ°Ρ ΠΌΠ°ΡΡΠΈΡΡ ΠΎΡΠ²Π΅ΡΠΎΠ²
y = np.matrix(y)
# Π½Π°Ρ
ΠΎΠ΄ΠΈΠΌ Π²Π΅ΠΊΡΠΎΡ Π²Π΅ΡΠΎΠ²
return (inv*XT)*y
# Π·Π°ΠΏΡΡΡΠΈΠΌ ΡΡΠ½ΠΊΡΠΈΡ
ab_np = pseudoinverse_matrix(x_np, y_np)
print ab_np
print '***************************************'
print
# Π½Π°ΠΏΠΈΡΠ΅ΠΌ ΡΡΠ½ΠΊΡΠΈΡ, ΠΊΠΎΡΠΎΡΠ°Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅Ρ Π΄Π»Ρ ΡΠ΅ΡΠ΅Π½ΠΈΡ ΠΌΠ°ΡΡΠΈΡΠ½ΠΎΠ΅ ΡΡΠ°Π²Π½Π΅Π½ΠΈΠ΅
def matrix_equation(X,y):
a = np.dot(X.T, X)
b = np.dot(X.T, y)
return np.linalg.solve(a, b)
# Π·Π°ΠΏΡΡΡΠΈΠΌ ΡΡΠ½ΠΊΡΠΈΡ
ab_np = matrix_equation(x_np,y_np)
print ab_np
Ake siqhathanise isikhathi esichithwe ekunqumeni ama-coefficients ΠΈ , ngokuhambisana nezindlela ezi-3 ezethulwe.
Ikhodi yokubala isikhathi sokubala
print ' 33[1m' + ' 33[4m' + "ΠΡΠ΅ΠΌΡ Π²ΡΠΏΠΎΠ»Π½Π΅Π½ΠΈΡ ΡΠ°ΡΡΠ΅ΡΠ° ΠΊΠΎΡΡΡΠΈΡΠΈΠ΅Π½ΡΠΎΠ² Π±Π΅Π· ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΡ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠΈ NumPy:" + ' 33[0m'
% timeit ab_us = Kramer_method(x_us,y_us)
print '***************************************'
print
print ' 33[1m' + ' 33[4m' + "ΠΡΠ΅ΠΌΡ Π²ΡΠΏΠΎΠ»Π½Π΅Π½ΠΈΡ ΡΠ°ΡΡΠ΅ΡΠ° ΠΊΠΎΡΡΡΠΈΡΠΈΠ΅Π½ΡΠΎΠ² Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ ΠΏΡΠ΅Π²Π΄ΠΎΠΎΠ±ΡΠ°ΡΠ½ΠΎΠΉ ΠΌΠ°ΡΡΠΈΡΡ:" + ' 33[0m'
%timeit ab_np = pseudoinverse_matrix(x_np, y_np)
print '***************************************'
print
print ' 33[1m' + ' 33[4m' + "ΠΡΠ΅ΠΌΡ Π²ΡΠΏΠΎΠ»Π½Π΅Π½ΠΈΡ ΡΠ°ΡΡΠ΅ΡΠ° ΠΊΠΎΡΡΡΠΈΡΠΈΠ΅Π½ΡΠΎΠ² Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ ΠΌΠ°ΡΡΠΈΡΠ½ΠΎΠ³ΠΎ ΡΡΠ°Π²Π½Π΅Π½ΠΈΡ:" + ' 33[0m'
%timeit ab_np = matrix_equation(x_np, y_np)
Ngenani elincane ledatha, umsebenzi "ozibhalayo" uphuma phambili, othola ama-coefficients usebenzisa indlela ye-Cramer.
Manje ungadlulela kwezinye izindlela zokuthola ama-coefficient ΠΈ .
Ukwehla kweGradient
Okokuqala, ake sichaze ukuthi iyini i-gradient. Kalula nje, i-gradient iyisegimenti ebonisa isiqondiso sokukhula okuphezulu komsebenzi. Ngokufanisa nokuqwala intaba, lapho i-gradient ibheke khona kulapho umqansa ukhuphukela esiqongweni sentaba. Ukuthuthukisa isibonelo ngentaba, sikhumbula ukuthi empeleni sidinga ukwehla okuphakeme kakhulu ukuze sifinyelele endaweni ephansi ngokushesha ngangokunokwenzeka, okungukuthi, okungenani - indawo lapho umsebenzi ungakhuli noma unganciphi. Kuleli qophelo okuphuma kokunye kuzolingana noziro. Ngakho-ke, asidingi i-gradient, kodwa i-antigradient. Ukuze uthole i-antigradient udinga nje ukuphindaphinda i-gradient ngayo -1 (khipha okukodwa).
Ake sinake iqiniso lokuthi umsebenzi ungaba nama-minima amaningana, futhi sehlele kwenye yazo sisebenzisa i-algorithm ehlongozwayo ngezansi, ngeke sikwazi ukuthola obunye ubuncane, obungaba ngaphansi kwalowo otholakele. Ake siphumule, lokhu akulona usongo kithi! Esimweni sethu sibhekene nobuncane obuncane, kusukela emsebenzini wethu kugrafu kune-parabola evamile. Futhi njengoba sonke kufanele sazi kahle kakhulu esifundweni sethu sezibalo zesikole, i-parabola inenani eliphansi elilodwa kuphela.
Ngemva kokuthola ukuthi kungani sidinga i-gradient, nokuthi i-gradient iyisegimenti, okungukuthi, i-vector enezixhumanisi ezinikeziwe, ezingama-coefficient afanayo ncamashi. ΠΈ singasebenzisa ukwehla kwe-gradient.
Ngaphambi kokuqala, ngiphakamisa ukuthi ufunde imisho embalwa nje mayelana ne-algorithm yokwehla:
- Sinquma ngendlela mbumbulu-okungahleliwe izixhumanisi zama-coefficients ΠΈ . Esibonelweni sethu, sizonquma ama-coefficient aseduze kukaziro. Lona umkhuba ovamile, kodwa icala ngalinye lingase libe nomkhuba walo.
- Kusuka ku-coordinate khipha inani lokuphuma kokunye kokunye kwe-oda loku-1 endaweni . Ngakho-ke, uma i-derivative ilungile, khona-ke umsebenzi uyakhula. Ngakho-ke, ngokukhipha inani le-derivative, sizohamba ngendlela ehlukile yokukhula, okungukuthi, ekuqondeni kokwehla. Uma okuphuma kokunye kuyinegethivu, khona-ke umsebenzi kuleli phuzu uyancipha futhi ngokukhipha inani lokuphuma kokunye sihamba siye ohlangothini lokwehla.
- Senza umsebenzi ofanayo nge-coordinate : susa inani lokuphuma kokunye ingxenye endaweni .
- Ukuze ungagxumi phezu kobuncane bese undizela endaweni ejulile, kuyadingeka ukusetha usayizi wesinyathelo ngendlela yokwehla. Ngokuvamile, ungabhala i-athikili ephelele mayelana nendlela yokusetha isinyathelo ngendlela efanele nokuthi ungasishintsha kanjani phakathi nenqubo yokwehla ukuze unciphise izindleko zokubala. Kodwa manje sinomsebenzi ohluke kancane phambi kwethu, futhi sizosungula usayizi wesinyathelo sisebenzisa indlela yesayensi ye- "poke" noma, njengoba besho ngolimi oluvamile, ngokomthetho.
- Uma sesisuka ezixhumanisini ezinikeziwe ΠΈ khipha amanani we-derivatives, sithola izixhumanisi ezintsha ΠΈ . Sithatha isinyathelo esilandelayo (ukukhipha), kakade kusukela kuzixhumanisi ezibaliwe. Futhi ngakho-ke umjikelezo uqala ngokuphindaphindiwe, kuze kube yilapho kufinyelelwa ukuhlangana okudingekayo.
Konke! Manje sesilungele ukuyofuna umhosha ojulile we-Mariana Trench. Ake siqale.
Ikhodi yokwehla kwe-gradient
# Π½Π°ΠΏΠΈΡΠ΅ΠΌ ΡΡΠ½ΠΊΡΠΈΡ Π³ΡΠ°Π΄ΠΈΠ΅Π½ΡΠ½ΠΎΠ³ΠΎ ΡΠΏΡΡΠΊΠ° Π±Π΅Π· ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΡ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠΈ NumPy.
# Π€ΡΠ½ΠΊΡΠΈΡ Π½Π° Π²Ρ
ΠΎΠ΄ ΠΏΡΠΈΠ½ΠΈΠΌΠ°Π΅Ρ Π΄ΠΈΠ°ΠΏΠ°Π·ΠΎΠ½Ρ Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ x,y, Π΄Π»ΠΈΠ½Ρ ΡΠ°Π³Π° (ΠΏΠΎ ΡΠΌΠΎΠ»ΡΠ°Π½ΠΈΡ=0,1), Π΄ΠΎΠΏΡΡΡΠΈΠΌΡΡ ΠΏΠΎΠ³ΡΠ΅ΡΠ½ΠΎΡΡΡ(tolerance)
def gradient_descent_usual(x_us,y_us,l=0.1,tolerance=0.000000000001):
# ΡΡΠΌΠΌΠ° Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ (Π²ΡΠ΅ ΠΌΠ΅ΡΡΡΠ°)
sx = sum(x_us)
# ΡΡΠΌΠΌΠ° ΠΈΡΡΠΈΠ½Π½ΡΡ
ΠΎΡΠ²Π΅ΡΠΎΠ² (Π²ΡΡΡΡΠΊΠ° Π·Π° Π²Π΅ΡΡ ΠΏΠ΅ΡΠΈΠΎΠ΄)
sy = sum(y_us)
# ΡΡΠΌΠΌΠ° ΠΏΡΠΎΠΈΠ·Π²Π΅Π΄Π΅Π½ΠΈΡ Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ Π½Π° ΠΈΡΡΠΈΠ½Π½ΡΠ΅ ΠΎΡΠ²Π΅ΡΡ
list_xy = []
[list_xy.append(x_us[i]*y_us[i]) for i in range(len(x_us))]
sxy = sum(list_xy)
# ΡΡΠΌΠΌΠ° ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ
list_x_sq = []
[list_x_sq.append(x_us[i]**2) for i in range(len(x_us))]
sx_sq = sum(list_x_sq)
# ΠΊΠΎΠ»ΠΈΡΠ΅ΡΡΠ²ΠΎ Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ
num = len(x_us)
# Π½Π°ΡΠ°Π»ΡΠ½ΡΠ΅ Π·Π½Π°ΡΠ΅Π½ΠΈΡ ΠΊΠΎΡΡΡΠΈΡΠΈΠ΅Π½ΡΠΎΠ², ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½Π½ΡΠ΅ ΠΏΡΠ΅Π²Π΄ΠΎΡΠ»ΡΡΠ°ΠΉΠ½ΡΠΌ ΠΎΠ±ΡΠ°Π·ΠΎΠΌ
a = float(random.uniform(-0.5, 0.5))
b = float(random.uniform(-0.5, 0.5))
# ΡΠΎΠ·Π΄Π°Π΅ΠΌ ΠΌΠ°ΡΡΠΈΠ² Ρ ΠΎΡΠΈΠ±ΠΊΠ°ΠΌΠΈ, Π΄Π»Ρ ΡΡΠ°ΡΡΠ° ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΠΌ Π·Π½Π°ΡΠ΅Π½ΠΈΡ 1 ΠΈ 0
# ΠΏΠΎΡΠ»Π΅ Π·Π°Π²Π΅ΡΡΠ΅Π½ΠΈΡ ΡΠΏΡΡΠΊΠ° ΡΡΠ°ΡΡΠΎΠ²ΡΠ΅ Π·Π½Π°ΡΠ΅Π½ΠΈΡ ΡΠ΄Π°Π»ΠΈΠΌ
errors = [1,0]
# Π·Π°ΠΏΡΡΠΊΠ°Π΅ΠΌ ΡΠΈΠΊΠ» ΡΠΏΡΡΠΊΠ°
# ΡΠΈΠΊΠ» ΡΠ°Π±ΠΎΡΠ°Π΅Ρ Π΄ΠΎ ΡΠ΅Ρ
ΠΏΠΎΡ, ΠΏΠΎΠΊΠ° ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠ΅ ΠΏΠΎΡΠ»Π΅Π΄Π½Π΅ΠΉ ΠΎΡΠΈΠ±ΠΊΠΈ ΡΡΠΌΠΌΡ ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² ΠΎΡ ΠΏΡΠ΅Π΄ΡΠ΄ΡΡΠ΅ΠΉ, Π½Π΅ Π±ΡΠ΄Π΅Ρ ΠΌΠ΅Π½ΡΡΠ΅ tolerance
while abs(errors[-1]-errors[-2]) > tolerance:
a_step = a - l*(num*a + b*sx - sy)/num
b_step = b - l*(a*sx + b*sx_sq - sxy)/num
a = a_step
b = b_step
ab = [a,b]
errors.append(errors_sq_Kramer_method(ab,x_us,y_us))
return (ab),(errors[2:])
# Π·Π°ΠΏΠΈΡΠ΅ΠΌ ΠΌΠ°ΡΡΠΈΠ² Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ
list_parametres_gradient_descence = gradient_descent_usual(x_us,y_us,l=0.1,tolerance=0.000000000001)
print ' 33[1m' + ' 33[4m' + "ΠΠ½Π°ΡΠ΅Π½ΠΈΡ ΠΊΠΎΡΡΡΠΈΡΠΈΠ΅Π½ΡΠΎΠ² a ΠΈ b:" + ' 33[0m'
print 'a =', round(list_parametres_gradient_descence[0][0],3)
print 'b =', round(list_parametres_gradient_descence[0][1],3)
print
print ' 33[1m' + ' 33[4m' + "Π‘ΡΠΌΠΌΠ° ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠΉ:" + ' 33[0m'
print round(list_parametres_gradient_descence[1][-1],3)
print
print ' 33[1m' + ' 33[4m' + "ΠΠΎΠ»ΠΈΡΠ΅ΡΡΠ²ΠΎ ΠΈΡΠ΅ΡΠ°ΡΠΈΠΉ Π² Π³ΡΠ°Π΄ΠΈΠ΅Π½ΡΠ½ΠΎΠΌ ΡΠΏΡΡΠΊΠ΅:" + ' 33[0m'
print len(list_parametres_gradient_descence[1])
print
Sangena phansi kakhulu ku-Mariana Trench futhi lapho sathola amanani e-coefficient afanayo ΠΈ , okuyilokho kanye obekulindelwe.
Ake sibhukule futhi, kulokhu kuphela, imoto yethu yasolwandle izogcwala obunye ubuchwepheshe, okuwumtapo wolwazi. I-NumPy.
Ikhodi yokwehla kwegradient (NumPy)
# ΠΏΠ΅ΡΠ΅Π΄ ΡΠ΅ΠΌ ΠΎΠΏΡΠ΅Π΄Π΅Π»ΠΈΡΡ ΡΡΠ½ΠΊΡΠΈΡ Π΄Π»Ρ Π³ΡΠ°Π΄ΠΈΠ΅Π½ΡΠ½ΠΎΠ³ΠΎ ΡΠΏΡΡΠΊΠ° Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠΈ NumPy,
# Π½Π°ΠΏΠΈΡΠ΅ΠΌ ΡΡΠ½ΠΊΡΠΈΡ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½ΠΈΡ ΡΡΠΌΠΌΡ ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠΉ ΡΠ°ΠΊΠΆΠ΅ Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ NumPy
def error_square_numpy(ab,x_np,y_np):
y_pred = np.dot(x_np,ab)
error = y_pred - y_np
return sum((error)**2)
# Π½Π°ΠΏΠΈΡΠ΅ΠΌ ΡΡΠ½ΠΊΡΠΈΡ Π³ΡΠ°Π΄ΠΈΠ΅Π½ΡΠ½ΠΎΠ³ΠΎ ΡΠΏΡΡΠΊΠ° Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠΈ NumPy.
# Π€ΡΠ½ΠΊΡΠΈΡ Π½Π° Π²Ρ
ΠΎΠ΄ ΠΏΡΠΈΠ½ΠΈΠΌΠ°Π΅Ρ Π΄ΠΈΠ°ΠΏΠ°Π·ΠΎΠ½Ρ Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ x,y, Π΄Π»ΠΈΠ½Ρ ΡΠ°Π³Π° (ΠΏΠΎ ΡΠΌΠΎΠ»ΡΠ°Π½ΠΈΡ=0,1), Π΄ΠΎΠΏΡΡΡΠΈΠΌΡΡ ΠΏΠΎΠ³ΡΠ΅ΡΠ½ΠΎΡΡΡ(tolerance)
def gradient_descent_numpy(x_np,y_np,l=0.1,tolerance=0.000000000001):
# ΡΡΠΌΠΌΠ° Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ (Π²ΡΠ΅ ΠΌΠ΅ΡΡΡΠ°)
sx = float(sum(x_np[:,1]))
# ΡΡΠΌΠΌΠ° ΠΈΡΡΠΈΠ½Π½ΡΡ
ΠΎΡΠ²Π΅ΡΠΎΠ² (Π²ΡΡΡΡΠΊΠ° Π·Π° Π²Π΅ΡΡ ΠΏΠ΅ΡΠΈΠΎΠ΄)
sy = float(sum(y_np))
# ΡΡΠΌΠΌΠ° ΠΏΡΠΎΠΈΠ·Π²Π΅Π΄Π΅Π½ΠΈΡ Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ Π½Π° ΠΈΡΡΠΈΠ½Π½ΡΠ΅ ΠΎΡΠ²Π΅ΡΡ
sxy = x_np*y_np
sxy = float(sum(sxy[:,1]))
# ΡΡΠΌΠΌΠ° ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ
sx_sq = float(sum(x_np[:,1]**2))
# ΠΊΠΎΠ»ΠΈΡΠ΅ΡΡΠ²ΠΎ Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ
num = float(x_np.shape[0])
# Π½Π°ΡΠ°Π»ΡΠ½ΡΠ΅ Π·Π½Π°ΡΠ΅Π½ΠΈΡ ΠΊΠΎΡΡΡΠΈΡΠΈΠ΅Π½ΡΠΎΠ², ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½Π½ΡΠ΅ ΠΏΡΠ΅Π²Π΄ΠΎΡΠ»ΡΡΠ°ΠΉΠ½ΡΠΌ ΠΎΠ±ΡΠ°Π·ΠΎΠΌ
a = float(random.uniform(-0.5, 0.5))
b = float(random.uniform(-0.5, 0.5))
# ΡΠΎΠ·Π΄Π°Π΅ΠΌ ΠΌΠ°ΡΡΠΈΠ² Ρ ΠΎΡΠΈΠ±ΠΊΠ°ΠΌΠΈ, Π΄Π»Ρ ΡΡΠ°ΡΡΠ° ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΠΌ Π·Π½Π°ΡΠ΅Π½ΠΈΡ 1 ΠΈ 0
# ΠΏΠΎΡΠ»Π΅ Π·Π°Π²Π΅ΡΡΠ΅Π½ΠΈΡ ΡΠΏΡΡΠΊΠ° ΡΡΠ°ΡΡΠΎΠ²ΡΠ΅ Π·Π½Π°ΡΠ΅Π½ΠΈΡ ΡΠ΄Π°Π»ΠΈΠΌ
errors = [1,0]
# Π·Π°ΠΏΡΡΠΊΠ°Π΅ΠΌ ΡΠΈΠΊΠ» ΡΠΏΡΡΠΊΠ°
# ΡΠΈΠΊΠ» ΡΠ°Π±ΠΎΡΠ°Π΅Ρ Π΄ΠΎ ΡΠ΅Ρ
ΠΏΠΎΡ, ΠΏΠΎΠΊΠ° ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠ΅ ΠΏΠΎΡΠ»Π΅Π΄Π½Π΅ΠΉ ΠΎΡΠΈΠ±ΠΊΠΈ ΡΡΠΌΠΌΡ ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² ΠΎΡ ΠΏΡΠ΅Π΄ΡΠ΄ΡΡΠ΅ΠΉ, Π½Π΅ Π±ΡΠ΄Π΅Ρ ΠΌΠ΅Π½ΡΡΠ΅ tolerance
while abs(errors[-1]-errors[-2]) > tolerance:
a_step = a - l*(num*a + b*sx - sy)/num
b_step = b - l*(a*sx + b*sx_sq - sxy)/num
a = a_step
b = b_step
ab = np.array([[a],[b]])
errors.append(error_square_numpy(ab,x_np,y_np))
return (ab),(errors[2:])
# Π·Π°ΠΏΠΈΡΠ΅ΠΌ ΠΌΠ°ΡΡΠΈΠ² Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ
list_parametres_gradient_descence = gradient_descent_numpy(x_np,y_np,l=0.1,tolerance=0.000000000001)
print ' 33[1m' + ' 33[4m' + "ΠΠ½Π°ΡΠ΅Π½ΠΈΡ ΠΊΠΎΡΡΡΠΈΡΠΈΠ΅Π½ΡΠΎΠ² a ΠΈ b:" + ' 33[0m'
print 'a =', round(list_parametres_gradient_descence[0][0],3)
print 'b =', round(list_parametres_gradient_descence[0][1],3)
print
print ' 33[1m' + ' 33[4m' + "Π‘ΡΠΌΠΌΠ° ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠΉ:" + ' 33[0m'
print round(list_parametres_gradient_descence[1][-1],3)
print
print ' 33[1m' + ' 33[4m' + "ΠΠΎΠ»ΠΈΡΠ΅ΡΡΠ²ΠΎ ΠΈΡΠ΅ΡΠ°ΡΠΈΠΉ Π² Π³ΡΠ°Π΄ΠΈΠ΅Π½ΡΠ½ΠΎΠΌ ΡΠΏΡΡΠΊΠ΅:" + ' 33[0m'
print len(list_parametres_gradient_descence[1])
print
Amanani e-coefficient ΠΈ engaguquleki.
Ake sibheke ukuthi iphutha lishintshe kanjani ngesikhathi sokwehla kwegradient, okungukuthi, ukuthi isamba sokuchezuka okuyisikwele sishintshe kanjani ngesinyathelo ngasinye.
Ikhodi yokuhlela izibalo zokuchezuka okuyisikwele
print 'ΠΡΠ°ΡΠΈΠΊβ4 "Π‘ΡΠΌΠΌΠ° ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠΉ ΠΏΠΎ-ΡΠ°Π³ΠΎΠ²ΠΎ"'
plt.plot(range(len(list_parametres_gradient_descence[1])), list_parametres_gradient_descence[1], color='red', lw=3)
plt.xlabel('Steps (Iteration)', size=16)
plt.ylabel('Sum of squared deviations', size=16)
plt.show()
Igrafu No. 4 βIsamba sokuchezuka okuyisikwele ngesikhathi sokwehla kwegradientβ
Kugrafu sibona ukuthi ngesinyathelo ngasinye iphutha liyancipha, futhi ngemva kwenani elithile lokuphindaphinda sibona umugqa ocishe uvundlile.
Okokugcina, ake silinganisele umehluko esikhathini sokwenza ikhodi:
Ikhodi yokunquma isikhathi sokubala sokwehla kwe-gradient
print ' 33[1m' + ' 33[4m' + "ΠΡΠ΅ΠΌΡ Π²ΡΠΏΠΎΠ»Π½Π΅Π½ΠΈΡ Π³ΡΠ°Π΄ΠΈΠ΅Π½ΡΠ½ΠΎΠ³ΠΎ ΡΠΏΡΡΠΊΠ° Π±Π΅Π· ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΡ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠΈ NumPy:" + ' 33[0m'
%timeit list_parametres_gradient_descence = gradient_descent_usual(x_us,y_us,l=0.1,tolerance=0.000000000001)
print '***************************************'
print
print ' 33[1m' + ' 33[4m' + "ΠΡΠ΅ΠΌΡ Π²ΡΠΏΠΎΠ»Π½Π΅Π½ΠΈΡ Π³ΡΠ°Π΄ΠΈΠ΅Π½ΡΠ½ΠΎΠ³ΠΎ ΡΠΏΡΡΠΊΠ° Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠΈ NumPy:" + ' 33[0m'
%timeit list_parametres_gradient_descence = gradient_descent_numpy(x_np,y_np,l=0.1,tolerance=0.000000000001)
Mhlawumbe kukhona okungalungile esikwenzayo, kodwa futhi kuwumsebenzi olula βobhalwe ekhayaβ ongawusebenzisi umtapo wolwazi. I-NumPy idlula isikhathi sokubala somsebenzi usebenzisa umtapo wolwazi I-NumPy.
Kodwa asimile, kodwa siphokophele ekufundeni enye indlela ethokozisayo yokuxazulula isibalo sokuhlehla esilula somugqa. Hlanganani!
Ukwehla kwe-Stochastic gradient
Ukuze uqonde ngokushesha umgomo wokusebenza kokwehla kwe-stochastic gradient, kungcono ukunquma umehluko wawo ekwehleni okujwayelekile kwe-gradient. Thina, endabeni yokwehla kwe-gradient, ezilinganisweni zokuphuma kokunye kwe ΠΈ usebenzise izilinganiso zamanani azo zonke izici nezimpendulo eziyiqiniso ezitholakala kusampula (okungukuthi, isamba sakho konke ΠΈ ). Ekwehleni kwe-stochastic gradient, ngeke sisebenzise wonke amanani akhona kusampula, kodwa esikhundleni salokho, khetha okungahleliwe lokho okubizwa ngokuthi inkomba yesampula futhi sisebenzise amanani ayo.
Isibonelo, uma inkomba inqunywa ukuthi iyinombolo 3 (ezintathu), sithatha amanani ΠΈ , bese sifaka amanani esikhundleni sezibalo eziphuma kokunye futhi sinquma izixhumanisi ezintsha. Khona-ke, ngemva kokunquma izixhumanisi, siphinde sinqume ngokungahleliwe inkomba yesampula, sishintshe amanani ahambisana nenkomba kuma-equations ahlukene ngokwengxenye, futhi sinqume izixhumanisi ngendlela entsha. ΠΈ njll. kuze kube yilapho ukuhlangana kuba luhlaza. Uma uthi nhlΓ‘, kungase kungabonakali sengathi lokhu kungasebenza nhlobo, kodwa kuyasebenza. Kuyiqiniso ukuthi kubalulekile ukuqaphela ukuthi iphutha alinciphi ngesinyathelo ngasinye, kodwa kukhona ukuthambekela ngokuqinisekile.
Yiziphi izinzuzo zokwehla kwe-stochastic gradient kunokwejwayelekile? Uma usayizi wethu wesampula mkhulu kakhulu futhi ukalwa ngamashumi ezinkulungwane zamanani, kulula kakhulu ukucubungula, ukusho, inkulungwane engahleliwe yawo, kunesampula yonke. Yilapho ukwehla kwe-stochastic gradient kuqala khona. Esimweni sethu, vele, ngeke siqaphele umehluko omkhulu.
Ake sibheke ikhodi.
Ikhodi yokwehla kwe-stochastic gradient
# ΠΎΠΏΡΠ΅Π΄Π΅Π»ΠΈΠΌ ΡΡΠ½ΠΊΡΠΈΡ ΡΡΠΎΡ
.Π³ΡΠ°Π΄.ΡΠ°Π³Π°
def stoch_grad_step_usual(vector_init, x_us, ind, y_us, l):
# Π²ΡΠ±ΠΈΡΠ°Π΅ΠΌ Π·Π½Π°ΡΠ΅Π½ΠΈΠ΅ ΠΈΠΊΡ, ΠΊΠΎΡΠΎΡΠΎΠ΅ ΡΠΎΠΎΡΠ²Π΅ΡΡΡΠ²ΡΠ΅Ρ ΡΠ»ΡΡΠ°ΠΉΠ½ΠΎΠΌΡ Π·Π½Π°ΡΠ΅Π½ΠΈΡ ΠΏΠ°ΡΠ°ΠΌΠ΅ΡΡΠ° ind
# (ΡΠΌ.Ρ-ΡΠΈΡ stoch_grad_descent_usual)
x = x_us[ind]
# ΡΠ°ΡΡΡΠΈΡΡΠ²ΡΠ°Π΅ΠΌ Π·Π½Π°ΡΠ΅Π½ΠΈΠ΅ y (Π²ΡΡΡΡΠΊΡ), ΠΊΠΎΡΠΎΡΠ°Ρ ΡΠΎΠΎΡΠ²Π΅ΡΡΡΠ²ΡΠ΅Ρ Π²ΡΠ±ΡΠ°Π½Π½ΠΎΠΌΡ Π·Π½Π°ΡΠ΅Π½ΠΈΡ x
y_pred = vector_init[0] + vector_init[1]*x_us[ind]
# Π²ΡΡΠΈΡΠ»ΡΠ΅ΠΌ ΠΎΡΠΈΠ±ΠΊΡ ΡΠ°ΡΡΠ΅ΡΠ½ΠΎΠΉ Π²ΡΡΡΡΠΊΠΈ ΠΎΡΠ½ΠΎΡΠΈΡΠ΅Π»ΡΠ½ΠΎ ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»Π΅Π½Π½ΠΎΠΉ Π² Π²ΡΠ±ΠΎΡΠΊΠ΅
error = y_pred - y_us[ind]
# ΠΎΠΏΡΠ΅Π΄Π΅Π»ΡΠ΅ΠΌ ΠΏΠ΅ΡΠ²ΡΡ ΠΊΠΎΠΎΡΠ΄ΠΈΠ½Π°ΡΡ Π³ΡΠ°Π΄ΠΈΠ΅Π½ΡΠ° ab
grad_a = error
# ΠΎΠΏΡΠ΅Π΄Π΅Π»ΡΠ΅ΠΌ Π²ΡΠΎΡΡΡ ΠΊΠΎΠΎΡΠ΄ΠΈΠ½Π°ΡΡ ab
grad_b = x_us[ind]*error
# Π²ΡΡΠΈΡΠ»ΡΠ΅ΠΌ Π½ΠΎΠ²ΡΠΉ Π²Π΅ΠΊΡΠΎΡ ΠΊΠΎΡΡΡΠΈΡΠΈΠ΅Π½ΡΠΎΠ²
vector_new = [vector_init[0]-l*grad_a, vector_init[1]-l*grad_b]
return vector_new
# ΠΎΠΏΡΠ΅Π΄Π΅Π»ΠΈΠΌ ΡΡΠ½ΠΊΡΠΈΡ ΡΡΠΎΡ
.Π³ΡΠ°Π΄.ΡΠΏΡΡΠΊΠ°
def stoch_grad_descent_usual(x_us, y_us, l=0.1, steps = 800):
# Π΄Π»Ρ ΡΠ°ΠΌΠΎΠ³ΠΎ Π½Π°ΡΠ°Π»Π° ΡΠ°Π±ΠΎΡΡ ΡΡΠ½ΠΊΡΠΈΠΈ Π·Π°Π΄Π°Π΄ΠΈΠΌ Π½Π°ΡΠ°Π»ΡΠ½ΡΠ΅ Π·Π½Π°ΡΠ΅Π½ΠΈΡ ΠΊΠΎΡΡΡΠΈΡΠΈΠ΅Π½ΡΠΎΠ²
vector_init = [float(random.uniform(-0.5, 0.5)), float(random.uniform(-0.5, 0.5))]
errors = []
# Π·Π°ΠΏΡΡΡΠΈΠΌ ΡΠΈΠΊΠ» ΡΠΏΡΡΠΊΠ°
# ΡΠΈΠΊΠ» ΡΠ°ΡΡΠΈΡΠ°Π½ Π½Π° ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½Π½ΠΎΠ΅ ΠΊΠΎΠ»ΠΈΡΠ΅ΡΡΠ²ΠΎ ΡΠ°Π³ΠΎΠ² (steps)
for i in range(steps):
ind = random.choice(range(len(x_us)))
new_vector = stoch_grad_step_usual(vector_init, x_us, ind, y_us, l)
vector_init = new_vector
errors.append(errors_sq_Kramer_method(vector_init,x_us,y_us))
return (vector_init),(errors)
# Π·Π°ΠΏΠΈΡΠ΅ΠΌ ΠΌΠ°ΡΡΠΈΠ² Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ
list_parametres_stoch_gradient_descence = stoch_grad_descent_usual(x_us, y_us, l=0.1, steps = 800)
print ' 33[1m' + ' 33[4m' + "ΠΠ½Π°ΡΠ΅Π½ΠΈΡ ΠΊΠΎΡΡΡΠΈΡΠΈΠ΅Π½ΡΠΎΠ² a ΠΈ b:" + ' 33[0m'
print 'a =', round(list_parametres_stoch_gradient_descence[0][0],3)
print 'b =', round(list_parametres_stoch_gradient_descence[0][1],3)
print
print ' 33[1m' + ' 33[4m' + "Π‘ΡΠΌΠΌΠ° ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠΉ:" + ' 33[0m'
print round(list_parametres_stoch_gradient_descence[1][-1],3)
print
print ' 33[1m' + ' 33[4m' + "ΠΠΎΠ»ΠΈΡΠ΅ΡΡΠ²ΠΎ ΠΈΡΠ΅ΡΠ°ΡΠΈΠΉ Π² ΡΡΠΎΡ
Π°ΡΡΠΈΡΠ΅ΡΠΊΠΎΠΌ Π³ΡΠ°Π΄ΠΈΠ΅Π½ΡΠ½ΠΎΠΌ ΡΠΏΡΡΠΊΠ΅:" + ' 33[0m'
print len(list_parametres_stoch_gradient_descence[1])
Sibheka ngokucophelela ama-coefficients futhi sizithole sibuza umbuzo "Kungenzeka kanjani lokhu?" Sithole amanye amanani we-coefficient ΠΈ . Mhlawumbe ukwehla kwe-stochastic gradient kuthole amapharamitha angcono kakhulu esibalo? Ngeshwa cha. Kwanele ukubheka isamba sokuchezuka okuyisikwele futhi ubone ukuthi ngamavelu amasha wama-coefficients, iphutha likhulu. Asijahile ukulahla ithemba. Ake sakhe igrafu yoshintsho lwephutha.
Ikhodi yokuhlela isamba sokuchezuka okusikwele ekwehleni kwegradient ye-stochastic
print 'ΠΡΠ°ΡΠΈΠΊ β5 "Π‘ΡΠΌΠΌΠ° ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠΉ ΠΏΠΎ-ΡΠ°Π³ΠΎΠ²ΠΎ"'
plt.plot(range(len(list_parametres_stoch_gradient_descence[1])), list_parametres_stoch_gradient_descence[1], color='red', lw=2)
plt.xlabel('Steps (Iteration)', size=16)
plt.ylabel('Sum of squared deviations', size=16)
plt.show()
Igrafu No. 5 βIsamba sokuchezuka okusikwele ngesikhathi sokwehla kwe-stochastic gradientβ
Uma sibheka isheduli, yonke into ihamba kahle futhi manje sizolungisa yonke into.
Pho kwenzekani? Lokhu okulandelayo kwenzeka. Uma sikhetha inyanga ngokungahleliwe, kuba okwenyanga ekhethiwe lapho i-algorithm yethu ifuna ukunciphisa iphutha ekubaleni imali engenayo. Bese sikhetha enye inyanga bese siphinda ukubala, kodwa sinciphisa iphutha lenyanga yesibili ekhethiwe. Manje khumbula ukuthi izinyanga ezimbili zokuqala zichezuka kakhulu emugqeni wezibalo zokuhlehla ezilula zomugqa. Lokhu kusho ukuthi uma noma iyiphi yalezi zinyanga ezimbili ikhethiwe, ngokunciphisa iphutha lazo zonke, i-algorithm yethu ikhulisa kakhulu iphutha layo yonke isampula. Pho yini okumele uyenze? Impendulo ilula: udinga ukunciphisa isinyathelo sokwehla. Phela, ngokunciphisa isinyathelo sokwehla, iphutha lizoyeka "ukweqa" phezulu naphansi. Noma kunalokho, iphutha "lokweqa" ngeke liyeke, kodwa ngeke likwenze ngokushesha :) Ake sihlole.
Ikhodi yokusebenzisa i-SGD ngezinyuso ezincane
# Π·Π°ΠΏΡΡΡΠΈΠΌ ΡΡΠ½ΠΊΡΠΈΡ, ΡΠΌΠ΅Π½ΡΡΠΈΠ² ΡΠ°Π³ Π² 100 ΡΠ°Π· ΠΈ ΡΠ²Π΅Π»ΠΈΡΠΈΠ² ΠΊΠΎΠ»ΠΈΡΠ΅ΡΡΠ²ΠΎ ΡΠ°Π³ΠΎΠ² ΡΠΎΠΎΡΠ²Π΅ΡΡΠ²ΡΡΡΠ΅
list_parametres_stoch_gradient_descence = stoch_grad_descent_usual(x_us, y_us, l=0.001, steps = 80000)
print ' 33[1m' + ' 33[4m' + "ΠΠ½Π°ΡΠ΅Π½ΠΈΡ ΠΊΠΎΡΡΡΠΈΡΠΈΠ΅Π½ΡΠΎΠ² a ΠΈ b:" + ' 33[0m'
print 'a =', round(list_parametres_stoch_gradient_descence[0][0],3)
print 'b =', round(list_parametres_stoch_gradient_descence[0][1],3)
print
print ' 33[1m' + ' 33[4m' + "Π‘ΡΠΌΠΌΠ° ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠΉ:" + ' 33[0m'
print round(list_parametres_stoch_gradient_descence[1][-1],3)
print
print ' 33[1m' + ' 33[4m' + "ΠΠΎΠ»ΠΈΡΠ΅ΡΡΠ²ΠΎ ΠΈΡΠ΅ΡΠ°ΡΠΈΠΉ Π² ΡΡΠΎΡ
Π°ΡΡΠΈΡΠ΅ΡΠΊΠΎΠΌ Π³ΡΠ°Π΄ΠΈΠ΅Π½ΡΠ½ΠΎΠΌ ΡΠΏΡΡΠΊΠ΅:" + ' 33[0m'
print len(list_parametres_stoch_gradient_descence[1])
print 'ΠΡΠ°ΡΠΈΠΊ β6 "Π‘ΡΠΌΠΌΠ° ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠΉ ΠΏΠΎ-ΡΠ°Π³ΠΎΠ²ΠΎ"'
plt.plot(range(len(list_parametres_stoch_gradient_descence[1])), list_parametres_stoch_gradient_descence[1], color='red', lw=2)
plt.xlabel('Steps (Iteration)', size=16)
plt.ylabel('Sum of squared deviations', size=16)
plt.show()
Igrafu No. 6 βIsamba sokuchezuka okuyisikwele ngesikhathi sokwehla kwe-stochastic gradient (izinyathelo eziyizinkulungwane ezingu-80)β
Ama-coefficients athuthukile, kodwa namanje awalungile. Ngokuqagela, lokhu kungalungiswa ngale ndlela. Sikhetha, isibonelo, ekuphindaphindweni okungu-1000 kokugcina amanani ama-coefficients okwenziwa ngawo iphutha elincane. Yiqiniso, kulokhu kuzodingeka futhi sibhale phansi amanani ama-coefficients ngokwawo. Ngeke sikwenze lokhu, kodwa kunalokho sinake isimiso. Kubukeka bushelelezi futhi iphutha libonakala lehla ngokulinganayo. Empeleni lokhu akulona iqiniso. Ake sibheke ukuphindaphinda okungu-1000 kokuqala futhi sikuqhathanise nokugcina.
Ikhodi yeshadi le-SGD (izinyathelo zokuqala eziyi-1000)
print 'ΠΡΠ°ΡΠΈΠΊ β7 "Π‘ΡΠΌΠΌΠ° ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠΉ ΠΏΠΎ-ΡΠ°Π³ΠΎΠ²ΠΎ. ΠΠ΅ΡΠ²ΡΠ΅ 1000 ΠΈΡΠ΅ΡΠ°ΡΠΈΠΉ"'
plt.plot(range(len(list_parametres_stoch_gradient_descence[1][:1000])),
list_parametres_stoch_gradient_descence[1][:1000], color='red', lw=2)
plt.xlabel('Steps (Iteration)', size=16)
plt.ylabel('Sum of squared deviations', size=16)
plt.show()
print 'ΠΡΠ°ΡΠΈΠΊ β7 "Π‘ΡΠΌΠΌΠ° ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠΉ ΠΏΠΎ-ΡΠ°Π³ΠΎΠ²ΠΎ. ΠΠΎΡΠ»Π΅Π΄Π½ΠΈΠ΅ 1000 ΠΈΡΠ΅ΡΠ°ΡΠΈΠΉ"'
plt.plot(range(len(list_parametres_stoch_gradient_descence[1][-1000:])),
list_parametres_stoch_gradient_descence[1][-1000:], color='red', lw=2)
plt.xlabel('Steps (Iteration)', size=16)
plt.ylabel('Sum of squared deviations', size=16)
plt.show()
Igrafu No. 7 βIsamba sokuchezuka kwesikwele SGD (izinyathelo zokuqala ezingu-1000)β
Igrafu No. 8 βIsamba sokuchezuka okuyisikwele SGD (izinyathelo zokugcina ezingu-1000)β
Ekuqaleni kokwehla, sibona iphutha elilinganayo kanye nokwehla okukhulu. Ekuphindaphindweni kokugcina, sibona ukuthi iphutha lizungeza futhi lizungeze inani le-1,475 futhi ngezinye izikhathi lilingana naleli nani eliphelele, kodwa-ke liyakhuphuka... Ngiyaphinda, ungabhala phansi amanani we- ama-coefficients ΠΈ , bese ukhetha lezo iphutha elincane kuzo. Kodwa-ke, sibe nenkinga enkulu kakhulu: bekufanele sithathe izinyathelo eziyizinkulungwane ezingama-80 (bona ikhodi) ukuze sithole amanani asondele kokulungile. Futhi lokhu vele kuyaphikisana nomqondo wokonga isikhathi sokubala ngokwehla kwe-stochastic gradient okuhlobene nokwehla kwe-gradient. Yini engalungiswa futhi ithuthukiswe? Akunzima ukuqaphela ukuthi ekuphindaphindweni kokuqala siyehla ngokuzethemba futhi, ngakho-ke, kufanele sishiye isinyathelo esikhulu ekuphindaphindweni kokuqala futhi sinciphise isinyathelo njengoba siqhubekela phambili. Ngeke sikwenze lokhu kulesi sihloko - sesivele side kakhulu. Labo abafisayo bangazicabangela bona ukuthi bangakwenza kanjani lokhu, akunzima :)
Manje masenze ukwehla kwe-stochastic gradient sisebenzisa umtapo wolwazi I-NumPy (futhi masingakhubeki ematsheni esiwahlonze ekuqaleni)
Ikhodi yokwehla kwe-Stochastic Gradient (NumPy)
# Π΄Π»Ρ Π½Π°ΡΠ°Π»Π° Π½Π°ΠΏΠΈΡΠ΅ΠΌ ΡΡΠ½ΠΊΡΠΈΡ Π³ΡΠ°Π΄ΠΈΠ΅Π½ΡΠ½ΠΎΠ³ΠΎ ΡΠ°Π³Π°
def stoch_grad_step_numpy(vector_init, X, ind, y, l):
x = X[ind]
y_pred = np.dot(x,vector_init)
err = y_pred - y[ind]
grad_a = err
grad_b = x[1]*err
return vector_init - l*np.array([grad_a, grad_b])
# ΠΎΠΏΡΠ΅Π΄Π΅Π»ΠΈΠΌ ΡΡΠ½ΠΊΡΠΈΡ ΡΡΠΎΡ
Π°ΡΡΠΈΡΠ΅ΡΠΊΠΎΠ³ΠΎ Π³ΡΠ°Π΄ΠΈΠ΅Π½ΡΠ½ΠΎΠ³ΠΎ ΡΠΏΡΡΠΊΠ°
def stoch_grad_descent_numpy(X, y, l=0.1, steps = 800):
vector_init = np.array([[np.random.randint(X.shape[0])], [np.random.randint(X.shape[0])]])
errors = []
for i in range(steps):
ind = np.random.randint(X.shape[0])
new_vector = stoch_grad_step_numpy(vector_init, X, ind, y, l)
vector_init = new_vector
errors.append(error_square_numpy(vector_init,X,y))
return (vector_init), (errors)
# Π·Π°ΠΏΠΈΡΠ΅ΠΌ ΠΌΠ°ΡΡΠΈΠ² Π·Π½Π°ΡΠ΅Π½ΠΈΠΉ
list_parametres_stoch_gradient_descence = stoch_grad_descent_numpy(x_np, y_np, l=0.001, steps = 80000)
print ' 33[1m' + ' 33[4m' + "ΠΠ½Π°ΡΠ΅Π½ΠΈΡ ΠΊΠΎΡΡΡΠΈΡΠΈΠ΅Π½ΡΠΎΠ² a ΠΈ b:" + ' 33[0m'
print 'a =', round(list_parametres_stoch_gradient_descence[0][0],3)
print 'b =', round(list_parametres_stoch_gradient_descence[0][1],3)
print
print ' 33[1m' + ' 33[4m' + "Π‘ΡΠΌΠΌΠ° ΠΊΠ²Π°Π΄ΡΠ°ΡΠΎΠ² ΠΎΡΠΊΠ»ΠΎΠ½Π΅Π½ΠΈΠΉ:" + ' 33[0m'
print round(list_parametres_stoch_gradient_descence[1][-1],3)
print
print ' 33[1m' + ' 33[4m' + "ΠΠΎΠ»ΠΈΡΠ΅ΡΡΠ²ΠΎ ΠΈΡΠ΅ΡΠ°ΡΠΈΠΉ Π² ΡΡΠΎΡ
Π°ΡΡΠΈΡΠ΅ΡΠΊΠΎΠΌ Π³ΡΠ°Π΄ΠΈΠ΅Π½ΡΠ½ΠΎΠΌ ΡΠΏΡΡΠΊΠ΅:" + ' 33[0m'
print len(list_parametres_stoch_gradient_descence[1])
print
Amanani acishe afana nalapho ehla ngaphandle kokusebenzisa I-NumPy. Nokho, lokhu kunengqondo.
Ake sithole ukuthi ukwehla kwe-stochastic gradient kusithatha isikhathi esingakanani.
Ikhodi yokunquma isikhathi sokubala se-SGD (izinyathelo eziyizinkulungwane ezingama-80)
print ' 33[1m' + ' 33[4m' +
"ΠΡΠ΅ΠΌΡ Π²ΡΠΏΠΎΠ»Π½Π΅Π½ΠΈΡ ΡΡΠΎΡ
Π°ΡΡΠΈΡΠ΅ΡΠΊΠΎΠ³ΠΎ Π³ΡΠ°Π΄ΠΈΠ΅Π½ΡΠ½ΠΎΠ³ΠΎ ΡΠΏΡΡΠΊΠ° Π±Π΅Π· ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΡ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠΈ NumPy:"
+ ' 33[0m'
%timeit list_parametres_stoch_gradient_descence = stoch_grad_descent_usual(x_us, y_us, l=0.001, steps = 80000)
print '***************************************'
print
print ' 33[1m' + ' 33[4m' +
"ΠΡΠ΅ΠΌΡ Π²ΡΠΏΠΎΠ»Π½Π΅Π½ΠΈΡ ΡΡΠΎΡ
Π°ΡΡΠΈΡΠ΅ΡΠΊΠΎΠ³ΠΎ Π³ΡΠ°Π΄ΠΈΠ΅Π½ΡΠ½ΠΎΠ³ΠΎ ΡΠΏΡΡΠΊΠ° Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠΈ NumPy:"
+ ' 33[0m'
%timeit list_parametres_stoch_gradient_descence = stoch_grad_descent_numpy(x_np, y_np, l=0.001, steps = 80000)
Ukuqhubekela phambili ehlathini, amafu amnyama: futhi, ifomula "ezibhalile" ibonisa umphumela omuhle kakhulu. Konke lokhu kusikisela ukuthi kufanele kube nezindlela ezicashile zokusebenzisa umtapo wolwazi I-NumPy, okusheshisa ngempela ukusebenza kokubala. Kulesi sihloko ngeke sifunde ngabo. Kuzoba nokuthile ongacabanga ngakho ngesikhathi sakho esikhululekile :)
Sifingqa
Ngaphambi kokufingqa, ngithanda ukuphendula umbuzo okungenzeka uvele kumfundi wethu othandekayo. Kungani, empeleni, "ukuhlukunyezwa" okunjalo nokwehla, kungani sidinga ukuhamba sehla senyuka intaba (ikakhulukazi phansi) ukuze sithole indawo ephansi eyigugu, uma ezandleni zethu siphethe idivayisi enamandla futhi elula, uhlobo lwesixazululo sokuhlaziya, esisithumela ngokushesha endaweni elungile?
Impendulo yalo mbuzo itholakala phezulu. Manje sibheke isibonelo esilula kakhulu, lapho impendulo yeqiniso ikhona kuncike kuphawu olulodwa . Awukuboni lokhu kaningi empilweni, ngakho-ke ake sicabange ukuthi sinezimpawu ezingu-2, 30, 50 noma ngaphezulu. Ake sengeze kulokhu izinkulungwane, noma amashumi ezinkulungwane zamavelu kusibaluli ngasinye. Kulesi simo, isisombululo sokuhlaziya singase singamelani nokuhlolwa futhi sehluleke. Ngokulandelayo, ukwehla kwe-gradient nokuhluka kwakho kuzosisondeza kancane kancane kumgomo - ubuncane bomsebenzi. Futhi ungakhathazeki ngesivinini - cishe sizobheka izindlela ezizosivumela ukuthi sibeke futhi silawule ubude besinyathelo (okungukuthi, isivinini).
Futhi manje isifinyezo sangempela esifushane.
Okokuqala, ngithemba ukuthi izinto ezivezwe esihlokweni zizosiza ekuqaleni βososayensi bedathaβ ekuqondeni indlela yokuxazulula izilinganiso zokuhlehla zomugqa ezilula (hhayi nje kuphela).
Okwesibili, sibheke izindlela ezimbalwa zokuxazulula i-equation. Manje, kuye ngesimo, singakhetha lowo ofaneleka kakhulu ukuxazulula inkinga.
Okwesithathu, sibone amandla ezilungiselelo ezengeziwe, okungukuthi ubude besinyathelo sokwehla kwe-gradient. Le pharamitha ayikwazi ukunganakwa. Njengoba kuphawuliwe ngenhla, ukuze kuncishiswe izindleko zokubala, ubude besinyathelo kufanele bushintshwe ngesikhathi sokwehla.
Okwesine, esimweni sethu, imisebenzi "ebhalwe ekhaya" ibonise imiphumela yesikhathi engcono kakhulu yokubala. Lokhu mhlawumbe kungenxa yokungasebenzisi kahle kakhulu kwamakhono omtapo wolwazi I-NumPy. Kodwa noma ngabe kunjalo, isiphetho esilandelayo siyazisikisela. Ngakolunye uhlangothi, ngezinye izikhathi kufanelekile ukubuza imibono emisiwe, futhi ngakolunye uhlangothi, akufanelekile ngaso sonke isikhathi ukuhlanganisa yonke into - ngokuphambene nalokho, ngezinye izikhathi indlela elula yokuxazulula inkinga iphumelela kakhulu. Futhi njengoba inhloso yethu bekuwukuhlaziya izindlela ezintathu zokuxazulula isibalo esilula sokuhlehla komugqa, ukusetshenziswa kwemisebenzi "ezibhale ngokwakho" bekwanele kithi.
Imibhalo (noma into efana naleyo)
1. Ukwehla komugqa
2. Indlela encane yezikwele
3. Okuphuma kokunye
I-4. I-Gradient
5. Ukwehla kwe-gradient
6. Umtapo wolwazi we-NumPy