Hanyar da za a bi don bincika layin 4 miliyan na lambar Python. Kashi na 2

A yau muna buga kashi na biyu na fassarar abu game da yadda Dropbox ya tsara nau'in sarrafa nau'in don layukan miliyan da yawa na lambar Python.

Hanyar da za a bi don bincika layin 4 miliyan na lambar Python. Kashi na 2

Karanta kashi na daya

Nau'in tallafi na hukuma (PEP 484)

Mun gudanar mu farko tsanani gwaje-gwaje da mypy a Dropbox a lokacin Hack Week 2014. Hack Week ne daya-mako taron hosted by Dropbox. A wannan lokacin, ma'aikata na iya aiki akan duk abin da suke so! Wasu shahararrun ayyukan fasaha na Dropbox sun fara a abubuwan da suka faru kamar waɗannan. Sakamakon wannan gwaji, mun kammala cewa mypy yana da kyau, kodayake aikin bai riga ya shirya don amfani da yawa ba.

A lokacin, ra'ayin daidaita tsarin tsarin tsarin Python yana cikin iska. Kamar yadda na ce, tun da Python 3.0 yana yiwuwa a yi amfani da nau'in annotations don ayyuka, amma waɗannan maganganu ne kawai na sabani, ba tare da ƙayyadaddun kalmomi da ma'ana ba. A lokacin aiwatar da shirin, waɗannan bayanan an yi watsi da su, a mafi yawancin, kawai an yi watsi da su. Bayan Makon Hack, mun fara aiki akan daidaita ilimin tarukan. Wannan aikin ya haifar da fitowar Farashin 484 (Guido van Rossum, Łukasz Langa da ni mun yi aiki tare akan wannan takarda).

Ana iya kallon dalilanmu ta bangarori biyu. Da farko, muna fatan cewa dukkanin halittun Python za su iya ɗaukar hanyar gama gari don amfani da nau'in alamomi (kalmar da aka yi amfani da ita a Python a matsayin daidai da "nau'in annotations"). Wannan, idan aka yi la'akari da yiwuwar haɗari, zai fi kyau fiye da amfani da hanyoyin da ba su dace da juna ba. Na biyu, mun so mu fito fili mu tattauna hanyoyin tantance nau'in tare da yawancin membobin al'ummar Python. Wannan sha’awar ta wani bangare ne ta hanyar gaskiyar cewa ba za mu so mu yi kama da “’yan ridda” daga ainihin ra’ayoyin harshen a idon ɗimbin masu shirye-shiryen Python ba. Harshe ne da ake bugawa da ƙarfi, wanda aka sani da "buga duck". A cikin al'umma, tun da farko, wani ɗan shakku game da ra'ayin buga rubutu ya kasa tashi. Amma wannan tunanin daga ƙarshe ya ɓace bayan ya bayyana a fili cewa buga rubutu a tsaye ba zai zama tilas ba (kuma bayan mutane sun gane cewa yana da amfani a zahiri).

Nau'in ƙayyadaddun ƙayyadaddun ƙayyadaddun ƙayyadaddun ƙayyadaddun ƙayyadaddun ƙayyadaddun ƙayyadaddun ƙayyadaddun ƙayyadaddun ƙayyadaddun ƙayyadaddun ƙa'idodi suka yi kama da abin da mypy ke tallafawa a lokacin. An saki PEP 484 tare da Python 3.5 a cikin 2015. Python ba ya zama yaren da ake bugawa mai ƙarfi ba. Ina so in yi la'akari da wannan taron a matsayin wani muhimmin ci gaba a tarihin Python.

Fara ƙaura

A ƙarshen 2015, Dropbox ya ƙirƙiri ƙungiyar mutane uku don aiki akan mypy. Sun hada da Guido van Rossum, Greg Price da David Fisher. Tun daga wannan lokacin, lamarin ya fara haɓaka cikin sauri. Farkon cikas ga ci gaban mypy shine aiki. Kamar yadda na nuna a sama, a farkon kwanakin aikin na yi tunani game da fassara aiwatar da mypy zuwa C, amma wannan ra'ayin ya ketare daga jerin a yanzu. Mun kasance makale tare da gudanar da tsarin ta amfani da mai fassarar CPython, wanda ba shi da sauri don kayan aiki kamar mypy. (aikin PyPy, madadin aiwatar da Python tare da mai tara JIT, bai taimaka mana ba.)

Abin farin ciki, wasu haɓaka algorithmic sun zo taimakonmu anan. Na farko mai ƙarfi "mai sauri" shine aiwatar da ƙarin dubawa. Tunanin da ke bayan wannan haɓakawa ya kasance mai sauƙi: idan duk abubuwan dogaro na module ba su canza ba tun lokacin gudu na baya na mypy, to zamu iya amfani da bayanan da aka adana yayin gudu na baya yayin aiki tare da dogaro. Muna buƙatar kawai yin duba nau'in akan fayilolin da aka gyara da kuma fayilolin da suka dogara da su. Mypy ma ya ɗan ci gaba kaɗan: idan ƙirar waje ta module ɗin ba ta canza ba, mypy ya ɗauka cewa sauran samfuran da suka shigo da wannan rukunin ba sa buƙatar sake duba su.

Binciken haɓaka ya taimaka mana da yawa lokacin da aka ba da bayanin adadi mai yawa na data kasance. Ma'anar ita ce, wannan tsari yakan ƙunshi yawancin tafiyar matakai na mypy yayin da ake ƙara bayanai a hankali zuwa lambar kuma a hankali an inganta su. Gudun farko na mypy har yanzu yana jinkiri sosai saboda yana da abubuwan dogaro da yawa don dubawa. Sa'an nan, don inganta halin da ake ciki, mun aiwatar da tsarin caching na nesa. Idan mypy ya gano cewa ma'ajin gida na iya ƙarewa, yana zazzage hoton cache na yanzu don gabaɗayan codebase daga ma'ajiya ta tsakiya. Sannan yana yin ƙarin bincike ta amfani da wannan hoton. Wannan ya ƙara mana wani babban mataki don haɓaka aikin mypy.

Wannan lokaci ne na ɗaukar nau'in bincika nau'in cikin sauri da dabi'a a Dropbox. A ƙarshen 2016, mun riga mun sami kusan layin 420000 na lambar Python tare da nau'in annotations. Yawancin masu amfani sun kasance masu sha'awar duba nau'in. Ƙungiyoyin ci gaba da yawa suna amfani da Dropbox mypy.

Komai yayi kyau a lokacin, amma har yanzu muna da abubuwa da yawa da za mu yi. Mun fara gudanar da bincike na masu amfani na cikin gida na lokaci-lokaci don gano wuraren matsalolin aikin kuma mu fahimci abubuwan da ake buƙatar warwarewa da farko (har yanzu ana amfani da wannan aikin a cikin kamfanin a yau). Mafi mahimmanci, kamar yadda ya bayyana, ayyuka biyu ne. Na farko, muna buƙatar ƙarin nau'in ɗaukar hoto na lambar, na biyu, muna buƙatar mypy don yin aiki da sauri. A bayyane yake cewa aikinmu na hanzarta mypy da aiwatar da shi cikin ayyukan kamfani har yanzu bai ƙare ba. Mu, da cikakken sanin mahimmancin waɗannan ayyuka guda biyu, mun saita game da warware su.

Ƙarin yawan aiki!

Bincike na haɓaka ya sanya mypy sauri, amma har yanzu kayan aikin bai yi sauri ba. Yawancin gwaje-gwajen haɓaka sun ɗauki kusan minti ɗaya. Dalilin haka kuwa shi ne shigo da kaya na cyclical. Wataƙila wannan ba zai ba duk wanda ya yi aiki tare da manyan codebases da aka rubuta cikin Python ba. Muna da jerin ɗaruruwan kayayyaki, kowannensu yana shigo da sauran a kaikaice. Idan an canza kowane fayil a cikin madauki na shigo da kaya, mypy dole ne ya aiwatar da duk fayilolin da ke cikin wannan madauki, kuma galibi kowane nau'ikan da ke shigo da kayayyaki daga wannan madauki. Ɗaya daga cikin irin wannan sake zagayowar shine sanannen "dogara tangle" wanda ya haifar da matsala mai yawa a Dropbox. Da zarar wannan tsarin ya ƙunshi nau'o'i ɗari da yawa, yayin da aka shigo da shi, kai tsaye ko a kaikaice, gwaje-gwaje da yawa, an kuma yi amfani da shi a cikin lambar samarwa.

Mun yi la'akari da yuwuwar "cirewa" dogara da madauwari, amma ba mu da albarkatun da za mu yi. Akwai lambobi da yawa da ba mu saba dasu ba. Sakamakon haka, mun fito da wata hanya dabam. Mun yanke shawarar yin mypy aiki da sauri ko da a gaban "dependency tangles". Mun cimma wannan burin ta amfani da mypy daemon. Daemon shine tsarin uwar garken da ke aiwatar da abubuwa biyu masu ban sha'awa. Da fari dai, yana adana bayanai game da gabaɗayan codebase a cikin ƙwaƙwalwar ajiya. Wannan yana nufin cewa duk lokacin da kake gudanar da mypy, ba dole ba ne ka loda bayanan da ke da alaƙa da dubban abubuwan dogaro da aka shigo da su. Na biyu, a hankali, a matakin ƙananan raka'a, yana nazarin abubuwan dogaro tsakanin ayyuka da sauran abubuwan. Misali, idan aikin foo kira aiki bar, to akwai dogaro foo daga bar. Lokacin da fayil ya canza, daemon farko, a keɓe, yana aiwatar da fayil ɗin da aka canza kawai. Sannan yana duban canje-canjen da ake iya gani a waje zuwa wancan fayil, kamar canza sa hannun ayyuka. Daemon yana amfani da cikakkun bayanai game da shigo da kaya kawai don duba sau biyu ayyukan waɗanda a zahiri suke amfani da aikin da aka gyara. Yawanci, tare da wannan hanyar, dole ne ku duba ayyuka kaɗan.

Aiwatar da duk wannan bai kasance mai sauƙi ba, tun da ainihin aiwatarwar mypy ya mayar da hankali sosai kan sarrafa fayil ɗaya a lokaci guda. Dole ne mu magance yawancin yanayi na kan iyaka, abin da ya faru ya buƙaci sake dubawa a lokuta inda wani abu ya canza a cikin lambar. Misali, wannan yana faruwa lokacin da aka sanya aji sabon aji. Da zarar mun yi abin da muke so, mun sami damar rage lokacin aiwatar da mafi yawan cak ɗin zuwa ƴan daƙiƙa kaɗan. Wannan kamar babbar nasara ce a gare mu.

Ko da ƙarin yawan aiki!

Tare da caching na nesa da na yi magana a sama, mypy daemon ya kusan magance matsalolin da ke tasowa lokacin da mai tsara shirye-shirye akai-akai yana bincika nau'in, yana yin canje-canje zuwa ƙananan adadin fayiloli. Koyaya, aikin tsarin a cikin mafi ƙarancin yanayin amfani har yanzu bai yi kyau ba. Farawa mai tsabta na mypy na iya ɗaukar sama da mintuna 15. Kuma wannan ya kasance fiye da yadda za mu yi farin ciki da shi. Kowane mako lamarin ya yi muni yayin da masu shirye-shirye ke ci gaba da rubuta sabon lamba da kuma ƙara bayanai zuwa lambar da ke akwai. Masu amfani da mu har yanzu suna jin yunwa don ƙarin aiki, amma mun yi farin cikin saduwa da su rabin hanya.

Mun yanke shawarar komawa ɗaya daga cikin ra'ayoyin farko game da mypy. Wato, don canza lambar Python zuwa lambar C. Gwaji tare da Cython (tsarin da ke ba ku damar fassara lambar da aka rubuta a cikin Python zuwa lambar C) bai ba mu wani saurin gani ba, don haka muka yanke shawarar farfado da tunanin rubuta namu mai tarawa. Tun da mypy codebase (wanda aka rubuta a Python) ya riga ya ƙunshi duk mahimman bayanai na nau'in, muna tunanin zai dace a yi ƙoƙarin amfani da waɗannan bayanan don hanzarta tsarin. Na yi sauri na ƙirƙiri wani samfuri don gwada wannan ra'ayin. Ya nuna haɓaka fiye da ninki 10 a cikin aiki akan maƙasudan ƙananan ma'auni daban-daban. Tunaninmu shine mu haɗa nau'ikan Python zuwa nau'ikan C ta amfani da Cython, da kuma juya nau'in annotations zuwa nau'in bincike na lokaci-lokaci (yawanci rubuta annotations ana watsi da su a lokacin gudu kuma ana amfani da su ta hanyar tsarin dubawa kawai). A zahiri mun yi niyyar fassara aiwatar da mypy daga Python zuwa yaren da aka tsara don a buga shi daidai gwargwado, wanda zai yi kama (kuma, galibi, aiki) daidai kamar Python. (Irin wannan ƙaura ta yaren giciye ya zama wani abu na al'adar aikin mypy. An rubuta ainihin aiwatar da mypy a cikin Alore, sannan akwai haɗin haɗin gwiwar Java da Python).

Mayar da hankali kan API ɗin tsawo na CPython shine mabuɗin don rashin rasa ikon sarrafa ayyukan. Ba mu buƙatar aiwatar da injin kama-da-wane ko kowane ɗakin karatu da mypy ke buƙata ba. Bugu da kari, za mu iya samun damar yin amfani da duk tsarin halittun Python da duk kayan aikin (kamar pytest). Wannan yana nufin cewa za mu iya ci gaba da yin amfani da lambar Python da aka fassara a lokacin haɓakawa, yana ba mu damar ci gaba da aiki tare da tsari mai sauri na yin canje-canjen lambar da gwada shi, maimakon jiran lambar don tattarawa. Da alama muna yin babban aiki na zama a kan kujeru biyu, a ce, kuma muna son shi.

Mai tarawa, wanda muka kira mypyc (tunda yana amfani da mypy azaman gaba-gaba don nazarin nau'ikan), ya zama babban aiki mai nasara. Gabaɗaya, mun sami kusan saurin gudu na 4x don gudanar da mypy akai-akai ba tare da caching ba. Haɓaka ainihin aikin mypyc ya ɗauki ƙaramin ƙungiyar Michael Sullivan, Ivan Levkivsky, Hugh Hahn, da kaina game da watanni 4 na kalanda. Wannan adadin aikin ya yi ƙasa da abin da ake buƙata don sake rubuta mypy, misali, a C++ ko Go. Kuma dole ne mu yi canje-canje kaɗan a aikin fiye da yadda za mu yi sa’ad da muke sake rubuta shi a wani yare. Mun kuma yi fatan za mu iya kawo mypyc zuwa irin wannan matakin da sauran masu shirye-shiryen Dropbox za su iya amfani da shi don tattarawa da kuma hanzarta lambar su.

Don cimma wannan matakin na aikin, dole ne mu yi amfani da wasu hanyoyin injiniya masu ban sha'awa. Don haka, mai tarawa zai iya hanzarta ayyuka da yawa ta hanyar amfani da sauri, ƙananan matakan ginawa C. Misali, ana fassara kiran aikin da aka haɗa zuwa kiran aikin C. Kuma irin wannan kiran yana da sauri fiye da kiran aikin da aka fassara. Wasu ayyuka, kamar duba ƙamus, har yanzu sun haɗa da yin amfani da kiran C-API na yau da kullun daga CPython, waɗanda ba su da sauri kawai lokacin da aka haɗa su. Mun sami damar kawar da ƙarin nauyin akan tsarin da aka yi ta hanyar fassarar, amma wannan a cikin wannan yanayin ya ba da ƙananan riba kawai dangane da aikin.

Don gano ayyukan “hankali” na yau da kullun, mun yi bayanin martabar lamba. Tare da wannan bayanan, mun yi ƙoƙarin ko dai tweak mypyc ta yadda zai samar da lambar C mai sauri don irin waɗannan ayyuka, ko kuma sake rubuta lambar Python daidai ta amfani da ayyuka masu sauri (kuma wani lokacin ba mu da isasshen mafita ga wannan ko wata matsala). . Sake rubuta lambar Python sau da yawa ya kasance mafita mafi sauƙi ga matsalar fiye da samun mai tarawa ta atomatik yin canji iri ɗaya. A cikin dogon lokaci, muna son sarrafa yawancin waɗannan sauye-sauye, amma a lokacin mun mai da hankali kan haɓaka mypy tare da ƙaramin ƙoƙari. Kuma a matsawa zuwa wannan burin, mun yanke sasanninta da yawa.

A ci gaba…

Ya ku masu karatu! Menene ra'ayin ku game da aikin mypy lokacin da kuka sami labarinsa?

Hanyar da za a bi don bincika layin 4 miliyan na lambar Python. Kashi na 2
Hanyar da za a bi don bincika layin 4 miliyan na lambar Python. Kashi na 2

source: www.habr.com

Add a comment