Sukar ƙa'idar da hanyoyin ƙungiyoyi na Telegram. Sashe na 1, fasaha: ƙwarewar rubuta abokin ciniki daga karce - TL, MT

Kwanan nan, posts game da yadda Telegram yake da kyau, yadda ƙwararrun ƙwararrun 'yan'uwan Durov ke gina tsarin sadarwa, da dai sauransu sun fara bayyana sau da yawa akan Habré. A lokaci guda, mutane kaɗan ne suka nutsar da kansu a cikin na'urar fasaha - aƙalla, suna amfani da sauƙi mai sauƙi (kuma wanda ya bambanta da MTProto) tushen Bot API na JSON, kuma yawanci kawai suna karɓa. kan imani duk yabo da PR da ke kewaye da manzo. Kusan shekara daya da rabi da suka wuce, abokin aikina a kungiyar NGO ta Eshelon Vasily (abin takaici, an goge asusunsa na Habré tare da daftarin) ya fara rubuta nasa abokin ciniki na Telegram daga karce a Perl, kuma daga baya marubucin waɗannan layin ya shiga. Me yasa Perl, wasu zasu tambaya nan da nan? Domin irin waɗannan ayyukan sun riga sun wanzu a cikin wasu harsuna, a gaskiya, wannan ba shine batun ba, za a iya samun wani harshe da ba a taɓa yin ba. shirye-shiryen ɗakin karatu, kuma a kan haka dole ne marubucin ya bi duk hanya daga karce. Haka kuma, cryptography lamari ne na amana, amma tabbatar. Tare da samfurin da aka yi niyya don tsaro, ba za ku iya dogaro kawai da shirye-shiryen ɗakin karatu daga masana'anta ba kuma ku amince da shi a makance (duk da haka, wannan batu ne na sashi na biyu). A halin yanzu, ɗakin karatu yana aiki sosai a matakin "matsakaici" (yana ba ku damar yin kowane buƙatun API).

Koyaya, ba za a sami cryptography ko lissafi da yawa a cikin wannan jerin posts ba. Amma za a sami wasu cikakkun bayanai na fasaha da kayan aikin gine-gine (kuma masu amfani ga waɗanda ba za su rubuta daga karce ba, amma za su yi amfani da ɗakin karatu a kowane harshe). Don haka, babban burin shine ƙoƙarin aiwatar da abokin ciniki daga karce bisa ga takardun hukuma. Wato, bari mu ɗauka cewa an rufe lambar tushe na abokan ciniki na hukuma (kuma, a cikin kashi na biyu za mu yi cikakken bayani game da batun gaskiyar cewa wannan gaskiya ne. yana faruwa don haka), amma, kamar yadda yake a zamanin da, alal misali, akwai ma'auni kamar RFC - shin zai yiwu a rubuta abokin ciniki bisa ga ƙayyadaddun ƙayyadaddun shi kaɗai, "ba tare da duban" lambar tushe ba, zama na hukuma (Telegram Desktop, wayar hannu), ko Telethon wanda ba na hukuma ba?

Ɗaukaka:

Takaddun bayanai... akwai, dama? Gaskiya ne?..

An fara tattara gutsure na bayanin kula don wannan labarin lokacin rani na ƙarshe. Duk wannan lokacin a kan official website https://core.telegram.org Takardun ya kasance har zuwa Layer 23, i.e. makale a wani wuri a cikin 2014 (tuna, akwai ba ma tashoshi a baya?). Tabbas, a cikin ka'idar, wannan yakamata ya ba mu damar aiwatar da abokin ciniki tare da aiki a wancan lokacin a cikin 2014. Amma ko da a cikin wannan hali, takardun ya kasance, na farko, bai cika ba, kuma na biyu, a wuraren da ya saba wa kansa. Sama da wata guda da ya gabata, a watan Satumbar 2019, ya kasance ba zato ba tsammani An gano cewa akwai babban sabuntawa na takaddun akan rukunin yanar gizon, don Layer 105 na baya-bayan nan, tare da bayanin cewa yanzu komai yana buƙatar sake karantawa. Lallai, an yi bitar labarai da yawa, amma da yawa sun kasance ba su canza ba. Sabili da haka, lokacin karanta zargi da ke ƙasa game da takardun, ya kamata ku tuna cewa wasu daga cikin waɗannan abubuwan ba su da mahimmanci, amma wasu har yanzu suna da kyau. Bayan haka, shekaru 5 a cikin duniyar zamani ba kawai dogon lokaci ba ne, amma sosai mai yawa. Tun daga waɗannan lokutan (musamman idan ba ku yi la'akari da wuraren da aka jefar da kuma farfado da shafukan geochat ba tun lokacin), adadin hanyoyin API a cikin tsarin ya girma daga ɗari zuwa fiye da ɗari biyu da hamsin!

A ina zan fara a matsayin matashin marubuci?

Ba kome ko ka rubuta daga karce ko amfani, misali, shirye-shiryen dakunan karatu kamar Telethon don Python ko Madeline don PHP, a kowane hali, za ku buƙaci farko yi rijistar aikace-aikacen ku - sami sigogi api_id и api_hash (Waɗanda suka yi aiki tare da VKontakte API nan da nan suka fahimta) wanda uwar garken zai gano aikace-aikacen. Wannan dole yi don dalilai na doka, amma za mu ƙara yin magana game da dalilin da yasa marubutan ɗakin karatu ba za su iya buga shi a kashi na biyu ba. Kuna iya gamsuwa da ƙimar gwajin, kodayake suna da iyaka sosai - gaskiyar ita ce yanzu zaku iya yin rajista kai kadai app, don haka kar a yi saurin shiga cikin sa.

Yanzu, daga ra'ayi na fasaha, ya kamata mu kasance masu sha'awar gaskiyar cewa bayan rajista ya kamata mu karɓi sanarwa daga Telegram game da sabuntawa zuwa takaddun shaida, yarjejeniya, da sauransu. Wato, wanda zai iya ɗauka cewa an watsar da rukunin yanar gizon tare da docks kuma ya ci gaba da aiki musamman tare da waɗanda suka fara yin abokan ciniki, saboda. ya fi sauki. Amma a'a, ba a lura da irin wannan ba, babu wani bayani da ya zo.

Kuma idan kun rubuta daga karce, to amfani da sigogin da aka samu a zahiri har yanzu yana da nisa. Ko da yake https://core.telegram.org/ kuma yayi magana game da su a cikin Farawa da farko, a zahiri, zaku fara aiwatarwa MTProto yarjejeniya Amma idan kun yi ĩmãni layout bisa ga tsarin OSI a karshen shafin don cikakken bayanin ka'idar, to gaba daya a banza.

A zahiri, duka kafin da kuma bayan MTProto, akan matakan da yawa lokaci ɗaya (kamar yadda masu haɗin yanar gizo na ƙasashen waje ke aiki a cikin kernel OS suna faɗi, cin zarafi), babban batu, mai raɗaɗi da muni zai shiga cikin hanyar.

Serialization na binary: TL (Nau'in Harshe) da makircinsa, da yadudduka, da sauran kalmomi masu ban tsoro.

Wannan batu, a zahiri, shine mabuɗin matsalolin Telegram. Kuma za a sami munanan kalmomi da yawa idan kun yi ƙoƙarin kutsawa cikinsa.

Don haka, ga zane. Idan wannan kalmar ta zo a zuciyarka, ka ce, JSON tsarin, Kun yi tunani daidai. Manufar iri ɗaya ce: wasu harshe don bayyana yiwuwar saitin bayanan da aka watsa. Wannan shine inda kamanni ya ƙare. Idan daga shafin MTProto yarjejeniya, ko daga tushen bishiyar abokin ciniki na hukuma, za mu yi ƙoƙarin buɗe wasu makirci, za mu ga wani abu kamar:

int ? = Int;
long ? = Long;
double ? = Double;
string ? = String;

vector#1cb5c415 {t:Type} # [ t ] = Vector t;

rpc_error#2144ca19 error_code:int error_message:string = RpcError;

rpc_answer_unknown#5e2ad36e = RpcDropAnswer;
rpc_answer_dropped_running#cd78e586 = RpcDropAnswer;
rpc_answer_dropped#a43ad8b7 msg_id:long seq_no:int bytes:int = RpcDropAnswer;

msg_container#73f1f8dc messages:vector<%Message> = MessageContainer;

---functions---

set_client_DH_params#f5045f1f nonce:int128 server_nonce:int128 encrypted_data:bytes = Set_client_DH_params_answer;

ping#7abe77ec ping_id:long = Pong;
ping_delay_disconnect#f3427b8c ping_id:long disconnect_delay:int = Pong;

invokeAfterMsg#cb9f372d msg_id:long query:!X = X;
invokeAfterMsgs#3dc4b4f0 msg_ids:Vector<long> query:!X = X;

account.updateProfile#78515775 flags:# first_name:flags.0?string last_name:flags.1?string about:flags.2?string = User;
account.sendChangePhoneCode#8e57deb flags:# allow_flashcall:flags.0?true phone_number:string current_number:flags.0?Bool = auth.SentCode;

Mutumin da ya ga wannan a karon farko zai iya gane kawai wani ɓangare na abin da aka rubuta - da kyau, waɗannan su ne a fili tsarin (ko da yake ina sunan, hagu ko dama?), akwai filayen a cikinsu. bayan haka nau'in ya biyo bayan hanji ... tabbas. Anan a cikin ɓangarorin kusurwa akwai yuwuwar samfura kamar a cikin C++ (a zahiri, ba sosai ba). Kuma menene duk sauran alamomin ke nufi, alamomin tambaya, alamun tashin hankali, kashi dari, alamomin zanta (kuma a fili suna nufin abubuwa daban-daban a wurare daban-daban), wani lokacin akwai kuma wani lokacin ba, lambobin hexadecimal - kuma mafi mahimmanci, yadda ake samun daga wannan. dayan dama (wanda uwar garken ba za ta ƙi shi ba) rafin byte? Dole ne ku karanta takaddun (eh, akwai hanyoyin haɗi zuwa tsari a cikin sigar JSON kusa - amma hakan bai sa ya fito fili ba).

Bude shafin Binary Data Serialization da nutsewa cikin duniyar sihiri ta namomin kaza da ilimin lissafi, wani abu mai kama da matan a shekara ta 4. Harafi, nau'in, ƙima, mai haɗawa, mai haɗa aiki, nau'i na al'ada, nau'in haɗaka, nau'in polymorphic... kuma wannan shine kawai shafin farko! Na gaba yana jiran ku Harshen TL, wanda, ko da yake ya riga ya ƙunshi misali na buƙatu maras muhimmanci da amsa, ba ya ba da amsa kwata-kwata ga wasu al'amura na yau da kullun, wanda ke nufin cewa dole ne ku shiga ta hanyar sake maimaita lissafin lissafi da aka fassara daga Rashanci zuwa Turanci a kan wani nau'i takwas da aka haɗa. shafuka!

Masu karatu da suka saba da harsunan aiki da nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'i) zai iya ganin harshen bayanin a cikin wannan harshe,ko da misali, kamar yadda ya fi saba, kuma yana iya cewa wannan ba daidai ba ne a ka'ida. Abubuwan da ake adawa da hakan su ne:

  • iya, manufar sauti mai kyau, amma kash, ta ba a cimma ba
  • Ilimi a jami'o'in Rasha ya bambanta ko da a tsakanin ƙwararrun IT - ba kowa ne ya ɗauki kwas daidai ba
  • A ƙarshe, kamar yadda za mu gani, a aikace shi ne ba a buƙata ba, tunda an yi amfani da ƙayyadaddun juzu'i na har ma da TL da aka kwatanta

Kamar yadda aka fada LeoNerd a tashar #perl a cikin hanyar sadarwa ta FreeNode IRC, wanda yayi ƙoƙarin aiwatar da kofa daga Telegram zuwa Matrix (fassara ƙididdiga ba daidai ba ne daga ƙwaƙwalwar ajiya):

Yana jin kamar an gabatar da wani don buga ka'idar a karon farko, ya yi farin ciki, kuma ya fara ƙoƙarin yin wasa tare da ita, ba tare da damu da gaske ba ko ana buƙatar ta a aikace.

Duba don kanka, idan buƙatar buqatar-nau'i (Int, tsayi, da sauransu) kamar yadda dole ne a aiwatar da shi da hannu - misali, bari dole ne a yi kokarin samu daga gare su, a qarshe, bari dole ne a aiwatar da su daga gare su vector. Wato a zahiri. tsararru, idan kun kira abubuwan da suka haifar da sunayensu masu kyau.

Amma kafin

Takaitaccen bayanin juzu'i na TL syntax ga waɗanda ba su karanta takaddun hukuma ba

constructor = Type;
myVec ids:Vector<long> = Type;

fixed#abcdef34 id:int = Type2;

fixedVec set:Vector<Type2> = FixedVec;

constructorOne#crc32 field1:int = PolymorType;
constructorTwo#2crc32 field_a:long field_b:Type3 field_c:int = PolymorType;
constructorThree#deadcrc bit_flags_of_what_really_present:# optional_field4:bit_flags_of_what_really_present.1?Type = PolymorType;

an_id#12abcd34 id:int = Type3;
a_null#6789cdef = Type3;

Ma'anar koyaushe yana farawa maginin gini, Bayan haka na zaɓi (a aikace - koyaushe) ta hanyar alamar # dole ne CRC32 daga sigar bayanin da aka daidaita na wannan nau'in. Na gaba yana zuwa bayanin filayen; idan akwai su, nau'in na iya zama fanko. Wannan duk yana ƙarewa da alama daidai gwargwado, sunan nau'in wanda wannan maginin ya mallaka - wato, a zahiri, subtype - nasa ne. Mutumin da ke hannun dama na alamar daidai shine polymorphic - wato, takamaiman nau'ikan nau'ikan na iya dacewa da shi.

Idan ma'anar ta faru bayan layi ---functions---, to, haɗin gwiwar zai kasance iri ɗaya, amma ma'anar za ta bambanta: mai ginawa zai zama sunan aikin RPC, filayen za su zama sigogi (da kyau, wato, zai kasance daidai da tsarin da aka ba, kamar yadda aka bayyana a kasa). , wannan kawai zai zama ma'anar da aka sanya), da kuma "nau'in polymorphic" - nau'in sakamakon da aka dawo. Gaskiya ne, zai kasance har yanzu polymorphic - kawai an ayyana shi a cikin sashe ---types---, amma wannan maginin "ba za a yi la'akari da shi ba". Yin lodin nau'ikan ayyukan da ake kira ta hanyar hujjarsu, watau. Don wasu dalilai, ayyuka da yawa masu suna iri ɗaya amma sa hannu daban-daban, kamar a cikin C++, ba a tanadar su a cikin TL ba.

Me yasa "mai ginawa" da "polymorphic" idan ba OOP ba? To, a gaskiya ma, zai zama da sauƙi ga wani ya yi tunani game da wannan a cikin OOP sharuddan - wani nau'in polymorphic a matsayin nau'in nau'in nau'in nau'i, kuma masu ginin su ne nau'o'in zuriyarsa, kuma final a cikin kalmomi na harsuna da dama. A gaskiya, ba shakka, a nan kawai kamanceceniya tare da hanyoyin ginawa na gaske da yawa a cikin harsunan shirye-shiryen OO. Tun da a nan ne kawai bayanai Tsarin, babu hanyoyin (ko da yake bayanin ayyuka da kuma hanyoyin kara ne quite iya haifar da rudani a cikin kai cewa sun wanzu, amma shi ke wani daban-daban al'amari) - za ka iya tunanin wani magini a matsayin darajar daga. wanda ana ginawa rubuta lokacin karanta rafin byte.

Ta yaya hakan ke faruwa? Deserializer, wanda koyaushe yana karanta 4 bytes, yana ganin ƙimar 0xcrc32 - kuma ya fahimci abin da zai faru a gaba field1 da nau'in int, i.e. yana karanta daidai 4 bytes, akan wannan filin da ya wuce gona da iri PolymorType karanta. Yana gani 0x2crc32 kuma ya fahimci cewa akwai fage biyu gaba, na farko long, wanda ke nufin mun karanta 8 bytes. Kuma a sa'an nan kuma wani hadadden nau'i, wanda aka deserialized a cikin wannan hanya. Misali, Type3 za a iya bayyana a cikin da'irar da zaran biyu constructors, bi da bi, to dole ne su hadu ko dai 0x12abcd34, bayan haka kuna buƙatar karanta ƙarin bytes 4 int, ko 0x6789cdef, bayan haka babu wani abu. Wani abu kuma - kuna buƙatar jefa banda. Duk da haka, bayan wannan za mu koma karanta 4 bytes int filayen field_c в constructorTwo da haka muka gama karanta namu PolymorType.

A ƙarshe, idan an kama ku 0xdeadcrc to constructorThree, to komai ya zama mai rikitarwa. Filin mu na farko shine bit_flags_of_what_really_present da nau'in # - a gaskiya, wannan kawai laƙabi ne na nau'in nat, ma'ana "lambar halitta". Wato, a zahiri, int ɗin da ba a sanya hannu ba shine, ta hanya, yanayin kawai lokacin da lambobin da ba a sanya hannu ba suka faru a cikin da'irori na gaske. Don haka, na gaba shine ginin tare da alamar tambaya, ma'ana cewa wannan filin - zai kasance akan waya kawai idan an saita bit ɗin daidai a cikin filin da ake magana akai (kimanin kamar mai aiki na ternary). Don haka, bari mu ɗauka cewa an saita wannan bit, wanda ke nufin cewa gaba muna buƙatar karanta filin kamar Type, wanda a cikin misalinmu yana da masu ginin 2. Ɗayan fanko ne (ya ƙunshi mai ganowa kawai), ɗayan yana da fili ids da nau'in ids:Vector<long>.

Kuna iya tunanin cewa duka samfura da nau'ikan nau'ikan suna cikin ribobi ko Java. Amma a'a. Kusan. Wannan kadai yanayin amfani da maƙallan kusurwa a cikin da'irori na gaske, kuma ana amfani dashi kawai don Vector. A cikin rafi na byte, waɗannan za su zama 4 CRC32 bytes ga nau'in Vector kanta, koyaushe iri ɗaya ne, sannan 4 bytes - adadin abubuwan tsararru, sannan waɗannan abubuwan da kansu.

Ƙara wannan gaskiyar cewa serialization koyaushe yana faruwa a cikin kalmomi na 4 bytes, kowane nau'in nau'in nau'in nau'insa ne - an kuma bayyana nau'ikan da aka gina a ciki. bytes и string tare da serialization manual na tsawon da wannan jeri ta 4 - da kyau, yana da alama sauti na al'ada har ma da inganci? Ko da yake TL ana da'awar zama mai tasiri na binary serialization, zuwa jahannama tare da su, tare da fadada kusan kowane abu, har ma da ƙimar Boolean da kirtani guda ɗaya zuwa 4 bytes, shin JSON har yanzu zai kasance mai kauri? Duba, har ma filayen da ba dole ba za a iya tsallake su ta tutoci kaɗan, komai yana da kyau sosai, har ma da ƙari don nan gaba, don haka me zai hana a ƙara sabbin filayen zaɓi ga mai ginin daga baya? ..

Amma a'a, idan ba ku karanta taƙaitaccen bayanina ba, amma cikakkun takardun, kuma kuyi tunani game da aiwatarwa. Da fari dai, ana ƙididdige CRC32 na maginin bisa ga daidaitaccen layin rubutu na tsarin (cire ƙarin sararin samaniya, da sauransu) - don haka idan an ƙara sabon filin, layin bayanin nau'in zai canza, don haka CRC32 da Saboda haka, serialization. Kuma menene tsohon abokin ciniki zai yi idan ya sami filin da aka kafa sabbin tutoci, kuma bai san abin da zai yi da su ba?...

Na biyu, mu tuna CRC32, wanda ake amfani da shi a nan da gaske kamar ayyukan hash don tantance nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'i nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau’in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in (de) (de) (de) (de) (de) (de). Anan muna fuskantar matsalar karo-karo - kuma a'a, yuwuwar ba ɗaya ba ce a cikin 232, amma mafi girma. Wanene ya tuna cewa an tsara CRC32 don gano (da gyara) kurakurai a cikin tashar sadarwa, kuma don haka yana inganta waɗannan kaddarorin don cutar da wasu? Misali, bai damu da sake tsara bytes ba: idan kun lissafta CRC32 daga layi biyu, a cikin na biyu zaku musanya bytes 4 na farko tare da 4 bytes na gaba - zai kasance iri ɗaya. Lokacin da shigarwar mu ta kasance kirtani na rubutu daga haruffan Latin (da ƙaramin rubutu), kuma waɗannan sunaye ba na musamman ba ne, yuwuwar sake fasalin wannan yana ƙaruwa sosai.

Af, wa ya duba abin da ke wurin? da gaske CRC32? Ɗaya daga cikin lambobin tushe na farko (har ma kafin Waltman) yana da aikin hash wanda ya ninka kowane hali ta lamba 239, don haka ƙaunatattun waɗannan mutane, ha ha!

A ƙarshe, lafiya, mun gane cewa masu ginin da nau'in filin Vector<int> и Vector<PolymorType> zai sami daban-daban CRC32. Me game da aikin kan layi? Kuma daga mahangar ka'idar. shin wannan ya zama wani ɓangare na nau'in? Bari mu ce mun wuce tsararrun lambobi dubu goma, da kyau Vector<int> komai a bayyane yake, tsayin da kuma wani 40000 bytes. Idan wannan Vector<Type2>, wanda ya ƙunshi filin guda ɗaya kawai int kuma shi kadai ne a cikin nau'in - muna buƙatar maimaita 10000xabcdef0 sau 34 sannan 4 bytes int, ko kuma harshen ya sami damar 'yantar da shi a gare mu daga maginin ginin fixedVec kuma maimakon 80000 bytes, canja wurin kuma kawai 40000?

Wannan ba tambaya ce mara aiki ba kwata-kwata - yi tunanin kun sami jerin sunayen masu amfani da rukuni, kowannensu yana da id, sunan farko, suna na ƙarshe - bambancin adadin bayanan da aka canjawa wuri ta hanyar haɗin wayar hannu na iya zama mahimmanci. Daidai ingancin serialization Telegram ne ake tallata mana.

Don haka…

Vector, wanda ba a sake shi ba

Idan ka yi ƙoƙarin yawo cikin shafukan bayanin mahaɗa da sauransu, za ka ga cewa vector (har ma da matrix) yana ƙoƙarin fitar da shi ta hanyar tuples na zanen gado da yawa. Amma a ƙarshe sun manta, an tsallake matakin ƙarshe, kuma an ba da ma'anar vector kawai, wanda ba a haɗa shi da nau'in ba. Akwai matsala? A cikin harsuna shirye-shirye, musamman masu aiki, abu ne na yau da kullun don siffanta tsarin akai-akai - mai tarawa tare da ƙarancin kimantawa zai fahimta kuma yayi komai da kansa. A cikin harshe serialization data abin da ake buƙata shine INGANTATTU: ya isa kawai a kwatanta jerin, i.e. tsarin abubuwa guda biyu - na farko shine nau'in bayanai, na biyu tsarin iri ɗaya ne da kansa ko sarari mara komai don wutsiya (fakitin). (cons) in Lisp). Amma wannan a fili zai buƙaci kowane kashi yana kashe ƙarin 4 bytes (CRC32 a cikin yanayin TL) don bayyana nau'in sa. Hakanan za'a iya siffanta tsararru cikin sauƙi tsayayyen girman, amma a cikin yanayin tsararrun tsayin da ba a sani ba a gaba, mun rabu.

Don haka, tun da TL baya ƙyale fitar da vector, dole ne a ƙara shi a gefe. A ƙarshe takaddun yana cewa:

Serialization ko da yaushe yana amfani da maginin maginin “vector” (const 0x1cb5c415 = crc32 (“vector t:Type # [t] = Vector t”) wanda baya dogaro da takamaiman ƙimar ma'aunin nau'in t.

Ƙimar ma'aunin zaɓi t ba shi da hannu a cikin serialization tun da an samo shi daga nau'in sakamako (koyaushe an san shi kafin ƙaddamarwa).

Duba da kyau: vector {t:Type} # [ t ] = Vector t - amma babu inda Wannan ma'anar ita kanta ba ta ce lamba ta farko dole ne ta kasance daidai da tsawon vector! Kuma baya zuwa daga ko'ina. Wannan kyauta ce da ke buƙatar kiyayewa kuma a aiwatar da ita da hannuwanku. A wani wuri, takardun har ma da gaskiya sun ambaci cewa nau'in ba na gaske ba ne:

Vector t polymorphic pseudotype shine "nau'i" wanda darajarsa shine jerin dabi'u na kowane nau'in t, ko dai a akwati ko babu.

... amma ba ya mayar da hankali a kan shi. Lokacin da kuka gaji da yawo ta hanyar ilimin lissafi (wataƙila ma kun san ku daga karatun jami'a), yanke shawarar dainawa kuma a zahiri duba yadda ake aiki da shi a aikace, ra'ayin da ya rage a cikin ku shine cewa wannan yana da mahimmanci. Lissafi a cikin ainihin, Cool People (masu ilmin lissafi biyu - wanda ya lashe ACM) ne ya ƙirƙira shi a fili, ba kowa ba. Manufar - don nunawa - an cimma nasara.

Af, game da lamba. Mu tunatar da ku cewa # ma'ana ce nat, lambar halitta:

Akwai nau'ikan maganganu (irin-expr) da lafuzzan lambobi (na-expr). Duk da haka, an ayyana su ta hanya ɗaya.

type-expr ::= expr
nat-expr ::= expr

amma a cikin nahawu an siffanta su kamar haka, watau. Dole ne a sake tunawa da wannan bambanci kuma a aiwatar da shi da hannu.

To, eh, nau'ikan samfuri (vector<int>, vector<User>) suna da mai ganowa guda ɗaya (#1cb5c415), i.e. idan kun san an sanar da kiran kamar yadda

users.getUsers#d91a548 id:Vector<InputUser> = Vector<User>;

to, ba ku jira kawai vector, amma vector na masu amfani. Fiye da daidai, ya kamata jira - a ainihin lambar, kowane nau'i, idan ba nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i) wanda zai zama mai ginawa, kuma ta hanya mai kyau wajen aiwatar da shi zai zama dole a duba -amma an aiko mu dai-dai da kowane nau'i na wannan vector. irin wannan? Idan wani nau'in PHP ne, wanda tsararru zai iya ƙunsar nau'ikan nau'ikan abubuwa daban-daban fa?

A wannan lokacin kun fara tunani - shin irin wannan TL ya zama dole? Watakila don keken zai yiwu a yi amfani da serializer na ɗan adam, irin wannan protobuf ɗin da ya riga ya wanzu a lokacin? Wannan ita ce ka'idar, bari mu dubi aiki.

Ana aiwatar da TL a cikin lamba

An haifi TL a cikin zurfin VKontakte tun kafin shahararrun abubuwan da suka faru tare da siyar da rabon Durov da (lalle ne,), tun kafin a fara ci gaban Telegram. Kuma a bude tushen lambar tushe na farkon aiwatarwa za ka iya samun da yawa ban dariya crutches. Kuma harshen da kansa an aiwatar da shi a can sosai fiye da yadda yake a yanzu a cikin Telegram. Misali, ba a yin amfani da hashes kwata-kwata a cikin makircin (ma'ana ginanniyar pseudotype (kamar vector) tare da karkatacciyar dabi'a). Ko kuma

Templates are not used now. Instead, the same universal constructors (for example, vector {t:Type} [t] = Vector t) are used w

amma bari mu yi la'akari, domin cikar, don gano, don haka a ce, juyin halittar Giant of Tunani.

#define ZHUKOV_BYTES_HACK

#ifdef ZHUKOV_BYTES_HACK

/* dirty hack for Zhukov request */

Ko wannan kyakkyawa:

    static const char *reserved_words_polymorhic[] = {

      "alpha", "beta", "gamma", "delta", "epsilon", "zeta", "eta", "theta", NULL

      };

Wannan guntu yana game da samfuri kamar:

intHash {alpha:Type} vector<coupleInt<alpha>> = IntHash<alpha>;

Wannan shine ma'anar nau'in samfurin hashmap azaman vector na int - Nau'in nau'i-nau'i. A cikin C++ zai yi kama da wani abu kamar haka:

    template <T> class IntHash {
      vector<pair<int,T>> _map;
    }

haka, alpha - keyword! Amma kawai a cikin C ++ zaka iya rubuta T, amma ya kamata ka rubuta alpha, beta ... Amma ba fiye da sigogi 8 ba, a nan ne fantasy ya ƙare. Da alama a wani lokaci a St. Petersburg an yi wasu tattaunawa kamar haka:

-- Надо сделать в TL шаблоны
-- Бл... Ну пусть параметры зовут альфа, бета,... Какие там ещё буквы есть... О, тэта!
-- Грамматика? Ну потом напишем

-- Смотрите, какой я синтаксис придумал для шаблонов и вектора!
-- Ты долбанулся, как мы это парсить будем?
-- Да не ссыте, он там один в схеме, захаркодить -- и ок

Amma wannan shine game da aiwatar da TL na farko da aka buga "a gaba ɗaya". Bari mu ci gaba zuwa la'akari da aiwatarwa a cikin abokan cinikin Telegram da kansu.

Magana zuwa Vasily:

Vasily, [09.10.18 17:07] Mafi yawa, jakin yana da zafi saboda sun ƙirƙira ɗimbin ɗimbin abubuwa, sa'an nan kuma suka yi musu bulala, suka rufe na'urar janareta da sanduna.
Sakamakon haka, na farko daga tashar jirgin ruwa pilot.jpg
Sannan daga lambar dzhekichan.webp

Tabbas, daga mutanen da suka saba da algorithms da lissafi, zamu iya tsammanin sun karanta Aho, Ullmann, kuma sun saba da kayan aikin da suka zama daidaitattun ma'auni a cikin masana'antar a cikin shekarun da suka gabata don rubuta masu tarawa na DSL, daidai?

Marubuci telegram-cli shine Vitaly Valtman, kamar yadda za'a iya fahimta daga abin da ya faru na tsarin TLO a wajen iyakokinsa (cli), memba na ƙungiyar - yanzu an ware ɗakin karatu don TL parsing. daban, menene tunaninta Farashin TL? ..

16.12 04:18 Vasily: Ina tsammanin wani bai ƙware lex+yacc ba.
16.12 04:18 Vasily: Ba zan iya bayyana shi in ba haka ba
16.12 04:18 Vasily: da kyau, ko an biya su don adadin layukan VK.
16.12 04:19 Vasily: Layi 3k+ da sauransu<censored> maimakon fassarori

Wataƙila banda? Bari mu ga yadda ya aikata Wannan shine babban abokin ciniki - Telegram Desktop:

    nametype = re.match(r'([a-zA-Z.0-9_]+)(#[0-9a-f]+)?([^=]*)=s*([a-zA-Z.<>0-9_]+);', line);
    if (not nametype):
      if (not re.match(r'vector#1cb5c415 {t:Type} # [ t ] = Vector t;', line)):
         print('Bad line found: ' + line);

Layukan 1100+ a cikin Python, wasu maganganu na yau da kullun + lokuta na musamman kamar vector, wanda, ba shakka, an bayyana shi a cikin tsarin kamar yadda ya kamata ya kasance bisa ga ma'anar TL, amma sun dogara da wannan ma'anar don tantance shi ... Tambayar ta taso, me yasa duk abin ya zama abin al'ajabi?иYa fi yadudduka idan babu wanda zai rarraba shi bisa ga takaddun ta wata hanya?!

Af... Ka tuna mun yi magana game da dubawar CRC32? Don haka, a cikin janareta na lambar Desktop ɗin Telegram akwai jerin keɓancewa ga waɗannan nau'ikan waɗanda aka ƙididdige su CRC32. bai dace ba tare da wanda aka nuna a cikin zane!

Vasily, [18.12/22 49:XNUMX] kuma a nan zan yi tunanin ko ana buƙatar irin wannan TL.
idan ina son yin rikici tare da madadin aiwatarwa, zan fara saka hutun layi, rabin fassarori za su karya akan ma'anoni masu yawa.
tdesktop, duk da haka, kuma

Ka tuna da batu game da layi daya, za mu dawo da shi kadan kadan.

To, telegram-cli ba na hukuma ba ne, Telegram Desktop na hukuma ne, amma sauran fa? Wanene ya sani?... A cikin lambar abokin ciniki ta Android babu wani tsarin bincike kwata-kwata (wanda ke haifar da tambayoyi game da buɗaɗɗen tushe, amma wannan na kashi na biyu ne), amma akwai wasu lambobin ban dariya da yawa, amma ƙari akan su a cikin karamin sashe na kasa.

Wadanne tambayoyi ne jerin abubuwan ke haifarwa a aikace? Alal misali, sun yi abubuwa da yawa, ba shakka, tare da ƙananan filayen da filayen sharadi:

Vasily: flags.0? true
yana nufin cewa filin yana nan kuma yayi daidai idan an saita tuta

Vasily: flags.1? int
yana nufin cewa filin yana nan kuma yana buƙatar cire shi

Vasily: Ass, kada ku damu da abin da kuke yi!
Vasily: Akwai ambaton wani wuri a cikin doc cewa gaskiya nau'in tsayin sifili ne, amma ba shi yiwuwa a haɗa wani abu daga takardunsu.
Vasily: A cikin buɗaɗɗen aiwatarwa wannan ba haka lamarin yake ba, amma akwai tarin ƙugiya da tallafi.

Me game da Telethon? Neman gaba ga batun MTProto, misali - a cikin takardun akwai irin waɗannan guda, amma alamar % an siffanta shi ne kawai a matsayin "daidai da nau'in da aka bayar", watau. a cikin misalan da ke ƙasa akwai ko dai kuskure ko wani abu mara izini:

Vasily, [22.06.18 18:38] A wuri guda:

msg_container#73f1f8dc messages:vector message = MessageContainer;

A cikin wani daban:

msg_container#73f1f8dc messages:vector<%Message> = MessageContainer;

Kuma waɗannan manyan bambance-bambance ne guda biyu, a zahiri wani nau'in vector tsirara ya zo

Ban ga ma'anar vector ba kuma ban ci karo da ɗaya ba

Ana rubuta nazari da hannu a cikin telethon

A cikin zanensa an yi sharhin ma'anar msg_container

Bugu da ƙari, tambayar ta kasance game da %. Ba a bayyana shi ba.

Vadim Goncharov, [22.06.18 19:22] kuma a cikin tdesktop?

Vasily, [22.06.18 19:23] Amma su TL parser akan injuna na yau da kullun ba za su ci wannan ba.

// parsed manually

TL kyakkyawan abstraction ne, babu wanda ya aiwatar da shi gaba ɗaya

Kuma % baya cikin sigar tsarin su

Amma a nan takardun sun saba wa kanta, don haka idk

An samo shi a cikin nahawu, za su iya mantawa kawai don kwatanta ma'anar tarukan

Kun ga takaddun akan TL, ba za ku iya gano shi ba tare da rabin lita ba

"To, bari mu ce," wani mai karatu zai ce, "kuna sukar wani abu, don haka nuna mani yadda ya kamata a yi."

Vasily ta ba da amsa: “Game da mai binciken, ina son abubuwa kamar su

    args: /* empty */ { $$ = NULL; }
        | args arg { $$ = g_list_append( $1, $2 ); }
        ;

    arg: LC_ID ':' type-term { $$ = tl_arg_new( $1, $3 ); }
            | LC_ID ':' condition '?' type-term { $$ = tl_arg_new_cond( $1, $5, $3 ); free($3); }
            | UC_ID ':' type-term { $$ = tl_arg_new( $1, $3 ); }
            | type-term { $$ = tl_arg_new( "", $1 ); }
            | '[' LC_ID ']' { $$ = tl_arg_new_mult( "", tl_type_new( $2, TYPE_MOD_NONE ) ); }
            ;

ko ta yaya ya fi kyau

struct tree *parse_args4 (void) {
  PARSE_INIT (type_args4);
  struct parse so = save_parse ();
  PARSE_TRY (parse_optional_arg_def);
  if (S) {
    tree_add_child (T, S);
  } else {
    load_parse (so);
  }
  if (LEX_CHAR ('!')) {
    PARSE_ADD (type_exclam);
    EXPECT ("!");
  }
  PARSE_TRY_PES (parse_type_term);
  PARSE_OK;
}

ko

        # Regex to match the whole line
        match = re.match(r'''
            ^                  # We want to match from the beginning to the end
            ([w.]+)           # The .tl object can contain alpha_name or namespace.alpha_name
            (?:
                #             # After the name, comes the ID of the object
                ([0-9a-f]+)    # The constructor ID is in hexadecimal form
            )?                 # If no constructor ID was given, CRC32 the 'tl' to determine it

            (?:s              # After that, we want to match its arguments (name:type)
                {?             # For handling the start of the '{X:Type}' case
                w+            # The argument name will always be an alpha-only name
                :              # Then comes the separator between name:type
                [wd<>#.?!]+  # The type is slightly more complex, since it's alphanumeric and it can
                               # also have Vector<type>, flags:# and flags.0?default, plus :!X as type
                }?             # For handling the end of the '{X:Type}' case
            )*                 # Match 0 or more arguments
            s                 # Leave a space between the arguments and the equal
            =
            s                 # Leave another space between the equal and the result
            ([wd<>#.?]+)     # The result can again be as complex as any argument type
            ;$                 # Finally, the line should always end with ;
            ''', tl, re.IGNORECASE | re.VERBOSE)

wannan shine cikakken lexer:

    ---functions---         return FUNCTIONS;
    ---types---             return TYPES;
    [a-z][a-zA-Z0-9_]*      yylval.string = strdup(yytext); return LC_ID;
    [A-Z][a-zA-Z0-9_]*      yylval.string = strdup(yytext); return UC_ID;
    [0-9]+                  yylval.number = atoi(yytext); return NUM;
    #[0-9a-fA-F]{1,8}       yylval.number = strtol(yytext+1, NULL, 16); return ID_HASH;

    n                      /* skip new line */
    [ t]+                  /* skip spaces */
    //.*$                 /* skip comments */
    /*.**/              /* skip comments */
    .                       return (int)yytext[0];

wadanda. mafi sauki shine sanya shi a hankali."

Gabaɗaya, sakamakon haka, parser da code janareta na ainihin abin da aka yi amfani da shi na TL sun dace da kusan layin nahawu 100 da ~ 300 na janareta (ƙidaya duka. print's generated code), gami da nau'in bayanan buns don dubawa a kowane aji. Kowane nau'in nau'in polymorphic yana juyewa zuwa aji mara tushe mara kyau, kuma masu ginin sun gaji daga gare ta kuma suna da hanyoyin serialization da lalata.

Rashin nau'i a cikin nau'in harshe

Buga mai ƙarfi abu ne mai kyau, daidai? A'a, wannan ba holivar ba ne (ko da yake na fi son ingantattun harsuna), amma matsayi ne a cikin tsarin TL. Dangane da shi, ya kamata harshe ya samar mana da kowane irin cak. To, lafiya, watakila ba shi da kansa ba, amma aiwatarwa, amma ya kamata a kalla ya kwatanta su. Kuma wane irin dama muke so?

Da farko, ƙuntatawa. Anan muna gani a cikin takaddun don loda fayiloli:

Ana raba abun cikin binaryar fayil ɗin zuwa sassa. Dole ne dukkan sassan su kasance da girmansu iri ɗaya ( part_size ) kuma dole ne a cika waɗannan sharuɗɗa:

  • part_size % 1024 = 0 (mai raba ta 1KB)
  • 524288 % part_size = 0 (Dole ne a raba 512KB daidai gwargwado ta part_size)

Kashi na ƙarshe ba lallai ne ya gamsar da waɗannan sharuɗɗan ba, muddin girmansa bai kai part_size ba.

Kowane bangare yakamata ya kasance yana da lambar jeri, file_part, tare da ƙima daga 0 zuwa 2,999.

Bayan an raba fayil ɗin kuna buƙatar zaɓar hanya don adana shi akan uwar garken. Amfani upload.saveBigFilePart idan cikakken girman fayil ɗin ya wuce 10 MB kuma upload.saveFilePart don ƙananan fayiloli.
Za a iya dawo da ɗaya daga cikin kurakuran shigar da bayanai masu zuwa:

  • FILE_PARTS_INVALID - Adadin sassa mara inganci. Ƙimar ba ta tsakanin 1..3000

Akwai ɗaya daga cikin wannan a cikin zane? Shin wannan ko ta yaya aka bayyana ta amfani da TL? A'a. Amma uzuri, har ma da kakan Turbo Pascal ya iya kwatanta nau'ikan da aka ƙayyade jeri. Kuma ya san wani abu guda, wanda yanzu aka fi sani da shi enum - nau'in da ya ƙunshi ƙididdige ƙididdiga na ƙayyadaddun adadin ƙididdiga (ƙananan). A cikin harsuna kamar C - lambobi, lura cewa ya zuwa yanzu mun yi magana ne kawai game da nau'ikan lambobi. Amma akwai kuma arrays, kirtani ... misali, zai yi kyau a kwatanta cewa wannan kirtani na iya ƙunsar lambar waya kawai, daidai?

Babu ɗayan waɗannan da ke cikin TL. Amma akwai, alal misali, a cikin JSON Schema. Kuma idan wani zai iya yin jayayya game da rarrabuwar 512 KB, cewa har yanzu yana buƙatar bincika lambar, sannan tabbatar cewa abokin ciniki kawai. ba zai iya ba aika lamba daga kewayon 1..3000 (kuma kuskuren daidai ba zai iya tashi ba) da zai yiwu, daidai? ..

Af, game da kurakurai da ƙimar dawowa. Ko da waɗanda suka yi aiki tare da TL sun ɓata idanunsu - ba nan da nan ya waye mana hakan ba kowanne Aiki a cikin TL na iya dawowa ba kawai nau'in dawowar da aka kwatanta ba, har ma da kuskure. Amma ba za a iya cire wannan ta kowace hanya ta amfani da TL kanta ba. Tabbas, ya riga ya bayyana kuma babu buƙatar wani abu a aikace (ko da yake a gaskiya, RPC za a iya yi ta hanyoyi daban-daban, za mu dawo zuwa wannan daga baya) - amma menene game da Tsarkakewa na ra'ayoyin Lissafi na Abstract Types. daga duniyar sama?.. Na ɗauki ja - don haka daidaita shi.

Kuma a ƙarshe, menene game da karantawa? To, a can, a gaba ɗaya, ina so description Kuna da shi daidai a cikin tsarin (a cikin tsarin JSON, kuma, shi ne), amma idan kun riga kun damu da shi, to menene game da ɓangaren aiki - aƙalla maras muhimmanci don kallon bambance-bambancen yayin sabuntawa? Duba da kanku a misalai na gaske:

-channelFull#76af5481 flags:# can_view_participants:flags.3?true can_set_username:flags.6?true can_set_stickers:flags.7?true hidden_prehistory:flags.10?true id:int about:string participants_count:flags.0?int admins_count:flags.1?int kicked_count:flags.2?int banned_count:flags.2?int read_inbox_max_id:int read_outbox_max_id:int unread_count:int chat_photo:Photo notify_settings:PeerNotifySettings exported_invite:ExportedChatInvite bot_info:Vector<BotInfo> migrated_from_chat_id:flags.4?int migrated_from_max_id:flags.4?int pinned_msg_id:flags.5?int stickerset:flags.8?StickerSet available_min_id:flags.9?int = ChatFull;
+channelFull#1c87a71a flags:# can_view_participants:flags.3?true can_set_username:flags.6?true can_set_stickers:flags.7?true hidden_prehistory:flags.10?true can_view_stats:flags.12?true id:int about:string participants_count:flags.0?int admins_count:flags.1?int kicked_count:flags.2?int banned_count:flags.2?int online_count:flags.13?int read_inbox_max_id:int read_outbox_max_id:int unread_count:int chat_photo:Photo notify_settings:PeerNotifySettings exported_invite:ExportedChatInvite bot_info:Vector<BotInfo> migrated_from_chat_id:flags.4?int migrated_from_max_id:flags.4?int pinned_msg_id:flags.5?int stickerset:flags.8?StickerSet available_min_id:flags.9?int = ChatFull;

ko

-message#44f9b43d flags:# out:flags.1?true mentioned:flags.4?true media_unread:flags.5?true silent:flags.13?true post:flags.14?true id:int from_id:flags.8?int to_id:Peer fwd_from:flags.2?MessageFwdHeader via_bot_id:flags.11?int reply_to_msg_id:flags.3?int date:int message:string media:flags.9?MessageMedia reply_markup:flags.6?ReplyMarkup entities:flags.7?Vector<MessageEntity> views:flags.10?int edit_date:flags.15?int post_author:flags.16?string grouped_id:flags.17?long = Message;
+message#44f9b43d flags:# out:flags.1?true mentioned:flags.4?true media_unread:flags.5?true silent:flags.13?true post:flags.14?true from_scheduled:flags.18?true id:int from_id:flags.8?int to_id:Peer fwd_from:flags.2?MessageFwdHeader via_bot_id:flags.11?int reply_to_msg_id:flags.3?int date:int message:string media:flags.9?MessageMedia reply_markup:flags.6?ReplyMarkup entities:flags.7?Vector<MessageEntity> views:flags.10?int edit_date:flags.15?int post_author:flags.16?string grouped_id:flags.17?long = Message;

Ya dogara da kowa, amma GitHub, alal misali, ya ƙi nuna canje-canje a cikin irin waɗannan dogayen layukan. Wasan "nemo bambance-bambancen 10", kuma abin da kwakwalwa ke gani nan da nan shi ne cewa farkon da ƙare a cikin duka misalai iri ɗaya ne, kuna buƙatar karantawa a hankali a wani wuri a tsakiyar ... A ra'ayi na, wannan ba kawai a cikin ka'idar ba, amma na gani zalla datti kuma maras kyau.

Af, game da tsabtar ka'idar. Me yasa muke buƙatar filayen bit? Ko da alama ba haka suke ba wari bad daga ra'ayi na irin ka'idar? Ana iya ganin bayanin a sigar farko na zanen. Da farko, i, haka abin yake, ga kowane atishawa an halicci sabon nau'i. Waɗannan rudiments har yanzu suna cikin wannan sigar, misali:

storage.fileUnknown#aa963b05 = storage.FileType;
storage.filePartial#40bc6f52 = storage.FileType;
storage.fileJpeg#7efe0e = storage.FileType;
storage.fileGif#cae1aadf = storage.FileType;
storage.filePng#a4f63c0 = storage.FileType;
storage.filePdf#ae1e508d = storage.FileType;
storage.fileMp3#528a0677 = storage.FileType;
storage.fileMov#4b09ebbc = storage.FileType;
storage.fileMp4#b3cea0e4 = storage.FileType;
storage.fileWebp#1081464c = storage.FileType;

Amma yanzu tunanin, idan kuna da filayen zaɓi 5 a cikin tsarin ku, to kuna buƙatar nau'ikan 32 don duk zaɓuɓɓukan da za ku iya. fashewar haɗuwa. Don haka, tsantsar kiristanci na ka'idar TL ta sake rugujewa da jakin simintin ƙarfe na ainihin gaskiyar serialization.

Bugu da kari, a wasu wuraren su kansu wadannan mutanen suna cin zarafin nasu. Misali, a cikin MTProto (babi na gaba) ana iya matsawa martani ta hanyar Gzip, komai yana da kyau - sai dai an keta yadudduka da kewaye. Har yanzu, ba RpcResult kanta aka girbe ba, amma abinda ke ciki. To, me ya sa ake yin haka?.. Dole ne in yanke a cikin kullun don matsawa ya yi aiki a ko'ina.

Ko wani misali, mun taɓa gano kuskure - an aiko shi InputPeerUser maimakon InputUser. Ko akasin haka. Amma ya yi aiki! Wato, uwar garken bai damu da nau'in ba. Ta yaya hakan zai kasance? Za a iya ba mu amsar ta guntuwar lamba daga telegram-cli:

  if (tgl_get_peer_type (E->id) != TGL_PEER_CHANNEL || (C && (C->flags & TGLCHF_MEGAGROUP))) {
    out_int (CODE_messages_get_history);
    out_peer_id (TLS, E->id);
  } else {    
    out_int (CODE_channels_get_important_history);

    out_int (CODE_input_channel);
    out_int (tgl_get_peer_id (E->id));
    out_long (E->id.access_hash);
  }
  out_int (E->max_id);
  out_int (E->offset);
  out_int (E->limit);
  out_int (0);
  out_int (0);

A wasu kalmomi, a nan ne ake yin serialization DA HANNU, ba ƙirƙira code! Wataƙila ana aiwatar da uwar garken a irin wannan hanya? .. A ka'ida, wannan zai yi aiki idan an yi sau ɗaya, amma ta yaya za a iya tallafawa daga baya yayin sabuntawa? Shin wannan shine dalilin da ya sa aka ƙirƙira makircin? Kuma a nan za mu ci gaba zuwa tambaya ta gaba.

Siffar. Yadudduka

Me yasa ake kiran nau'ikan makircin Layers kawai za'a iya yin hasashe dangane da tarihin da aka buga. A bayyane yake, da farko marubuta sun yi tunanin cewa za a iya yin abubuwa na asali ta amfani da tsarin da ba a canza ba, kuma kawai idan ya cancanta, don takamaiman buƙatun, suna nuna cewa ana yin su ta amfani da wani nau'i na daban. A ka'ida, har ma da kyakkyawan ra'ayi - kuma sabon zai kasance, kamar yadda yake, "gauraye", ya shimfiɗa a saman tsohuwar. Amma bari mu ga yadda aka yi. Gaskiya ne, ban iya kallon shi ba tun daga farko - yana da ban dariya, amma zane na Layer Layer kawai ba ya wanzu. Layers sun fara da 2. Takaddun sun gaya mana game da fasalin TL na musamman:

Idan abokin ciniki yana goyan bayan Layer 2, to dole ne a yi amfani da mai gini mai zuwa:

invokeWithLayer2#289dd1f6 {X:Type} query:!X = X;

A aikace, wannan yana nufin cewa kafin kowane kiran API, int tare da ƙimar 0x289dd1f6 dole ne a ƙara kafin lambar hanyar.

Sauti na al'ada. Amma me ya faru daga baya? Sai ya bayyana

invokeWithLayer3#b7475268 query:!X = X;

To menene na gaba? Kamar yadda kuke tsammani,

invokeWithLayer4#dea0d430 query:!X = X;

Abin ban dariya? A'a, ya yi wuri don yin dariya, yi tunani game da gaskiyar cewa kowane Buƙatun daga wani Layer yana buƙatar nannade shi da irin wannan nau'in na musamman - idan duk sun bambanta a gare ku, ta yaya kuma zaku iya bambanta su? Kuma ƙara kawai 4 bytes a gaba hanya ce mai inganci. Don haka,

invokeWithLayer5#417a57ae query:!X = X;

Amma a bayyane yake cewa bayan ɗan lokaci wannan zai zama wani nau'in bacchanalia. Kuma mafita ta zo:

Sabuntawa: Farawa da Layer 9, hanyoyin taimako invokeWithLayerN za a iya amfani da kawai tare da initConnection

Hooray! Bayan nau'ikan 9, a ƙarshe mun zo ga abin da aka yi a cikin ƙa'idodin Intanet a cikin 80s - yarda da sigar sau ɗaya a farkon haɗin!

To menene na gaba?..

invokeWithLayer10#39620c41 query:!X = X;
...
invokeWithLayer18#1c900537 query:!X = X;

Amma yanzu har yanzu kuna iya dariya. Bayan wani nau'i na 9 kawai, an ƙara wani magini na duniya tare da lambar sigar, wanda dole ne a kira shi sau ɗaya kawai a farkon haɗin, kuma ma'anar yadudduka ya yi kama da ya ɓace, yanzu kawai yanayin yanayi ne, kamar. ko'ina kuma. An warware matsalar.

Daidai?..

Vasily, [16.07.18 14:01] Ko a ranar Juma'a na yi tunani:
Sabis ɗin wayar yana aika abubuwan da suka faru ba tare da buƙata ba. Dole ne a nannade buƙatun a cikin InvokeWithLayer. Sabar ba ta kunsa sabuntawa; babu wani tsari don naɗa martani da sabuntawa.

Wadancan. abokin ciniki ba zai iya tantance Layer ɗin da yake son sabuntawa ba

Vadim Goncharov, [16.07.18 14:02] ba InvokeWithLayer ba ne a ka'ida?

Vasily, [16.07.18 14:02] Wannan ita ce hanya daya tilo

Vadim Goncharov, [16.07.18 14:02] wanda da gaske ya kamata yana nufin yarda da Layer a farkon zaman.

Af, yana biye da cewa ba a bayar da rage darajar abokin ciniki ba

Sabuntawa, i.e. nau'in Updates a cikin makircin, wannan shine abin da uwar garken ke aikawa ga abokin ciniki ba don amsa buƙatar API ba, amma da kansa lokacin da wani abu ya faru. Wannan batu ne mai rikitarwa wanda za'a tattauna a cikin wani sakon, amma a yanzu yana da mahimmanci a san cewa uwar garken yana adana Sabuntawa ko da abokin ciniki yana layi.

Don haka, idan kun ƙi kunsa kowane kunshin don nuna sigar sa, wannan a hankali yana haifar da matsaloli masu yiwuwa masu zuwa:

  • uwar garken yana aika sabuntawa ga abokin ciniki tun kafin abokin ciniki ya sanar da wane nau'in yana tallafawa
  • menene zan yi bayan haɓaka abokin ciniki?
  • Hukumar Lafiya ta Duniya garanticewa ra'ayin uwar garken game da lambar Layer ba zai canza ba yayin aiwatarwa?

Kuna tsammanin wannan hasashe ne kawai na ka'idar, kuma a aikace wannan ba zai iya faruwa ba, saboda an rubuta uwar garken daidai (akalla, an gwada shi da kyau)? Ha! Ko yaya abin yake!

Wannan shi ne ainihin abin da muka shiga cikin watan Agusta. A ranar 14 ga Agusta, an sami saƙonnin cewa ana sabunta wani abu akan sabobin Telegram... sannan a cikin rajistan ayyukan:

2019-08-15 09:28:35.880640 MSK warn  main: ANON:87: unknown object type: 0x80d182d1 at TL/Object.pm line 213.
2019-08-15 09:28:35.751899 MSK warn  main: ANON:87: unknown object type: 0xb5223b0f at TL/Object.pm line 213.

sannan kuma megabytes da yawa na alamun tari (da kyau, a lokaci guda kuma an gyara katako). Bayan haka, idan ba a gane wani abu a cikin TL ɗinku ba, binary ne ta hanyar sa hannu, ƙara ƙasa DUKA tafi, yanke hukunci ba zai yiwu ba. Me ya kamata ku yi a irin wannan yanayi?

To, abu na farko da ya zo zuciyar kowa shine ya cire haɗin kuma a sake gwadawa. Ban taimaka ba. Mu google CRC32 - waɗannan sun zama abubuwa daga makirci na 73, kodayake mun yi aiki akan 82. Muna duban rajistan ayyukan a hankali - akwai masu ganowa daga makirci biyu daban-daban!

Wataƙila matsalar tana cikin abokin cinikinmu ne kawai? A'a, mun ƙaddamar da Telegram Desktop 1.2.17 (samfurin da aka kawo a cikin adadin rarraba Linux), yana rubutawa zuwa Exception log: MTP nau'in nau'in nau'in id #b5223b0f karanta a cikin MTMessageMedia…

Sukar ƙa'idar da hanyoyin ƙungiyoyi na Telegram. Sashe na 1, fasaha: ƙwarewar rubuta abokin ciniki daga karce - TL, MT

Google ya nuna cewa irin wannan matsala ta riga ta faru ga ɗaya daga cikin abokan cinikin da ba na hukuma ba, amma sai lambobin sigar kuma, a kan haka, zato sun bambanta ...

To me ya kamata mu yi? Vasily da ni sun rabu: ya yi ƙoƙari ya sabunta da'irar zuwa 91, na yanke shawarar jira 'yan kwanaki kuma in gwada 73. Dukansu hanyoyin sun yi aiki, amma tun da yake suna da tasiri, babu fahimtar yawancin nau'ikan sama ko ƙasa da kuke buƙata. don tsalle, ko tsawon lokacin da kuke buƙatar jira.

Daga baya na sami damar sake haifar da halin da ake ciki: mun ƙaddamar da abokin ciniki, kashe shi, sake haɗa da'irar zuwa wani Layer, sake farawa, sake kama matsalar, komawa zuwa wanda ya gabata - oops, babu adadin juyawa da abokin ciniki ya sake farawa don 'yan mintoci kaɗan za su taimaka. Za ku sami cakuda tsarin bayanai daga yadudduka daban-daban.

Bayani? Kamar yadda zaku iya tsammani daga alamomin kai tsaye daban-daban, uwar garken ta ƙunshi matakai da yawa na nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'ikan nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in iri, nau'in nau'in nau'in nau'in nau'in nau'in: nau'ikan, nau'ikan ya bambanta, uwar garken yana ƙunshe da matakai daban-daban akan injuna daban-daban. Mafi mahimmanci, uwar garken da ke da alhakin "buffering" ya sanya a cikin jerin abubuwan da manyansa suka ba shi, kuma sun ba da shi a cikin makircin da aka yi a lokacin tsarawa. Kuma har sai wannan jerin gwano ya "rube", ba za a iya yin kome game da shi ba.

Wataƙila ... amma wannan mummunan kullun ne?! .. A'a, kafin yin tunani game da ra'ayoyin hauka, bari mu dubi lambar abokan ciniki na hukuma. A cikin sigar Android ba mu sami wani TL parser ba, amma muna samun babban fayil (GitHub ya ƙi taɓa shi) tare da (de) serialization. Ga snippets code:

public static class TL_message_layer68 extends TL_message {
    public static int constructor = 0xc09be45f;
//...
//еще пачка подобных
//...
    public static class TL_message_layer47 extends TL_message {
        public static int constructor = 0xc992e15c;
        public static Message TLdeserialize(AbstractSerializedData stream, int constructor, boolean exception) {
            Message result = null;
            switch (constructor) {
                case 0x1d86f70e:
                    result = new TL_messageService_old2();
                    break;
                case 0xa7ab1991:
                    result = new TL_message_old3();
                    break;
                case 0xc3060325:
                    result = new TL_message_old4();
                    break;
                case 0x555555fa:
                    result = new TL_message_secret();
                    break;
                case 0x555555f9:
                    result = new TL_message_secret_layer72();
                    break;
                case 0x90dddc11:
                    result = new TL_message_layer72();
                    break;
                case 0xc09be45f:
                    result = new TL_message_layer68();
                    break;
                case 0xc992e15c:
                    result = new TL_message_layer47();
                    break;
                case 0x5ba66c13:
                    result = new TL_message_old7();
                    break;
                case 0xc06b9607:
                    result = new TL_messageService_layer48();
                    break;
                case 0x83e5de54:
                    result = new TL_messageEmpty();
                    break;
                case 0x2bebfa86:
                    result = new TL_message_old6();
                    break;
                case 0x44f9b43d:
                    result = new TL_message_layer104();
                    break;
                case 0x1c9b1027:
                    result = new TL_message_layer104_2();
                    break;
                case 0xa367e716:
                    result = new TL_messageForwarded_old2(); //custom
                    break;
                case 0x5f46804:
                    result = new TL_messageForwarded_old(); //custom
                    break;
                case 0x567699b3:
                    result = new TL_message_old2(); //custom
                    break;
                case 0x9f8d60bb:
                    result = new TL_messageService_old(); //custom
                    break;
                case 0x22eb6aba:
                    result = new TL_message_old(); //custom
                    break;
                case 0x555555F8:
                    result = new TL_message_secret_old(); //custom
                    break;
                case 0x9789dac4:
                    result = new TL_message_layer104_3();
                    break;

ko

    boolean fixCaption = !TextUtils.isEmpty(message) &&
    (media instanceof TLRPC.TL_messageMediaPhoto_old ||
     media instanceof TLRPC.TL_messageMediaPhoto_layer68 ||
     media instanceof TLRPC.TL_messageMediaPhoto_layer74 ||
     media instanceof TLRPC.TL_messageMediaDocument_old ||
     media instanceof TLRPC.TL_messageMediaDocument_layer68 ||
     media instanceof TLRPC.TL_messageMediaDocument_layer74)
    && message.startsWith("-1");

Hmm... kaman daji. Amma, tabbas, wannan lambar ƙira ce, to lafiya? .. Amma tabbas yana goyan bayan duk nau'ikan! Gaskiya ne, ba a bayyana dalilin da ya sa komai ya haɗu tare ba, tattaunawar sirri, da kowane nau'i _old7 ko ta yaya kada ku yi kama da na'ura ... Duk da haka, mafi yawan abin da aka busa ni

TL_message_layer104
TL_message_layer104_2
TL_message_layer104_3

Jama'a, ba za ku iya yanke shawarar abin da ke cikin Layer ɗaya ba?! To, lafiya, bari mu ce "biyu" an sake su tare da kuskure, da kyau, ya faru, amma UKU?... Nan da nan, rake guda kuma? Wannan wane irin batsa ne, sorry?..

A cikin lambar tushe na Desktop Telegram, ta hanyar, irin wannan abu ya faru - idan haka ne, da yawa aikata a jere zuwa tsarin ba su canza lambar Layer ba, amma gyara wani abu. A cikin yanayin da babu tushen tushen bayanai na tsarin, daga ina za a iya samo shi, sai dai lambar tushe na abokin ciniki na hukuma? Kuma idan kun ɗauke shi daga can, ba za ku iya tabbatar da cewa tsarin ya yi daidai ba har sai kun gwada duk hanyoyin.

Ta yaya ma za a iya gwada wannan? Ina fata magoya bayan naúrar, aiki da sauran gwaje-gwaje za su raba a cikin sharhin.

To, bari mu kalli wani gunkin lambar:

public static class TL_folders_deleteFolder extends TLObject {
    public static int constructor = 0x1c295881;

    public int folder_id;

    public TLObject deserializeResponse(AbstractSerializedData stream, int constructor, boolean exception) {
        return Updates.TLdeserialize(stream, constructor, exception);
    }

    public void serializeToStream(AbstractSerializedData stream) {
        stream.writeInt32(constructor);
        stream.writeInt32(folder_id);
    }
}

//manually created

//RichText start
public static abstract class RichText extends TLObject {
    public String url;
    public long webpage_id;
    public String email;
    public ArrayList<RichText> texts = new ArrayList<>();
    public RichText parentRichText;

    public static RichText TLdeserialize(AbstractSerializedData stream, int constructor, boolean exception) {
        RichText result = null;
        switch (constructor) {
            case 0x1ccb966a:
                result = new TL_textPhone();
                break;
            case 0xc7fb5e01:
                result = new TL_textSuperscript();
                break;

Wannan sharhin "da hannu aka ƙirƙira" yana nuna cewa kawai ɓangaren wannan fayil ɗin an rubuta shi da hannu (za ku iya tunanin duk mafarkin kulawa?), Sauran kuma na'ura ce ta haifar. Duk da haka, sai wata tambaya ta taso - cewa tushen yana samuwa ba gaba daya ba (a la GPL blobs a cikin Linux kernel), amma wannan ya riga ya zama batun sashi na biyu.

Amma isa. Bari mu matsa zuwa ƙa'idar da ke samanta wacce duk wannan jerin abubuwan ke gudana.

MT Proto

Don haka, bari mu buɗe cikakken bayanin и cikakken bayanin ka'idar kuma farkon abin da muke tuntuɓe a kai shi ne ƙamus. Kuma da yalwar komai. Gabaɗaya, wannan alama alama ce ta Telegram - kiran abubuwa daban-daban a wurare daban-daban, ko abubuwa daban-daban da kalma ɗaya, ko akasin haka (misali, a cikin babban matakin API, idan kun ga fakitin sitika, ba haka bane. me kuke tunani).

Misali, "saƙo" da "zama" suna nufin wani abu dabam a nan fiye da na abokin ciniki na Telegram. To, duk abin da ke bayyana a fili tare da saƙon, ana iya fassara shi a cikin sharuddan OOP, ko kuma kawai a kira kalmar "fakiti" - wannan ƙananan, matakin sufuri, babu saƙonni iri ɗaya kamar a cikin dubawa, akwai saƙonnin sabis da yawa. . Amma zaman ... amma na farko abubuwa farko.

sufuri Layer

Abu na farko shine sufuri. Za su gaya mana game da zaɓuɓɓuka guda 5:

  • TCP
  • Yanar gizo
  • Websocket akan HTTPS
  • HTTP
  • HTTPS

Vasily, [15.06.18 15:04] Akwai kuma sufurin UDP, amma ba a rubuta shi ba.

Kuma TCP a cikin bambance-bambancen guda uku

Na farko yayi kama da UDP akan TCP, kowane fakiti ya haɗa da lambar jeri da crc
Me yasa karatun takardu akan keken keke yake da zafi haka?

To, akwai shi a yanzu TCP riga a cikin 4 bambance-bambancen karatu:

  • Marassa tushe
  • Intermediate
  • Matsakaicin manne
  • Full

To, ok, Padded matsakaici don MTProxy, wannan an ƙara shi daga baya saboda sanannun abubuwan da suka faru. Amma me yasa ƙarin nau'ikan guda biyu (uku a duka) lokacin da zaku iya samu tare da ɗaya? Duk hudun sun bambanta kawai ta yadda za a saita tsayi da nauyin babban MTProto, wanda za a tattauna gaba:

  • a cikin Abridged yana da 1 ko 4 bytes, amma ba 0xef ba, sannan jiki
  • a Matsakaici wannan shi ne 4 bytes na tsayi da filin, kuma a karon farko dole ne abokin ciniki ya aika 0xeeeeeeee don nuna cewa Matsakaici ne
  • Cikak wanda ya fi jaraba, daga mahangar mai sadarwar: tsayi, lambar jeri, kuma BA WANDA ya fi MTProto, jiki, CRC32. Ee, duk wannan yana saman TCP. Wanda ke ba mu ingantaccen sufuri ta hanyar rafi na byte; ba a buƙatar jeri, musamman ma'auni. To, yanzu wani zai ƙi ni cewa TCP yana da adadin kuɗi 16-bit, don haka cin hanci da rashawa ya faru. Mai girma, amma a zahiri muna da ƙa'idar cryptographic tare da hashes sama da 16 bytes, duk waɗannan kurakurai - har ma da ƙari - za a kama su ta hanyar rashin daidaituwa na SHA a matakin mafi girma. Babu ma'ana a CRC32 akan wannan.

Bari mu kwatanta Abridged, wanda tsayin byte ɗaya zai yiwu, tare da Intermediate, wanda ke ba da hujjar "Idan ana buƙatar daidaita bayanai na 4-byte," wanda hakan shirme ne. Menene, an yi imanin cewa masu shirye-shiryen Telegram ba su da kwarewa sosai har ba za su iya karanta bayanai daga soket a cikin madaidaicin buffer ba? Har yanzu dole ne ku yi wannan, saboda karantawa na iya dawo muku da kowane adadin bytes (kuma akwai kuma sabar proxy, misali...). Ko kuma a daya bangaren, me yasa toshe Abridged idan har yanzu za mu sami fakiti mai tsayi a saman bytes 16 - ajiye 3 bytes wani lokaci ?

Mutum yana samun ra'ayi cewa Nikolai Durov yana son sake ƙirƙira ƙafafu, gami da ka'idojin cibiyar sadarwa, ba tare da wata buƙata ta zahiri ba.

Sauran hanyoyin sufuri, gami da. Yanar Gizo da MTProxy, ba za mu yi la'akari yanzu ba, watakila a cikin wani sakon, idan akwai buƙata. Game da wannan MTProxy guda ɗaya, bari mu tuna kawai cewa jim kaɗan bayan fitowar sa a cikin 2018, masu samarwa da sauri sun koyi toshe shi, an yi niyya don kewaye tarewa, a kan girman kunshin! Hakanan gaskiyar cewa uwar garken MTProxy da Waltman ya rubuta (kuma Waltman) a cikin C yana da alaƙa da ƙayyadaddun Linux, kodayake ba a buƙatar wannan kwata-kwata (Phil Kulin zai tabbatar), kuma irin wannan uwar garken ko dai a cikin Go ko Node.js zai kasance. dace a kasa da layi daya.

Amma za mu yanke shawara game da ilimin fasaha na waɗannan mutane a ƙarshen sashe, bayan yin la'akari da wasu batutuwa. A yanzu, bari mu matsa zuwa OSI Layer 5, zaman - a kan abin da suka sanya MTProto zaman.

Maɓallai, saƙonni, zama, Diffie-Hellman

Sun sanya shi a wurin ba daidai ba... Zaman ba zaman ɗaya ba ne wanda ake iya gani a cikin mu'amala a ƙarƙashin Zauren Aiki. Amma a cikin tsari.

Sukar ƙa'idar da hanyoyin ƙungiyoyi na Telegram. Sashe na 1, fasaha: ƙwarewar rubuta abokin ciniki daga karce - TL, MT

Don haka mun sami kirtani ta byte na sanannen tsayi daga layin sufuri. Wannan ko dai rufaffen saƙo ne ko bayyanannen rubutu - idan har yanzu muna kan matakin yarjejeniya kuma muna yin sa. Wanne daga cikin tarin ra'ayoyi da ake kira "key" muke magana akai? Bari mu fayyace wannan batu ga ƙungiyar Telegram da kanta (Ina neman afuwar fassarar kaina daga Turanci tare da gajiyawar ƙwaƙwalwa da ƙarfe 4 na safe, yana da sauƙi a bar wasu jumla kamar yadda suke):

Akwai ƙungiyoyi biyu da ake kira zaman - ɗaya a cikin UI na abokan ciniki na hukuma a ƙarƙashin "zamanin yanzu", inda kowane zaman ya dace da na'urar / OS gaba ɗaya.
Na biyu - Zaman MTProto, wanda ke da jerin lambar saƙon (a cikin ƙananan ma'ana) a cikinsa, kuma wanda na iya dawwama tsakanin haɗin TCP daban-daban. Ana iya shigar da zaman MTProto da yawa a lokaci guda, misali, don hanzarta sauke fayil.

Tsakanin wadannan biyun zaman akwai ra'ayi izni. A cikin yanayin lalacewa, zamu iya cewa UI zaman daidai yake da izni, amma kash, komai yana da rikitarwa. Mu duba:

  • Mai amfani akan sabuwar na'urar ya fara haifarwa auth_key kuma yana ɗaure shi zuwa lissafi, misali ta hanyar SMS - shi ya sa izni
  • Ya faru a cikin farko Zaman MTProto, wanda ke da session_id cikin kanku.
  • A wannan mataki, haɗuwa izni и session_id za a iya kira misali - wannan kalma ta bayyana a cikin takardun da lambar wasu abokan ciniki
  • Sa'an nan, abokin ciniki iya bude da dama Zaman MTProto karkashin guda auth_key - zuwa wannan DC.
  • Bayan haka, wata rana abokin ciniki zai buƙaci buƙatar fayil ɗin daga wani DC - kuma ga wannan DC za a samar da sabo auth_key !
  • Don sanar da tsarin cewa ba sabon mai amfani ba ne ke yin rajista, amma iri ɗaya ne izni (UI zaman), abokin ciniki yana amfani da kiran API auth.exportAuthorization cikin gida DC auth.importAuthorization a cikin sabon DC.
  • Komai iri ɗaya ne, da yawa na iya buɗewa Zaman MTProto (kowane da nasa session_id) zuwa wannan sabon DC, karkashin ya auth_key.
  • A ƙarshe, abokin ciniki na iya son Cikakkar Sirrin Gaba. Kowanne auth_key ya kasance Dindindin maɓalli - kowane DC - kuma abokin ciniki na iya kira auth.bindTempAuthKey don amfani wucin gadi auth_key - kuma sake, daya kawai temp_auth_key ta DC, gama gari ga kowa Zaman MTProto zuwa wannan DC.

lura, cewa gishiri (da gishiri na gaba) shima daya ne auth_key wadanda. raba tsakanin kowa da kowa Zaman MTProto zuwa DC guda.

Menene ma'anar "tsakanin haɗin TCP daban-daban"? Don haka wannan yana nufin wani abu kamar kuki izini akan gidan yanar gizo - yana dagewa (yana tsira) yawancin haɗin TCP zuwa sabar da aka bayar, amma wata rana ta yi muni. Ba kamar HTTP ba, a cikin MTProto saƙonnin da ke cikin zaman ana ƙididdige su kuma an tabbatar da su; idan sun shiga cikin rami, haɗin ya karye - bayan kafa sabuwar hanyar sadarwa, uwar garken zai aika da duk abin da ke cikin wannan zaman wanda bai isar ba a baya. Haɗin TCP.

Koyaya, an taƙaita bayanin da ke sama bayan watanni da yawa na bincike. A halin yanzu, muna aiwatar da abokin cinikinmu daga karce? - mu koma farkon.

Don haka bari mu samar auth_key a kan Sigar Diffie-Hellman daga Telegram. Mu yi kokarin fahimtar takardun...

Vasily, [19.06.18 20:05] data_with_hash: = SHA1 (bayanai) + data + (kowane bazuwar bytes); irin wannan tsayin daidai 255 bytes;
encrypted_data: = RSA(bayanai_with_hash, uwar garken_public_key); an ɗaga dogon lamba 255-byte (babban endian) zuwa ikon da ake buƙata akan modul ɗin da ake buƙata, kuma ana adana sakamakon a matsayin lambar 256-byte.

Suna da wani dope DH

Baya yi kama da DH na mutum mai lafiya
Babu maɓallan jama'a guda biyu a dx

To, a ƙarshe an daidaita wannan, amma ragowar ya rage - tabbacin aikin da abokin ciniki ya yi cewa ya iya ƙididdige lambar. Nau'in kariya daga harin DoS. Kuma ana amfani da maɓallin RSA sau ɗaya kawai a hanya ɗaya, da gaske don ɓoyewa new_nonce. Amma yayin da wannan aiki mai sauƙi zai yi nasara, menene za ku fuskanta?

Vasily, [20.06.18/00/26 XNUMX:XNUMX] Har yanzu ban samu zuwa aikace-aikacen appid ba tukuna.

Na aika wannan bukatar zuwa DH

Kuma, a cikin tashar jirgin ruwa ya ce zai iya amsawa da 4 bytes na lambar kuskure. Shi ke nan

To, ya gaya mani -404, to menene?

Don haka sai na ce masa: “Kama ɓoyayyen ɓoyayyen ɓoyayyen ku da maɓallin uwar garken tare da hoton yatsa kamar wannan, Ina son DH,” kuma ya amsa da 404 wawa.

Menene ra'ayinku game da wannan martanin uwar garken? Me za a yi? Babu mai tambaya (sai dai a kan haka a kashi na biyu).

Anan ana yin duk sha'awa akan tashar jirgin ruwa

Ba ni da wani abin yi, kawai na yi mafarkin canza lambobi gaba da gaba

Biyu 32 bit lambobi. Na kwashe su kamar kowa

Amma a'a, waɗannan biyun suna buƙatar ƙara su zuwa layin farko a matsayin BE

Vadim Goncharov, [20.06.18 15:49] kuma saboda wannan 404?

Vasily, [20.06.18 15:49] YA!

Vadim Goncharov, [20.06.18 15:50] don haka ban gane abin da zai iya "bai samu ba"

Vasily, [20.06.18 15:50] kusan

Ba zan iya samun irin wannan bazuwar cikin manyan abubuwan ba %

Ba mu ma sarrafa rahoton kuskure ba

Vasily, [20.06.18 20:18] Oh, akwai kuma MD5. Tuni hashes daban-daban guda uku

Ana lissafta sawun maɓalli kamar haka:

digest = md5(key + iv)
fingerprint = substr(digest, 0, 4) XOR substr(digest, 4, 4)

SHA1 dan sha2

Don haka mu sanya shi auth_key mun sami girman 2048 ta amfani da Diffie-Hellman. Menene na gaba? Na gaba mun gano cewa ƙananan 1024 na wannan maɓalli ba a amfani da su ta kowace hanya ... amma bari muyi tunani game da wannan a yanzu. A wannan mataki, muna da sirrin raba tare da uwar garken. An kafa analogue na zaman TLS, wanda hanya ce mai tsada sosai. Amma uwar garken har yanzu bai san komai ba game da wanda mu! Ba tukuna, a zahiri. izini. Wadancan. idan kun yi tunani cikin sharuddan “kalmar shiga-kalmar shiga”, kamar yadda kuka taɓa yi a cikin ICQ, ko aƙalla “maɓallin shiga”, kamar a cikin SSH (misali, akan wasu gitlab/github). Mun sami wanda ba a san sunansa ba. Idan uwar garken ta gaya mana "waɗannan lambobin wayar wani DC ne ke ba da sabis" fa? Ko ma "an hana lambar wayar ku"? Mafi kyawun abin da za mu iya yi shi ne kiyaye maɓalli a cikin bege cewa zai yi amfani kuma ba zai lalace ba nan da nan.

Af, mun "karba" tare da ajiyar kuɗi. Misali, mun amince da uwar garken? Idan karya ne fa? Za a buƙaci bincikar bayanan sirri:

Vasily, [21.06.18 17:53] Suna ba abokan ciniki ta wayar hannu don duba lambar 2kbit don fifiko%)

Amma ba a bayyana ko kadan ba, nafeijoa

Vasily, [21.06.18 18:02] Takardun ba ta faɗi abin da za a yi ba idan ta zama mai sauƙi.

Ba a ce ba. Bari mu ga abin da hukuma Android abokin ciniki yi a cikin wannan harka? A shi ke nan (kuma a, duk fayil ɗin yana da ban sha'awa) - kamar yadda suke faɗa, zan bar wannan kawai a nan:

278     static const char *goodPrime = "c71caeb9c6b1c9048e6c522f70f13f73980d40238e3e21c14934d037563d930f48198a0aa7c14058229493d22530f4dbfa336f6e0ac925139543aed44cce7c3720fd51f69458705ac68cd4fe6b6b13abdc9746512969328454f18faf8c595f642477fe96bb2a941d5bcd1d4ac8cc49880708fa9b378e3c4f3a9060bee67cf9a4a4a695811051907e162753b56b0f6b410dba74d8a84b2a14b3144e0ef1284754fd17ed950d5965b4b9dd46582db1178d169c6bc465b0d6ff9ca3928fef5b9ae4e418fc15e83ebea0f87fa9ff5eed70050ded2849f47bf959d956850ce929851f0d8115f635b105ee2e4e15d04b2454bf6f4fadf034b10403119cd8e3b92fcc5b";
279   if (!strcasecmp(prime, goodPrime)) {

A'a, ba shakka har yanzu yana nan wasu Akwai gwaje-gwaje na farko na lamba, amma ni kaina ba ni da isasshen ilimin lissafi.

To, mun sami babban maɓalli. Don shiga, i.e. aika buƙatun, kuna buƙatar yin ƙarin ɓoyewa, ta amfani da AES.

An bayyana maɓallin saƙon azaman 128 na tsakiya na SHA256 na jikin saƙon (ciki har da zama, ID ɗin saƙo, da sauransu), gami da maɓalli na padding, wanda 32 bytes da aka ɗauka daga maɓallin izini.

Vasily, [22.06.18 14:08] Matsakaici, karama, ragowa

An karɓa auth_key. Duka. Bayan su ... ba a bayyana ba daga takardar. Jin kyauta don nazarin buɗaɗɗen lambar tushe.

Lura cewa MTProto 2.0 yana buƙatar daga 12 zuwa 1024 bytes na padding, har yanzu yana ƙarƙashin yanayin cewa za a iya raba sakamakon saƙon da 16 bytes.

Don haka nawa ya kamata ku ƙara?

Haka ne, akwai kuma 404 idan akwai kuskure

Idan wani ya yi nazari a hankali zane da rubutu na takardun, sun lura cewa babu MAC a can. Kuma ana amfani da wannan AES a cikin wani yanayin IGE wanda ba a amfani da shi a ko'ina. Su, ba shakka, suna rubuta game da wannan a cikin FAQ ɗin su ... Anan, kamar, maɓallin saƙon kansa ma SHA hash na bayanan da aka ɓoye, ana amfani da su don bincika amincin - kuma idan akwai rashin daidaituwa, takaddun don wasu dalilai. yana ba da shawarar yin watsi da su shiru (amma batun tsaro, idan sun karya mu fa?).

Ni ba ma'aikacin cryptographer ba ne, watakila babu wani abu da ba daidai ba a cikin wannan yanayin a wannan yanayin ta fuskar ka'idar. Amma zan iya bayyana matsala mai amfani a fili, ta amfani da Desktop Telegram a matsayin misali. Yana ɓoye cache na gida (duk waɗannan D877F783D5D3EF8C) daidai da saƙonni a cikin MTProto (kawai a cikin wannan sigar 1.0), watau. da farko maɓallin saƙo, sannan bayanan kanta (kuma wani wuri ban da babban babba auth_key 256 bytes, ba tare da wanda msg_key mara amfani). Don haka, matsalar ta zama sananne akan manyan fayiloli. Wato, kuna buƙatar adana kwafin bayanan biyu - rufaffiyar da ɓoyewa. Kuma idan akwai megabytes, ko bidiyo mai gudana, alal misali? .. Tsare-tsare na gargajiya tare da MAC bayan rubutun yana ba ku damar karanta shi a cikin rafi, nan da nan aika shi. Amma tare da MTProto za ku yi a farko rufaffen ko kuma rusa saƙon gaba ɗaya, kawai sai a tura shi zuwa cibiyar sadarwa ko diski. Don haka, a cikin sabbin sigogin Teburin Telegram a cikin cache a cikin user_data Hakanan ana amfani da wani tsari - tare da AES a yanayin CTR.

Vasily, [21.06.18 01:27] Oh, na gano menene IGE: IGE shine ƙoƙari na farko na "yanayin ɓoyayyen ɓoyewa," asali na Kerberos. Yunkurin da bai yi nasara ba ne (ba ya ba da kariya ga mutunci), kuma dole ne a cire shi. Wannan shine farkon neman shekara 20 don ingantaccen yanayin ɓoyewa wanda ke aiki, wanda kwanan nan ya ƙare a cikin halaye kamar OCB da GCM.

Kuma yanzu muhawarar daga gefen keken:

Tawagar da ke bayan Telegram, karkashin jagorancin Nikolai Durov, ta ƙunshi zakarun ACM shida, rabin su Ph.D a lissafi. Sun ɗauki kimanin shekaru biyu don fitar da sigar MTProto na yanzu.

Abun ban dariya. Shekaru biyu a matakin ƙasa

Ko kuma kuna iya ɗaukar tls kawai

To, bari mu ce mun yi ɓoyayyen ɓoyewa da sauran nuances. Shin a ƙarshe zai yiwu a aika buƙatun da aka jera a cikin TL kuma a lalata martanin? To menene kuma yaya ya kamata ku aika? Anan, bari mu ce, hanyar initConnection, watakila wannan shi ne?

Vasily, [25.06.18 18:46] Yana fara haɗi da adana bayanai akan na'urar mai amfani da aikace-aikacen.

Yana karɓar app_id, device_model, system_version, app_version da lang_code.

Da kuma wasu tambaya

Takaddun bayanai kamar koyaushe. Jin kyauta don nazarin buɗe tushen

Idan komai ya kusan bayyana tare da invokeWithLayer, to me ke faruwa a nan? Ya bayyana, bari mu ce muna da - abokin ciniki ya riga ya sami abin da zai tambayi uwar garken game da shi - akwai buƙatar da muke so mu aika:

Vasily, [25.06.18 19:13] Yin la'akari da lambar, kiran farko an nannade shi cikin wannan tarkace, kuma abin da kansa an nannade shi da kira tare da sawa.

Me yasa initConnection ba zai iya zama kira na daban ba, amma dole ne ya zama abin rufewa? Haka ne, kamar yadda ya fito, dole ne a yi shi a kowane lokaci a farkon kowane zaman, kuma ba sau ɗaya ba, kamar yadda tare da babban maɓalli. Amma! Mai amfani mara izini ba zai iya kiran shi ba! Yanzu mun kai matakin da ya dace Wannan shafi - kuma yana gaya mana cewa ...

Kadan daga cikin hanyoyin API kawai ke samuwa ga masu amfani mara izini:

  • auth.sendCode
  • auth.resendCode
  • account.getPassword
  • auth.checkPassword
  • auth.checkPhone
  • auth.signUp
  • auth.signIn
  • auth.importAuthorization
  • taimako.getConfig
  • taimako.getNearestDc
  • help.getAppUpdate
  • help.getCdnConfig
  • langpack.getLangPack
  • langpack.getStrings
  • langpack.getBabancin
  • langpack.get Harsuna
  • langpack.samun Harshe

Na farkonsu. auth.sendCode, kuma akwai buƙatun farko da ake so wanda a cikinsa muke aika api_id da api_hash, bayan haka kuma muna karɓar SMS mai lamba. Kuma idan muna cikin DC ɗin da ba daidai ba (lambobin waya a ƙasar nan wani ya yi amfani da su, misali), to za mu sami kuskure tare da lambar DC ɗin da ake so. Don gano ko wane adireshin IP ta lambar DC kuke buƙatar haɗawa da shi, taimaka mana help.getConfig. A wani lokaci akwai kawai shigarwar 5, amma bayan shahararrun abubuwan da suka faru na 2018, adadin ya karu sosai.

Yanzu bari mu tuna cewa mun kai ga wannan mataki a kan uwar garke ba tare da suna. Shin bai yi tsada ba don samun adireshin IP kawai? Me zai hana yin wannan, da sauran ayyuka, a cikin ɓoyayyen ɓangaren MProto? Ina jin ƙin yarda: "Ta yaya za mu tabbatar da cewa ba RKN ba ne zai amsa da adireshin ƙarya?" Don wannan mun tuna cewa, a gaba ɗaya, abokan ciniki na hukuma Maɓallan RSA suna cushe, i.e. iya ka kawai shiga wannan bayanin. A gaskiya, an riga an yi wannan don bayani game da ketare toshewar da abokan ciniki ke karɓa ta wasu tashoshi (a zahiri, ba za a iya yin wannan a cikin MTProto kanta ba; kuna buƙatar sanin inda zaku haɗa).

KO. A wannan mataki na izinin abokin ciniki, har yanzu ba a ba mu izini ba kuma ba mu yi rajistar aikace-aikacen mu ba. Muna son ganin yanzu abin da uwar garken ke amsa hanyoyin da ake samu ga mai amfani mara izini. Kuma a nan…

Vasily, [10.07.18 14:45] https://core.telegram.org/method/help.getConfig

config#7dae33e0 [...] = Config;
help.getConfig#c4f9186b = Config;

https://core.telegram.org/api/datacenter

config#232d5905 [...] = Config;
help.getConfig#c4f9186b = Config;

A cikin tsarin, na farko ya zo na biyu

A cikin tsarin tdesktop darajar ta uku ita ce

Ee, tun lokacin, ba shakka, an sabunta takaddun. Ko da yake ba da daɗewa ba zai iya sake zama mara amfani. Ta yaya mai haɓaka novice zai sani? Wataƙila idan ka yi rajistar aikace-aikacenka, za su sanar da kai? Vasily ya yi haka, amma kash, ba su aika masa da komai ba (kuma, za mu yi magana game da wannan a kashi na biyu).

...Ka lura cewa mun riga mun koma API, watau. zuwa mataki na gaba, kuma an rasa wani abu a cikin batun MProto? Babu mamaki:

Vasily, [28.06.18 02:04] Mm, suna yin taɗi ta hanyar wasu algorithms akan e2e.

Mtproto yana bayyana algorithms na ɓoyewa da maɓallai don yankuna biyu, da kuma ɗan ƙaramin tsari

Amma koyaushe suna haɗa matakan daban-daban na tari, don haka ba koyaushe a bayyana inda mtproto ya ƙare ba kuma matakin na gaba ya fara.

Ta yaya suke hadawa? Da kyau, ga maɓallin wucin gadi iri ɗaya don PFS, alal misali (a hanya, Desktop Telegram ba zai iya yin shi ba). Ana aiwatar da shi ta buƙatun API auth.bindTempAuthKey, i.e. daga saman matakin. Amma a lokaci guda yana tsoma baki tare da ɓoyewa a ƙananan matakin - bayan shi, alal misali, kuna buƙatar sake yin shi. initConnection da sauransu, wannan ba kawai bukata ta al'ada. Abin da ke da mahimmanci kuma shine zaku iya samun maɓallin wucin gadi DAYA kawai a kowane DC, kodayake filin auth_key_id a cikin kowane saƙo yana ba ku damar canza maɓallin aƙalla kowane saƙo, kuma cewa uwar garken yana da hakkin ya “manta” maɓallin wucin gadi a kowane lokaci - takaddun ba ya faɗi abin da za a yi a wannan yanayin… da kyau, me yasa ba zai iya ba. 'Ba ku da maɓallai da yawa, kamar tare da saitin gishiri na gaba, kuma?..

Akwai wasu 'yan wasu abubuwan da ya kamata a lura dasu game da jigon MTProto.

Saƙonnin saƙo, msg_id, msg_seqno, tabbatarwa, pings a cikin hanyar da ba ta dace ba da sauran abubuwan ban mamaki.

Me yasa kuke buƙatar sanin su? Saboda suna "zuba" zuwa matsayi mafi girma, kuma kuna buƙatar sanin su lokacin aiki tare da API. Bari mu ɗauka ba mu da sha'awar msg_key; ƙaramin matakin ya ɓoye mana komai. Amma a cikin bayanan da aka ɓoye muna da filayen masu zuwa (kuma tsawon bayanan, don haka mun san inda padding yake, amma wannan ba shi da mahimmanci):

  • gishiri - int64
  • zaman_id - int64
  • message_id - int64
  • seq_no - int32

Bari mu tunatar da ku cewa gishiri ɗaya ne kawai ga dukan DC. Me yasa aka sani game da ita? Ba wai kawai don akwai buƙata ba get_future_salts, wanda ke gaya muku waɗanne tazara ne za su kasance masu inganci, amma kuma saboda idan gishirin ku ya “rube”, saƙon (buƙatun) zai ɓace kawai. Sabar za ta, ba shakka, bayar da rahoton sabon gishiri ta hanyar bayarwa new_session_created - amma tare da tsohon za ku sake aika shi ko ta yaya, misali. Kuma wannan batu yana rinjayar tsarin gine-ginen aikace-aikacen.

An ba da izinin uwar garken ya sauke zaman gaba ɗaya kuma ya amsa ta wannan hanyar saboda dalilai da yawa. A zahiri, menene zaman MTProto daga bangaren abokin ciniki? Waɗannan lambobi biyu ne session_id и seq_no saƙonni a cikin wannan zaman. To, da kuma haɗin TCP na asali, ba shakka. Bari mu ce abokin cinikinmu har yanzu bai san yadda ake yin abubuwa da yawa ba, ya katse kuma ya sake haɗawa. Idan wannan ya faru da sauri - tsohon zaman ya ci gaba a cikin sabon haɗin TCP, karuwa seq_no kara. Idan ya dau lokaci mai tsawo, uwar garken na iya goge shi, domin a gefensa ma akwai layi, kamar yadda muka samu.

Me ya kamata ya kasance seq_no? Oh, wannan tambaya ce mai ban tsoro. Yi ƙoƙarin fahimtar ainihin abin da ake nufi:

Saƙon da ke da alaƙa da abun ciki

Saƙon da ke buƙatar sanarwa bayyane. Waɗannan sun haɗa da duk mai amfani da saƙonnin sabis da yawa, kusan duka ban da kwantena da sanarwa.

Lambar Jerin Saƙo (msg_seqno)

Lamba 32-bit daidai sau biyu adadin saƙonnin “mai alaƙa” (waɗanda ke buƙatar sanarwa, musamman waɗanda ba kwantena ba) wanda mai aikawa ya ƙirƙira kafin wannan saƙon kuma daga baya ya ƙaru da ɗaya idan saƙon yanzu shine saƙon da ke da alaƙa. A ko da yaushe ana samar da kwantena bayan dukan abinda ke cikinsa; don haka lambar jerin sa ta fi ko daidai da jerin lambobin saƙonnin da ke cikinsa.

Wane irin circus ne wannan tare da karuwa ta 1, sannan wani ta 2?... Ina tsammanin cewa da farko suna nufin "mafi ƙarancin mahimmanci ga ACK, sauran shine lamba", amma sakamakon ba daidai ba ne - musamman, yana fitowa, ana iya aikawa da dama tabbatarwa suna da iri daya seq_no! yaya? To, alal misali, uwar garken tana aiko mana da wani abu, ta aika, kuma mu kanmu shiru, muna amsawa kawai tare da saƙon sabis na tabbatar da samun saƙon sa. A wannan yanayin, tabbacin mu masu fita za su sami lambar mai fita iri ɗaya. Idan kun saba da TCP kuma kuna tunanin cewa wannan yana kama da daji, amma ga alama ba daji sosai ba, saboda a cikin TCP. seq_no ba ya canzawa, amma tabbatarwa yana zuwa seq_no a gefe guda kuma, zan yi gaggawar tayar da ku. Ana ba da tabbaci a cikin MTProto NOT a kan seq_no, kamar yadda yake a cikin TCP, amma ta msg_id !

Menene wannan msg_id, mafi mahimmancin waɗannan fagagen? Mai gano saƙo na musamman, kamar yadda sunan ke nunawa. An bayyana shi azaman lamba 64-bit, mafi ƙanƙanta waɗanda ke sake samun sihirin “uwar garken ba-uwar garke”, sauran kuma tambarin lokaci Unix ne, gami da ɓangaren juzu'i, an canza rago 32 zuwa hagu. Wadancan. timestamp per se (da saƙon da ke da lokutan da suka bambanta da yawa sabar za ta ƙi su). Daga wannan ya nuna cewa gabaɗaya wannan shine mai ganowa wanda yake duniya ga abokin ciniki. Ganin cewa - bari mu tuna session_id - muna da garanti: Babu wani yanayi da za a iya aika saƙon da aka yi nufin zama ɗaya zuwa wani zama na daban. Wato ya bayyana cewa akwai riga uku matakin - zaman, lambar zama, id ɗin saƙo. Me yasa irin wannan rikitarwa, wannan sirrin yana da girma sosai.

Sabili da haka, msg_id ake bukata domin...

RPC: buƙatun, martani, kurakurai. Tabbatarwa.

Kamar yadda wataƙila kun lura, babu nau'in ko aiki na musamman "yi buƙatar RPC" ko'ina a cikin zanen, kodayake akwai amsoshi. Bayan haka, muna da saƙonnin da ke da alaƙa! Wato, kowane sakon zai iya zama bukata! Ko ba zama. Bayan haka, kowane ne msg_id. Amma akwai amsoshi:

rpc_result#f35c6d01 req_msg_id:long result:Object = RpcResult;

Anan ne aka nuna wane sako wannan martani ne. Sabili da haka, a saman matakin API, dole ne ku tuna abin da adadin buƙatun ku ya kasance - Ina tsammanin babu buƙatar bayyana cewa aikin ba shi da alaƙa, kuma ana iya samun buƙatun da yawa a cikin lokaci guda, Amsoshin wanne za a iya mayar da su a kowane tsari? A ka'ida, daga wannan kuma saƙonnin kuskure kamar babu ma'aikata, tsarin gine-ginen da ke bayan wannan za a iya gano shi: uwar garken da ke kula da haɗin TCP tare da ku shine ma'auni na gaba-gaba, yana tura buƙatun zuwa ga baya kuma ya tattara su ta hanyar. message_id. Da alama duk abin da ke nan a bayyane yake, ma'ana kuma mai kyau.

Ee?.. Kuma idan kun yi tunani akai? Bayan haka, amsawar RPC ita ma tana da filin msg_id! Shin muna buƙatar yin ihu ga uwar garken "ba ku amsa amsa ta ba!"? Kuma eh, menene akwai game da tabbatarwa? Game da shafi saƙonni game da saƙonni ya gaya mana menene

msgs_ack#62d6b459 msg_ids:Vector long = MsgsAck;

kuma dole ne a yi ta kowane gefe. Amma ba koyaushe ba! Idan kun sami RpcResult, ita kanta tana aiki azaman tabbaci. Wato, uwar garken na iya amsa buƙatarku tare da MsgsAck - kamar, "Na karɓa." RpcResult zai iya amsa nan da nan. Zai iya zama duka biyu.

Kuma a, har yanzu kuna da amsa amsar! Tabbatarwa. In ba haka ba, uwar garken zai yi la'akari da shi ba za a iya bayarwa ba kuma ya sake mayar da shi zuwa gare ku. Ko da bayan sake haɗawa. Amma a nan, ba shakka, batun ƙarewar lokaci ya taso. Mu duba su anjima kadan.

A halin yanzu, bari mu duba yiwuwar kurakuran aiwatar da tambaya.

rpc_error#2144ca19 error_code:int error_message:string = RpcError;

Oh, wani zai yi kira, a nan ne mafi kyawun tsari - akwai layi! Dauki lokacinku. nan jerin kurakurai, amma ba shakka ba cikakke ba. Daga gare ta muka koyi cewa code ne wani abu kamar Kuskuren HTTP (da kyau, ba shakka, ba a mutunta ilimin tarukan martani, a wasu wuraren ana rarraba su ba da gangan a cikin lambobin), kuma layin yayi kama da. CAPITAL_LETTERS_AND_NUMBERS. Misali, PHONE_NUMBER_OCCPIED ko FILE_PART_Х_MISSING. To, wato, har yanzu kuna buƙatar wannan layin faskara. Misali FLOOD_WAIT_3600 zai nufin cewa dole ne ku jira awa daya, kuma PHONE_MIGRATE_5, cewa lambar tarho mai wannan prefix dole ne a yi rijista a cikin 5th DC. Muna da nau'in harshe, daidai? Ba mu buƙatar hujja daga kirtani, na yau da kullun za su yi, lafiya.

Bugu da ƙari, wannan ba akan shafin saƙon sabis bane, amma, kamar yadda aka riga aka saba tare da wannan aikin, ana iya samun bayanin a wani shafi na takardun. Ko jefa tuhuma. Da farko, duba, cin zarafin rubutu/Layer - RpcError za a iya gida a ciki RpcResult. Me yasa ba a waje ba? Me ba mu yi la’akari da shi ba?.. Saboda haka, ina tabbacin hakan RpcError ba za a iya saka a ciki ba RpcResult, amma zama kai tsaye ko gida a cikin wani nau'i? .. Kuma idan ba zai iya ba, me yasa ba a saman matakin ba, watau. ya bace req_msg_id ? ..

Amma bari mu ci gaba game da saƙonnin sabis. Abokin ciniki na iya tunanin cewa uwar garken yana tunani na dogon lokaci kuma ya yi wannan buƙatu mai ban mamaki:

rpc_drop_answer#58e4a740 req_msg_id:long = RpcDropAnswer;

Akwai yuwuwar amsoshi guda uku ga wannan tambayar, sake haɗawa tare da hanyar tabbatarwa; ƙoƙarin fahimtar abin da yakamata su kasance (da menene jerin nau'ikan nau'ikan da ba sa buƙatar tabbatarwa) an bar wa mai karatu azaman aikin gida (bayanin kula: bayanin da ke ciki) lambar tushen Desktop ɗin Telegram bai cika ba).

Drug jaraba: saƙon statuses

Gabaɗaya, wurare da yawa a cikin TL, MTProto da Telegram gabaɗaya suna barin jin taurin kai, amma saboda ladabi, dabara da sauransu. labarun bashi Mun yi shiru game da shi cikin ladabi, kuma muka yi la'akari da batsa a cikin tattaunawar. Duk da haka, wannan wuriОyawancin shafin yana game da saƙonni game da saƙonni Yana da ban mamaki har ma a gare ni, wanda ya dade yana aiki tare da ka'idojin cibiyar sadarwa kuma ya ga kekuna na nau'i daban-daban na karkatacciyar hanya.

Yana farawa ba tare da lahani ba, tare da tabbatarwa. Na gaba suna ba mu labarin

bad_msg_notification#a7eff811 bad_msg_id:long bad_msg_seqno:int error_code:int = BadMsgNotification;
bad_server_salt#edab447b bad_msg_id:long bad_msg_seqno:int error_code:int new_server_salt:long = BadMsgNotification;

To, duk wanda ya fara aiki da MTProto zai yi mu'amala da su; a cikin sake zagayowar "gyara - sake tattarawa - kaddamarwa", samun kurakurai na lamba ko gishiri wanda ya yi nasara a lokacin gyare-gyare abu ne na kowa. Duk da haka, akwai abubuwa biyu a nan:

  1. Wannan yana nufin cewa ainihin saƙon ya ɓace. Muna buƙatar ƙirƙirar wasu layukan layi, za mu kalli hakan daga baya.
  2. Menene waɗannan baƙon lambobin kuskure? 16, 17, 18, 19, 20, 32, 33, 34, 35, 48, 64... ina sauran lambobi, Tommy?

Takardar ta ce:

Manufar ita ce an haɗa ƙimar kuskure_code (error_code >> 4): misali, lambobin 0x40 — 0x4f sun dace da kurakurai a cikin ruɓewar akwati.

amma, da farko, wani motsi a cikin wata hanya, kuma na biyu, ba kome ba, ina sauran lambobin? A kan marubucin?.. Duk da haka, waɗannan ƙananan abubuwa ne.

Addiction yana farawa a cikin saƙonni game da matsayin saƙo da kwafin saƙo:

  • Neman Bayanin Matsayin Saƙo
    Idan ko wanne bangare bai samu bayani kan matsayin sakwannin da ke fita ba na wani dan lokaci, yana iya nemansa a sarari daga ɗayan bangaren:
    msgs_state_req#da69fb52 msg_ids:Vector long = MsgsStateReq;
  • Saƙon Bayani game da Matsayin Saƙonni
    msgs_state_info#04deb57d req_msg_id:long info:string = MsgsStateInfo;
    Nan, info kirtani ne wanda ya ƙunshi daidai byte ɗaya na matsayi na kowane saƙo daga jerin msg_ids mai shigowa:

    • 1 = babu abin da aka sani game da saƙon (msg_id yayi ƙasa da ƙasa, ƙila ɗayan ɓangaren ya manta da shi)
    • 2 = saƙon da ba a karɓa ba (msg_id ya faɗi cikin kewayon abubuwan ganowa da aka adana; duk da haka, ɗayan ɓangaren ba su sami saƙo irin wannan ba)
    • 3 = saƙon da ba a karɓa ba (msg_id yayi girma sosai; duk da haka, ɗayan ɓangaren bai karɓi shi ba tukuna)
    • 4 = saƙon da aka karɓa (lura cewa wannan amsa ita ma a lokaci guda takardar yarda ce)
    • +8 = an riga an yarda da saƙo
    • +16 = saƙon da ba ya buƙatar amincewa
    • +32 = Tambayar RPC da ke ƙunshe a cikin saƙon da ake sarrafa ko sarrafa tuni an gama
    • +64 = Amsar da ke da alaƙa ga saƙon da aka riga aka samar
    • +128 = wata ƙungiya ta san cewa an riga an karɓi saƙon
      Wannan martani baya buƙatar amincewa. Yana da amincewar msgs_state_req mai dacewa, a ciki da kanta.
      Lura cewa idan ya zama ba zato ba tsammani ɗayan ɓangaren ba su da wani saƙon da yake kama da an aika masa, za a iya sake aika saƙon kawai. Ko da ɗayan ɗayan ya kamata ya karɓi kwafin biyu na saƙon a lokaci ɗaya, kwafin ɗin za a yi watsi da shi. (Idan lokaci mai yawa ya wuce, kuma ainihin msg_id ba ya aiki, za a nannade saƙon a cikin msg_copy).
  • Sadarwar Sa-kai na Matsayin Saƙonni
    Ko wanne bangare na iya sanar da dayan bangaren matsayin sakwannin da daya bangaren ke yadawa da son rai.
    msgs_all_info#8cc0d131 msg_ids:Vector long info:string = MsgsAllInfo
  • Tsawaita Sadarwar Sa-kai na Matsayin Saƙo ɗaya
    ...
    msg_detailed_info#276d3ec6 msg_id:long answer_msg_id:long bytes:int status:int = MsgDetailedInfo;
    msg_new_detailed_info#809db6df answer_msg_id:long bytes:int status:int = MsgDetailedInfo;
  • Buƙatar Buƙatar Sake Aika Saƙonni
    msg_resend_req#7d861a08 msg_ids:Vector long = MsgResendReq;
    Jam’iyyar mai nisa ta amsa nan da nan ta hanyar sake aika sakonnin da aka nema […]
  • Buƙatar Buƙatar Sake Aika Amsoshi
    msg_resend_ans_req#8610baeb msg_ids:Vector long = MsgResendReq;
    Bangaren nesa ya amsa ta hanyar sake aikawa amsoshin ga sakonnin da ake bukata […]
  • Kwafin Saƙo
    A wasu yanayi, tsohon saƙo mai msg_id wanda baya aiki yana buƙatar sake aikawa. Sa'an nan, an nannade shi a cikin kwandon kwafi:
    msg_copy#e06046b2 orig_message:Message = MessageCopy;
    Da zarar an karɓi saƙon, ana sarrafa saƙon kamar ba nannade ba. Koyaya, idan an san tabbatacciyar cewa an karɓi saƙon orig_message.msg_id, to sabon saƙon ba a sarrafa shi ba (alhali a lokaci guda, an yarda da shi da orig_message.msg_id). Dole ne ƙimar orig_message.msg_id ya zama ƙasa da msg_id na akwati.

Mu ma muyi shiru akan me msgs_state_info kuma kunnuwa na TL da ba a gama ba suna fitowa (muna buƙatar vector na bytes, kuma a cikin ƙananan raƙuman ruwa biyu akwai enum, kuma a cikin manyan raƙuman ruwa biyu akwai tutoci). Maganar ta bambanta. Shin akwai wanda ya fahimci dalilin da yasa duk wannan ke aiki? a cikin abokin ciniki na gaske ya zama dole? Amma a nan an bayyana buƙatun tafiya da dawowa.

Hakan ya biyo bayan cewa kowane bangare dole ne ba kawai rufa-rufa da aika saƙonni ba, har ma da adana bayanai game da kansu, game da martanin da aka ba su, na ɗan lokaci da ba a sani ba. Takaddun ba su bayyana ko dai lokuta ko kuma amfani da waɗannan abubuwan ba. babu hanya. Abin da ya fi ban mamaki shi ne cewa a zahiri ana amfani da su a cikin lambar abokan ciniki na hukuma! Da alama an gaya musu wani abu da ba a haɗa shi cikin takardun jama'a ba. Fahimta daga lambar me yasa, ba ya da sauƙi kamar yadda yake a cikin yanayin TL - ba wani yanki ba ne (dangane) a hankali, amma yanki ne da aka ɗaure da tsarin gine-ginen aikace-aikacen, watau. zai buƙaci ƙarin lokaci mai mahimmanci don fahimtar lambar aikace-aikacen.

Pings da lokaci. Layi.

Daga komai, idan muka tuna da zato game da gine-ginen uwar garken (rarrabuwar buƙatun a duk faɗin baya), wani abu mai ban tausayi ya biyo baya - duk da garantin isarwa a cikin TCP (ko dai an isar da bayanan, ko kuma za a sanar da ku game da hutu, amma za a isar da bayanan har sai matsalar ta faru), tabbatarwa a cikin MTProto kanta - babu garanti. Sabar na iya yin asara ko jefar da saƙon cikin sauƙi, kuma babu abin da za a iya yi game da shi, kawai a yi amfani da nau'ikan crutches daban-daban.

Kuma da farko - jerin saƙo. Da kyau, da abu ɗaya duk abin da yake a fili tun daga farkonsa - dole ne a adana saƙon da ba a tabbatar da shi ba kuma a yi fushi. Kuma bayan wane lokaci? Kuma dan iska ya san shi. Wataƙila waɗancan saƙonnin sabis ɗin da suka kamu da cutar ko ta yaya suna magance wannan matsalar tare da ƙugiya, ka ce, a cikin Desktop Telegram akwai kusan layukan 4 da suka dace da su (watakila ƙari, kamar yadda aka riga aka ambata, don wannan kuna buƙatar zurfafa cikin lambar sa da gine-ginen da gaske; daidai da haka. lokaci, mun san cewa ba za a iya ɗaukar shi azaman samfuri ba; ba a amfani da wasu nau'ikan nau'ikan tsarin MProto a ciki).

Me yasa hakan ke faruwa? Wataƙila, masu shirye-shiryen uwar garken sun kasa tabbatar da aminci a cikin gungu, ko ma buffering a gaban ma'auni, kuma sun tura wannan matsala ga abokin ciniki. Daga cikin fidda rai, Vasily ya yi ƙoƙarin aiwatar da wani zaɓi na dabam, tare da layi biyu kawai, ta amfani da algorithms daga TCP - aunawa RTT zuwa uwar garken da daidaita girman "taga" (a cikin saƙonni) dangane da adadin buƙatun da ba a tabbatar ba. Wato, irin wannan m heuristic don tantance nauyin uwar garken shine yawan buƙatun mu da zai iya taunawa a lokaci guda kuma ba zai rasa ba.

To, wato kun gane, ko? Idan dole ne ku sake aiwatar da TCP akan wata yarjejeniya da ke gudana akan TCP, wannan yana nuna ƙayyadaddun ƙa'idar da ba ta da kyau.

Ee, me yasa kuke buƙatar jerin gwano fiye da ɗaya, kuma menene wannan ke nufi ga mutumin da ke aiki tare da babban matakin API ko ta yaya? Duba, kuna yin buƙatu, jera shi, amma galibi ba za ku iya aika ta nan take ba. Me yasa? Domin amsar za ta kasance msg_id, wanda na ɗan lokaci neаNi lakabi ne, aikin da ya fi dacewa a jinkirta shi har zuwa lokacin da zai yiwu - idan uwar garken ya ƙi shi saboda rashin daidaituwa tsakanin lokaci tsakaninmu da shi (ba shakka, za mu iya yin kullun da ke canza lokacinmu daga yanzu. zuwa uwar garken ta ƙara delta da aka lasafta daga martanin uwar garken - abokan ciniki na hukuma suna yin wannan, amma danye ne kuma ba daidai ba saboda buffering). Don haka, lokacin da kuka yi buƙatu tare da kiran aikin gida daga ɗakin karatu, saƙon yana wucewa ta matakai masu zuwa:

  1. Yana kwance a layi ɗaya yana jiran ɓoyewa.
  2. An nada msg_id kuma sakon ya tafi wani jerin gwano - yiwuwar aikawa; aika zuwa soket.
  3. a) Sabar ta amsa MsgsAck - an isar da saƙon, mun share shi daga “sauran jerin gwano”.
    b) Ko akasin haka, ba ya son wani abu, ya amsa badmsg - sake aika daga "wani jerin gwano"
    c) Ba a san wani abu ba, saƙon yana buƙatar fushi daga wani jerin gwano - amma ba a san ainihin lokacin ba.
  4. A karshe uwar garken ta amsa RpcResult - ainihin amsa (ko kuskure) - ba kawai isar ba, har ma da sarrafa shi.

Kila, yin amfani da kwantena zai iya magance matsalar a wani yanki. Wannan shi ne lokacin da tarin saƙon ya cika guda ɗaya, kuma uwar garken ta amsa tare da tabbatar da su gaba ɗaya, a ɗaya. msg_id. Amma kuma zai yi watsi da wannan fakitin, idan wani abu ya faru, gaba ɗaya.

Kuma a wannan lokacin abubuwan da ba na fasaha ba sun shiga cikin wasa. Daga gwaninta, mun ga kullun da yawa, kuma ban da haka, yanzu za mu ga ƙarin misalai na shawarwari mara kyau da gine-gine - a cikin irin wannan yanayi, yana da daraja a amince da yanke shawara? Tambayar magana ce (ba shakka).

Me muke magana akai? Idan a kan batun "saƙonnin ƙwayoyi game da saƙonni" har yanzu kuna iya yin hasashe tare da ƙin yarda kamar "ku wawa ne, ba ku fahimci shirinmu mai haske ba!" (don haka rubuta takardun da farko, kamar yadda mutane na al'ada ya kamata, tare da ma'ana da misalai na musayar fakiti, sa'an nan kuma za mu yi magana), sa'an nan kuma lokaci / lokaci-lokaci tambaya ce kawai mai amfani da takamaiman, duk abin da ke nan an san shi na dogon lokaci. Menene takaddun ke gaya mana game da ƙarewar lokaci?

Sabar yawanci tana yarda da karɓar saƙo daga abokin ciniki (kullum, tambayar RPC) ta amfani da martanin RPC. Idan amsa ya daɗe yana zuwa, uwar garken na iya fara aika takardar shaidar karɓa, kuma daga baya, martanin RPC da kansa.

Abokin ciniki yakan yarda da karɓar saƙo daga uwar garken (yawanci, amsawar RPC) ta ƙara sanarwa zuwa tambayar RPC ta gaba idan ba a watsa shi da latti ba (idan an ƙirƙira shi, a ce, 60-120 seconds bayan karɓar. na saƙo daga uwar garken). Duk da haka, idan na dogon lokaci babu wani dalili na aika saƙonni zuwa uwar garken ko kuma idan akwai adadi mai yawa na saƙonnin da ba a yarda da su ba daga uwar garken (ce, fiye da 16), abokin ciniki yana aika da amincewar kai tsaye.

... Na fassara: mu kanmu ba mu san nawa da yadda muke bukata ba, don haka bari mu ɗauka cewa bari ya kasance kamar haka.

Kuma game da pings:

Saƙonnin Ping (PING/PONG)

ping#7abe77ec ping_id:long = Pong;

Yawancin lokaci ana mayar da martani ga haɗin kai ɗaya:

pong#347773c5 msg_id:long ping_id:long = Pong;

Waɗannan saƙonnin ba sa buƙatar amincewa. Ana yada pong ne kawai don amsawa ga ping yayin da za'a iya fara ping ta kowane bangare.

Rufe Haɗin da aka jinkirta + PING

ping_delay_disconnect#f3427b8c ping_id:long disconnect_delay:int = Pong;

Yana aiki kamar ping. Bugu da kari, bayan an karɓi wannan, uwar garken yana fara mai ƙidayar lokaci wanda zai rufe haɗin haɗin da ke yanzu disconnect_delay bayan daƙiƙa sai dai idan ya karɓi sabon saƙo iri ɗaya wanda ke sake saita duk masu ƙidayar baya. Idan abokin ciniki ya aika waɗannan pings sau ɗaya kowane sakan 60, misali, yana iya saita disconnect_delay daidai da daƙiƙa 75.

Kuna hauka?! A cikin dakika 60, jirgin zai shiga tashar, ya sauka kuma ya ɗauki fasinjoji, kuma zai sake rasa hulɗa a cikin rami. A cikin dakika 120, yayin da kuka ji shi, zai zo a wani, kuma mai yiwuwa haɗin zai karye. Da kyau, ya bayyana a fili inda ƙafafu suke fitowa - "Na ji kararrawa, amma ban san inda yake ba", akwai algorithm Nagl da zaɓi na TCP_NODELAY, wanda aka yi niyya don aikin hulɗa. Amma, gafarce ni, riƙe ƙimar sa ta asali - 200 Milliseconds Idan da gaske kuna son siffanta wani abu makamancin haka kuma ku adana akan fakiti biyu masu yuwuwa, sannan a kashe shi na daƙiƙa 5, ko duk abin da “Mai amfani ke bugawa…” lokacin saƙo ya ƙare yanzu. Amma babu ƙari.

Kuma a ƙarshe, pings. Wato, duba rayuwar haɗin TCP. Abin ban dariya ne, amma kimanin shekaru 10 da suka gabata na rubuta wani rubutu mai mahimmanci game da manzo na dakin kwanan dalibai - marubutan can kuma sun yi amfani da sabar daga abokin ciniki, kuma ba akasin haka ba. Amma dalibai na shekara 3 abu daya ne, kuma ofishin kasa da kasa wani abu ne, daidai?..

Na farko, ɗan shirin ilimi. Haɗin TCP, in babu musayar fakiti, na iya rayuwa har tsawon makonni. Wannan abu ne mai kyau da mara kyau, ya danganta da manufar. Yana da kyau idan kuna da haɗin SSH da aka buɗe zuwa uwar garken, kun tashi daga kwamfutar, sake kunna na'ura mai ba da hanya tsakanin hanyoyin sadarwa, koma wurin ku - zaman ta wannan uwar garken bai tsage ba (ba ku buga komai ba, babu fakiti) , ya dace. Yana da kyau idan akwai dubban abokan ciniki akan sabar, kowannensu yana ɗaukar albarkatu (sannu, Postgres!), Kuma mai yiwuwa mai watsa shiri na abokin ciniki ya sake yin aiki da dadewa - amma ba za mu sani ba.

Tsarin Taɗi/IM ya faɗi cikin shari'a ta biyu don ƙarin dalili guda ɗaya - statuses na kan layi. Idan mai amfani ya “fadi”, kuna buƙatar sanar da masu shiga tsakani game da wannan. In ba haka ba, za ku ƙare tare da kuskuren da masu kirkiro Jabber suka yi (kuma sun gyara shekaru 20) - mai amfani ya katse, amma sun ci gaba da rubuta masa sakonni, suna imani cewa yana kan layi (wanda kuma ya ɓace gaba daya a cikin waɗannan. 'yan mintoci kaɗan kafin a gano haɗin). A'a, zaɓin TCP_KEEPALIVE, wanda mutane da yawa waɗanda ba su fahimci yadda masu ƙidayar TCP ke aiki ba suna jefawa ba tare da izini ba (ta saita dabi'un daji kamar dubun seconds), ba zai taimaka a nan ba - kuna buƙatar tabbatar da cewa ba kawai kwaya ta OS ba. na na'urar mai amfani yana da rai, amma kuma yana aiki akai-akai, yana iya ba da amsa, da kuma aikace-aikacen kanta (kuna tsammanin ba zai iya daskare ba? Telegram Desktop akan Ubuntu 18.04 ya daskare ni fiye da sau ɗaya).

Don haka dole ne ku yi ping sabar abokin ciniki, kuma ba akasin haka - idan abokin ciniki ya yi haka, idan haɗin ya karye, ba za a isar da ping ba, ba za a cimma burin ba.

Me muke gani a Telegram? Daidai akasin haka! To, wato. A bisa ka'ida, ba shakka, bangarorin biyu na iya yin ping juna. A matsayinka na mai mulki, masu amfani suna amfani da kayan aiki ping_delay_disconnect, wanda ke saita lokaci akan sabar. To, uzuri na, ba har zuwa ga abokin ciniki ya yanke shawarar tsawon lokacin da yake so ya zauna a can ba tare da ping ba. Sabar, dangane da nauyinta, ya fi sani. Amma, ba shakka, idan ba ku damu da albarkatun ba, to, za ku zama Pinocchio mugun ku, kuma kullun zai yi ...

Ta yaya ya kamata a tsara ta?

Na yi imani cewa abubuwan da ke sama suna nuna a fili cewa ƙungiyar Telegram / VKontakte ba ta da ƙwarewa sosai a fagen sufuri (da ƙananan) matakin cibiyoyin sadarwar kwamfuta da ƙarancin cancantar su a cikin abubuwan da suka dace.

Me ya sa ya zama mai rikitarwa, kuma ta yaya masu fasahar Telegram za su yi ƙoƙari su ƙi? Gaskiyar cewa sun yi ƙoƙarin yin zaman da ya tsira daga haɗin haɗin TCP, watau, abin da ba a kawo ba a yanzu, za mu kawo daga baya. Wataƙila sun yi ƙoƙari su yi jigilar UDP, amma sun fuskanci matsaloli kuma suka watsar da shi (shi ya sa takardun ba su da komai - babu wani abin alfahari). Amma saboda rashin fahimtar yadda cibiyoyin sadarwa a general da kuma TCP musamman aiki, inda za ka iya dogara da shi, da kuma inda kana bukatar ka yi shi da kanka (da kuma yadda), da kuma kokarin hada wannan tare da cryptography "tsuntsaye biyu da dutse daya”, wannan shine sakamakon.

Ta yaya ya zama dole? Bisa ga gaskiyar cewa msg_id tambarin lokaci ne wanda ya zama dole daga mahangar sirri don hana sake kunnawa hari, kuskure ne a haɗa aikin ganowa na musamman gare shi. Sabili da haka, ba tare da canza ainihin gine-gine na yanzu ba (lokacin da aka samar da rafi na Sabuntawa, wannan babban jigon API ne na wani ɓangaren wannan jerin posts), mutum zai buƙaci:

  1. Sabar da ke riƙe da haɗin TCP ga abokin ciniki yana ɗaukar alhakin - idan ya karanta daga soket, da fatan za a yarda, sarrafa ko dawo da kuskure, babu asara. Sa'an nan tabbatarwa ba vector na ids ba ne, amma kawai "Seq_no na ƙarshe da aka karɓa" - kawai lamba, kamar yadda a cikin TCP (lambobi biyu - seq ɗin ku da wanda aka tabbatar). Kullum muna cikin zaman, ko ba haka ba?
  2. Tambarin lokaci don hana harin sake kunnawa ya zama fili na daban, la nonce. Ana duba shi, amma baya shafar komai. Isa kuma uint32 - idan gishirin mu ya canza aƙalla kowane rabin yini, za mu iya ware 16 ragowa zuwa ƙananan oda na juzu'in juzu'i na lokacin yanzu, sauran - zuwa ɓangaren juzu'i na daƙiƙa (kamar yanzu).
  3. An cire msg_id a duk - daga ra'ayi na rarrabe buƙatun a kan backends, akwai, da farko, abokin ciniki id, kuma na biyu, zaman id, concatenate su. Saboda haka, abu ɗaya ne kawai ya isa a matsayin mai gano buƙatun seq_no.

Wannan kuma ba shine zaɓi mafi nasara ba; cikakken bazuwar zai iya zama mai ganowa - an riga an yi wannan a cikin babban matakin API lokacin aika saƙo, ta hanya. Zai fi kyau a sake gyara gine-ginen gaba ɗaya daga dangi zuwa cikakke, amma wannan batu ne ga wani ɓangaren, ba wannan post ɗin ba.

API?

Ta-dam! Don haka, bayan da muka yi gwagwarmaya ta hanyar da ke cike da ciwo da ƙuƙwalwa, a ƙarshe mun sami damar aika kowane buƙatun zuwa uwar garken kuma mu karɓi kowane amsoshi a gare su, da kuma karɓar sabuntawa daga uwar garken (ba don amsa buƙatun ba, amma shi da kansa. ya aiko mana, kamar PUSH, idan kowa ya fito fili haka).

Hankali, yanzu za a sami misali ɗaya kawai a cikin Perl a cikin labarin! (ga wadanda ba su san ma’anar kalma ba, hujja ta farko ta albarka ita ce tsarin bayanan abu, na biyu kuma ajinsa):

2019.10.24 12:00:51 $1 = {
'cb' => 'TeleUpd::__ANON__',
'out' => bless( {
'filter' => bless( {}, 'Telegram::ChannelMessagesFilterEmpty' ),
'channel' => bless( {
'access_hash' => '-6698103710539760874',
'channel_id' => '1380524958'
}, 'Telegram::InputPeerChannel' ),
'pts' => '158503',
'flags' => 0,
'limit' => 0
}, 'Telegram::Updates::GetChannelDifference' ),
'req_id' => '6751291954012037292'
};
2019.10.24 12:00:51 $1 = {
'in' => bless( {
'req_msg_id' => '6751291954012037292',
'result' => bless( {
'pts' => 158508,
'flags' => 3,
'final' => 1,
'new_messages' => [],
'users' => [],
'chats' => [
bless( {
'title' => 'Хулиномика',
'username' => 'hoolinomics',
'flags' => 8288,
'id' => 1380524958,
'access_hash' => '-6698103710539760874',
'broadcast' => 1,
'version' => 0,
'photo' => bless( {
'photo_small' => bless( {
'volume_id' => 246933270,
'file_reference' => '
'secret' => '1854156056801727328',
'local_id' => 228648,
'dc_id' => 2
}, 'Telegram::FileLocation' ),
'photo_big' => bless( {
'dc_id' => 2,
'local_id' => 228650,
'file_reference' => '
'secret' => '1275570353387113110',
'volume_id' => 246933270
}, 'Telegram::FileLocation' )
}, 'Telegram::ChatPhoto' ),
'date' => 1531221081
}, 'Telegram::Channel' )
],
'timeout' => 300,
'other_updates' => [
bless( {
'pts_count' => 0,
'message' => bless( {
'post' => 1,
'id' => 852,
'flags' => 50368,
'views' => 8013,
'entities' => [
bless( {
'length' => 20,
'offset' => 0
}, 'Telegram::MessageEntityBold' ),
bless( {
'length' => 18,
'offset' => 480,
'url' => 'https://alexeymarkov.livejournal.com/[url_вырезан].html'
}, 'Telegram::MessageEntityTextUrl' )
],
'reply_markup' => bless( {
'rows' => [
bless( {
'buttons' => [
bless( {
'text' => '???? 165',
'data' => 'send_reaction_0'
}, 'Telegram::KeyboardButtonCallback' ),
bless( {
'data' => 'send_reaction_1',
'text' => '???? 9'
}, 'Telegram::KeyboardButtonCallback' )
]
}, 'Telegram::KeyboardButtonRow' )
]
}, 'Telegram::ReplyInlineMarkup' ),
'message' => 'А вот и новая книга! 
// [текст сообщения вырезан чтоб не нарушать правил Хабра о рекламе]
напечатаю.',
'to_id' => bless( {
'channel_id' => 1380524958
}, 'Telegram::PeerChannel' ),
'date' => 1571724559,
'edit_date' => 1571907562
}, 'Telegram::Message' ),
'pts' => 158508
}, 'Telegram::UpdateEditChannelMessage' ),
bless( {
'pts' => 158508,
'message' => bless( {
'edit_date' => 1571907589,
'to_id' => bless( {
'channel_id' => 1380524958
}, 'Telegram::PeerChannel' ),
'date' => 1571807301,
'message' => 'Почему Вы считаете Facebook плохой компанией? Можете прокомментировать? По-моему, это шикарная компания. Без долгов, с хорошей прибылью, а если решат дивы платить, то и еще могут нехило подорожать.
Для меня ответ совершенно очевиден: потому что Facebook делает ужасный по качеству продукт. Да, у него монопольное положение и да, им пользуется огромное количество людей. Но мир не стоит на месте. Когда-то владельцам Нокии было смешно от первого Айфона. Они думали, что лучше Нокии ничего быть не может и она навсегда останется самым удобным, красивым и твёрдым телефоном - и доля рынка это красноречиво демонстрировала. Теперь им не смешно.
Конечно, рептилоиды сопротивляются напору молодых гениев: так Цукербергом был пожран Whatsapp, потом Instagram. Но всё им не пожрать, Паша Дуров не продаётся!
Так будет и с Фейсбуком. Нельзя всё время делать говно. Кто-то когда-то сделает хороший продукт, куда всё и уйдут.
#соцсети #facebook #акции #рептилоиды',
'reply_markup' => bless( {
'rows' => [
bless( {
'buttons' => [
bless( {
'data' => 'send_reaction_0',
'text' => '???? 452'
}, 'Telegram::KeyboardButtonCallback' ),
bless( {
'text' => '???? 21',
'data' => 'send_reaction_1'
}, 'Telegram::KeyboardButtonCallback' )
]
}, 'Telegram::KeyboardButtonRow' )
]
}, 'Telegram::ReplyInlineMarkup' ),
'entities' => [
bless( {
'length' => 199,
'offset' => 0
}, 'Telegram::MessageEntityBold' ),
bless( {
'length' => 8,
'offset' => 919
}, 'Telegram::MessageEntityHashtag' ),
bless( {
'offset' => 928,
'length' => 9
}, 'Telegram::MessageEntityHashtag' ),
bless( {
'length' => 6,
'offset' => 938
}, 'Telegram::MessageEntityHashtag' ),
bless( {
'length' => 11,
'offset' => 945
}, 'Telegram::MessageEntityHashtag' )
],
'views' => 6964,
'flags' => 50368,
'id' => 854,
'post' => 1
}, 'Telegram::Message' ),
'pts_count' => 0
}, 'Telegram::UpdateEditChannelMessage' ),
bless( {
'message' => bless( {
'reply_markup' => bless( {
'rows' => [
bless( {
'buttons' => [
bless( {
'data' => 'send_reaction_0',
'text' => '???? 213'
}, 'Telegram::KeyboardButtonCallback' ),
bless( {
'data' => 'send_reaction_1',
'text' => '???? 8'
}, 'Telegram::KeyboardButtonCallback' )
]
}, 'Telegram::KeyboardButtonRow' )
]
}, 'Telegram::ReplyInlineMarkup' ),
'views' => 2940,
'entities' => [
bless( {
'length' => 609,
'offset' => 348
}, 'Telegram::MessageEntityItalic' )
],
'flags' => 50368,
'post' => 1,
'id' => 857,
'edit_date' => 1571907636,
'date' => 1571902479,
'to_id' => bless( {
'channel_id' => 1380524958
}, 'Telegram::PeerChannel' ),
'message' => 'Пост про 1С вызвал бурную полемику. Человек 10 (видимо, 1с-программистов) единодушно написали:
// [текст сообщения вырезан чтоб не нарушать правил Хабра о рекламе]
Я бы добавил, что блестящая у 1С дистрибуция, а маркетинг... ну, такое.'
}, 'Telegram::Message' ),
'pts_count' => 0,
'pts' => 158508
}, 'Telegram::UpdateEditChannelMessage' ),
bless( {
'pts' => 158508,
'pts_count' => 0,
'message' => bless( {
'message' => 'Здравствуйте, расскажите, пожалуйста, чем вредит экономике 1С?
// [текст сообщения вырезан чтоб не нарушать правил Хабра о рекламе]
#софт #it #экономика',
'edit_date' => 1571907650,
'date' => 1571893707,
'to_id' => bless( {
'channel_id' => 1380524958
}, 'Telegram::PeerChannel' ),
'flags' => 50368,
'post' => 1,
'id' => 856,
'reply_markup' => bless( {
'rows' => [
bless( {
'buttons' => [
bless( {
'data' => 'send_reaction_0',
'text' => '???? 360'
}, 'Telegram::KeyboardButtonCallback' ),
bless( {
'data' => 'send_reaction_1',
'text' => '???? 32'
}, 'Telegram::KeyboardButtonCallback' )
]
}, 'Telegram::KeyboardButtonRow' )
]
}, 'Telegram::ReplyInlineMarkup' ),
'views' => 4416,
'entities' => [
bless( {
'offset' => 0,
'length' => 64
}, 'Telegram::MessageEntityBold' ),
bless( {
'offset' => 1551,
'length' => 5
}, 'Telegram::MessageEntityHashtag' ),
bless( {
'length' => 3,
'offset' => 1557
}, 'Telegram::MessageEntityHashtag' ),
bless( {
'offset' => 1561,
'length' => 10
}, 'Telegram::MessageEntityHashtag' )
]
}, 'Telegram::Message' )
}, 'Telegram::UpdateEditChannelMessage' )
]
}, 'Telegram::Updates::ChannelDifference' )
}, 'MTProto::RpcResult' )
};
2019.10.24 12:00:51 $1 = {
'in' => bless( {
'update' => bless( {
'user_id' => 2507460,
'status' => bless( {
'was_online' => 1571907651
}, 'Telegram::UserStatusOffline' )
}, 'Telegram::UpdateUserStatus' ),
'date' => 1571907650
}, 'Telegram::UpdateShort' )
};
2019.10.24 12:05:46 $1 = {
'in' => bless( {
'chats' => [],
'date' => 1571907946,
'seq' => 0,
'updates' => [
bless( {
'max_id' => 141719,
'channel_id' => 1295963795
}, 'Telegram::UpdateReadChannelInbox' )
],
'users' => []
}, 'Telegram::Updates' )
};
2019.10.24 13:01:23 $1 = {
'in' => bless( {
'server_salt' => '4914425622822907323',
'unique_id' => '5297282355827493819',
'first_msg_id' => '6751307555044380692'
}, 'MTProto::NewSessionCreated' )
};
2019.10.24 13:24:21 $1 = {
'in' => bless( {
'chats' => [
bless( {
'username' => 'freebsd_ru',
'version' => 0,
'flags' => 5440,
'title' => 'freebsd_ru',
'min' => 1,
'photo' => bless( {
'photo_small' => bless( {
'local_id' => 328733,
'volume_id' => 235140688,
'dc_id' => 2,
'file_reference' => '
'secret' => '4426006807282303416'
}, 'Telegram::FileLocation' ),
'photo_big' => bless( {
'dc_id' => 2,
'file_reference' => '
'volume_id' => 235140688,
'local_id' => 328735,
'secret' => '71251192991540083'
}, 'Telegram::FileLocation' )
}, 'Telegram::ChatPhoto' ),
'date' => 1461248502,
'id' => 1038300508,
'democracy' => 1,
'megagroup' => 1
}, 'Telegram::Channel' )
],
'users' => [
bless( {
'last_name' => 'Panov',
'flags' => 1048646,
'min' => 1,
'id' => 82234609,
'status' => bless( {}, 'Telegram::UserStatusRecently' ),
'first_name' => 'Dima'
}, 'Telegram::User' )
],
'seq' => 0,
'date' => 1571912647,
'updates' => [
bless( {
'pts' => 137596,
'message' => bless( {
'flags' => 256,
'message' => 'Создать джейл с именем покороче ??',
'to_id' => bless( {
'channel_id' => 1038300508
}, 'Telegram::PeerChannel' ),
'id' => 119634,
'date' => 1571912647,
'from_id' => 82234609
}, 'Telegram::Message' ),
'pts_count' => 1
}, 'Telegram::UpdateNewChannelMessage' )
]
}, 'Telegram::Updates' )
};

Ee, ba mai ɓarna da gangan ba - idan ba ku karanta ba tukuna, ci gaba da yin shi!

Oh, wai~~... menene wannan kama? Wani abu da aka sani sosai...wataƙila wannan shine tsarin bayanai na API ɗin Yanar Gizo na yau da kullun a cikin JSON, sai dai cewa azuzuwan ma an haɗa su da abubuwa?..

To haka lamarin ya kasance... ’Yan uwa meye haka?. farawa kawai?...Shin kawai JSON akan HTTPS ba zai zama mafi sauƙi ba?! Me muka samu a musayar? Ko ƙoƙarin ya cancanci hakan?

Bari mu kimanta abin da TL+MTProto ya ba mu da kuma waɗanne hanyoyin da za su yiwu. To, HTTP, wanda ke mayar da hankali kan samfurin amsa buƙatar, ba daidai ba ne, amma aƙalla wani abu a saman TLS?

M serialization. Ganin wannan tsarin bayanan, kama da JSON, na tuna cewa akwai nau'ikansa na binary. Bari mu sanya MsgPack a matsayin wanda bai isa ba, amma akwai, alal misali, CBOR - ta hanya, daidaitattun da aka bayyana a cikin RFC 7049. Yana da sananne ga gaskiyar cewa ya bayyana tags, azaman hanyar faɗaɗawa, da tsakanin an riga an daidaita shi samuwa:

  • 25 + 256 - maye gurbin layi mai maimaitawa tare da ambaton lambar layin, irin wannan hanyar matsawa mai arha
  • 26 - Serialized Perl abu tare da sunan aji da mahawara maginin gini
  • 27- serialized harshe mai zaman kansa abu mai nau'in suna da mahawara mai ginin

Da kyau, na yi ƙoƙarin tsara bayanai iri ɗaya a cikin TL kuma a cikin CBOR tare da kunna kirtani da haɗa kayan. Sakamakon ya fara bambanta ga CBOR wani wuri daga megabyte:

cborlen=1039673 tl_len=1095092

Sabili da haka, ƙarshe: Akwai mafi sauƙi mafi sauƙi waɗanda ba su da matsala na gazawar aiki tare ko mai ganowa wanda ba a san shi ba, tare da kwatankwacin inganci.

Saurin haɗin haɗin gwiwa. Wannan yana nufin sifili RTT bayan sake haɗawa (lokacin da aka riga an ƙirƙiri maɓalli sau ɗaya) - ana aiwatar da shi daga saƙon MTProto na farko, amma tare da wasu sharuɗɗa - buga gishiri iri ɗaya, zaman bai lalace ba, da sauransu. Menene TLS ke ba mu maimakon? Magana akan batu:

Lokacin amfani da PFS a cikin TLS, tikitin zaman TLS (RFC 5077) don ci gaba da zaman rufaffiyar ba tare da sake yin shawarwari ba kuma ba tare da adana mahimman bayanai akan uwar garken ba. Lokacin buɗe haɗin farko da ƙirƙirar maɓallai, uwar garken yana ɓoye yanayin haɗin kuma yana watsa shi ga abokin ciniki (a cikin hanyar tikitin zama). Saboda haka, lokacin da aka dawo da haɗin kai, abokin ciniki ya aika tikitin zama, gami da maɓallin zaman, baya zuwa uwar garken. Tikitin kanta an ɓoye shi tare da maɓallin wucin gadi (maɓallin tikitin zama), wanda aka adana akan uwar garken kuma dole ne a rarraba shi tsakanin duk sabar gaba da ke sarrafa SSL a cikin hanyoyin warwarewa.[10]. Don haka, gabatarwar tikitin zama na iya keta PFS idan maɓallan uwar garken wucin gadi sun lalace, alal misali, lokacin da aka adana su na dogon lokaci (OpenSSL, nginx, Apache suna adana su ta tsohuwa na tsawon lokacin shirin; shahararrun rukunin yanar gizo suna amfani da su. maɓalli na sa'o'i da yawa, har zuwa kwanaki).

Anan RTT ba sifili ba ne, kuna buƙatar musanya aƙalla ClientHello da ServerHello, bayan haka abokin ciniki zai iya aika bayanai tare da Finished. Amma a nan ya kamata mu tuna cewa ba mu da gidan yanar gizon, tare da tarin sabbin hanyoyin haɗin yanar gizo, amma manzo, wanda haɗin gwiwar shine sau da yawa ɗaya ko fiye ko žasa mai tsawo, ƙananan buƙatun zuwa shafukan yanar gizon - duk abin yana da yawa. na ciki. Wato, abu ne mai karɓuwa sosai idan ba mu ci karo da wani ɓangaren jirgin ƙasa mara kyau ba.

Manta wani abu kuma? Rubuta a cikin sharhi.

A ci gaba!

A cikin kashi na biyu na wannan jerin posts za mu yi la'akari ba fasaha ba, amma al'amurran kungiya - hanyoyi, akida, dubawa, hali ga masu amfani, da dai sauransu. Dangane da haka, akan bayanan fasaha da aka gabatar anan.

Sashe na uku zai ci gaba da nazarin ɓangaren fasaha / ƙwarewar haɓakawa. Za ku koyi, musamman:

  • ci gaban pandemonium tare da nau'ikan TL iri-iri
  • abubuwan da ba a sani ba game da tashoshi da manyan ƙungiyoyi
  • me yasa maganganu sun fi muni
  • game da cikakken vs dangi saƙon magana
  • menene bambanci tsakanin hoto da hoto
  • yadda emoji ke tsoma baki tare da rubutun rubutun

da sauran ƴan sanda! Ku ci gaba da saurare!

source: www.habr.com

Add a comment