Silinganisa kanjani izikhangiso

Silinganisa kanjani izikhangiso

Isevisi ngayinye abasebenzisi bayo abangazakhela okwabo okuqukethwe (i-UGC - Okuqukethwe okukhiqizwa ngumsebenzisi) ayiphoqelekile ukuthi ixazulule izinkinga zebhizinisi kuphela, kodwa futhi nokubeka izinto ngendlela ku-UGC. Ukulinganiswa kokuqukethwe okungekuhle noma kwekhwalithi ephansi ekugcineni kunganciphisa ukukhanga kwesevisi kubasebenzisi, kuqede ngisho nokusebenza kwayo.

Namuhla sizokutshela mayelana nokusebenzisana phakathi kwe-Yula ne-Odnoklassniki, okusisiza ukuthi silinganisele ngempumelelo izikhangiso ku-Yula.

I-synergy ngokuvamile iyinto ewusizo kakhulu, futhi emhlabeni wanamuhla, lapho ubuchwepheshe nezitayela zishintsha ngokushesha, zingashintsha zibe umsindisi wempilo. Kungani uchitha izinsiza eziyivelakancane nesikhathi usungula into eseyivele yaqanjwa futhi yalethwa emqondweni wakho?

Sacabanga into efanayo lapho sibhekene nomsebenzi ogcwele wokulinganisa okuqukethwe komsebenzisi - izithombe, umbhalo nezixhumanisi. Abasebenzisi bethu balayisha izigidi zokuqukethwe ku-Yula nsuku zonke, futhi ngaphandle kokucubungula okuzenzakalelayo akwenzeki neze ukulinganisa yonke le datha mathupha.

Ngakho-ke, sisebenzise inkundla yokulinganisa eseyenziwe ngomumo, ngaleso sikhathi ozakwethu base-Odnoklassniki base beqede "cishe baphelele."

Kungani Odnoklassniki?

Nsuku zonke, amashumi ezigidi zabasebenzisi beza engosini yezokuxhumana futhi bashicilele izigidigidi zokuqukethwe: kusuka ezithombeni kuye kumavidiyo nemibhalo. Ipulatifomu yokulinganisa ye-Odnoklassniki isiza ukuhlola amanani amakhulu kakhulu wedatha futhi imelane nogaxekile nama-bots.

Ithimba lokulinganisa le-OK liqongelele ulwazi oluningi, njengoba selithuthukise ithuluzi lalo iminyaka engu-12. Kubalulekile ukuthi bangagcini nje ukwabelana ngezixazululo zabo esezenziwe kakade, kodwa futhi benze ngendlela oyifisayo ukwakheka kwendawo yabo ukuze ivumelane nemisebenzi yethu ethile.

Silinganisa kanjani izikhangiso

Kusukela manje kuqhubeke, ngokufushaniswa, sizovele sibize inkundla yokuhlola ye-OK ngokuthi “inkundla.”

Konke kusebenza kanjani

Ukushintshaniswa kwedatha phakathi kwe-Yula ne-Odnoklassniki kuqaliswa ngokusebenzisa Apache Kafka.

Kungani sikhethe leli thuluzi:

  • E-Yula, zonke izikhangiso zimodelwa ngemuva, ngakho-ke ekuqaleni impendulo ehambisanayo yayingadingeki.
  • Uma kwenzeka isigaba esibi futhi i-Yula noma i-Odnoklassniki ingatholakali, okuhlanganisa ngenxa yemithwalo ephakeme kakhulu, khona-ke idatha evela e-Kafka ngeke ishabalale noma kuphi futhi ingafundwa kamuva.
  • Inkundla yayisivele ihlanganiswe ne-Kafka, ngakho-ke izinkinga eziningi zokuphepha zaxazululwa.

Silinganisa kanjani izikhangiso

Esikhangisweni ngasinye esidalwe noma esilungiswe umsebenzisi ku-Yula, kukhiqizwa i-JSON enedatha, efakwa e-Kafka ukuze ihlolwe okulandelayo. Kusuka e-Kafka, izimemezelo zilayishwa endaweni yesikhulumi, lapho zihlulelwa khona ngokuzenzakalela noma mathupha. Izikhangiso ezimbi zivinjwa ngesizathu, futhi lezo ingxenyekazi engakutholi ukwephulwa kwazo zimakwa ngokuthi “zinhle.” Khona-ke zonke izinqumo zibuyiselwa ku-Yula futhi zisetshenziswe enkonzweni.

Ekugcineni, ku-Yula konke kuza ezenzweni ezilula: thumela isikhangiso endaweni yesikhulumi se-Odnoklassniki bese ubuyisela isinqumo esithi "kulungile", noma kungani kungenjalo "kulungile".

Ukucubungula okuzenzakalelayo

Kwenzekani esikhangisweni ngemva kokufika endawenikazi? Isikhangiso ngasinye sihlukaniswe izinhlangano ezimbalwa:

  • Igama,
  • incazelo,
  • izithombe,
  • isigaba esikhethwe umsebenzisi nesigatshana sesikhangiso,
  • intengo

Silinganisa kanjani izikhangiso

Inkundla ibe seyenza ukuhlanganisa kwebhizinisi ngalinye ukuze kutholwe izimpinda. Ngaphezu kwalokho, umbhalo nezithombe zihlanganiswa ngokuvumelana nezinhlelo ezahlukene.

Ngaphambi kokuhlanganisa, imibhalo ijwayele ukususa izinhlamvu ezikhethekile, izinhlamvu ezishintshiwe kanye nezinye izibi. Idatha etholiwe ihlukaniswe ngama-N-grams, ngayinye ene-hashi. Umphumela uba ama-hashe amaningi ahlukile. Ukufana phakathi kwemibhalo kunqunywa ngu Isilinganiso sikaJaccard phakathi kwamasethi amabili aphumayo. Uma ukufana kukukhulu kunomkhawulo, imibhalo ihlanganiswa ibe yiqoqo elilodwa. Ukusheshisa ukusesha kwamaqoqo afanayo, i-MinHash ne-Locality-sensitive hashing isetshenziswa.

Izinketho ezihlukahlukene zezithombe zokunamathisela zenzelwe izithombe, kusukela ekuqhathaniseni izithombe ze-pHash kuya ekusesheni izimpinda usebenzisa inethiwekhi ye-neural.

Indlela yokugcina iyona “enzima” kakhulu. Ukuze uqeqeshe imodeli, ama-triplets ezithombe (N, A, P) akhethiwe lapho u-N engafani no-A, futhi u-P efana no-A (i-semi-duplicate). Khona-ke inethiwekhi ye-neural yafunda ukwenza u-A no-P basondele ngangokunokwenzeka, kanye no-A no-N ngangokunokwenzeka. Lokhu kuphumela ekutholeni okungelona iqiniso okumbalwa uma kuqhathaniswa nokuthatha okushumekiwe kunethiwekhi eqeqeshwe kusengaphambili.

Uma inethiwekhi ye-neural ithola izithombe njengokufakiwe, ikhiqiza i-N(128)-dimensional vector ngayinye yazo futhi kwenziwa isicelo sokuhlola ukuba seduze kwesithombe. Okulandelayo, i-threshold ibalwa lapho izithombe eziseduze zibhekwa njengezimpinda.

Imodeli iyakwazi ukuthola ngobuchule abathumeli bogaxekile abathwebula umkhiqizo ofanayo ngokuqondile ngama-engeli ahlukene ukuze badlule ukuqhathanisa kwe-pHash.

Silinganisa kanjani izikhangisoSilinganisa kanjani izikhangiso
Isibonelo sezithombe zogaxekile ezinamathiselwe ndawonye inethiwekhi ye-neural njengezimpinda.

Esigabeni sokugcina, izikhangiso eziyimpinda ziseshwa ngesikhathi esisodwa kokubili umbhalo nesithombe.

Uma izikhangiso ezimbili noma ngaphezulu zibambene ndawonye eqoqweni, isistimu iqala ukuvimba okuzenzakalelayo, okuthi, kusetshenziswa ama-algorithms athile, ikhethe ukuthi yiziphi izimpinda ezizosuswa futhi ishiye. Isibonelo, uma abasebenzisi ababili benezithombe ezifanayo esikhangisweni, isistimu izovimba isikhangiso sakamuva kakhulu.

Uma esedaliwe, wonke amaqoqo adlula ochungechungeni lwezihlungi ezizenzakalelayo. Isihlungi ngasinye sabela iqoqo amaphuzu: mangakanani amathuba okuthi liqukethe usongo olukhonjwa yilesi sihlungi.

Isibonelo, isistimu ihlaziya incazelo esikhangisweni bese ikhetha izigaba ezingaba khona. Bese ithatha leyo enamathuba amaningi futhi iqhathanise nesigaba esishiwo umbhali wesikhangiso. Uma engafani, isikhangiso sivinjelwa isigaba esingalungile. Futhi njengoba sinomusa futhi sithembekile, sitshela umsebenzisi ngokuqondile ukuthi yisiphi isigaba okumele asikhethe ukuze isikhangiso siphumelele ukulinganisa.

Silinganisa kanjani izikhangiso
Isaziso sokuvinjwa kwesigaba esingalungile.

Ukufunda ngomshini kuzwakala kusekhaya endaweni yethu yesikhulumi. Isibonelo, ngosizo lwayo sisesha amagama nezincazelo zezimpahla ezinqatshelwe eRussian Federation. Futhi amamodeli enethiwekhi ye-neural "ahlola" ngokucophelela izithombe ukuze abone ukuthi aqukethe yini ama-URL, imibhalo yogaxekile, izinombolo zocingo, kanye nolwazi olufanayo "olunqatshelwe".

Ezimweni lapho bezama ukuthengisa umkhiqizo ongavunyelwe ofihlwe njengento esemthethweni, futhi ungekho umbhalo esihlokweni noma encazelweni, sisebenzisa ukumaka isithombe. Esithombeni ngasinye, kungase kwengezwe omaka abahlukahlukene abangafika kwabayizinkulungwane ezingu-11 abachaza lokho okusesithombeni.

Silinganisa kanjani izikhangiso
Bazama ukuthengisa i-hookah ngokuyifihla njengesamovar.

Ngokuhambisana nezihlungi eziyinkimbinkimbi, ezilula nazo ziyasebenza, zixazulula izinkinga ezisobala ezihlobene nombhalo:

  • i-antimat;
  • I-URL kanye nomtshina wezinombolo zocingo;
  • ukukhuluma ngezithunywa ezisheshayo nabanye abathintwayo;
  • intengo encishisiwe;
  • izikhangiso okungadayiswanga kuzo, njll.

Namuhla, sonke isikhangiso sidlula ngesisefo esihle sezihlungi ezizenzakalelayo ezingaphezu kuka-50 ezizama ukuthola okuthile okubi esikhangisweni.

Uma ingekho imitshina esebenzile, impendulo ithunyelwa ku-Yula yokuthi isikhangiso “sesingenzeka” sihleleke kahle. Sisebenzisa le mpendulo ngokwethu, futhi abasebenzisi ababhalisele umthengisi bathola isaziso mayelana nokutholakala komkhiqizo omusha.

Silinganisa kanjani izikhangiso
Isaziso sokuthi umdayisi unomkhiqizo omusha.

Ngenxa yalokho, isikhangiso ngasinye “sikhule” ngemethadatha, enye yayo ekhiqizwa lapho isikhangiso sidalwa (ikheli lasesizindeni se-inthanethi lombhali, umenzeli womsebenzisi, iplathifomu, indawo yendawo, njll.), kanti okunye amaphuzu akhishwa isihlungi ngasinye. .

Olayini bezimemezelo

Uma isikhangiso sifika endawenikazi, isistimu isibeka kolayini. Ulayini ngamunye udalwa kusetshenziswa ifomula yezibalo ehlanganisa imethadatha yesikhangiso ngendlela ethola noma imaphi amaphethini amabi.

Isibonelo, ungakha ulayini wezikhangiso esigabeni esithi "Omakhalekhukhwini" kubasebenzisi be-Yula okuthiwa base-St. Petersburg, kodwa amakheli abo e-IP avela e-Moscow noma kwamanye amadolobha.

Silinganisa kanjani izikhangiso
Isibonelo sezikhangiso ezithunyelwe umsebenzisi oyedwa emadolobheni ahlukahlukene.

Noma ungakha olayini ngokusekelwe kuzikolo inethiwekhi ye-neural eyabela izikhangiso, uzihlele ngokulandelana kokwehla.

Ulayini ngamunye, ngokuya ngefomula yawo, unikeza amaphuzu wokugcina esikhangisweni. Khona-ke ungaqhubeka ngezindlela ezahlukene:

  • cacisa umkhawulo lapho isikhangiso sizothola khona uhlobo oluthile lokuvinjwa;
  • thumela zonke izikhangiso ezikulayini komongameli ukuze zibuyekezwe mathupha;
  • noma hlanganisa izinketho zangaphambilini: cacisa umkhawulo wokuvinjwa okuzenzakalelayo bese uthumela kubangameli lezo zikhangiso ezingakafinyeleli kulo mkhawulo.

Silinganisa kanjani izikhangiso

Kungani le migqa idingeka? Ake sithi umsebenzisi ulayishe isithombe sesibhamu. Inethiwekhi ye-neural inika amaphuzu ukusuka ku-95 kuya ku-100 futhi inquma ngokunemba okungamaphesenti angama-99 ukuthi kukhona isikhali esithombeni. Kodwa uma inani lamaphuzu lingaphansi kwama-95%, ukunemba kwemodeli kuqala ukwehla (lesi isici samamodeli enethiwekhi ye-neural).

Ngenxa yalokho, kwakhiwa ulayini ngokusekelwe kumodeli yamaphuzu, futhi lezo zikhangiso ezitholwe phakathi kuka-95 no-100 zivinjelwa ngokuzenzakalelayo njengokuthi "Imikhiqizo Enqatshelwe". Izikhangiso ezinamaphuzu angaphansi kwama-95 zithunyelwa kubomengameli ukuze zicutshungulwe mathupha.

Silinganisa kanjani izikhangiso
I-Chocolate Beretta ene-cartridges. Okongamela mathupha kuphela! 🙂

Ukulinganisela mathupha

Ekuqaleni kuka-2019, cishe u-94% wazo zonke izikhangiso ze-Yula zimodareyithwa ngokuzenzakalelayo.

Silinganisa kanjani izikhangiso

Uma inkundla ingakwazi ukunquma kwezinye izikhangiso, izithumela ukuze zihlolwe mathupha. I-Odnoklassniki ithuthukise ithuluzi labo: imisebenzi yomongameli ngokushesha ibonisa lonke ulwazi oludingekayo ukuze wenze isinqumo esisheshayo - isikhangiso sifanelekile noma kufanele sivinjwe, okubonisa isizathu.

Futhi ukuze izinga lenkonzo lingahlupheki ngesikhathi sokulinganisa ngesandla, umsebenzi wabantu uhlale uqashwe ngeso lokhozi. Isibonelo, ekusakazweni komsebenzi, umongameli uboniswa "izicupho" -izikhangiso esezivele zikhona izixazululo esezivele zenziwe. Uma isinqumo somongameli singaqondani nesiqediwe, umongameli unikezwa iphutha.

Ngokwesilinganiso, umongameli uchitha imizuzwana engu-10 ehlola isikhangiso esisodwa. Ngaphezu kwalokho, inani lamaphutha alikho ngaphezu kuka-0,5% wazo zonke izikhangiso eziqinisekisiwe.

Ukulinganisela kwabantu

Ozakwethu base-Odnoklassniki baqhubekela phambili futhi basebenzisa "usizo lwezithameli": babhala uhlelo lokusebenza lomdlalo wenkundla yezokuxhumana lapho ungamaka khona idatha enkulu, uqokomise uphawu oluthile olubi - I-Odnoklassniki Moderator (https://ok.ru/app/moderator). Indlela enhle yokusebenzisa usizo lwabasebenzisi be-OK abazama ukwenza okuqukethwe kujabulise kakhulu.

Silinganisa kanjani izikhangiso
Igeyimu lapho abasebenzisi bamaka khona izithombe ezinenombolo yocingo kuzo.

Noma yimuphi ulayini wezikhangiso kuplathifomu ungaqondiswa kabusha kugeyimu ye-Odnoklassniki Moderator. Yonke into ephawulwa abasebenzisi begeyimu ibe isithunyelwa kubomengameli bangaphakathi ukuze ibuyekezwe. Lolu hlelo lukuvumela ukuthi uvimbe izikhangiso ezingakadalwa izihlungi, futhi ngesikhathi esifanayo udale amasampula okuqeqesha.

Ukugcina imiphumela yokulinganisa

Silondoloza zonke izinqumo ezenziwe ngesikhathi sokulinganisela ukuze singaphinde sicubungule lezo zikhangiso esesivele senze isinqumo ngazo.

Izigidi zamaqoqo zidalwa nsuku zonke ngokusekelwe ezikhangisweni. Ngokuhamba kwesikhathi, iqoqo ngalinye libhalwe ukuthi "okuhle" noma "okubi." Isikhangiso esisha ngasinye noma ukubuyekezwa kwaso, kungena iqoqo elinophawu, kuthola ngokuzenzakalelayo ukulungiswa okuvela kuqoqo ngokwalo. Kunezinqumo ezinjalo ezizenzakalelayo ezingaba yizinkulungwane ezingama-20 ngosuku.

Silinganisa kanjani izikhangiso

Uma zingekho izimemezelo ezintsha ezifikayo kuqoqo, liyasuswa enkumbulweni futhi i-hashi yalo nesixazululo libhalwa ku-Apache Cassandra.

Lapho inkundla ithola isikhangiso esisha, iqala izame ukuthola iqoqo elifanayo phakathi kwalezo esezivele zidaliwe bese ithatha isisombululo kukho. Uma lingekho iqoqo elinjalo, inkundla iya eCassandra futhi ibheke lapho. Ingabe ukutholile? Kuhle, kusebenzisa isixazululo kuqoqo futhi kulithumela ku-Yula. Kunesilinganiso sezinqumo ezinjalo “eziphindaphindwayo” eziyizinkulungwane ezingu-70 nsuku zonke—8% yazo zonke.

Ukufingqa

Sekuyiminyaka emibili nengxenye sisebenzisa inkundla yokulinganisa ye-Odnoklassniki. Siyayithanda imiphumela:

  • Sengamela ngokuzenzakalelayo u-94% wazo zonke izikhangiso ngosuku.
  • Izindleko zokulinganisa isikhangiso esisodwa zehlisiwe zisuka kuma-ruble angu-2 zaya kuma-kopecks angu-7.
  • Ngenxa yethuluzi elenziwe ngomumo, sikhohlwe ngezinkinga zokuphatha omengameli.
  • Senyuse inani lezikhangiso ezicutshungulwa mathupha izikhathi ezingu-2,5 ngenani elifanayo labomengameli nesabelomali. Ikhwalithi yokuhlola okwenziwa ngesandla nayo inyukile ngenxa yokulawula okuzenzakalelayo, futhi ishintshashintsha cishe ku-0,5% wamaphutha.
  • Ngokushesha simboza izinhlobo ezintsha zogaxekile ngezihlungi.
  • Ngokushesha sixhuma iminyango emisha ekulinganiseleni "I-Yula Verticals". Kusukela ngo-2017, u-Yula wengeze Indawo Ethengiswayo, Izikhala kanye nokuma okuzenzakalelayo.

Source: www.habr.com

Engeza amazwana