Momwe timayendetsera zotsatsa

Momwe timayendetsera zotsatsa

Ntchito iliyonse yomwe ogwiritsa ntchito amatha kupanga zomwe ali nazo (UGC - Zopangidwa ndi ogwiritsa ntchito) amakakamizika osati kuthetsa mavuto a bizinesi, komanso kuyika zinthu mu UGC. Kusawongolera bwino kapena kutsika kwazinthu kumatha kuchepetsa kukopa kwa ntchito kwa ogwiritsa ntchito, ngakhale kutha ntchito yake.

Lero tikuwuzani za mgwirizano pakati pa Yula ndi Odnoklassniki, zomwe zimatithandiza kutsatsa malonda ku Yula.

Synergy ambiri ndi chinthu chothandiza kwambiri, ndipo m'dziko lamakono, pamene matekinoloje ndi machitidwe akusintha mofulumira kwambiri, amatha kukhala opulumutsa moyo. Chifukwa chiyani mukuwononga zosowa ndi nthawi kupanga chinthu chomwe chidapangidwa kale ndikukumbukiridwa pamaso panu?

Tidaganiza zomwezo pomwe tidakumana ndi ntchito yonse yowongolera zomwe ogwiritsa ntchito - zithunzi, zolemba ndi maulalo. Ogwiritsa athu amatsitsa mamiliyoni azinthu ku Yula tsiku lililonse, ndipo popanda kudzikonza zokha sikutheka kuwongolera deta yonse pamanja.

Chifukwa chake, tidagwiritsa ntchito nsanja yokonzekera, yomwe panthawiyo anzathu ochokera ku Odnoklassniki anali atamaliza kukhala "pafupifupi ungwiro."

Chifukwa chiyani Odnoklassniki?

Tsiku lililonse, ogwiritsa ntchito mamiliyoni ambiri amabwera pamalo ochezera a pa Intaneti ndikufalitsa mabiliyoni azinthu: kuchokera pazithunzi mpaka makanema ndi zolemba. Odnoklassniki moderation platform imathandizira kuyang'ana ma data ambiri ndikuthana ndi spammers ndi bots.

Gulu la OK moderation lapeza zambiri, popeza lakhala likukonza zida zake kwa zaka 12. Ndikofunikira kuti asamangogawana nawo mayankho omwe adapangidwa kale, komanso kusintha mamangidwe a nsanja yawo kuti agwirizane ndi ntchito zathu zenizeni.

Momwe timayendetsera zotsatsa

Kuyambira pano, pakufupikitsa, tingotcha nsanja ya OK "platform."

Momwe zonse zimagwirira ntchito

Kusinthana kwa data pakati pa Yula ndi Odnoklassniki kumakhazikitsidwa kudzera Apache Kafka.

Chifukwa chomwe tasankha chida ichi:

  • Ku Yula, zotsatsa zonse zimasinthidwa pambuyo pake, kotero poyambirira kuyankha kofanana sikunali kofunikira.
  • Ngati ndime yoyipa ichitika ndipo Yula kapena Odnoklassniki sakupezeka, kuphatikiza chifukwa chazochulukira, ndiye kuti deta yochokera ku Kafka sidzatha paliponse ndipo imatha kuwerengedwa pambuyo pake.
  • Pulatifomu idaphatikizidwa kale ndi Kafka, kotero nkhani zambiri zachitetezo zidathetsedwa.

Momwe timayendetsera zotsatsa

Pa malonda aliwonse omwe apangidwa kapena kusinthidwa ndi wogwiritsa ntchito ku Yula, JSON yokhala ndi data imapangidwa, yomwe imayikidwa ku Kafka kuti iwonetsedwe motsatira. Kuchokera ku Kafka, zolengeza zimayikidwa papulatifomu, pomwe zimaweruzidwa zokha kapena pamanja. Zotsatsa zoyipa zimatsekedwa ndi chifukwa, ndipo zomwe nsanja sizipeza zophwanya zimalembedwa kuti "zabwino." Kenako zosankha zonse zimatumizidwanso ku Yula ndikugwiritsidwa ntchito muutumiki.

Pamapeto pake, kwa Yula zonse zimabwera kuzinthu zosavuta: tumizani zotsatsa ku nsanja ya Odnoklassniki ndikubweza chigamulo "chabwino", kapena bwanji "chabwino".

Makina opangira

Kodi chimachitika ndi chiyani pamalonda ikafika papulatifomu? Malonda aliwonse agawidwa m'magulu angapo:

  • Dzina,
  • kufotokoza,
  • zithunzi,
  • gulu losankhidwa ndi ogwiritsa ntchito ndi gawo lazotsatsa,
  • mtengo

Momwe timayendetsera zotsatsa

Pulatifomuyi imapanga gulu lililonse kuti lipeze zobwereza. Kuphatikiza apo, zolemba ndi zithunzi zimasonkhanitsidwa motsatira njira zosiyanasiyana.

Asanaphatikizepo, zolemba zimasinthidwa kuti zichotse zilembo zapadera, zilembo zosinthidwa ndi zinyalala zina. Deta yolandiridwa imagawidwa mu N-grams, yomwe ili ndi hashed. Zotsatira zake ndi ma hashi ambiri apadera. Kufanana kwa malemba kumatsimikiziridwa ndi Mulingo wa Jaccard pakati pa seti ziwiri zotsatira. Ngati kufanana kuli kwakukulu kuposa malire, ndiye kuti malembawo amaphatikizidwa kukhala gulu limodzi. Kuti mufulumizitse kusaka kwamagulu ofanana, MinHash ndi Locality-sensitive hashing amagwiritsidwa ntchito.

Zosankha zingapo zopangira zithunzi za gluing zidapangidwa kuti zizijambula, kuyambira kufananiza zithunzi za pHash mpaka kusaka zobwereza pogwiritsa ntchito neural network.

Njira yotsiriza ndiyo "yoopsa" kwambiri. Kuti aphunzitse chitsanzocho, zithunzi zitatu (N, A, P) zinasankhidwa zomwe N sizifanana ndi A, ndipo P ndi yofanana ndi A (ndi semi-duplicate). Kenako maukonde a neural adaphunzira kupanga A ndi P kukhala pafupi momwe angathere, ndi A ndi N momwe angathere. Izi zimabweretsa zabwino zabodza zochepa poyerekeza ndi kungotenga zoyika kuchokera pa netiweki yophunzitsidwa kale.

Neural network ikalandira zithunzi monga zolowetsa, imapanga vekitala ya N(128)-dimensional kwa aliyense wa iwo ndipo pempho limapangidwa kuti liwone kuyandikira kwa chithunzicho. Kenako, poyambira amawerengedwa pomwe zithunzi zapafupi zimatengedwa ngati zobwereza.

Mtunduwu umatha kupeza mwaluso ma spammers omwe amajambula chinthu chomwecho kuchokera kumakona osiyanasiyana kuti adutse kufananiza kwa pHash.

Momwe timayendetsera zotsatsaMomwe timayendetsera zotsatsa
Chitsanzo cha zithunzi za sipamu zolumikizidwa pamodzi ndi neural network ngati zobwereza.

Pamapeto pake, malonda obwereza amafufuzidwa nthawi imodzi ndi zolemba ndi zithunzi.

Ngati zotsatsa ziwiri kapena zingapo zatsatizana pagulu, dongosololi limayamba kutsekereza, lomwe, pogwiritsa ntchito ma aligorivimu ena, limasankha zomwe zibwerezedwa kuti zichotse komanso zosiya. Mwachitsanzo, ngati ogwiritsa ntchito awiri ali ndi zithunzi zofanana muzotsatsa, makinawo amaletsa malonda aposachedwa.

Akapangidwa, magulu onse amadutsa mndandanda wa zosefera zokha. Fyuluta iliyonse imapereka zigoli ku gululo: ndizotheka bwanji kuti lili ndi chiwopsezo chomwe fyulutayi imadziwika.

Mwachitsanzo, makina amasanthula kufotokozera mu malonda ndikusankha magulu omwe angakhale nawo. Kenako zimatengera yemwe ali ndi kuthekera kwakukulu ndikufanizitsa ndi gulu lomwe lafotokozedwa ndi wolemba malonda. Ngati sizikufanana, malondawo amaletsedwa pagulu lolakwika. Ndipo popeza ndife okoma mtima komanso owona mtima, timauza wogwiritsa ntchito gulu lomwe akuyenera kusankha kuti malondawo adutse bwino.

Momwe timayendetsera zotsatsa
Chidziwitso cha kuletsa kwa gulu lolakwika.

Kuphunzira pamakina kumamveka bwino papulatifomu yathu. Mwachitsanzo, ndi chithandizo chake timafufuza mayina ndi mafotokozedwe a katundu woletsedwa ku Russian Federation. Ndipo mitundu ya neural network mosamalitsa "imayang'ana" zithunzizo kuti muwone ngati zili ndi ma URL, mawu a spam, manambala a foni, ndi zidziwitso zomwezo "zoletsedwa".

Nthawi zomwe akufuna kugulitsa chinthu choletsedwa chobisika ngati chovomerezeka, ndipo palibe mawu mumutu kapena kufotokozera, timagwiritsa ntchito ma taging. Pa chithunzi chilichonse, mpaka 11 ma tag osiyanasiyana amatha kuwonjezeredwa omwe amafotokoza zomwe zili pachithunzichi.

Momwe timayendetsera zotsatsa
Akuyesera kugulitsa hookah pobisala ngati samovar.

Mogwirizana ndi zosefera zovuta, zosavuta zimagwiranso ntchito, kuthetsa mavuto odziwikiratu okhudzana ndi zolemba:

  • antimat;
  • URL ndi chowunikira nambala yafoni;
  • kutchula amithenga apompopompo ndi ena olumikizana nawo;
  • mtengo wotsika;
  • zotsatsa zomwe sizikugulitsidwa, ndi zina.

Masiku ano, malonda aliwonse amadutsa musefa wabwino wa zosefera zodziwikiratu zopitilira 50 zomwe zimayesa kupeza zoyipa pazotsatsa.

Ngati palibe chowunikira chomwe chinagwira ntchito, ndiye kuti yankho limatumizidwa kwa Yula kuti zotsatsazo "ndizotheka" mwadongosolo. Timagwiritsa ntchito yankho ili tokha, ndipo ogwiritsa ntchito omwe adalembetsa nawo ogulitsa amalandira chidziwitso chokhudza kupezeka kwa chinthu chatsopano.

Momwe timayendetsera zotsatsa
Chidziwitso choti wogulitsa ali ndi chinthu chatsopano.

Zotsatira zake, malonda aliwonse "amachulukira" ndi metadata, ena amapangidwa pamene malonda apangidwa (adiresi ya IP ya wolemba, wogwiritsa ntchito, nsanja, geolocation, ndi zina zotero), ndipo zina zonse ndi mphambu zomwe zimaperekedwa ndi fyuluta iliyonse. .

Mizere yolengeza

Zotsatsa zikafika papulatifomu, makinawa amaziyika mumzere umodzi. Mzere uliwonse umapangidwa pogwiritsa ntchito masamu omwe amaphatikiza metadata ya ad m'njira yomwe imazindikira zolakwika zilizonse.

Mwachitsanzo, mukhoza kupanga mzere wa malonda mu gulu la "Mafoni a M'manja" kuchokera kwa ogwiritsa ntchito a Yula omwe amati aku St. Petersburg, koma ma adilesi awo a IP akuchokera ku Moscow kapena mizinda ina.

Momwe timayendetsera zotsatsa
Chitsanzo cha zotsatsa zotumizidwa ndi wogwiritsa ntchito m'mizinda yosiyanasiyana.

Kapena mutha kupanga mizere kutengera zigoli zomwe neural network imagawira zotsatsa, kuzikonza motsitsa.

Mzere uliwonse, molingana ndi chilinganizo chake, umapereka chigoli chomaliza pazotsatsa. Ndiye mukhoza kupitiriza m'njira zosiyanasiyana:

  • tchulani malire omwe malonda adzalandira mtundu wina wa kutsekereza;
  • tumizani zotsatsa zonse zomwe zili pamzere kwa oyang'anira kuti akawunikenso pamanja;
  • kapena phatikizani zosankha zam'mbuyomu: tchulani zoletsa zokha ndikutumiza kwa oyang'anira zotsatsa zomwe sizinafike poyambira pano.

Momwe timayendetsera zotsatsa

Chifukwa chiyani mizere iyi ili yofunika? Tiyerekeze kuti wogwiritsa ntchito adakweza chithunzi chamfuti. Neural network imapatsa mphamvu kuchokera pa 95 mpaka 100 ndipo imatsimikizira molondola 99 peresenti kuti pali chida pachithunzichi. Koma ngati mtengo wamtengo uli pansi pa 95%, kulondola kwachitsanzo kumayamba kuchepa (ichi ndi mawonekedwe a neural network models).

Zotsatira zake, mzere umapangidwa kutengera mtundu wa zigoli, ndipo zotsatsa zomwe zidalandilidwa pakati pa 95 ndi 100 zimangotsekeredwa ngati "Zoletsedwa". Zotsatsa zokhala ndi zochepera 95 zimatumizidwa kwa oyang'anira kuti azikonza pamanja.

Momwe timayendetsera zotsatsa
Chokoleti Beretta ndi makatiriji. Zongoyang'anira pamanja zokha! πŸ™‚

Kuwongolera pamanja

Kumayambiriro kwa 2019, pafupifupi 94% yazotsatsa zonse ku Yula zimangosinthidwa zokha.

Momwe timayendetsera zotsatsa

Ngati nsanja siyingasankhe zotsatsa zina, zimawatumiza kuti aziwongolera pamanja. Odnoklassniki adapanga chida chawo: ntchito za oyang'anira nthawi yomweyo zimawonetsa zidziwitso zonse zofunikira kuti apange chisankho mwachangu - kutsatsa kuli koyenera kapena kuyenera kutsekedwa, kuwonetsa chifukwa.

Ndipo kuti ubwino wautumiki usavutike pakuwongolera pamanja, ntchito za anthu zimayang'aniridwa nthawi zonse. Mwachitsanzo, mumayendedwe a ntchito, woyang'anira akuwonetsedwa "misampha" -zotsatsa zomwe zilipo kale zothetsera. Ngati chisankho cha woyang'anira sichikugwirizana ndi chomwe chatsirizidwa, woyang'anira amapatsidwa cholakwika.

Pafupifupi, woyang'anira amatha masekondi 10 kuyang'ana malonda amodzi. Kuphatikiza apo, kuchuluka kwa zolakwika sikupitilira 0,5% yazotsatsa zonse zotsimikizika.

Kusamala kwa anthu

Anzake ochokera ku Odnoklassniki adapita patsogolo kwambiri ndipo adapezerapo mwayi pa "thandizo la omvera": adalemba ntchito yamasewera pamasamba ochezera pomwe mutha kuyika mwachangu zambiri, ndikuwunikira chizindikiro china choipa - Odnoklassniki Moderator (https://ok.ru/app/moderator). Njira yabwino yopezera mwayi wothandizidwa ndi ogwiritsa ntchito OK omwe akuyesera kuti zomwe zilimo zikhale zosangalatsa.

Momwe timayendetsera zotsatsa
Masewera omwe ogwiritsa amaikapo zithunzi zomwe zili ndi nambala yafoni.

Mzere uliwonse wa zotsatsa papulatifomu ukhoza kutumizidwa kumasewera a Odnoklassniki Moderator. Chilichonse chomwe ogwiritsa ntchito amalemba chimatumizidwa kwa oyang'anira amkati kuti chitsimikizidwe. Chiwembuchi chimakulolani kuti mutseke zotsatsa zomwe zosefera sizinapangidwebe, ndipo nthawi yomweyo pangani zitsanzo zophunzitsira.

Kusunga zotsatira zowongolera

Timasunga zisankho zonse zomwe zapangidwa pakuwongolera kuti tisakonzenso zotsatsa zomwe tasankha kale.

Mamiliyoni magulu amapangidwa tsiku lililonse kutengera zotsatsa. M'kupita kwa nthawi, gulu lirilonse limatchedwa "zabwino" kapena "zoipa." Kutsatsa kwatsopano kulikonse kapena kukonzanso kwake, kulowa mgulu lomwe lili ndi chilemba, kumangolandira chigamulo kuchokera pagulu lokha. Pali pafupifupi 20 zikwizikwi zodziwikiratu zodziwikiratu patsiku.

Momwe timayendetsera zotsatsa

Ngati palibe zilengezo zatsopano zomwe zikufika pagululi, zimachotsedwa pamtima ndipo hashi yake ndi yankho lake zimalembedwa kwa Apache Cassandra.

Pulatifomu ikalandira kutsatsa kwatsopano, imayesa kaye kupeza gulu lofananira pakati pa omwe adapangidwa kale ndikutenga yankho kuchokera pamenepo. Ngati palibe gulu loterolo, nsanja imapita ku Cassandra ndikuyang'ana pamenepo. Mwaipeza? Zabwino, zimagwiritsa ntchito yankho ku tsango ndikutumiza ku Yula. Pali avareji ya zosankha β€œzobwerezedwa” 70 tsiku lililonseβ€”8% ya chiwonkhetso.

Kufotokozera mwachidule

Takhala tikugwiritsa ntchito nsanja ya Odnoklassniki kwa zaka ziwiri ndi theka. Timakonda zotsatira:

  • Timangoyesa 94% ya zotsatsa zonse patsiku.
  • Mtengo wowongolera malonda udachepetsedwa kuchokera ku ma ruble 2 mpaka 7 kopecks.
  • Chifukwa cha chida chokonzekera, tinayiwala za mavuto oyang'anira oyang'anira.
  • Tidachulukitsa kuchuluka kwa zotsatsa zosinthidwa pamanja ndi nthawi 2,5 ndi owongolera ndi bajeti yofanana. Ubwino wa kuwongolera pamanja wakulanso chifukwa chowongolera zokha, komanso kusinthasintha pafupifupi 0,5% ya zolakwika.
  • Timaphimba mwachangu mitundu yatsopano ya sipamu ndi zosefera.
  • Timalumikiza mwachangu madipatimenti atsopano kuti azichita moyenera "Yula Verticals". Kuyambira 2017, Yula adawonjezera Real Estate, Vacancies and Auto verticals.

Source: www.habr.com

Kuwonjezera ndemanga