E ʻimi maikaʻi i nā hilinaʻi hana ma nā waihona

Hoʻohana ʻia ka ʻimi ʻana i nā hilinaʻi hana i ka ʻikepili ma nā wahi like ʻole o ka ʻikepili: hoʻokele waiwai, hoʻomaʻemaʻe ʻikepili, ʻenekinia reverse database a me ka ʻimi ʻikepili. Ua paʻi mua mākou e pili ana i nā hilinaʻi iā lākou iho ʻO kahi mea kākau ʻO Anastasia Birillo lāua ʻo Nikita Bobrov. I kēia manawa, ʻo Anastasia, he haumāna puka o ka Computer Science Center i kēia makahiki, kaʻana like i ka hoʻomohala ʻana o kēia hana ma ke ʻano o ka hana noiʻi āna i pale ai ma ke kikowaena.

E ʻimi maikaʻi i nā hilinaʻi hana ma nā waihona

Koho hana

I koʻu aʻo ʻana ma ke kikowaena CS, hoʻomaka wau e aʻo hohonu i ka ʻikepili, ʻo ia hoʻi, ka ʻimi ʻana i nā hilinaʻi hana a ʻokoʻa. Ua pili kēia kumuhana i ke kumuhana o kaʻu haʻawina ma ke kulanui, no laila, i koʻu hana ʻana i ka haʻawina, ua hoʻomaka wau e heluhelu i nā ʻatikala e pili ana i nā hilinaʻi like ʻole i nā waihona. Ua kākau wau i kahi loiloi o kēia wahi - kekahi o kaʻu mea mua nāʻatikala ma ka ʻōlelo Pelekania a waiho ʻia i ka hālāwai SEIM-2017. Ua hauʻoli loa au i koʻu ʻike ʻana ua ʻae ʻia ʻo ia ma hope o nā mea a pau, a ua hoʻoholo wau e komo hohonu i ke kumuhana. ʻAʻole hou ka manaʻo - ua hoʻomaka e hoʻohana ʻia i ka makahiki 90, akā i kēia manawa ua hoʻohana ʻia ma nā wahi he nui.

I kaʻu kau lua ma ke kikowaena, ua hoʻomaka wau i kahi papahana noiʻi e hoʻomaikaʻi i nā algorithms no ka ʻimi ʻana i nā hilinaʻi hana. Ua hana pū ʻo ia me St. Petersburg State University puka haumāna Nikita Bobrov ma JetBrains Research.

Ka paʻakikī o ka ʻimi ʻana i nā hilinaʻi hana

ʻO ka pilikia nui ka paʻakikī helu. ʻO ka helu o nā hilinaʻi liʻiliʻi liʻiliʻi a hiki ʻole ke kaupalena ʻia ma luna e ka waiwai E ʻimi maikaʻi i nā hilinaʻi hana ma nā waihonakahi E ʻimi maikaʻi i nā hilinaʻi hana ma nā waihona - ka helu o nā hiʻohiʻona papa. ʻAʻole pili ka manawa hana o nā algorithm i ka helu o nā ʻano, akā i ka helu o nā lālani. I nā makahiki 90, hiki i nā algorithm hulina kānāwai federal ma ka PC papapihi maʻamau ke hoʻoponopono i nā pūʻulu ʻikepili i loaʻa a hiki i 20 mau ʻano a me nā ʻumi kaukani o nā lālani a hiki i kekahi mau hola. ʻO nā algorithms hou e holo ana ma nā kaʻina hana multi-core ʻike i nā hilinaʻi no nā pūʻulu ʻikepili i loaʻa i nā haneli o nā ʻano (a hiki i 200) a me nā haneli haneli o nā lālani i ka manawa like. Eia naʻe, ʻaʻole lawa kēia: ʻaʻole ʻae ʻia kēlā manawa no ka hapa nui o nā noi honua maoli. No laila, ua hoʻomohala mākou i nā ala e hoʻolalelale i nā algorithm i loaʻa.

ʻO nā papahana hoʻokoe no nā ʻāpana ʻāpana

Ma ka hapa mua o ka hana, ua hoʻomohala mākou i nā papa hana caching no kahi papa o nā algorithms e hoʻohana ana i ke ʻano hoʻokaʻawale ʻāpana. ʻO kahi ʻāpana no kahi ʻano he papa inoa, kahi i loaʻa i kēlā me kēia papa inoa nā helu laina me nā waiwai like no kahi ʻano i hāʻawi ʻia. Kapa ʻia kēlā me kēia papa inoa he pūʻulu. Nui nā algorithms hou e hoʻohana i nā ʻāpana e hoʻoholo ai inā paʻa ka hilinaʻi a ʻaʻole paha, ʻo ia hoʻi, pili lākou i ka lemma: Dependency E ʻimi maikaʻi i nā hilinaʻi hana ma nā waihona paa ina E ʻimi maikaʻi i nā hilinaʻi hana ma nā waihona. Eia E ʻimi maikaʻi i nā hilinaʻi hana ma nā waihona ua koho ʻia kahi ʻāpana a ua hoʻohana ʻia ka manaʻo o ka nui o ka pā - ka helu o nā puʻupuʻu i loko. Algorithms e hoʻohana i nā ʻāpana, i ka wā i uhaki ʻia ai ka hilinaʻi, e hoʻohui i nā ʻano ʻē aʻe i ka ʻaoʻao hema o ka hilinaʻi, a laila e helu hou, e hana ana i ka hana o ka intersection o nā pā. Kapa ʻia kēia hana ʻo specialization i nā ʻatikala. Akā ua ʻike mākou i hiki ke hoʻohana hou ʻia nā ʻāpana no nā hilinaʻi i mālama ʻia ma hope o kekahi mau pōʻai kūikawā, hiki ke hoʻemi nui i ka manawa holo o nā algorithms, no ka mea he pipiʻi ka hana intersection.

No laila, ua noi mākou i kahi heuristic e pili ana iā Shannon Entropy a me Ginny Uncertainty, a me kā mākou metric, a mākou i kapa ai ʻo Reverse Entropy. He hoʻololi liʻiliʻi ia o Shannon Entropy a piʻi aʻe i ka piʻi ʻana o ke ʻano o ka hoʻonohonoho ʻikepili. ʻO ka heuristic i manaʻo ʻia penei:

E ʻimi maikaʻi i nā hilinaʻi hana ma nā waihona

he mea E ʻimi maikaʻi i nā hilinaʻi hana ma nā waihona — degere o ka ʻokoʻa o ka ʻāpana i helu ʻia E ʻimi maikaʻi i nā hilinaʻi hana ma nā waihonaa me ka E ʻimi maikaʻi i nā hilinaʻi hana ma nā waihona ʻo ia ka median o nā degere o ka ʻokoʻa no kēlā me kēia ʻano. Ua hoʻāʻo ʻia nā metric ʻekolu i hōʻike ʻia ma luna nei ma ke ʻano he metric kūʻokoʻa. Hiki iā ʻoe ke ʻike he ʻelua mau mea hoʻololi i ka heuristic. Hōʻike ka mua i ka pili ʻana o ka ʻāpana o kēia manawa i ke kī mua a hiki iā ʻoe ke hūnā i ka nui o kēlā mau ʻāpana i mamao loa mai ke kī hiki. ʻO ka mea hoʻololi ʻelua e ʻae iā ʻoe e nānā i ka noho ʻana o ka cache a no laila e paipai i ka hoʻohui ʻana i nā ʻāpana ʻē aʻe i ka cache inā loaʻa kahi kaʻawale. ʻO ka hopena kūleʻa o kēia pilikia i ʻae iā mākou e wikiwiki i ka PYRO algorithm e 10-40%, ma muli o ka dataset. He mea pono e hoʻomaopopo i ka PYRO algorithm ka mea lanakila loa ma kēia wahi.

Ma ke kiʻi ma lalo nei hiki iā ʻoe ke ʻike i nā hopena o ka hoʻohana ʻana i ka heuristic i manaʻo ʻia e hoʻohālikelike ʻia me kahi ala hoʻokele coin-flip caching kumu. He logarithmic ka axis X.

E ʻimi maikaʻi i nā hilinaʻi hana ma nā waihona

ʻO kahi ala ʻē aʻe e mālama i nā ʻāpana

A laila hāʻawi mākou i kahi ala ʻē aʻe e mālama i nā ʻāpana. ʻO nā ʻāpana he pūʻulu o nā puʻupuʻu, kēlā me kēia mea e mālama i nā helu o nā tuple me nā waiwai like no kekahi mau ʻano. Loaʻa paha i kēia mau pūʻulu nā kaʻina lōʻihi o nā helu tuple, no ka laʻana inā ua kauoha ʻia ka ʻikepili i loko o ka papaʻaina. No laila, ua manaʻo mākou i kahi hoʻolālā hoʻopaʻa no ka mālama ʻana i nā ʻāpana, ʻo ia hoʻi ka mālama ʻana i nā waiwai i nā puʻupuʻu o nā ʻāpana:

$$hōʻikeʻike$$pi(X) = {{ka hoʻopaʻa lima{1, 2, 3, 4, 5}_{Ka wā mua}, ka pūʻali lalo{7, 8}_{Ka wā ʻelua}, 10}}\ lalo{ Hoʻopiʻi} \ pi(X) = {{ka hoʻopaʻa lima ʻokoʻa{$, 1, 5}_{Mukahi~waena}, kaʻaoʻao lalo{7, 8}_{Kalua~waena}, 10}}$$hōʻike$$

Ua hiki i kēia ala ke hōʻemi i ka hoʻohana ʻana i ka hoʻomanaʻo i ka wā o ka hana o ka TANE algorithm mai 1 a 25%. ʻO ka TANE algorithm kahi algorithm maʻamau no ka ʻimi ʻana i nā kānāwai federal; hoʻohana ia i nā ʻāpana i kāna hana. Ma ke ʻano o ka hoʻomaʻamaʻa, ua koho ʻia ka TANE algorithm, no ka mea, ʻoi aku ka maʻalahi o ka hoʻokō ʻana i ka waiho ʻana i loko o ia mea ma mua, no ka laʻana, ma PYRO i mea e loiloi ai i ka holo ʻana o ke ala i manaʻo ʻia. Hōʻike ʻia nā hopena i loaʻa ma ka kiʻi ma lalo nei. He logarithmic ka axis X.

E ʻimi maikaʻi i nā hilinaʻi hana ma nā waihona

Hui ADBIS-2019

Ma muli o nā hopena o ka noiʻi, i Kepakemapa 2019 ua paʻi au i kahi ʻatikala Ka Hoʻokolo ʻana no ka ʻike ʻana i ka hilinaʻi pono kūpono ma ka 23rd European Conference on Advances in Databases and Information Systems (ADBIS-2019). I ka wā o ka hōʻike, ua ʻike ʻia ka hana e Bernhard Thalheim, he kanaka koʻikoʻi ma ke kahua o ka ʻikepili. Ua hoʻokumu nā hopena noiʻi i kumu o kaʻu dissertation ma ke kēkelē laepua ma ka makemakika a me ka mechanics ma St. Petersburg State University, kahi i hoʻokō ʻia ai nā ala ʻelua i manaʻo ʻia (caching and compression) ma nā algorithm ʻelua: TANE a me PYRO. Eia kekahi, ua hōʻike nā hopena i ke ao holoʻokoʻa nā ala i manaʻo ʻia, no ka mea ma nā algorithm ʻelua, me nā ala ʻelua, ua ʻike ʻia kahi hōʻemi nui o ka hoʻohana ʻana i ka hoʻomanaʻo, a me ka hoʻemi nui ʻana i ka manawa hana o nā algorithms.

Source: www.habr.com

Pākuʻi i ka manaʻo hoʻopuka