Itafile yehashi elula yeGPU

Itafile yehashi elula yeGPU
Ndiyithumele kwi-Github iprojekthi entsha Itheyibhile elula ye-GPU Hash.

Yitafile yehashi ye-GPU elula ekwazi ukusetyenzwa ngamakhulu ezigidi zofakelo ngomzuzwana. Kwi-laptop yam ye-NVIDIA GTX 1060, ikhowudi ifaka i-64 yezigidi ze-key-value pairs ezenziwe ngokungenamkhethe malunga ne-210 ms kwaye isusa i-32 yezigidi zezibini malunga ne-64 ms.

Oko kukuthi, isantya kwilaptop simalunga nezigidi ezingama-300 zokufakwa/umzuzwana kunye nezigidi ezingama-500 zokucima/umzuzwana.

Itheyibhile ibhalwe kwi-CUDA, nangona ubuchule obufanayo bunokusetyenziswa kwi-HLSL okanye kwi-GLSL. Ukuphunyezwa kunemida emininzi yokuqinisekisa ukusebenza okuphezulu kwikhadi levidiyo:

  • Kuphela amaqhosha e-32-bit kunye namaxabiso afanayo aqhutyelwa phambili.
  • Itheyibhile yehashi inobungakanani obumiselweyo.
  • Kwaye obu bukhulu kufuneka bulingane nesibini kumandla.

Kwizitshixo kunye namaxabiso, kufuneka ugcine isiphawuli esilula esisihlukanisayo (kule khowudi ingentla ngu-0xffffffff).

Itafile yeHash ngaphandle kwezitshixo

Itheyibhile ye-hash isebenzisa idilesi evulekileyo nge ukukhangela ngomgca, oko kukuthi, luludwe lwexabiso elingundoqo-izibini ezigcinwe kwinkumbulo kwaye inomsebenzi wecache ophezulu. Okufanayo akunakutshiwo ngokudibanisa, okubandakanya ukukhangela isalathisi kuluhlu oludibeneyo. Itheyibhile ye-hash luluhlu olulula lokugcina izinto KeyValue:

struct KeyValue
{
    uint32_t key;
    uint32_t value;
};

Ubungakanani betafile ngamandla amabini, hayi inani eliphambili, kuba umyalelo omnye okhawulezayo wanele ukusebenzisa imaski ye-pow2/AND, kodwa umqhubi wemodulus uyacotha kakhulu. Oku kubalulekile kwimeko yokukhangela ngomgca, njengoko kwitheyibhile yokukhangela isalathiso se-slot kufuneka sisongelwe kwindawo nganye. Kwaye ngenxa yoko, iindleko zokusebenza zongezwa imodyuli kwindawo nganye yokubeka.

Itheyibhile igcina kuphela isitshixo kunye nexabiso lento nganye, hayi ihashi yesitshixo. Ekubeni itafile igcina kuphela izitshixo ze-32-bit, i-hash ibalwa ngokukhawuleza. Ikhowudi engentla isebenzisa i-Murmur3 hash, eyenza kuphela utshintsho olumbalwa, ii-XOR kunye nokuphindaphinda.

Itheyibhile ye-hash isebenzisa iindlela zokukhusela zokutshixa ezizimeleyo kumyalelo wememori. Nokuba eminye imisebenzi yokubhala iphazamisa ulungelelwaniso lweminye imisebenzi enjalo, itheyibhile ye-hash iya kuhlala ilungile. Siza kuthetha ngale nto ngezantsi. Ubuchwephesha busebenza kakuhle ngamakhadi evidiyo aqhuba amawaka eentambo ngaxeshanye.

Izitshixo kunye namaxabiso kwitheyibhile ye-hash aqalwa ukuba angenanto.

Ikhowudi inokuguqulwa ukuze iphathe amaqhosha angama-64-bit kunye namaxabiso ngokunjalo. Izitshixo zifuna ukufunda, ukubhala, nokuthelekisa-kwaye-tshintshise imisebenzi yeathom. Kwaye amaxabiso afuna imisebenzi yokufunda nokubhala yeathom. Ngethamsanqa, kwi-CUDA, ukufunda-bhala imisebenzi ye-32- kunye ne-64-bit ixabiso le-atomic ukuba nje zilungelelaniswa ngokwendalo (jonga ngezantsi). apha), kunye namakhadi evidiyo anamhlanje axhasa i-64-bit ye-athomu yokuthelekisa kunye notshintshiselwano. Ngokuqinisekileyo, xa ufudukela kwiibhithi ezingama-64, ukusebenza kuya kuncipha kancinci.

Imeko yetafile yeHash

Isibini ngasinye sexabiso eliphambili kwitafile ye-hash sinokuba nelinye lamazwe amane:

  • Isitshixo kunye nexabiso azinanto. Kule meko, itafile ye-hash iyaqaliswa.
  • Isitshixo sibhaliwe phantsi, kodwa ixabiso alikabhalwa. Ukuba omnye umsonto ufunda idatha ngoku, ubuya ungenanto. Oku kuqhelekile, into enye ngeyenzekile ukuba omnye umsonto wophumezo usebenze kancinci ngaphambili, kwaye sithetha malunga nolwakhiwo lwedatha oluhambelanayo.
  • Zombini isitshixo kunye nexabiso zirekhodwa.
  • Ixabiso liyafumaneka kweminye imisonto yophumezo, kodwa isitshixo asikabikho. Oku kunokwenzeka ngenxa yokuba imodeli yenkqubo yeCUDA inemodeli yememori eyalelwe ngokukhululekileyo. Oku kuqhelekile, kuyo nayiphi na imeko, isitshixo sisangenanto, nokuba ixabiso alisekho njalo.

I-nuance ebalulekileyo kukuba xa isitshixo sibhaliwe kwi-slot, ayisekho ukuhamba - nokuba isitshixo sisusiwe, siya kuthetha ngale nto ingezantsi.

Ikhowudi yetheyibhile ye-hash ide isebenze kunye neemodeli zememori eziyalelwe ngokukhululekileyo apho ulandelelwano apho imemori ifundwa kwaye ibhalwa ngayo ingaziwa. Njengoko sijonge ukufakela, ukukhangela, kunye nokucima kwitafile ye-hash, khumbula ukuba iperi yexabiso elingundoqo ngalinye likwelinye lamazwe amane achazwe ngasentla.

Ukufaka kwitafile yehashi

Umsebenzi weCUDA ofaka izibini zexabiso elingundoqo kwitafile yehash ibonakala ngolu hlobo:

void gpu_hashtable_insert(KeyValue* hashtable, uint32_t key, uint32_t value)
{
    uint32_t slot = hash(key);

    while (true)
    {
        uint32_t prev = atomicCAS(&hashtable[slot].key, kEmpty, key);
        if (prev == kEmpty || prev == key)
        {
            hashtable[slot].value = value;
            break;
        }
        slot = (slot + 1) & (kHashTableCapacity-1);
    }
}

Ukufakela isitshixo, ikhowudi iphinda-phinda kwi-hash uluhlu lwetafile eqala nge-hash yeqhosha elifakiweyo. Indawo nganye yokubeka kuluhlu yenza umsebenzi wokuthelekisa-kwaye-swap atom ethelekisa isitshixo kuloo slot ukuba engenanto. Ukuba ukungafani kufunyenwe, isitshixo kwindawo yokubeka ihlaziywa kunye nesitshixo esifakiwe, kwaye iqhosha le-original slot libuyiselwa. Ukuba eli qhosha lokuqala lalingenanto okanye lihambelana nesitshixo esifakiweyo, ngoko ikhowudi ifumene indawo efanelekileyo yokufakela kwaye ifake ixabiso elifakiweyo kwi-slot.

Ukuba kwi-kernel enye ifowuni gpu_hashtable_insert() kukho izinto ezininzi ezinesitshixo esifanayo, emva koko nawaphi na amaxabiso azo anokubhalwa kwisitshixo slot. Oku kuthathwa njengesiqhelo: enye yexabiso eliphambili elibhalayo ngexesha lokufowuna liya kuphumelela, kodwa ekubeni konke oku kwenzeka ngokuhambelanayo kwimisonto emininzi yokubulawa, asinakuqikelela ukuba yeyiphi inkumbulo ebhaliweyo eya kuba yeyokugqibela.

Ujongo lwetafile yeHash

Ikhowudi yokukhangela izitshixo:

uint32_t gpu_hashtable_lookup(KeyValue* hashtable, uint32_t key)
{
        uint32_t slot = hash(key);

        while (true)
        {
            if (hashtable[slot].key == key)
            {
                return hashtable[slot].value;
            }
            if (hashtable[slot].key == kEmpty)
            {
                return kEmpty;
            }
            slot = (slot + 1) & (kHashTableCapacity - 1);
        }
}

Ukufumana ixabiso lesitshixo esigcinwe kwitafile, siphindaphinda uluhlu oluqala nge-hash yesitshixo esiyifunayo. Kwindawo nganye yokubeka, sijonga ukuba isitshixo yile siyifunayo, kwaye ukuba kunjalo, sibuyisela ixabiso layo. Siphinde sijonge ukuba isitshixo asinanto na, kwaye ukuba kunjalo, siyaluphelisa uphendlo.

Ukuba asikwazi ukufumana isitshixo, ikhowudi ibuyisela ixabiso elingenanto.

Yonke le misebenzi yokukhangela inokwenziwa ngaxeshanye ngokufaka kunye nokucima. Isibini ngasinye kwitheyibhile siya kuba nelinye lamazwe amane achazwe ngentla apha ekuhambeni.

Ukucima kwitafile ye-hash

Ikhowudi yokucima izitshixo:

void gpu_hashtable_delete(KeyValue* hashtable, uint32_t key, uint32_t value)
{
    uint32_t slot = hash(key);

    while (true)
    {
        if (hashtable[slot].key == key)
        {
            hashtable[slot].value = kEmpty;
            return;
        }
        if (hashtable[slot].key == kEmpty)
        {
            return;
        }
        slot = (slot + 1) & (kHashTableCapacity - 1);
    }
}

Ukucima isitshixo kwenziwa ngendlela engaqhelekanga: sishiya isitshixo kwitafile kwaye siphawule ixabiso layo (kungekhona isitshixo ngokwalo) njengento engenanto. Le khowudi ifana kakhulu ne lookup(), ngaphandle kokuba xa umdlalo ufunyenwe kwisitshixo, yenza ixabiso layo lingenanto.

Njengoko kukhankanyiwe ngasentla, xa isitshixo sibhaliwe kwindawo yokubeka, ayisashukunyiswa. Nokuba into icinyiwe etafileni, isitshixo sihlala sikhona, ixabiso lalo liba lingenanto. Oku kuthetha ukuba akukho mfuneko yokuba sisebenzise umsebenzi wokubhala i-atomic kwixabiso le-slot, kuba akukhathaliseki nokuba ixabiso langoku alinanto okanye hayi - liya kuhlala lingenanto.

Ukuhlaziya itafile ye-hash

Ungatshintsha ubungakanani betafile ye-hash ngokwenza itafile enkulu kwaye ufake izinto ezingenanto ukusuka kwitafile endala kuyo. Khange ndiphumeze lo msebenzi kuba bendifuna ukugcina ikhowudi yesampulu ilula. Ngaphezu koko, kwiinkqubo ze-CUDA, ulwabiwo lwememori luhlala lwenziwa kwikhowudi yenginginya kune-CUDA kernel.

Inqaku Itheyibhile yeHash yokutshixa-Mahala ichaza indlela yokulungiswa kolwakhiwo lwedatha olutshixekileyo.

Ukukhuphisana

Kule khowudi yokusebenza iziqwengana ezingasentla gpu_hashtable_insert(), _lookup() ΠΈ _delete() cwangcisa ixabiso eliphambili elinye ngexesha. Kwaye ngaphantsi gpu_hashtable_insert(), _lookup() ΠΈ _delete() qhubekisa uluhlu lwezibini ngokunxuseneyo, isibini ngasinye kumsonto wophumezo we-GPU:

// CPU code to invoke the CUDA kernel on the GPU
uint32_t threadblocksize = 1024;
uint32_t gridsize = (numkvs + threadblocksize - 1) / threadblocksize;
gpu_hashtable_insert_kernel<<<gridsize, threadblocksize>>>(hashtable, kvs, numkvs);

// GPU code to process numkvs key/values in parallel
void gpu_hashtable_insert_kernel(KeyValue* hashtable, const KeyValue* kvs, unsigned int numkvs)
{
    unsigned int threadid = blockIdx.x*blockDim.x + threadIdx.x;
    if (threadid < numkvs)
    {
        gpu_hashtable_insert(hashtable, kvs[threadid].key, kvs[threadid].value);
    }
}

Itheyibhile ye-hash ekwaziyo ukutshixa ixhasa ukufakwa kunye, ukukhangela, kunye nokucima. Ngenxa yokuba izibini zexabiso eliphambili zihlala zikwisinye sezitshixo ezine kwaye izitshixo azihambi, itheyibhile iqinisekisa ukuchaneka nokuba iintlobo ezahlukeneyo zokusebenza zisetyenziswa ngaxeshanye.

Nangona kunjalo, ukuba siqhuba ibhetshi yofakelo kunye nokucima ngokuhambelanayo, kwaye ukuba uluhlu lwegalelo lwezibini lunezitshixo eziphindwe kabini, ngoko asiyi kukwazi ukuqikelela ukuba zeziphi izibini eziya "kuphumelela" -iya kubhalwa kwitheyibhile ye-hash ekugqibeleni. Masithi siyibize ikhowudi yofakelo kunye noluhlu lwegalelo lwezibini A/0 B/1 A/2 C/3 A/4. Xa ikhowudi igqityiwe, izibini B/1 ΠΈ C/3 ziqinisekisiwe ukuba zikhona kwitafile, kodwa kwangaxeshanye naziphi na izibini ziya kuvela kuyo A/0, A/2 okanye A/4. Oku kunokuba yingxaki okanye kungabi yingxaki - konke kuxhomekeke kwisicelo. Unokwazi kwangaphambili ukuba akukho zitshixo ziphindwe kabini kuluhlu lwegalelo, okanye ungakhathali ukuba leliphi ixabiso elibhalwe ekugqibeleni.

Ukuba le yingxaki kuwe, kufuneka ukwahlule izibini eziphindiweyo kwiifowuni ezahlukeneyo zeCUDA. Kwi-CUDA, nawuphi na umsebenzi obiza i-kernel usoloko ugqibezela phambi kocingo olulandelayo lwe-kernel (ubuncinane ngaphakathi kwemisonto enye. Kwimisonto eyahlukeneyo, iikernel zibulawa ngokunxuseneyo). Kulo mzekelo ungentla, ukuba ubiza i-kernel enye nge A/0 B/1 A/2 C/3, kunye nezinye kunye A/4, emva koko isitshixo A izakufumana ixabiso 4.

Ngoku makhe sithethe malunga nokuba imisebenzi kufuneka lookup() ΠΈ delete() sebenzisa isalathisi esicacileyo okanye esiguquguqukayo kuluhlu lwezibini kwitheyibhile yehashi. Uxwebhu lweCUDA Ichaza ukuba:

Umqokeleli usenokukhetha ukwandisa ukufunda nokubhala kwinkumbulo yehlabathi okanye ekwabelwana ngayo... Olu lwakhiwo lunokuvalwa kusetyenziswa igama elingundoqo. volatile: ... nasiphi na ireferensi kolu tshintsho luqokelelwe kwinkumbulo yokwenyani efundwayo okanye umyalelo wokubhala.

Iingqwalasela zokuchaneka azifuni sicelo volatile. Ukuba umsonto wophumezo usebenzisa ixabiso eligcinwe kwi-cached ukusuka kumsebenzi wokufunda kwangaphambili, iya kusebenzisa ulwazi oluphelelwe lixesha. Kodwa kunjalo, olu lulwazi oluvela kwimeko echanekileyo yetafile ye-hash ngexesha elithile le-kernel call. Ukuba ufuna ukusebenzisa ulwazi lwamva nje, unokusebenzisa isalathiso volatile, kodwa ke ukusebenza kuya kuncipha kancinci: ngokweemvavanyo zam, xa ucima izinto ezizigidi ezingama-32, isantya sinciphile ukusuka kwi-500 yezigidi zokucinywa / isekhondi ukuya kwi-450 yezigidi zokususwa / isekhondi.

Imveliso

Kuvavanyo lokufaka i-64 yezigidi zezinto kunye nokucima i-32 yezigidi zazo, ukhuphiswano phakathi std::unordered_map kwaye akukho tafile yehash yeGPU:

Itafile yehashi elula yeGPU
std::unordered_map ichithe i-70 ms ifaka kwaye isusa izinto kwaye emva koko iyayikhulula unordered_map (ukususa izigidi zezinto kuthatha ixesha elininzi, kuba ngaphakathi unordered_map iinkumbulo ezininzi ziyenziwa). Ukuthetha ngokunyaniseka, std:unordered_map izithintelo ezahlukeneyo ngokupheleleyo. Yintambo enye ye-CPU yophumezo, ixhasa amaxabiso aphambili awo nawuphi na ubungakanani, iqhuba kakuhle kumazinga aphezulu okusetyenziswa, kwaye ibonisa ukusebenza okuzinzileyo emva kokucinywa okuninzi.

Ubude betafile ye-hash ye-GPU kunye ne-inter-program yonxibelelwano yayiyi-984 ms. Oku kubandakanya ixesha elichithwe ngokubeka itafile kwimemori kunye nokucima (ukwaba i-1 GB yememori ngexesha elinye, elithatha ixesha elithile kwi-CUDA), ukufaka kunye nokucima izinto, kunye nokuphindaphinda phezu kwazo. Zonke iikopi eziya nokusuka kwimemori yekhadi levidiyo nazo zithathelwa ingqalelo.

Itheyibhile yehashi ngokwayo ithathe i-271 ms ukuyigqiba. Oku kubandakanya ixesha elichithwe yikhadi levidiyo lifaka kunye nokucima izinto, kwaye alithatheli ingqalelo ixesha elichithwe ukukopisha kwimemori kunye nokuphindaphinda phezu kwetafile enesiphumo. Ukuba itheyibhile ye-GPU ihlala ixesha elide, okanye ukuba itheyibhile ye-hash iqulethwe ngokupheleleyo kwimemori yekhadi levidiyo (umzekelo, ukwenza itafile ye-hash eya kusetyenziswa ngenye ikhowudi ye-GPU kwaye kungekhona iprosesa ephakathi), ngoko ke iziphumo zovavanyo zifanelekile.

Itafile ye-hash yekhadi levidiyo ibonisa ukusebenza okuphezulu ngenxa yokuphuma okuphezulu kunye nokuhambelana okusebenzayo.

Iingxaki

Uyilo lwetafile yehash inemiba embalwa ekufuneka uyiqaphele:

  • Ukuhlolwa komgca kuthintelwa kukudityaniswa, nto leyo ebangela ukuba izitshixo etafileni zibekwe ngaphantsi kokugqibeleleyo.
  • Izitshixo azisuswanga kusetyenziswa umsebenzi delete kwaye ekuhambeni kwexesha badibanisa itafile.

Ngenxa yoko, ukusebenza kwetafile ye-hash kunokuthotywa kancinci kancinci, ngakumbi ukuba ikhona ixesha elide kwaye inokufakwa kunye nokucima okuninzi. Enye indlela yokunciphisa ezi zinto zingeloncedo kukuhlaziya kwakhona kwitafile entsha enesantya esisezantsi sokusetyenziswa kunye nokuhluza izitshixo ezisusiweyo ngexesha lokuhlaziywa kwakhona.

Ukubonisa imiba echaziweyo, ndiza kusebenzisa le khowudi ingentla ukwenza itheyibhile enezinto eziyi-128 yezigidi kunye ne-loop ngokusebenzisa i-4 yezigidi zezinto de ndizalise i-124 yezigidi zeendawo zokubeka (izinga lokusetyenziswa malunga ne-0,96). Nantsi itheyibhile yeziphumo, umqolo ngamnye yiCUDA kernel call yokufaka izigidi ezi-4 zezinto ezintsha kwitafile enye yehashi:

Ireyithi yosetyenziso
Ukufakwa kwexesha 4 izinto

0,00
11,608448 ms (361,314798 million izitshixo/umzuzwana.)

0,03
11,751424 ms (356,918799 million izitshixo/umzuzwana.)

0,06
11,942592 ms (351,205515 million izitshixo/umzuzwana.)

0,09
12,081120 ms (347,178429 million izitshixo/umzuzwana.)

0,12
12,242560 ms (342,600233 million izitshixo/umzuzwana.)

0,16
12,396448 ms (338,347235 million izitshixo/umzuzwana.)

0,19
12,533024 ms (334,660176 million izitshixo/umzuzwana.)

0,22
12,703328 ms (330,173626 million izitshixo/umzuzwana.)

0,25
12,884512 ms (325,530693 million izitshixo/umzuzwana.)

0,28
13,033472 ms (321,810182 million izitshixo/umzuzwana.)

0,31
13,239296 ms (316,807174 million izitshixo/umzuzwana.)

0,34
13,392448 ms (313,184256 million izitshixo/umzuzwana.)

0,37
13,624000 ms (307,861434 million izitshixo/umzuzwana.)

0,41
13,875520 ms (302,280855 million izitshixo/umzuzwana.)

0,44
14,126528 ms (296,909756 million izitshixo/umzuzwana.)

0,47
14,399328 ms (291,284699 million izitshixo/umzuzwana.)

0,50
14,690304 ms (285,515123 million izitshixo/umzuzwana.)

0,53
15,039136 ms (278,892623 million izitshixo/umzuzwana.)

0,56
15,478656 ms (270,973402 million izitshixo/umzuzwana.)

0,59
15,985664 ms (262,379092 million izitshixo/umzuzwana.)

0,62
16,668673 ms (251,627968 million izitshixo/umzuzwana.)

0,66
17,587200 ms (238,486174 million izitshixo/umzuzwana.)

0,69
18,690048 ms (224,413765 million izitshixo/umzuzwana.)

0,72
20,278816 ms (206,831789 million izitshixo/umzuzwana.)

0,75
22,545408 ms (186,038058 million izitshixo/umzuzwana.)

0,78
26,053312 ms (160,989275 million izitshixo/umzuzwana.)

0,81
31,895008 ms (131,503463 million izitshixo/umzuzwana.)

0,84
42,103294 ms (99,619378 million izitshixo/umzuzwana.)

0,87
61,849056 ms (67,815164 million izitshixo/umzuzwana.)

0,90
105,695999 ms (39,682713 million izitshixo/umzuzwana.)

0,94
240,204636 ms (17,461378 million izitshixo/umzuzwana.)

Ngokunyuka kokusetyenziswa, ukusebenza kuyancipha. Oku akunqweneleki kwiimeko ezininzi. Ukuba isicelo sifaka izinto kwitafile kwaye sizilahle (umzekelo, xa ubala amagama kwincwadi), oku akuyongxaki. Kodwa ukuba isicelo sisebenzisa itafile yehashi yexesha elide (umzekelo, kumhleli wegraphics ukugcina iindawo ezingenanto zemifanekiso apho umsebenzisi ehlala efaka kwaye ecima ulwazi), ngoko le ndlela yokuziphatha ingaba yingxaki.

Kwaye kulinganiswe itafile ye-hash yokuhlola ubunzulu emva kokufakwa kwezigidi ezingama-64 (usetyenziso lungu-0,5). Ubunzulu obuyi-avareji yayiyi-0,4774, ngoko ke uninzi lwezitshixo lwalukwindawo yokubeka ilungileyo okanye i-slot enye kude neyona ndawo ilungileyo. Ubuninzi besandi besandi sasingama-60.

Ndaye ndalinganisa ubunzulu bokuhlola kwitafile enezigidi ezili-124 zofakelo (usetyenziso lungu-0,97). Ubunzulu obuphakathi bebusele buyi-10,1757, kwaye ubuninzi - 6474 (!!). Ukusebenza kwe-Linear sensing yehla kakhulu kumazinga aphezulu okusetyenziswa.

Kungcono ukugcina izinga lokusetyenziswa kwetheyibhile ye-hash lisezantsi. Kodwa ke sonyusa ukusebenza ngeendleko zokusetyenziswa kwememori. Ngethamsanqa, kwimeko yamaqhosha angama-32-bit kunye namaxabiso, oku kunokulungiswa. Ukuba kulo mzekelo ungasentla, kwitheyibhile enezinto eziyi-128 yezigidi, sigcina usetyenziso lwe-0,25, ngoko asikwazi ukubeka izinto ezingaphezu kwezigidi ezingama-32 kuyo, kwaye i-96 yezigidi eziseleyo ziya kulahleka - iibyte ezi-8 kwisibini ngasinye. , 768 MB yememori elahlekileyo.

Nceda uqaphele ukuba sithetha ngokulahleka kwememori yekhadi levidiyo, eyona nto ibaluleke kakhulu kunememori yenkqubo. Nangona uninzi lwamakhadi emizobo yedesktop yanamhlanje axhasa i-CUDA ubuncinane i-4 GB yememori (ngexesha lokubhala, i-NVIDIA 2080 Ti ine-11 GB), akusayi kuba sisigqibo sobulumko ukuphulukana nezixa ezinjalo.

Emva kwexesha ndiza kubhala ngakumbi malunga nokudala iitafile ze-hash zamakhadi evidiyo angenangxaki ngobunzulu bokuhlola, kunye neendlela zokuphinda usebenzise iindawo zokubeka ezicinyiweyo.

Umlinganiselo wobunzulu besandi

Ukumisela ubunzulu bovavanyo lwesitshixo, sinokukhupha ihashi yesitshixo (isalathiso sayo setafile esifanelekileyo) kwisalathiso setafile yeso kanye:

// get_key_index() -> index of key in hash table
uint32_t probelength = (get_key_index(key) - hash(key)) & (hashtablecapacity-1);

Ngenxa yomlingo wamanani amabini amabini ahambelanayo kunye nenyaniso yokuba umthamo wetheyibhile ye-hash ibini kumandla amabini, le ndlela iya kusebenza naxa isalathisi esingundoqo sishukunyiswa ekuqaleni kwetafile. Masithathe isitshixo esikhawulezileyo ukuya ku-1, kodwa sifakwe kwi-slot 3. Emva koko kwitafile enomthamo 4 sifumana (3 β€” 1) & 3, elilingana no-2.

isiphelo

Ukuba unemibuzo okanye izimvo, nceda undithumelele i-imeyile Twitter okanye uvule isihloko esitsha ngaphakathi iindawo zokugcina.

Le khowudi yabhalwa phantsi kwempefumlelo evela kumanqaku abalaseleyo:

Kwixesha elizayo, ndiya kuqhubeka ndibhala malunga nokuphunyezwa kwetafile ye-hash kumakhadi evidiyo kwaye ndihlalutye ukusebenza kwabo. Izicwangciso zam ziquka i-chaining, i-Robin Hood hashing, kunye ne-cuckoo hashing usebenzisa imisebenzi ye-athomu kwizakhiwo zedatha ezinobungane be-GPU.

umthombo: www.habr.com

Yongeza izimvo