Cassandra. Pehea ʻaʻole e make inā ʻike wale ʻoe iā Oracle

Aloha Habr.

ʻO Misha Butrimov koʻu inoa, makemake wau e haʻi iki iā ʻoe e pili ana iā Cassandra. Pono kaʻu moʻolelo i ka poʻe i ʻike ʻole i ka ʻikepili NoSQL - he nui nā hiʻohiʻona hoʻokō a me nā pitfalls e pono ai ʻoe e ʻike. A inā ʻaʻole ʻoe i ʻike i kekahi mea ʻē aʻe ma mua o Oracle a i ʻole kekahi ʻikepili pili pili, e mālama kēia mau mea i kou ola.

He aha ka maikaʻi o Cassandra? ʻO ia kahi waihona NoSQL i hoʻolālā ʻia me ka ʻole o kahi helu o ka hāʻule ʻole e kaulike maikaʻi. Inā pono ʻoe e hoʻohui i ʻelua mau terabytes no kekahi waihona, hoʻohui wale ʻoe i nā nodes i ke apo. E hoʻonui iā ia i kahi kikowaena ʻikepili ʻē aʻe? E hoʻohui i nā nodes i ka hui. Hoʻonui i ka RPS i hana ʻia? E hoʻohui i nā nodes i ka hui. Hana ʻia ma ka ʻaoʻao ʻē aʻe.

Cassandra. Pehea ʻaʻole e make inā ʻike wale ʻoe iā Oracle

He aha hou kāna maikaʻi? E pili ana i ka lawelawe ʻana i nā noi he nui. Akā, ehia ka nui? ʻAʻole nui ka 10, 20, 30, 40 tausani noi i kēlā me kēia kekona. 100 tausani noi i kēlā me kēia kekona no ka hoʻopaʻa ʻana - pū kekahi. Aia nā hui i ʻōlelo e mālama lākou i 2 miliona mau noi i kēlā me kēia kekona. Pono paha lākou e manaʻoʻiʻo.

A ma ke kumu, he ʻokoʻa nui ʻo Cassandra mai ka ʻikepili pili - ʻaʻole like ia me lākou. A he mea nui kēia e hoʻomanaʻo.

ʻAʻole hana like nā mea a pau i like

I ka manawa i hele mai ai kekahi hoa hana iaʻu a nīnau mai: "Eia kahi ʻōlelo noiʻi CQL Cassandra, a loaʻa iā ia kahi ʻōlelo koho, aia kahi, aia a. Kākau wau i nā leka a ʻaʻole pono. No ke aha mai?". ʻO ka mālama ʻana iā Cassandra e like me ka ʻikepili pili ʻo ia ke ala kūpono e pepehi ai i ke ola kino. A ʻaʻole wau e hoʻolaha nei, ua pāpā ʻia ma Rūsia. E hoʻolālā wale ʻoe i kahi mea hewa.

No ka laʻana, hele mai kekahi mea kūʻai mai iā mākou a ʻōlelo mai: “E kūkulu kāua i kahi waihona no nā moʻolelo TV, a i ʻole kahi waihona no kahi papa kuhikuhi meaʻai. E loaʻa iā mākou nā kīʻaha meaʻai ma laila a i ʻole kahi papa inoa o nā moʻolelo TV a me nā mea keaka i loko. " 'Ōlelo mākou me ka hauʻoli: "E hele kāua!" E hoʻouna wale i ʻelua bytes, ʻelua mau hōʻailona a pau ʻoe, e hana wikiwiki a hilinaʻi nā mea a pau. A maikaʻi nā mea a pau a hiki i ka poʻe kūʻai mai a ʻōlelo mai e hoʻoponopono pū ana nā wahine hale i ka pilikia kūʻē: loaʻa iā lākou kahi papa inoa o nā huahana, a makemake lākou e ʻike i ke kīʻaha a lākou e makemake ai e kuke. Ua make ʻoe.

ʻO kēia no ka mea ʻo Cassandra kahi waihona kikowaena: hāʻawi ia i kahi waiwai nui a mālama i ka ʻikepili i nā kolamu ākea. Ma Java a i ʻole Kotlin, hiki ke wehewehe ʻia penei:

Map<RowKey, SortedMap<ColumnKey, ColumnValue>>

ʻO ia hoʻi, he palapala ʻāina i loaʻa pū kekahi palapala ʻāina. ʻO ke kī mua o kēia palapala ʻāina ʻo ia ke kī ʻo Row a i ʻole kī Partition - ke kī ʻāpana. ʻO ke kī ʻelua, ʻo ia ke kī i ka palapala ʻāina i hoʻokaʻawale ʻia, ʻo ia ke kī Clustering.

No ka hōʻike ʻana i ka puʻunaue ʻana o ka waihona, e kahakiʻi kākou i ʻekolu node. I kēia manawa pono ʻoe e hoʻomaopopo pehea e hoʻokaʻawale ai i ka ʻikepili i nā nodes. No ka mea, inā mākou e hoʻopili i nā mea a pau i hoʻokahi (ma ke ala, hiki i hoʻokahi tausani, ʻelua tausani, ʻelima - e like me kou makemake), ʻaʻole kēia e pili ana i ka hāʻawi ʻana. No laila, pono mākou i kahi hana makemakika e hoʻihoʻi i kahi helu. He helu wale nō, he int lōʻihi e hāʻule i loko o kekahi ʻano. A e loaʻa iā mākou hoʻokahi node ke kuleana no hoʻokahi pae, ʻo ka lua no ka lua, ʻo ka n no ka nth.

Cassandra. Pehea ʻaʻole e make inā ʻike wale ʻoe iā Oracle

Lawe ʻia kēia helu me ka hoʻohana ʻana i kahi hana hash, i hoʻopili ʻia i ka mea a mākou i kapa ai i ke kī Partition. ʻO kēia ke kolamu i hōʻike ʻia ma ke kuhikuhi kī Primary, a ʻo ia ke kolamu e lilo i ke kī mua a me ka nui loa o ka palapala ʻāina. Hoʻoholo ia i ka node e loaʻa ai ka ʻikepili. Hoʻokumu ʻia kahi pākaukau ma Cassandra me ka like like o ka syntax e like me SQL:

CREATE TABLE users (
	user_id uu id,
	name text,
	year int,
	salary float,
	PRIMARY KEY(user_id)

)

ʻO ke kī Primary ma kēia hihia he hoʻokahi kolamu, a ʻo ia hoʻi ke kī hoʻokaʻawale.

Pehea e hana ai kā mākou mea hoʻohana? E hele ana kekahi i kekahi node, kekahi i kekahi, a o kekahi i ke kolu. ʻO ka hopena he papaʻaina hash maʻamau, ʻike ʻia hoʻi he palapala ʻāina, ʻike ʻia hoʻi he puke wehewehe ʻōlelo ma Python, a i ʻole kahi ʻōnaehana waiwai Key maʻalahi e hiki ai iā mākou ke heluhelu i nā waiwai āpau, heluhelu a kākau ma ke kī.

Cassandra. Pehea ʻaʻole e make inā ʻike wale ʻoe iā Oracle

E koho: i ka wā e ʻae ʻia ai ke kānana ʻana i ka scan piha, a i ʻole ka mea e hana ʻole ai

E kākau i kekahi ʻōlelo koho: select * from users where, userid = . Hoʻololi ia e like me Oracle: kākau mākou i ke koho, kuhikuhi i nā kūlana a hana nā mea āpau, loaʻa nā mea hoʻohana. Akā inā koho ʻoe, no ka laʻana, kahi mea hoʻohana me kahi makahiki hānau, hoʻopiʻi ʻo Cassandra ʻaʻole hiki ke hoʻokō i ka noi. No ka mea, ʻaʻole ʻo ia i ʻike iki i ke ʻano o kā mākou hāʻawi ʻana i ka ʻikepili e pili ana i ka makahiki hānau - hoʻokahi wale nō kolamu i hōʻike ʻia he kī. A laila ʻōlelo ʻo ia, "ʻAe, hiki iaʻu ke hoʻokō i kēia noi. Hoʻohui i ka ʻae kānana." Hoʻohui mākou i ke kuhikuhi, hana nā mea a pau. A i kēia manawa, hiki mai kekahi mea weliweli.

Ke holo mākou ma ka ʻikepili hōʻike, maikaʻi nā mea a pau. A ke hoʻokō nei ʻoe i kahi nīnau i ka hana ʻana, kahi i loaʻa iā mākou, no ka laʻana, 4 miliona mau moʻolelo, a laila ʻaʻole maikaʻi loa nā mea āpau iā mākou. No ka mea, ʻo ka ʻae ʻana i ka kānana he kuhikuhi ia e hiki ai iā Cassandra ke hōʻiliʻili i nā ʻikepili a pau mai kēia papa ʻaina mai nā nodes a pau, nā kikowaena ʻikepili āpau (inā he nui lākou i loko o kēia pūʻulu), a laila kānana wale. He analogue kēia o Full Scan, a ʻaʻole hauʻoli kekahi me ia.

Inā makemake mākou i nā mea hoʻohana ma ka ID, maikaʻi mākou me kēia. Akā i kekahi manawa pono mākou e kākau i nā nīnau ʻē aʻe a kau i nā palena ʻē aʻe i ke koho. No laila, ke hoʻomanaʻo nei mākou: he palapala ʻāina kēia i loaʻa ke kī hoʻokaʻawale, akā i loko he palapala ʻāina i hoʻokaʻawale ʻia.

A he kī nō hoʻi kāna, i kapa ʻia ʻo Clustering Key. ʻO kēia kī, ʻo ia hoʻi, nā kolamu a mākou e koho ai, me ke kōkua o Cassandra e hoʻomaopopo i ke ʻano o kāna ʻikepili i hoʻonohonoho kino ʻia a loaʻa i kēlā me kēia node. ʻO ia hoʻi, no kekahi kī Partition, e haʻi pololei ke kī Clustering iā ʻoe pehea e hoʻokuke ai i ka ʻikepili i loko o kēia kumulāʻau, kahi e lawe ai i laila.

He lāʻau maoli kēia, ua kapa ʻia kahi mea hoʻohālikelike ma laila, kahi mākou e hele ai i kahi pūʻulu kolamu ma ke ʻano o kahi mea, a ua kuhikuhi ʻia hoʻi ma ke ʻano he papa inoa o nā kolamu.

CREATE TABLE users_by_year_salary_id (
	user_id uuid,
	name text,
	year int,
	salary float,
	PRIMARY KEY((year), salary, user_id)

E noʻonoʻo i ke kuhikuhi kī Primary; ʻo kāna hoʻopaʻapaʻa mua (i kā mākou hihia, ʻo ka makahiki) ʻo ia ke kī Partition. Hiki iā ia ke komo i hoʻokahi kolamu a ʻoi aku paha, ʻaʻohe mea nui. Inā he nui nā kolamu, pono e wehe hou ʻia i loko o nā brackets i maopopo ai ka ʻōlelo preprocessor ʻo ia ke kī Primary, a ma hope o nā kolamu ʻē aʻe ke kī Clustering. I kēia hihia, e hoʻouna ʻia lākou i ka mea hoʻohālikelike i ke ʻano i hōʻike ʻia ai. ʻO ia hoʻi, ʻoi aku ka nui o ke kolamu mua, ʻoi aku ka nui o ka lua, a pēlā aku. ʻO ke ʻano o kā mākou kākau ʻana, no ka laʻana, ua like nā māla no nā papa ʻikepili: papa inoa mākou i nā māla, a no lākou e kākau mākou i nā mea nui a ʻoi aku ka liʻiliʻi. Ma Cassandra, ʻo kēia, ma ke ʻano he ʻōlelo, nā kahua o ka papa ʻikepili, kahi e hoʻopili ʻia ai nā mea like.

Hoʻonohonoho mākou i ka hoʻokaʻawale ʻana a kau i nā palena

Pono ʻoe e hoʻomanaʻo ua hoʻonohonoho ʻia ke ʻano o ke ʻano (e iho, piʻi, nā mea a pau) i ka manawa like i ka wā i hana ʻia ai ke kī, ʻaʻole hiki ke hoʻololi ʻia ma hope. Hoʻoholo kino ia pehea e hoʻokaʻawale ʻia ai ka ʻikepili a pehea e mālama ʻia ai. Inā pono ʻoe e hoʻololi i ke kī Clustering a i ʻole ka hoʻonohonoho ʻana, pono ʻoe e hana i kahi papaʻaina hou a hoʻoili i ka ʻikepili i loko. ʻAʻole e hana kēia me kahi mea i loaʻa.

Cassandra. Pehea ʻaʻole e make inā ʻike wale ʻoe iā Oracle

Hoʻopiha mākou i kā mākou papaʻaina me nā mea hoʻohana a ʻike mākou ua hāʻule lākou i loko o ke apo, ma mua o ka makahiki hānau, a laila i loko o kēlā me kēia node e ka uku a me ka ID mea hoʻohana. I kēia manawa hiki iā mākou ke koho ma ke kau ʻana i nā palena.

Hōʻike hou ʻia kā mākou hana where, and, a loaʻa iā mākou nā mea hoʻohana, a maikaʻi hou nā mea a pau. Akā, inā mākou e ho'āʻo e hoʻohana wale i kahi hapa o ke kī Clustering, a me ka mea nui ʻole, a laila e hoʻopiʻi koke ʻo Cassandra ʻaʻole hiki ke loaʻa kahi i loko o kā mākou palapala 'āina kahi i loaʻa ai kēia mau kahua no ka mea hoʻohālikelike null, a me kēia. ua hoonoho wale ia , - kahi i moe ai. Pono wau e huki hou i nā ʻikepili a pau mai kēia node a kānana. A ʻo kēia kahi analogue o Full Scan i loko o kahi node, ʻino kēia.

I nā kūlana maopopo ʻole, e hana i kahi papaʻaina hou

Inā makemake mākou e hoʻopaʻa i nā mea hoʻohana ma ka ID, a i ʻole ma ka makahiki, a i ʻole ma ka uku, he aha kā mākou e hana ai? ʻAʻohe mea. E hoʻohana wale i ʻelua papa. Inā pono ʻoe e hōʻea i nā mea hoʻohana i ʻekolu mau ala like ʻole, aia ʻekolu mau papa. Ua hala nā lā a mākou i mālama ai i ka lewa ma ka wili. ʻO kēia ka punawai haʻahaʻa loa. ʻOi aku ka liʻiliʻi ma mua o ka manawa pane, hiki ke hōʻino i ka mea hoʻohana. ʻOi aku ka ʻoluʻolu o ka mea hoʻohana i ka loaʻa ʻana o kekahi mea i kekona ma mua o 10 mau minuke.

Ke kālepa nei mākou i nā wahi kūpono ʻole a me nā ʻikepili denormalized no ka hiki ke hoʻonui maikaʻi a hana pono. Ma hope o nā mea a pau, ʻoiaʻiʻo, ʻo kahi hui i ʻekolu mau kikowaena ʻikepili, ʻo kēlā me kēia mea he ʻelima mau nodes, me kahi pae ʻae ʻia o ka mālama ʻana i ka ʻikepili (inā ʻaʻohe mea i nalowale), hiki ke ola i ka make ʻana o hoʻokahi kikowaena data. A ʻelua node hou i kēlā me kēia o nā ʻelua i koe. A ma hope wale nō o kēia e hoʻomaka ai nā pilikia. He redundancy maikaʻi loa kēia, pono ia i nā mea hoʻokele SSD hou aʻe a me nā kaʻina hana. No laila, i mea e hoʻohana ai iā Cassandra, ʻaʻole loa he SQL, kahi i loaʻa ʻole ai nā pilina, nā kī haole, pono ʻoe e ʻike i nā lula maʻalahi.

Hoʻolālā mākou i nā mea a pau e like me kāu noi. ʻO ka mea nui ʻaʻole ka ʻikepili, akā pehea e hana ai ka noi me ia. Inā pono e loaʻa i nā ʻikepili like ʻole ma nā ʻano like ʻole a i ʻole ka ʻikepili like ʻole ma nā ʻano like ʻole, pono mākou e kau iā ia ma kahi ala kūpono no ka noi. Inā ʻaʻole, e hāʻule mākou i ka Full Scan a ʻaʻole hāʻawi ʻo Cassandra iā mākou i kahi pono.

ʻO ka denormalizing ʻikepili ka mea maʻamau. Poina mākou e pili ana i nā ʻano maʻamau, ʻaʻohe o mākou ʻikepili pili. Inā mākou e waiho i kahi mea i lalo 100 manawa, e moe ia i lalo 100 manawa. ʻOi aku ka maikaʻi ma mua o ke kū ʻana.

Koho mākou i nā kī no ka hoʻokaʻawale ʻana i puʻunaue maʻamau. ʻAʻole mākou makemake e hāʻule ka hash o kā mākou mau kī i loko o kahi ākea haiki. ʻO ia hoʻi, he laʻana maikaʻi ʻole ka makahiki hānau ma ka laʻana ma luna. ʻOi aku ka pololei, maikaʻi inā puʻunaue ʻia kā mākou mea hoʻohana e ka makahiki hānau, a maikaʻi ʻole inā e kamaʻilio mākou e pili ana i nā haumāna papa 5 - ʻaʻole maikaʻi loa ka māhele ʻana ma laila.

Koho ʻia ka hoʻokaʻawale ʻana ma ke kahua hana Clustering Key. Inā pono e hoʻololi, pono mākou e hōʻano hou i kā mākou papaʻaina me kahi kī ʻokoʻa.

A ʻo ka mea nui loa: inā pono mākou e kiʻi i ka ʻikepili like ma 100 mau ala like ʻole, a laila e loaʻa iā mākou nā papa ʻokoʻa 100.

Source: www.habr.com

Pākuʻi i ka manaʻo hoʻopuka