Haddii aad waligaa u isticmaashay is-dhexgalka shabakada si aad u daawato diiwaanka, waxa ay u badantahay in aad dareentay sida ay u dhib badan yihiin iyo (badanaa) aan gaar ahaan isticmaale-saaxiibtinimo ama jawaab celin. Qaarkood way fududahay in la qabsado, kuwa kalena waa kuwo aad u xun, laakiin waxaan aaminsanahay in dhammaan dhibaatooyinkan oo dhan ay yihiin in aan ku soo dhowaaneyno aragtida log si khaldan: iskudayga in la abuuro interface web halkaas oo CLI (command line interface) si fiican u shaqeeyo. Shakhsi ahaan, aad ayaan ugu qanacsanahay inaan ku shaqeeyo dabada, grep, awk, iyo wixii la mid ah, markaa aniga ahaan, interface-ka ugu fiican ee ku shaqeynta logyada waxay noqon doontaa wax la mid ah dabada iyo grep, laakiin sidoo kale loo isticmaali karo akhrinta diiwaannada server-yo badan. Taas macnaheedu waa, dabcan, iyaga oo ka akhrinaya ClickHouse!
* sida ku cad ra'yiga shakhsi ahaaneed ee isticmaalaha Habr
La kulan logscli
Maan la iman magac isku-xidhkayga, iyo si daacad ah, waa wax badan oo tusaale ah, laakiin haddii aad rabto inaad aragto koodka isha isla markiiba, waa lagu soo dhaweynayaa: (350 sadar oo Go code la doortay).
Qaababka
Hadafkayagu waxa uu ahaa in aan sameeyo interface dareensan in ay yaqaanaan kuwa loo isticmaalo dabada/grep, taasoo la macno ah in ay taageerto waxyaabaha soo socda:
- Eeg dhammaan diiwaannada adiga oo aan shaandhayn.
- Hayso xariiqyo ka kooban xargo hoosaad go'an (calan
-FΡgrep). - Ka dhig xariiqyo ku habboon tibaaxaha caadiga ah (calanka
-EΡgrep). - Sida caadiga ah, daawashada waxay u socotaa siday u kala horreeyeen, maadaama qoraaladii ugu dambeeyay ay inta badan xiiseeyaan marka hore.
- Muuji macnaha guud ee ku xiga khad kasta (doorashooyinka)
-A,-BΠΈ-CΡgrep, Daabacaadda N ka hor, ka dib, iyo agagaarka xariiq kasta oo ku habboon, siday u kala horreeyaan). - Arag diiwaannada soo galaya wakhtiga dhabta ah, leh ama la'aan shaandhayn (asal ahaan
tail -f | grep). - Interface waa in uu la jaanqaadaa
less,head,tailiyo kuwa kale - sida caadiga ah, natiijooyinka waa in lagu soo celiyaa iyada oo aan wax xaddidnayn tiradooda; khadadka waxaa lagu daabacaa qulqulka ilaa inta isticmaaluhu uu xiiseynayo inuu helo; calaamadSIGPIPEwaa in si aamusnaan ah u joojiyaa diiwaannada qulqulka, sida ay sameeyaantail,grepiyo adeegyada kale ee UNIX.
Π Π΅Π°Π»ΠΈΠ·Π°ΡΠΈΡ
Waxaan u qaadan doonaa inaad hore u haysatid hab aad ku geyn karto diiwaannada ClickHouse. Haddaysan ahayn, waxaan ku talinayaa inaad tijaabiso. ΠΈ Markaasay .
Marka hore, waxaad u baahan tahay inaad go'aan ka gaarto schema database. Maaddaama logu caadi ahaan lagu kala soocaa waqti, waxay u muuqataa mid macquul ah in sidaas lagu kaydiyo. Haddii aad leedahay qaybo badan oo log ah oo ay dhammaantood isku nooc yihiin, waxaad ka dhigi kartaa qaybta log tiirka koowaad ee furaha aasaasiga ah. Tani waxay kuu ogolaaneysaa inaad haysato hal miis halkii aad ka heli lahayd dhowr, taas oo noqon doonta faa'iido weyn markaad geliso ClickHouse (serverrada leh darawallada adag, waxaa lagula talinayaa inaad geliso xogta wax ka badan ~ 1 mar ilbiriqsi kasta). serverka oo dhan).
Taasi waa, waxaan u baahanahay qiyaastii nidaamka miiska soo socda:
CREATE TABLE logs(
category LowCardinality(String), -- ΠΊΠ°ΡΠ΅Π³ΠΎΡΠΈΡ Π»ΠΎΠ³ΠΎΠ² (ΠΎΠΏΡΠΈΠΎΠ½Π°Π»ΡΠ½ΠΎ)
time DateTime, -- Π²ΡΠ΅ΠΌΡ ΡΠΎΠ±ΡΡΠΈΡ
millis UInt16, -- ΠΌΠΈΠ»Π»ΠΈΡΠ΅ΠΊΡΠ½Π΄Ρ (ΠΌΠΎΠ³ΡΡ Π±ΡΡΡ ΠΈ ΠΌΠΈΠΊΡΠΎΡΠ΅ΠΊΡΠ½Π΄Ρ, ΠΈ Ρ.Π΄.): ΡΠ΅ΠΊΠΎΠΌΠ΅Π½Π΄ΡΠ΅ΡΡΡ Ρ
ΡΠ°Π½ΠΈΡΡ, Π΅ΡΠ»ΠΈ ΡΠΎΠ±ΡΡΠΈΠΉ ΠΌΠ½ΠΎΠ³ΠΎ, ΡΡΠΎΠ±Ρ Π±ΡΠ»ΠΎ Π»Π΅Π³ΡΠ΅ ΡΠ°Π·Π»ΠΈΡΠ°ΡΡ ΡΠΎΠ±ΡΡΠΈΡ ΠΌΠ΅ΠΆΠ΄Ρ ΡΠΎΠ±ΠΎΠΉ
..., -- Π²Π°ΡΠΈ ΡΠΎΠ±ΡΡΠ²Π΅Π½Π½ΡΠ΅ ΠΏΠΎΠ»Ρ, Π½Π°ΠΏΡΠΈΠΌΠ΅Ρ ΠΈΠΌΡ ΡΠ΅ΡΠ²Π΅ΡΠ°, ΡΡΠΎΠ²Π΅Π½Ρ Π»ΠΎΠ³ΠΈΡΠΎΠ²Π°Π½ΠΈΡ, ΠΈ ΡΠ°ΠΊ Π΄Π°Π»Π΅Π΅
message String -- ΡΠ΅ΠΊΡΡ ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΡ
) ENGINE=MergeTree()
ORDER BY (category, time, millis)
Nasiib darro, isla markiiba ma helin ilo furan oo leh qoraallo macquul ah oo aan soo dejin karo, markaa tan beddelkeeda waxaan u adeegsaday tusaale ahaan. Dabcan, qaabdhismeedkoodu maaha mid la mid ah kan qoraallada qoraalka ah, laakiin tusaale ahaan tani muhiim maaha.
Tilmaamaha soo dejinta Amazon dib u eegista ClickHouse
Aan samayno miis:
CREATE TABLE amazon(
review_date Date,
time DateTime DEFAULT toDateTime(toUInt32(review_date) * 86400 + rand() % 86400),
millis UInt16 DEFAULT rand() % 1000,
marketplace LowCardinality(String),
customer_id Int64,
review_id String,
product_id LowCardinality(String),
product_parent Int64,
product_title String,
product_category LowCardinality(String),
star_rating UInt8,
helpful_votes UInt32,
total_votes UInt32,
vine FixedString(1),
verified_purchase FixedString(1),
review_headline String,
review_body String
)
ENGINE=MergeTree()
ORDER BY (time, millis)
SETTINGS index_granularity=8192
Xogta Amazon waxa ay ka kooban tahay taariikhda dib u eegista, laakiin ma aha wakhtiga saxda ah, markaa waxa aanu xogtan ku buuxin doonaa randon.
Uma baahnid inaad soo dejiso dhammaan faylasha TSV; Waxaad kaliya soo dejisan kartaa 10-20ka ugu horreeya, taas oo ku siin doonta xog badan oo kugu filan oo aan ku habboonayn 16 GB ee RAM. Si aan u geliyo faylasha TSV, waxaan isticmaalay amarka soo socda:
for i in *.tsv; do
echo $i;
tail -n +2 $i | pv |
clickhouse-client --input_format_allow_errors_ratio 0.5 --query='INSERT INTO amazon(marketplace,customer_id,review_id,product_id,product_parent,product_title,product_category,star_rating,helpful_votes,total_votes,vine,verified_purchase,review_headline,review_body,review_date) FORMAT TabSeparated'
done
Disk-ka caadiga ah ee joogtada ah (oo ah HDD) ee Google Cloud oo cabbirkiisu yahay 1000 GB (waxaan u doortay cabbirkan inta badan si aan xawaaruhu wax yar uga sarreeyo, inkasta oo laga yaabo in SSD ee awoodda loo baahan yahay ay ka jaban tahay) xawaaruhu wuxuu ahaa qiyaastii ~ 75 MB / sec ee 4 cores.
- Waa inaan tilmaamaa inaan u shaqeeyo Google, laakiin waxaan isticmaalay akoon shakhsi ah, maqaalkani shaqo kuma laha shaqadayda shirkadda.
Waxaan soo saari doonaa dhammaan sawirada anigoo isticmaalaya xogtan, maadaama ay tani tahay waxa aan gacanta ku hayo.
Muuji horumarka iskaanka xogta
Maadaama aan isticmaaleyno iskaanka buuxa ee miiska log ee ClickHouse, oo qalliinkan uu qaadan karo waqti aad u badan waxaana laga yaabaa in aan wax natiijo ah soo celin muddo dheer haddii dhowr kulan la helo, waxaa lagu talinayaa in la muujiyo horumarka weydiinta ilaa safafka natiijada ugu horreysa la soo celiyo. Ujeedadan awgeed, interface-ka HTTP wuxuu leeyahay halbeeg kuu ogolaanaya inaad ku muujiso horumarka madaxyada HTTP: send_progress_in_http_headers=1Nasiib darro, maktabadda caadiga ah ee Go ma akhrin karto madax sida loo helay, laakiin HTTP 1.0 interface (aan lagu khaldin 1.1!) Waxaa taageera ClickHouse, si aad u furto xiriir TCP cayriin ClickHouse oo u dir halkaas. GET /?query=... HTTP/1.0nn oo hel madaxyada jawaabta iyo jidhka iyada oo aan wax baxsan ama qarsoodi ah, markaa kiiskan xitaa uma baahnid isticmaalka maktabadda caadiga ah.
Diiwaanada qulqulka ee ClickHouse
ClickHouse waxa ay haysay wanaajinta su'aalaha ORDER BY muddo dheer hadda (ilaa 2019?), marka su'aal la mid ah
SELECT time, millis, message
FROM logs
WHERE message LIKE '%something%'
ORDER BY time DESC, millis DESC
Isla markiiba waxay bilaabi doontaa soo celinta khadadka leh xarafka-hoosaadka "wax" fariinta, iyada oo aan la sugin in sawirku dhammeeyo.
Waxa kale oo aad u habboonaan lahayd haddii ClickHouse uu si toos ah u baajiyo codsiga markii xidhiidhka la xidhay, laakiin tani maaha habdhaqanka caadiga ah. Baabi'inta tooska ah ee codsiga waa la dami karaa iyadoo la isticmaalayo ikhtiyaarka cancel_http_readonly_queries_on_client_close=1.
Si sax ah ula tacaalida SIGPIPE gudaha Go
Markaad fulinayso, dheh, amar some_cmd | head -n 10, sida saxda ah kooxda some_cmd ay joojiso dilkeeda marka head Akhri 10 sadar? Jawaabtu waa sahlan tahay: goorma head dhamaado, tuubada ayaa xirta, iyo stdout ee amarka some_cmd wuxuu bilaabmaa inuu tilmaamo, caadiyan, "meelna ma jiro." Goorma some_cmd isku dayaya inuu wax u qoro tuubo xidhan, .
Go, tani waxay sidoo kale ku dhacdaa si caadi ah, laakiin maamulaha calaamadaha SIGPIPE wuxuu sidoo kale daabacaa "signal: SIGPIPE" ama fariin la mid ah dhamaadka, iyo si aad u saarto fariintan, kaliya waxaad u baahan tahay inaad u qabato SIGPIPE naftaada sida aad rabto, taas oo ah, si aamusnaan ah uga bax:
ch := make(chan os.Signal)
signal.Notify(ch, syscall.SIGPIPE)
go func() {
<-ch
os.Exit(0)
}()
Muuji fariinta macnaha guud
Badana waxa aad rabtaa in aad aragto macnaha guud ee uu khaladku ku dhacay (tusaale, codsigaa keenay argagaxa, ama dhibaatooyinka la xidhiidha shilka ka hor) grep Xulashooyinka -A, -B, iyo -C ayaa loo isticmaalaa ujeedadan, kuwaas oo muujinaya tirada la cayimay ee khadadka ka dib, ka hor, iyo hareeraha fariinta, siday u kala horreeyaan.
Nasiib darro, ma helin hab fudud oo lagu sameeyo ClickHouse, si loo muujiyo macnaha guud, waydiimo dheeraad ah ayaa loo diraa saf kasta oo natiijada ah, wax sidan oo kale ah (faahfaahintu waxay ku xiran tahay kala-soocidda iyo haddii macnaha guud la muujiyay ka hor ama ka dib):
SELECT time,millis,review_body FROM amazon
WHERE (time = 'ΠΠ ΠΠΠ―_Π‘ΠΠΠ«Π’ΠΠ―' AND millis < ΠΠΠΠΠΠ‘ΠΠΠ£ΠΠΠ«_Π‘ΠΠΠ«Π’ΠΠ―) OR (time < 'ΠΠ ΠΠΠ―_Π‘ΠΠΠ«Π’ΠΠ―')
ORDER BY time DESC, millis DESC
LIMIT ΠΠΠΠΠ§ΠΠ‘Π’ΠΠ_Π‘Π’Π ΠΠ_ΠΠΠΠ’ΠΠΠ‘Π’Π
SETTINGS max_threads=1
Maadaama codsiga la soo diro isla markiiba ka dib markii ClickHouse uu soo celiyo safka u dhigma, wuxuu ku dhamaanayaa kaydka iyo, guud ahaan, codsiga si cadaalad ah ayaa loo fuliyaa waxayna isticmaashaa CPU yar (sida caadiga ah codsigu wuxuu qaataa ~ 6 ms mashiinka farsamada gacanta).
Muuji fariimaha cusub wakhtiga dhabta ah
Si loo muujiyo fariimaha soo socda (ku dhawaad) wakhtiga dhabta ah, waxaanu si fudud u fulinaa codsiga dhawrkii ilbiriqsi kasta, anagoo xasuusanayna shaambada wakhtiga ee ugu dambaysay ee aanu hore ula kulanay.
Tusaalooyinka amarka
Sidee buu u eg yahay amarada logscli ee caadiga ah?
Haddii aad soo dejisay kaydka xogta Amazon ee aan ku sheegay bilowga maqaalka, waxaad socodsiin kartaa amarada soo socda:
# ΠΠΎΠΊΠ°Π·Π°ΡΡ ΡΡΡΠΎΠΊΠΈ, Π³Π΄Π΅ Π²ΡΡΡΠ΅ΡΠ°Π΅ΡΡΡ ΡΠ»ΠΎΠ²ΠΎ walmart
$ logscli -F 'walmart' | less
# ΠΠΎΠΊΠ°Π·Π°ΡΡ ΡΠ°ΠΌΡΠ΅ ΡΠ²Π΅ΠΆΠΈΠ΅ 10 ΡΡΡΠΎΠΊ, Π³Π΄Π΅ Π²ΡΡΡΠ΅ΡΠ°Π΅ΡΡΡ "terrible"
$ logscli -F terrible -limit 10
# Π’ΠΎ ΠΆΠ΅ ΡΠ°ΠΌΠΎΠ΅ Π±Π΅Π· -limit:
$ logscli -F terrible | head -n 10
# ΠΠΎΠΊΠ°Π·Π°ΡΡ Π²ΡΠ΅ ΡΡΡΠΎΠΊΠΈ, ΠΏΠΎΠ΄Ρ
ΠΎΠ΄ΡΡΠΈΠ΅ ΠΏΠΎΠ΄ /times [0-9]/, Π½Π°ΠΏΠΈΡΠ°Π½Π½ΡΠ΅ Π΄Π»Ρ vine ΠΈ Ρ ΠΊΠΎΡΠΎΡΡΡ
Π²ΡΡΠΎΠΊΠΈΠΉ ΡΠ΅ΠΉΡΠΈΠ½Π³
$ logscli -E 'times [0-9]' -where="vine='Y' AND star_rating>4" | less
# ΠΠΎΠΊΠ°Π·Π°ΡΡ Π²ΡΠ΅ ΡΡΡΠΎΠΊΠΈ ΡΠΎ ΡΠ»ΠΎΠ²ΠΎΠΌ "panic" ΠΈ 3 ΡΡΡΠΎΠΊΠΈ ΠΊΠΎΠ½ΡΠ΅ΠΊΡΡΠ° Π²ΠΎΠΊΡΡΠ³
$ logscli -F 'panic' -C 3 | less
# ΠΠ΅ΠΏΡΠ΅ΡΡΠ²Π½ΠΎ ΠΏΠΎΠΊΠ°Π·ΡΠ²Π°ΡΡ Π½ΠΎΠ²ΡΠ΅ ΡΡΡΠΎΠΊΠΈ ΡΠΎ ΡΠ»ΠΎΠ²ΠΎΠΌ "5-star"
$ logscli -F '5-star' -tailf
tixraacyada
Koodhka utility (la'aan dukumeenti) ayaa laga heli karaa github at Waxaan jeclaan lahaa inaan maqlo fikradahaaga ku saabsan fikradayda ku saabsan isku xirka konsole-ku-saleysan ClickHouse ee daawashada diiwaannada
Source: www.habr.com
