Isu tiri kugadzira iyo yakanyanya nyore interface munyika * yekuona matanda

Isu tiri kugadzira iyo yakanyanya nyore interface munyika * yekuona matanda Kana iwe wakamboshandisa web interfaces kuti uone matanda, saka iwe unogona kunge wakacherechedza kuti, sekutonga, aya mainterface anorema uye (kazhinji) haana kunyanya kukoshesa uye anoteerera. Zvimwe zvaunogona kujaira, zvimwe zvinotyisa, asi zvinoratidzika kwandiri kuti chikonzero chematambudziko ese ndechekuti tinosvika pabasa rekuona matanda zvisizvo: tinoedza kugadzira webhu interface uko CLI (command line interface) inoshanda zviri nani. Ini pachangu ndakasununguka kushanda nemuswe, grep, awk nevamwe, uye saka kwandiri chimiro chakakodzera chekushanda nematanda chingave chimwe chinhu chakafanana nemuswe uye grep, asi chinogona zvakare kushandiswa kuverenga matanda aibva kune akawanda maseva. Ndiko, hongu, vaverenge kubva kuClickHouse!

*maererano nemaonero emunhu anoshandisa habra wakapenga

Sangana logscli

Ini handina kuunza zita rechimiro changu, uye, kutaura chokwadi, riripo muchimiro cheiyo prototype, asi kana iwe uchida kuona nekukurumidza iyo kodhi kodhi, saka unogamuchirwa: https://github.com/YuriyNasretdinov/logscli (350 mitsetse yeGo kodhi yakasarudzwa).

Zviratidzo

Chinangwa changu chaive chekugadzira chinongedzo chinoita sechino zivikanwa kune avo vanoshandiswa muswe / grep, kureva, kutsigira zvinotevera zvinhu:

  1. Ona matanda ese, pasina kusefa.
  2. Siya mitsetse ine yakagadziriswa substring (mureza -F у grep).
  3. Siya mitsetse inoenderana nekutaura kwenguva dzose (mureza -E у grep).
  4. Nekusagadzika, kutarisa kuri munhevedzano yenguva, sezvo matanda achangoburwa anowanzo kufarira kutanga.
  5. Ratidza mamiriro padivi pemutsara wega wega (sarudzo -A, -B и -C у grep, kudhinda N mitsetse pamberi, mushure, uye kutenderedza mutsara wega wega unofananidzwa, zvichiteerana).
  6. Wona matanda anouya munguva chaiyo, aine kana pasina kusefa (chaizvoizvo tail -f | grep).
  7. Iyo interface inofanirwa kuenderana less, head, tail uye vamwe - nekusingaperi, zvigumisiro zvinofanira kudzorerwa pasina zvirambidzo pahuwandu hwavo; mitsetse inodhindwa serukova chero bedzi mushandisi achifarira kuigamuchira; chiratidzo SIGPIPE inofanirwa kukanganisa chinyararire kuyerera kwelogi, sezvavanoita tail, grep uye zvimwe UNIX zvinoshandiswa.

Kutevedzera

Ini ndichafungidzira kuti iwe unotoziva neimwe nzira kuendesa matanda kuClickHouse. Kana zvisina kudaro, ndinokurudzira kuedza LSD и kittenhouse, pamwe chete chinyorwa ichi nezve kuendesa log.

Kutanga iwe unofanirwa kusarudza pane base scheme. Sezvo uchiwanzoda kugamuchira matanda akarongwa nenguva, zvinoita seane musoro kuachengeta saizvozvo. Kana paine akawanda matanda emhando uye ese ari emhando imwe chete, saka iwe unogona kugadzira chikamu chegi sechikamu chekutanga chekiyi yekutanga - izvi zvinokutendera iwe kuve netafura imwe pane akati wandei, inova yakakura kuwedzera kana. kuisa muClickHouse (pamaseva ane hard drive, zvinokurudzirwa kuisa data isingapfuure ~ ka1 pasekondi. kune yese server).

Ndokunge, isu tinoda inenge inotevera tafura chirongwa:

CREATE TABLE logs(
    category LowCardinality(String), -- категория логов (опционально)
    time DateTime, -- время события
    millis UInt16, -- миллисекунды (могут быть и микросекунды, и т.д.): рекомендуется хранить, если событий много, чтобы было легче различать события между собой
    ..., -- ваши собственные поля, например имя сервера, уровень логирования, и так далее
    message String -- текст сообщения
) ENGINE=MergeTree()
ORDER BY (category, time, millis)

Nehurombo, handina kukwanisa kuwana chero akavhurika masosi ane echokwadi matanda andaigona kubata nekurodha pasi, saka ndakatora izvi panzvimbo semuenzaniso. wongororo yezvigadzirwa kubva kuAmazon pamberi pe2015. Ehe, chimiro chavo hachina kunyatsofanana neaya ezvinyorwa zvinyorwa, asi nekuda kwekuenzanisira izvi hazvina kukosha.

mirairo yekurodha Amazon wongororo kuClickHouse

Ngatigadzire tafura:

CREATE TABLE amazon(
   review_date Date,
   time DateTime DEFAULT toDateTime(toUInt32(review_date) * 86400 + rand() % 86400),
   millis UInt16 DEFAULT rand() % 1000,
   marketplace LowCardinality(String),
   customer_id Int64,
   review_id String,
   product_id LowCardinality(String),
   product_parent Int64,
   product_title String,
   product_category LowCardinality(String),
   star_rating UInt8,
   helpful_votes UInt32,
   total_votes UInt32,
   vine FixedString(1),
   verified_purchase FixedString(1),
   review_headline String,
   review_body String
)
ENGINE=MergeTree()
ORDER BY (time, millis)
SETTINGS index_granularity=8192

MuAmazon dataset pane chete zuva rekuongorora, asi hapana nguva chaiyo, saka ngatizadze iyi data nerandon.

Iwe haufanirwe kurodha ese mafaera etsv uye uzvigumire kune yekutanga ~ 10-20 kuitira kuti uwane yakaringana seti yedata isingakwane mu16 GB ye RAM. Kurodha mafaira eTSV ndakashandisa murairo unotevera:

for i in *.tsv; do
    echo $i;
    tail -n +2 $i | pv |
    clickhouse-client --input_format_allow_errors_ratio 0.5 --query='INSERT INTO amazon(marketplace,customer_id,review_id,product_id,product_parent,product_title,product_category,star_rating,helpful_votes,total_votes,vine,verified_purchase,review_headline,review_body,review_date) FORMAT TabSeparated'
done

Pane yakajairwa Persistent Disk (inova HDD) muGoogle Cloud ine saizi ye1000 GB (Ndakatora saizi iyi kunyanya kuitira kuti kumhanya kwacho kwaive kwakakwira zvishoma, kunyangwe pamwe SSD yehukuru hunodiwa ingadai yakachipa) kurodha. kumhanya kwaive kunosvika ~ 75 MB/sec pa 4 cores.

  • Ndinofanira kuita reservation kuti ndinoshanda kuGoogle, asi ndakashandisa account yangu uye chinyorwa ichi hachina chekuita nebasa rangu pakambani.

Ini ndichaburitsa mifananidzo yese neiyi dataset, nekuti izvi ndizvo zvese zvandaive nazvo.

Ratidza kufambira mberi kwekuongorora data

Sezvo muClickHouse tichashandisa scan yakazara patafura ine matanda, uye kushanda uku kunogona kutora nguva yakawanda uye kunogona kusaburitsa chero mhedzisiro kwenguva yakareba kana mashoma machisi awanikwa, zvinokurudzirwa kukwanisa kuratidza kufambira mberi kwemubvunzo kusvika mitsetse yekutanga ine mhinduro yagamuchirwa. Kuti uite izvi, pane parameter muHTTP interface inobvumidza iwe kutumira kufambira mberi muHTTP misoro: send_progress_in_http_headers=1. Nehurombo, iyo yakajairwa Go raibhurari haigone kuverenga misoro sezvainotambirwa, asi iyo HTTP 1.0 interface (isina kuvhiringwa ne 1.1!) inotsigirwa neClickHouse, saka unogona kuvhura yakasvibirira TCP yekubatanidza kuClickHouse woitumira ikoko. GET /?query=... HTTP/1.0nn uye gamuchira iyo misoro yemhinduro uye muviri pasina kana kutiza kana encryption, saka mune ino kesi hatitomboda kushandisa yakajairwa raibhurari.

Kutenderera matanda kubva ClickHouse

ClickHouse yave neoptimization yemibvunzo ine ORDER BY kwenguva yakati rebei (kubvira 2019?), saka mubvunzo senge.

SELECT time, millis, message
FROM logs
WHERE message LIKE '%something%'
ORDER BY time DESC, millis DESC

Ichabva yatanga kudzorera mitsetse ine substring "chimwe chinhu" mumeseji yavo, pasina kumirira kuti scan ipere.

Zvakare, zvingave zviri nyore kana ClickHouse pachayo ikanzura chikumbiro kana kubatana kwairi kwakavharwa, asi iyi haisi iyo default maitiro. Kudzima chikumbiro otomatiki kunogona kugoneswa uchishandisa sarudzo cancel_http_readonly_queries_on_client_close=1.

Kubata chaiko kweSIGPIPE muGo

Kana waita, iti, murairo some_cmd | head -n 10, chaizvo sei murayiro some_cmd inomisa kuurayiwa kana head akabvisa mitsetse gumi? Mhinduro iri nyore: rini head inopera, pombi inovhara, uye stdout yeimwe_cmd yekuraira inotanga kunongedza, zvine mamiriro, "kune chero kupi". Rini some_cmd anoedza kunyorera pombi yakavharwa, inogamuchira chiratidzo cheSIGPIPE, icho chinogumisa chinyararire chirongwa nekukasira.

MuGo izvi zvinoitikawo nekusarudzika, asi SIGPIPE siginecha inobata zvakare inodhinda "chiratidzo: SIGPIPE" kana meseji yakafanana kumagumo, uye kujekesa iyi meseji isu tinongoda kubata SIGPIPE pachedu nenzira yatinoda, ndiko kuti, chinyararire. kubuda:

ch := make(chan os.Signal)
signal.Notify(ch, syscall.SIGPIPE)
go func() {
    <-ch
    os.Exit(0)
}()

Ratidza mamiriro emashoko

Kazhinji iwe unoda kuona mamiriro akaita kukanganisa (semuenzaniso, chikumbiro chakonzera kutya, kana kuti ndeapi matambudziko aionekwa kusati kwaparara), uye mu grep Izvi zvinoitwa uchishandisa -A, -B uye -C sarudzo, iyo inoratidza nhamba yakatarwa yemitsara mushure, isati yasvika, uye yakatenderedza meseji, zvichiteerana.

Nehurombo, ini handisati ndawana nzira iri nyore yekuita zvakafanana muClickHouse, saka kuratidza mamiriro, chikumbiro chekuwedzera senge ichi chinotumirwa kune yega yega mutsara wemhedzisiro (ruzivo rwunotsamira pakurongeka uye kuti mamiriro acho anoratidzwa zvisati zvaitika. kana mushure):

SELECT time,millis,review_body FROM amazon
WHERE (time = 'ВРЕМЯ_СОБЫТИЯ' AND millis < МИЛЛИСЕКУНДЫ_СОБЫТИЯ) OR (time < 'ВРЕМЯ_СОБЫТИЯ')
ORDER BY time DESC, millis DESC
LIMIT КОЛИЧЕСТВО_СТРОК_КОНТЕКСТА
SETTINGS max_threads=1

Sezvo chikumbiro chatumirwa kanenge pakarepo mushure mekunge ClickHouse yadzosa mutsara unoenderana, unopera mu cache uye kazhinji chikumbiro chinoitwa nekukasira uye chinopedza diki CPU (kazhinji chikumbiro chinotora anenge ~ 6 ms pamushini wangu chaiwo).

Ratidza mameseji matsva munguva chaiyo

Kuti tiratidze mameseji anouya mu (inenge) nguva chaiyo, tinongoita chikumbiro kamwe chete mumasekonzi mashoma, tichirangarira nguva yekupedzisira yatakasangana nayo kare.

Mirairo mienzaniso

Ko mirairo yelogscli inotaridzika sei mukuita?

Kana iwe wakadhawunirodha iyo Amazon dataset yandakataura pakutanga kwechinyorwa, unogona kumhanyisa inotevera mirairo:

# Показать строки, где встречается слово walmart
$ logscli -F 'walmart' | less

# Показать самые свежие 10 строк, где встречается "terrible"
$ logscli -F terrible -limit 10

# То же самое без -limit:
$ logscli -F terrible | head -n 10

# Показать все строки, подходящие под /times [0-9]/, написанные для vine и у которых высокий рейтинг
$ logscli -E 'times [0-9]' -where="vine='Y' AND star_rating>4" | less

# Показать все строки со словом "panic" и 3 строки контекста вокруг
$ logscli -F 'panic' -C 3 | less

# Непрерывно показывать новые строки со словом "5-star"
$ logscli -F '5-star' -tailf

nezvakanyorwa

Iyo yekushandisa kodhi (isina zvinyorwa) inowanikwa pa github pa https://github.com/YuriyNasretdinov/logscli. Ndingafara kunzwa pfungwa dzako pane yangu zano reiyo console interface yekuona matanda akavakirwa paClickHouse.

Source: www.habr.com

Voeg