Peb tab tom tsim lub interface yooj yim tshaj plaws hauv ntiaj teb * rau kev saib cov cav

Peb tab tom tsim lub interface yooj yim tshaj plaws hauv ntiaj teb * rau kev saib cov cav Yog tias koj tau siv lub vev xaib cuam tshuam los saib cov cav, ces koj tau pom tias yuav ua li cas, raws li txoj cai, cov kev sib tshuam no yog cumbersome thiab (feem ntau) tsis yooj yim heev thiab teb. Qee qhov koj tuaj yeem siv tau, qee qhov yog qhov txaus ntshai heev, tab sis nws zoo li kuv tias yog vim li cas rau txhua qhov teeb meem yog tias peb mus txog txoj haujlwm ntawm kev saib cov cav tsis raug: peb sim tsim lub vev xaib interface qhov twg CLI (hais lus txib interface) ua haujlwm zoo dua. Kuv tus kheej nyiam ua haujlwm nrog tus Tsov tus tw, grep, awk thiab lwm tus, thiab yog li ntawd rau kuv qhov kev sib txuas zoo tshaj plaws rau kev ua haujlwm nrog cov cav yuav yog ib yam zoo ib yam li tus Tsov tus tw thiab grep, tab sis kuj tseem siv tau los nyeem cov cav uas tuaj ntawm ntau lub servers. Ntawd yog, tau kawg, nyeem lawv los ntawm ClickHouse!

* raws li tus kheej lub tswv yim ntawm cov neeg siv habra koj ROCK

Ntsib logscli

Kuv tsis tau tuaj nrog lub npe rau kuv lub interface, thiab, kom ncaj ncees, nws zoo li muaj nyob rau hauv daim ntawv ntawm tus qauv, tab sis yog tias koj xav pom tam sim ntawd qhov chaws, ces koj txais tos: https://github.com/YuriyNasretdinov/logscli (350 kab ntawm Go code xaiv).

Nta

Kuv lub hom phiaj yog ua kom muaj kev sib cuam tshuam uas yuav zoo li paub rau cov neeg siv los ua tus Tsov tus tw / grep, uas yog, los txhawb cov khoom hauv qab no:

  1. Saib tag nrho cov cav, tsis muaj lim.
  2. Cia cov kab uas muaj cov hlua ruaj ruaj (chij -F у grep).
  3. Tawm cov kab uas phim cov lus qhia tsis tu ncua (chij -E у grep).
  4. Los ntawm lub neej ntawd, kev saib yog nyob rau hauv qhov kev txiav txim rov qab, txij li cov cav tsis ntev los no feem ntau yog qhov txaus siab ua ntej.
  5. Qhia cov ntsiab lus ntawm ib sab ntawm txhua kab (kev xaiv -A, -B и -C у grep, luam ntawv N kab ua ntej, tom qab, thiab ib ncig ntawm txhua txoj kab sib txuam, ntsig txog).
  6. Saib cov ntaub ntawv nkag hauv lub sijhawm, nrog lossis tsis muaj kev lim dej (tseem ceeb tail -f | grep).
  7. Lub interface yuav tsum tau sib xws nrog less, head, tail thiab lwm tus - los ntawm lub neej ntawd, cov txiaj ntsig yuav tsum tau xa rov qab yam tsis muaj kev txwv rau lawv tus lej; cov kab luam tawm raws li cov kwj deg ntev npaum li tus neeg siv xav tau txais lawv; teeb liab SIGPIPE yuav tsum ntsiag to cuam tshuam cav streaming, ib yam li lawv ua tail, grep thiab lwm yam khoom siv UNIX.

Kev siv

Kuv yuav xav tias koj twb paub yuav ua li cas xa cov cav mus rau ClickHouse. Yog tsis yog, kuv xav kom sim lsd и lub tsev menyuam mivThiab kab lus no hais txog kev xa khoom log.

Ua ntej koj yuav tsum txiav txim siab ntawm lub hauv paus txheej txheem. Txij li thaum koj feem ntau xav tau txais cov cav txheeb los ntawm lub sijhawm, nws zoo li qhov xav tau los khaws lawv li ntawd. Yog tias muaj ntau pawg cav thiab lawv yog txhua yam tib yam, ces koj tuaj yeem ua ib pawg cav raws li thawj kab ntawm thawj tus yuam sij - qhov no yuav tso cai rau koj kom muaj ib lub rooj hloov ntau, uas yuav yog qhov loj ntxiv thaum tso rau hauv ClickHouse (ntawm servers nrog hard drives, nws raug nquahu kom ntxig cov ntaub ntawv tsis pub ntau tshaj ~ 1 zaug ib ob rau tag nrho server).

Ntawd yog, peb xav tau kwv yees li cov lus hauv qab no:

CREATE TABLE logs(
    category LowCardinality(String), -- категория логов (опционально)
    time DateTime, -- время события
    millis UInt16, -- миллисекунды (могут быть и микросекунды, и т.д.): рекомендуется хранить, если событий много, чтобы было легче различать события между собой
    ..., -- ваши собственные поля, например имя сервера, уровень логирования, и так далее
    message String -- текст сообщения
) ENGINE=MergeTree()
ORDER BY (category, time, millis)

Hmoov tsis zoo, kuv tsis tuaj yeem nrhiav tau tam sim ntawd qhib qhov chaw nrog cov cav tiag tiag uas kuv tuaj yeem rub thiab rub tawm, yog li kuv tau coj qhov no los ua piv txwv txheeb xyuas cov khoom los ntawm Amazon txog rau xyoo 2015. Tau kawg, lawv cov qauv tsis zoo ib yam li cov ntawv teev lus, tab sis rau kev piav qhia lub hom phiaj no tsis tseem ceeb.

cov lus qhia rau uploading Amazon tshuaj xyuas rau ClickHouse

Cia peb tsim ib lub rooj:

CREATE TABLE amazon(
   review_date Date,
   time DateTime DEFAULT toDateTime(toUInt32(review_date) * 86400 + rand() % 86400),
   millis UInt16 DEFAULT rand() % 1000,
   marketplace LowCardinality(String),
   customer_id Int64,
   review_id String,
   product_id LowCardinality(String),
   product_parent Int64,
   product_title String,
   product_category LowCardinality(String),
   star_rating UInt8,
   helpful_votes UInt32,
   total_votes UInt32,
   vine FixedString(1),
   verified_purchase FixedString(1),
   review_headline String,
   review_body String
)
ENGINE=MergeTree()
ORDER BY (time, millis)
SETTINGS index_granularity=8192

Hauv Amazon dataset tsuas muaj hnub rau kev tshuaj xyuas, tab sis tsis muaj sijhawm, yog li cia peb sau cov ntaub ntawv no nrog randon.

Koj tsis tas yuav rub tawm tag nrho cov ntaub ntawv tsv thiab txwv koj tus kheej rau thawj ~ 10-20 txhawm rau kom tau txais cov ntaub ntawv loj loj uas yuav tsis haum rau 16 GB ntawm RAM. Txhawm rau upload cov ntaub ntawv TSV kuv siv cov lus txib hauv qab no:

for i in *.tsv; do
    echo $i;
    tail -n +2 $i | pv |
    clickhouse-client --input_format_allow_errors_ratio 0.5 --query='INSERT INTO amazon(marketplace,customer_id,review_id,product_id,product_parent,product_title,product_category,star_rating,helpful_votes,total_votes,vine,verified_purchase,review_headline,review_body,review_date) FORMAT TabSeparated'
done

Ntawm tus txheej txheem Persistent Disk (uas yog HDD) hauv Google Huab nrog qhov loj me ntawm 1000 GB (Kuv coj qhov loj me no feem ntau kom qhov nrawm dua me ntsis, txawm hais tias tej zaum SSD ntawm qhov xav tau loj yuav tau pheej yig dua) qhov upload ceev yog kwv yees li ~ 75 MB / sec ntawm 4 cores.

  • Kuv yuav tsum tau txais kev tshwj tseg uas kuv ua haujlwm hauv Google, tab sis kuv siv tus kheej tus account thiab tsab xov xwm no tsis muaj dab tsi cuam tshuam nrog kuv txoj haujlwm ntawm lub tuam txhab.

Kuv yuav tsim tag nrho cov duab kos nrog cov ntaub ntawv tshwj xeeb no, vim qhov no yog txhua yam kuv muaj ntawm tes.

Qhia cov ntaub ntawv scanning

Txij li thaum nyob rau hauv ClickHouse peb yuav siv ib tug tag nrho scan ntawm ib lub rooj nrog cav, thiab qhov kev ua hauj lwm no yuav siv sij hawm ib tug tseem ceeb npaum li cas ntawm lub sij hawm thiab tej zaum yuav tsis tsim ib tug ntev lub sij hawm yog hais tias ob peb qhov sib tw tau pom, nws yog advisable kom muaj peev xwm los qhia lub kev nce qib ntawm cov lus nug kom txog thaum thawj kab nrog cov txiaj ntsig tau txais. Txhawm rau ua qhov no, muaj qhov tsis muaj nyob hauv HTTP interface uas tso cai rau koj xa kev nce qib hauv HTTP headers: send_progress_in_http_headers=1. Hmoov tsis zoo, tus qauv Go lub tsev qiv ntawv tsis tuaj yeem nyeem cov headers raws li lawv tau txais, tab sis HTTP 1.0 interface (tsis yog yuav tsum tsis meej pem nrog 1.1!) tau txais kev txhawb los ntawm ClickHouse, yog li koj tuaj yeem qhib qhov sib txuas TCP raw rau ClickHouse thiab xa mus rau qhov ntawd GET /?query=... HTTP/1.0nn thiab tau txais cov lus teb headers thiab lub cev yam tsis muaj kev khiav tawm lossis kev nkag mus, yog li hauv qhov no peb tsis tas yuav siv lub tsev qiv ntawv tus qauv.

Streaming cav los ntawm ClickHouse

ClickHouse tau ua kom zoo dua rau cov lus nug nrog ORDER BY rau lub sijhawm ntev (txij li xyoo 2019?), yog li cov lus nug zoo li

SELECT time, millis, message
FROM logs
WHERE message LIKE '%something%'
ORDER BY time DESC, millis DESC

Nws yuav tam sim ntawd pib xa rov qab cov kab uas muaj cov substring "ib yam dab tsi" hauv lawv cov lus, tsis tas tos rau qhov scan kom tiav.

Tsis tas li ntawd, nws yuav yooj yim heev yog ClickHouse nws tus kheej tau tso tseg qhov kev thov thaum qhov kev sib txuas rau nws raug kaw, tab sis qhov no tsis yog tus cwj pwm tsis zoo. Tsis siv neeg thov tshem tawm tuaj yeem qhib siv qhov kev xaiv cancel_http_readonly_queries_on_client_close=1.

Kev tuav ntawm SIGPIPE hauv Go kom raug

Thaum koj ua tiav, hais, cov lus txib some_cmd | head -n 10, raws nraim li cov lus txib some_cmd nres execution thaum head rho tawm 10 kab? Cov lus teb yog yooj yim: thaum twg head xaus, cov yeeb nkab kaw, thiab stdout ntawm some_cmd hais kom ua pib taw tes, conditionally, "mus rau qhov twg". Thaum twg some_cmd sim sau mus rau ib lub yeeb nkab kaw, nws tau txais SIGPIPE teeb liab, uas ntsiag to txiav qhov kev pab cuam los ntawm lub neej ntawd.

Nyob rau hauv Go qhov no kuj tshwm sim los ntawm lub neej ntawd, tab sis SIGPIPE teeb liab handler kuj luam tawm "cim: SIGPIPE" lossis cov lus zoo sib xws thaum kawg, thiab kom tshem tawm cov lus no peb tsuas yog yuav tsum tau lis SIGPIPE peb tus kheej li peb xav tau, uas yog, tsuas yog ntsiag to. tawm:

ch := make(chan os.Signal)
signal.Notify(ch, syscall.SIGPIPE)
go func() {
    <-ch
    os.Exit(0)
}()

Qhia cov ntsiab lus ntawm cov lus

Feem ntau koj xav pom cov ntsiab lus uas qee qhov yuam kev tshwm sim (piv txwv li, qhov kev thov twg ua rau muaj kev ntshai, lossis cov teeb meem cuam tshuam dab tsi tau pom ua ntej kev sib tsoo), thiab hauv grep Qhov no yog ua tiav siv cov kev xaiv -A, -B, thiab -C, uas qhia cov kab ntawv teev npe tom qab, ua ntej, thiab ib puag ncig cov lus, raws li.

Hmoov tsis zoo, kuv tsis tau pom txoj hauv kev yooj yim los ua tib yam hauv ClickHouse, yog li txhawm rau tso tawm cov ntsiab lus, ib qho kev thov ntxiv zoo li qhov no raug xa mus rau txhua kab ntawm qhov tshwm sim (cov ntsiab lus nyob ntawm qhov kev txheeb xyuas thiab seb cov ntsiab lus tau qhia ua ntej. los yog tom qab):

SELECT time,millis,review_body FROM amazon
WHERE (time = 'ВРЕМЯ_СОБЫТИЯ' AND millis < МИЛЛИСЕКУНДЫ_СОБЫТИЯ) OR (time < 'ВРЕМЯ_СОБЫТИЯ')
ORDER BY time DESC, millis DESC
LIMIT КОЛИЧЕСТВО_СТРОК_КОНТЕКСТА
SETTINGS max_threads=1

Txij li thaum qhov kev thov raug xa yuav luag tam sim tom qab ClickHouse rov qab cov kab sib txuas, nws xaus rau hauv lub cache thiab feem ntau qhov kev thov raug ua tiav sai heev thiab siv CPU me me (feem ntau qhov kev thov yuav siv li ~ 6 ms ntawm kuv lub tshuab virtual).

Qhia cov lus tshiab hauv lub sijhawm

Txhawm rau qhia cov lus xa tuaj hauv (yuav luag) lub sijhawm tiag tiag, peb tsuas yog ua tiav qhov kev thov ib zaug ob peb feeb, nco ntsoov lub sijhawm kawg uas peb tau ntsib ua ntej.

Cov piv txwv hais kom ua

Cov lus txib logscli zoo li cas hauv kev xyaum?

Yog tias koj rub tawm Amazon dataset uas kuv tau hais thaum pib ntawm tsab xov xwm, koj tuaj yeem khiav cov lus txib hauv qab no:

# Показать строки, где встречается слово walmart
$ logscli -F 'walmart' | less

# Показать самые свежие 10 строк, где встречается "terrible"
$ logscli -F terrible -limit 10

# То же самое без -limit:
$ logscli -F terrible | head -n 10

# Показать все строки, подходящие под /times [0-9]/, написанные для vine и у которых высокий рейтинг
$ logscli -E 'times [0-9]' -where="vine='Y' AND star_rating>4" | less

# Показать все строки со словом "panic" и 3 строки контекста вокруг
$ logscli -F 'panic' -C 3 | less

# Непрерывно показывать новые строки со словом "5-star"
$ logscli -F '5-star' -tailf

ua tim khawv

Cov lej siv hluav taws xob (tsis muaj ntaub ntawv) muaj nyob rau ntawm github ntawm https://github.com/YuriyNasretdinov/logscli. Kuv yuav zoo siab tau hnov ​​koj cov kev xav ntawm kuv lub tswv yim rau lub console interface rau saib cov cav raws li ClickHouse.

Tau qhov twg los: www.hab.com

Ntxiv ib saib