Imvamisa, imikhiqizo yokuhweba noma ezinye izindlela esezilungile ezenziwe ngomthombo ovulekile, njenge-Prometheus + Grafana, zisetshenziselwa ukuqapha nokuhlaziya ukusebenza kwe-Nginx. Lena inketho enhle yokuqapha noma ukuhlaziya kwesikhathi sangempela, kodwa ayilungele ukuhlaziya umlando. Kunoma iyiphi insiza ethandwayo, ivolumu yedatha evela kulogi ye-nginx ikhula ngokushesha, futhi ukuhlaziya inani elikhulu ledatha, kunengqondo ukusebenzisa okuthile okukhethekile.
Kulesi sihloko ngizokutshela ukuthi ungasebenzisa kanjani
I-TL:DR;
Ukuqoqa ulwazi sisebenzisa
Iqoqa izingodo ze-Nginx
Ngokuzenzakalelayo, izingodo ze-Nginx zibukeka kanjena:
4/9/2019 12:58:17 PM1.1.1.1 - - [09/Apr/2019:09:58:17 +0000] "GET /sign-up HTTP/2.0" 200 9168 "https://example.com/sign-in" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" "-"
4/9/2019 12:58:17 PM1.1.1.1 - - [09/Apr/2019:09:58:17 +0000] "GET /sign-in HTTP/2.0" 200 9168 "https://example.com/sign-up" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" "-"
Angahlukaniswa, kodwa kulula kakhulu ukulungisa ukucushwa kwe-Nginx ukuze kukhiqize izingodo ku-JSON:
log_format json_combined escape=json '{ "created_at": "$msec", '
'"remote_addr": "$remote_addr", '
'"remote_user": "$remote_user", '
'"request": "$request", '
'"status": $status, '
'"bytes_sent": $bytes_sent, '
'"request_length": $request_length, '
'"request_time": $request_time, '
'"http_referrer": "$http_referer", '
'"http_x_forwarded_for": "$http_x_forwarded_for", '
'"http_user_agent": "$http_user_agent" }';
access_log /var/log/nginx/access.log json_combined;
I-S3 yokugcina
Ukugcina izingodo, sizosebenzisa i-S3. Lokhu kukuvumela ukuthi ugcine futhi uhlaziye izingodo endaweni eyodwa, njengoba i-Athena ingasebenza nedatha ku-S3 ngokuqondile. Kamuva esihlokweni ngizokutshela ukuthi ungangeza kanjani kahle futhi ucubungule izingodo, kodwa okokuqala sidinga ibhakede elihlanzekile ku-S3, lapho kungekho okunye okuzogcinwa khona. Kuhle ukucatshangelwa kusenesikhathi ukuthi uzodala kusiphi isifunda ibhakede lakho, ngoba i-Athena ayitholakali kuzo zonke izifunda.
Ukudala isifunda kukhonsoli ye-Athena
Masidale itafula e-Athena lamalogi. Kudingeka kukho kokubili ukubhala nokufunda uma uhlela ukusebenzisa i-Kinesis Firehose. Vula ikhonsoli ye-Athena bese udala ithebula:
Ukwakhiwa kwethebula le-SQL
CREATE EXTERNAL TABLE `kinesis_logs_nginx`(
`created_at` double,
`remote_addr` string,
`remote_user` string,
`request` string,
`status` int,
`bytes_sent` int,
`request_length` int,
`request_time` double,
`http_referrer` string,
`http_x_forwarded_for` string,
`http_user_agent` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
's3://<YOUR-S3-BUCKET>'
TBLPROPERTIES ('has_encrypted_data'='false');
Ukudala i-Kinesis Firehose Stream
I-Kinesis Firehose izobhala idatha etholwe isuka ku-Nginx iye ku-S3 ngefomethi ekhethiwe, iyihlukanise ibe izinkomba ngefomethi ethi YYYY/MM/DD/HH. Lokhu kuzoba usizo lapho ufunda idatha. Yebo, ungabhala ngokuqondile ku-S3 usuka eqhweni, kodwa kulokhu kuzodingeka ubhale i-JSON, futhi lokhu kungasebenzi ngenxa yobukhulu bamafayela. Ukwengeza, uma usebenzisa i-PrestoDB noma i-Athena, i-JSON ifomethi yedatha enensa kakhulu. Ngakho-ke vula ikhonsoli ye-Kinesis Firehose, chofoza okuthi βDala ukusakaza kokulethwaβ, khetha βi-PUT eqondileβ kunkambu "yokulethwa":
Kuthebhu elandelayo, khetha βUkuguqulwa kwefomethi yokurekhodaβ - βKunikwe amandlaβ bese ukhetha βi-Apache ORCβ njengefomethi yokurekhoda. Ngokocwaningo oluthile
Sikhetha i-S3 yokugcina kanye nebhakede esalidalile ekuqaleni. I-Aws Glue Crawler, engizokhuluma ngayo kamuva, ayikwazi ukusebenza neziqalo ebhakedeni le-S3, ngakho-ke kubalulekile ukuyishiya ingenalutho.
Izinketho ezisele zingashintshwa kuye ngomthwalo wakho; Ngivame ukusebenzisa ezizenzakalelayo. Qaphela ukuthi ukucindezela kwe-S3 akutholakali, kodwa i-ORC isebenzisa ukucindezela komdabu ngokuzenzakalelayo.
eqephuzayo
Manje njengoba sesilungiselele ukugcina nokwamukela amalogi, sidinga ukulungiselela ukuthumela. Sizosebenzisa
Okokuqala, sidinga ifayela lokumisa le-fluent.conf. Idale futhi wengeze umthombo:
imbobo 24224
hlanganisa 0.0.0.0
Manje ungaqala iseva ye-Fluentd. Uma udinga ukucushwa okuthuthuke kakhulu, yiya ku
$ docker run
-d
-p 24224:24224
-p 24224:24224/udp
-v /data:/fluentd/log
-v <PATH-TO-FLUENT-CONF>:/fluentd/etc fluentd
-c /fluentd/etc/fluent.conf
fluent/fluentd:stable
Lokhu kulungiselelwa kusebenzisa indlela /fluentd/log
ukugcina izingodo ngaphambi kokuthumela. Ungenza ngaphandle kwalokhu, kodwa lapho uqala kabusha, ungalahlekelwa yikho konke okugcinwe kunqolobane ngomsebenzi ophula iqolo. Ungasebenzisa futhi noma iyiphi imbobo; 24224 iyimbobo ezenzakalelayo ye-Fluentd.
Manje njengoba sine-Fluentd esebenzayo, singathumela izingodo ze-Nginx lapho. Sivame ukusebenzisa i-Nginx esitsheni se-Docker, lapho i-Docker inomshayeli wokugawula wendabuko we-Fluentd:
$ docker run
--log-driver=fluentd
--log-opt fluentd-address=<FLUENTD-SERVER-ADDRESS>
--log-opt tag="{{.Name}}"
-v /some/content:/usr/share/nginx/html:ro
-d
nginx
Uma usebenzisa i-Nginx ngokuhlukile, ungasebenzisa amafayela welogi, i-Fluentd inawo
Ake sengeze ukuhlukaniswa kwelogi okulungiselelwe ngenhla ekucushweni kwe-Fluent:
<filter YOUR-NGINX-TAG.*>
@type parser
key_name log
emit_invalid_record_to_error false
<parse>
@type json
</parse>
</filter>
Futhi ukuthumela izingodo ku-Kinesis usebenzisa
<match YOUR-NGINX-TAG.*>
@type kinesis_firehose
region region
delivery_stream_name <YOUR-KINESIS-STREAM-NAME>
aws_key_id <YOUR-AWS-KEY-ID>
aws_sec_key <YOUR_AWS-SEC_KEY>
</match>
Athena
Uma ulungiselele yonke into ngendlela efanele, khona-ke ngemva kwesikhashana (ngokuzenzakalelayo, amarekhodi e-Kinesis athola idatha kanye njalo ngemizuzu eyi-10) kufanele ubone amafayela welogi ku-S3. Kumenyu "yokuqapha" ye-Kinesis Firehose ungabona ukuthi ingakanani idatha erekhodiwe ku-S3, kanye namaphutha. Ungakhohlwa ukunikeza ukufinyelela kokubhala ebhakedeni le-S3 endimeni ye-Kinesis. Uma i-Kinesis ingakwazi ukuncozulula okuthile, izongeza amaphutha ebhakedeni elifanayo.
Manje ungakwazi ukubuka idatha ku-Athena. Ake sithole izicelo zakamuva esibuyisele amaphutha ngazo:
SELECT * FROM "db_name"."table_name" WHERE status > 499 ORDER BY created_at DESC limit 10;
Iskena wonke amarekhodi esicelweni ngasinye
Manje izingodo zethu sezicutshunguliwe futhi zagcinwa ku-S3 ku-ORC, zicindezelwe futhi zilungele ukuhlaziywa. I-Kinesis Firehose yaze yazihlela zaba izinkomba zehora ngalinye. Kodwa-ke, inqobo nje uma ithebula lingahlukanisiwe, i-Athena izolayisha idatha yesikhathi sonke kuso sonke isicelo, ngaphandle kokungavamile. Lokhu kuyinkinga enkulu ngenxa yezizathu ezimbili:
- Ivolumu yedatha ikhula njalo, inciphisa imibuzo;
- I-Athena ikhokhiswa ngokusekelwe kumthamo wedatha eskeniwe, okungenani okungu-10 MB ngesicelo ngasinye.
Ukuze silungise lokhu, sisebenzisa i-AWS Glue Crawler, ezocaca idatha ku-S3 futhi ibhale ulwazi lokuhlukanisa ku-Glue Metastore. Lokhu kuzosivumela ukuthi sisebenzise ama-partitions njengesihlungi lapho sibuza u-Athena, futhi kuzoskena kuphela izinkomba ezicaciswe embuzweni.
Isetha i-Amazon Glue Crawler
I-Amazon Glue Crawler iskena yonke idatha kubhakede le-S3 futhi idale amatafula anezihlukanisi. Dala i-Glue Crawler kusukela ku-AWS Glue console bese wengeza ibhakede lapho ugcina khona idatha. Ungasebenzisa isiseshi esisodwa kumabhakede amaningana, lapho sizodala amathebula kusizindalwazi esicacisiwe anamagama afana namagama amabhakede. Uma uhlela ukusebenzisa le datha njalo, qiniseka ukuthi umisa uhlelo lokuqalisa lwe-Crawler ukuze luvumelane nezidingo zakho. Sisebenzisa i-Crawler eyodwa kuwo wonke amathebula, esebenza njalo ngehora.
Amatafula ahlukanisiwe
Ngemva kokwethulwa kokuqala kwesiseshi, amathebula ebhakede ngalinye eliskeniwe kufanele avele kusizindalwazi esicaciswe kuzilungiselelo. Vula ikhonsoli ye-Athena futhi uthole itafula elinamalogi e-Nginx. Ake sizame ukufunda okuthile:
SELECT * FROM "default"."part_demo_kinesis_bucket"
WHERE(
partition_0 = '2019' AND
partition_1 = '04' AND
partition_2 = '08' AND
partition_3 = '06'
);
Lo mbuzo uzokhetha wonke amarekhodi atholwe phakathi kuka-6 a.m. no-7 a.m. ngo-April 8, 2019. Kodwa lokhu kusebenza kahle kangakanani kunokufunda nje etafuleni elingahlukanisiwe? Ake sithole bese sikhetha amarekhodi afanayo, siwahlunge ngesitembu sesikhathi:
Amasekhondi angu-3.59 kanye namamegabhayithi angu-244.34 edatha kudathasethi eneviki kuphela lamalogi. Ake sizame isihlungi ngokuhlukanisa:
Ngokushesha kancane, kodwa okubaluleke kakhulu - ama-megabytes angu-1.23 kuphela wedatha! Kungaba ishibhile kakhulu uma kungenjalo ngenani eliphansi lamamegabhayithi ayi-10 ngesicelo ngasinye senani. Kodwa kusengcono kakhulu, futhi kumadathasethi amakhulu umehluko uzomangalisa kakhulu.
Ukwakha ideshibhodi usebenzisa i-Cube.js
Ukuze sihlanganise ideshibhodi, sisebenzisa uhlaka lokuhlaziya lwe-Cube.js. Inemisebenzi eminingi impela, kodwa sinentshisekelo kokubili: ikhono lokusebenzisa ngokuzenzakalelayo izihlungi zokuhlukanisa kanye nokuhlanganisa ngaphambilini idatha. Isebenzisa i-schema yedatha
Masidale uhlelo olusha lwe-Cube.js. Njengoba sesivele sisebenzisa isitaki se-AWS, kunengqondo ukusebenzisa i-Lambda ukuze sisetshenziswe. Ungasebenzisa isifanekiso esicacile sokukhiqiza uma uhlela ukusingatha i-backend ye-Cube.js ku-Heroku noma ku-Docker. Amadokhumenti achaza amanye
$ npm install -g cubejs-cli
$ cubejs create nginx-log-analytics -t serverless -d athena
Okuguquguqukayo kwemvelo kusetshenziselwa ukulungisa ukufinyelela kusizindalwazi ku-cube.js. Ijeneretha izodala ifayela le-.env ongacacisa kulo okhiye bakho
Manje siyadinga
Kuhla lwemibhalo schema
, dala ifayela Logs.js
. Nasi isibonelo semodeli yedatha ye-nginx:
Ikhodi yemodeli
const partitionFilter = (from, to) => `
date(from_iso8601_timestamp(${from})) <= date_parse(partition_0 || partition_1 || partition_2, '%Y%m%d') AND
date(from_iso8601_timestamp(${to})) >= date_parse(partition_0 || partition_1 || partition_2, '%Y%m%d')
`
cube(`Logs`, {
sql: `
select * from part_demo_kinesis_bucket
WHERE ${FILTER_PARAMS.Logs.createdAt.filter(partitionFilter)}
`,
measures: {
count: {
type: `count`,
},
errorCount: {
type: `count`,
filters: [
{ sql: `${CUBE.isError} = 'Yes'` }
]
},
errorRate: {
type: `number`,
sql: `100.0 * ${errorCount} / ${count}`,
format: `percent`
}
},
dimensions: {
status: {
sql: `status`,
type: `number`
},
isError: {
type: `string`,
case: {
when: [{
sql: `${CUBE}.status >= 400`, label: `Yes`
}],
else: { label: `No` }
}
},
createdAt: {
sql: `from_unixtime(created_at)`,
type: `time`
}
}
});
Lapha sisebenzisa okuguquguqukayo
Siphinde sisethe amamethrikhi namapharamitha esifuna ukuwabonisa kudeshibhodi futhi sicacise ukuhlanganiswa kwangaphambilini. I-Cube.js izodala amathebula engeziwe anedatha ehlanganiswe kusengaphambili futhi izobuyekeza idatha ngokuzenzakalelayo njengoba ifika. Lokhu akusheshisi kuphela imibuzo, kodwa futhi kunciphisa izindleko zokusebenzisa i-Athena.
Ake sengeze lolu lwazi kufayela le-schema yedatha:
preAggregations: {
main: {
type: `rollup`,
measureReferences: [count, errorCount],
dimensionReferences: [isError, status],
timeDimensionReference: createdAt,
granularity: `day`,
partitionGranularity: `month`,
refreshKey: {
sql: FILTER_PARAMS.Logs.createdAt.filter((from, to) =>
`select
CASE WHEN from_iso8601_timestamp(${to}) + interval '3' day > now()
THEN date_trunc('hour', now()) END`
)
}
}
}
Sicacisa kule modeli ukuthi kuyadingeka ukuhlanganisa kusengaphambili idatha yawo wonke amamethrikhi asetshenzisiwe, futhi usebenzise ukuhlukanisa ngenyanga.
Manje singakwazi ukuhlanganisa ideshibhodi!
Cube.js backend inikeza
Iseva ye-Cube.js yamukela isicelo ku-
{
"measures": ["Logs.errorCount"],
"timeDimensions": [
{
"dimension": "Logs.createdAt",
"dateRange": ["2019-01-01", "2019-01-07"],
"granularity": "day"
}
]
}
Masifake iklayenti le-Cube.js nomtapo wezincwadi wengxenye ye-React nge-NPM:
$ npm i --save @cubejs-client/core @cubejs-client/react
Singenisa izingxenye cubejs
ΠΈ QueryRenderer
ukuze ulande idatha, futhi uqoqe ideshibhodi:
Ikhodi yedeshibhodi
import React from 'react';
import { LineChart, Line, XAxis, YAxis } from 'recharts';
import cubejs from '@cubejs-client/core';
import { QueryRenderer } from '@cubejs-client/react';
const cubejsApi = cubejs(
'YOUR-CUBEJS-API-TOKEN',
{ apiUrl: 'http://localhost:4000/cubejs-api/v1' },
);
export default () => {
return (
<QueryRenderer
query={{
measures: ['Logs.errorCount'],
timeDimensions: [{
dimension: 'Logs.createdAt',
dateRange: ['2019-01-01', '2019-01-07'],
granularity: 'day'
}]
}}
cubejsApi={cubejsApi}
render={({ resultSet }) => {
if (!resultSet) {
return 'Loading...';
}
return (
<LineChart data={resultSet.rawData()}>
<XAxis dataKey="Logs.createdAt"/>
<YAxis/>
<Line type="monotone" dataKey="Logs.errorCount" stroke="#8884d8"/>
</LineChart>
);
}}
/>
)
}
Imithombo yedeshibhodi iyatholakala kokuthi
Source: www.habr.com