Ngokuqhelekileyo, iimveliso zorhwebo okanye ezinye iindlela ezivulekileyo ezivulekileyo, ezifana nePrometheus + Grafana, zisetyenziselwa ukubeka iliso kunye nokuhlalutya ukusebenza kweNginx. Olu lukhetho oluhle lokubeka iliso okanye uhlalutyo lwexesha lokwenyani, kodwa alulungele kakhulu uhlalutyo lwembali. Kuwo nawuphi na uvimba odumileyo, umthamo wedatha evela kwi-nginx logs ukhula ngokukhawuleza, kwaye ukuhlalutya inani elikhulu ledatha, kunengqiqo ukusebenzisa into ekhethekileyo.
Kweli nqaku ndiza kukuxelela indlela ongayisebenzisa ngayo
TL:DR;
Ukuqokelela ulwazi esilusebenzisayo
Ukuqokelela iilog zeNginx
Ngokungagqibekanga, iilog zeNginx zijongeka ngolu hlobo:
4/9/2019 12:58:17 PM1.1.1.1 - - [09/Apr/2019:09:58:17 +0000] "GET /sign-up HTTP/2.0" 200 9168 "https://example.com/sign-in" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" "-"
4/9/2019 12:58:17 PM1.1.1.1 - - [09/Apr/2019:09:58:17 +0000] "GET /sign-in HTTP/2.0" 200 9168 "https://example.com/sign-up" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" "-"
Zinokwahlulwa, kodwa kulula kakhulu ukulungisa uqwalaselo lwe-Nginx ukuze luvelise iilog kwi-JSON:
log_format json_combined escape=json '{ "created_at": "$msec", '
'"remote_addr": "$remote_addr", '
'"remote_user": "$remote_user", '
'"request": "$request", '
'"status": $status, '
'"bytes_sent": $bytes_sent, '
'"request_length": $request_length, '
'"request_time": $request_time, '
'"http_referrer": "$http_referer", '
'"http_x_forwarded_for": "$http_x_forwarded_for", '
'"http_user_agent": "$http_user_agent" }';
access_log /var/log/nginx/access.log json_combined;
S3 yokugcina
Ukugcina izingodo, siya kusebenzisa i-S3. Oku kukuvumela ukuba ugcine kwaye uhlalutye izingodo kwindawo enye, ekubeni i-Athena inokusebenza ngedatha kwi-S3 ngokuthe ngqo. Kamva kwinqaku ndiza kukuxelela indlela yokongeza ngokuchanekileyo kunye nokucwangcisa izingodo, kodwa okokuqala sifuna ibhakethi ecocekileyo kwi-S3, apho akukho nto enye iya kugcinwa. Kufanelekile ukuqwalasela kwangaphambili ukuba yeyiphi ingingqi oya kuyenza ibhakethi lakho, kuba i-Athena ayifumaneki kuyo yonke imimandla.
Ukudala isiphaluka kwi-console ye-Athena
Masenze itafile e-Athena yezigodo. Kuyafuneka kokubini ukubhala nokufunda ukuba uceba ukusebenzisa iKinesis Firehose. Vula i-console ye-Athena kwaye wenze itafile:
Ukudalwa kwetafile yeSQL
CREATE EXTERNAL TABLE `kinesis_logs_nginx`(
`created_at` double,
`remote_addr` string,
`remote_user` string,
`request` string,
`status` int,
`bytes_sent` int,
`request_length` int,
`request_time` double,
`http_referrer` string,
`http_x_forwarded_for` string,
`http_user_agent` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
's3://<YOUR-S3-BUCKET>'
TBLPROPERTIES ('has_encrypted_data'='false');
Ukudala iKinesis Firehose Stream
I-Kinesis Firehose iya kubhala idatha efunyenwe kwi-Nginx ukuya kwi-S3 kwifomathi ekhethiweyo, iyahlula ibe ngabalawuli kwi-YYYY/MM/DD/HH. Oku kuya kuba luncedo xa ufunda idatha. Unako, ngokuqinisekileyo, ubhale ngokuthe ngqo kwi-S3 ukusuka kwi-flueld, kodwa kulo mzekelo kuya kufuneka ubhale i-JSON, kwaye oku akusebenzi kakuhle ngenxa yobukhulu beefayile. Ukongeza, xa usebenzisa i-PrestoDB okanye i-Athena, i-JSON yeyona fomati yedatha ecothayo. Ke vula ikhonsoli yeKinesis Firehose, cofa "Yenza umjelo wokuhambisa", khetha u-"PUT ngqo" kwindawo "yokuhambisa":
Kwithebhu elandelayo, khetha u- "Record format conversion" - "Enebled" kwaye ukhethe "Apache ORC" njengefomathi yokurekhoda. Ngokutsho kolunye uphando
Sikhetha i-S3 yokugcina kunye nebhakethi esiyidale ngaphambili. I-Aws Glue Crawler, endiya kuthetha ngayo kamva, ayikwazi ukusebenza kunye nezimaphambili kwibhakethi ye-S3, ngoko ke kubalulekile ukuyishiya ingenanto.
Iinketho ezishiyekileyo zinokutshintshwa ngokuxhomekeke kumthwalo wakho; Ndidla ngokusebenzisa ezo zikhoyo. Qaphela ukuba ucinezelo lwe-S3 alufumaneki, kodwa i-ORC isebenzisa ucinezelo lwemveli ngokungagqibekanga.
kakuhle
Ngoku ukuba siqwalasele ukugcina kunye nokufumana iilog, kufuneka siqwalasele ukuthumela. Siza kusebenzisa
Okokuqala, sifuna ifayile yoqwalaselo fluent.conf. Yidale kwaye wongeze umthombo:
izibuko le-24224
bopha 0.0.0.0
Ngoku ungaqala iseva yeFluentd. Ukuba ufuna uqwalaselo oluphambili, yiya ku
$ docker run
-d
-p 24224:24224
-p 24224:24224/udp
-v /data:/fluentd/log
-v <PATH-TO-FLUENT-CONF>:/fluentd/etc fluentd
-c /fluentd/etc/fluent.conf
fluent/fluentd:stable
Olu qwalaselo lusebenzisa umendo /fluentd/log
kwi-cache logs phambi kokuthumela. Unokwenza ngaphandle koku, kodwa emva koko xa uqala kwakhona, unokuphulukana nayo yonke into egcinwe nge-back-breaking labour. Ungasebenzisa kwakhona naliphi na izibuko;
Ngoku ekubeni sineFluentd esebenzayo, sinokuthumela iilog zeNginx apho. Sihlala siqhuba i-Nginx kwisikhongozeli se-Docker, apho i-Docker inomqhubi wokugawulwa kwemithi weFluentd:
$ docker run
--log-driver=fluentd
--log-opt fluentd-address=<FLUENTD-SERVER-ADDRESS>
--log-opt tag="{{.Name}}"
-v /some/content:/usr/share/nginx/html:ro
-d
nginx
Ukuba usebenzisa i-Nginx ngokwahlukileyo, ungasebenzisa iifayile zelog, uFluentd unayo
Masiyongeze ulwahlulo lwelog oluqwalaselwe ngasentla kuqwalaselo lweFluent:
<filter YOUR-NGINX-TAG.*>
@type parser
key_name log
emit_invalid_record_to_error false
<parse>
@type json
</parse>
</filter>
Kwaye ukuthumela izingodo kwiKinesis usebenzisa
<match YOUR-NGINX-TAG.*>
@type kinesis_firehose
region region
delivery_stream_name <YOUR-KINESIS-STREAM-NAME>
aws_key_id <YOUR-AWS-KEY-ID>
aws_sec_key <YOUR_AWS-SEC_KEY>
</match>
Athena
Ukuba ulungiselele yonke into ngokuchanekileyo, emva koko emva kwexesha (ngokungagqibekanga, iirekhodi zeKinesis zifumene idatha kanye ngemizuzu eyi-10) kufuneka ubone iifayile zelog kwi-S3. Kwimenyu "yokubeka iliso" ye-Kinesis Firehose unokubona ukuba ingakanani idatha erekhodiweyo kwi-S3, kunye neempazamo. Ungalibali ukunika ukufikelela kokubhala kwibhakethi ye-S3 kwindima yeKinesis. Ukuba iKinesis ayikwazanga ukwahlula into, iyakongeza iimpazamo kwibhakethi enye.
Ngoku unokujonga idatha kwi-Athena. Masifumane izicelo zamva nje esibuyisele iimpazamo ngazo:
SELECT * FROM "db_name"."table_name" WHERE status > 499 ORDER BY created_at DESC limit 10;
Ukuskena zonke iirekhodi zesicelo ngasinye
Ngoku iilogi zethu zicutshungulwe kwaye zigcinwe kwi-S3 kwi-ORC, zixinzelelwe kwaye zilungele ukuhlalutya. I-Kinesis Firehose yade yaququzelela ukuba ibe ngabalawuli beyure nganye. Nangona kunjalo, nje ukuba itafile ingahlulwanga, i-Athena iya kulayisha idatha yexesha lonke kwisicelo ngasinye, ngaphandle kwezinto ezinqabileyo. Le yingxaki enkulu ngenxa yezizathu ezibini:
- Umthamo wedatha uhlala ukhula, unciphisa imibuzo;
- I-Athena ihlawuliswa ngokusekelwe kumthamo wedatha eskeniweyo, ubuncinane be-10 MB ngesicelo ngasinye.
Ukulungisa oku, sisebenzisa i-AWS Glue Crawler, eya kukhwela idatha kwi-S3 kwaye ibhale ulwazi lokwahlula kwi-Glue Metastore. Oku kuya kusivumela ukuba sisebenzise izahlulelo njengecebo lokucoca xa ubuza uAthena, kwaye izakuskena kuphela abalawuli abaxeliweyo kumbuzo.
Ukuseta i-Amazon Glue Crawler
I-Amazon Glue Crawler ihlola yonke idatha kwibhakethi ye-S3 kwaye yenza iitafile ezinezahlulo. Yenza i-Glue Crawler esuka kwi-AWS Glue console kwaye ungeze ibhakethi apho ugcina khona idatha. Ungasebenzisa umkhangeli omnye kwiibhakethi ezininzi, apho iyakwenza iitafile kwindawo egciniweyo ekhankanyiweyo enamagama ahambelana namagama eemele. Ukuba uceba ukusebenzisa le datha rhoqo, qiniseka ukuqwalasela ishedyuli yokuphehlelelwa kweCrawler ukuze ihambelane neemfuno zakho. Sisebenzisa iCrawler enye kuzo zonke iitafile, eziqhuba iyure nganye.
Iitafile ezahluliweyo
Emva kophehlelelo lokuqala lomkhangeli, iitafile zebhakethi nganye eskeniweyo kufuneka ivele kuvimba weenkcukacha oxeliweyo kwizicwangciso. Vula i-console ye-Athena kwaye ufumane itafile ene-Nginx logs. Masizame ukufunda into:
SELECT * FROM "default"."part_demo_kinesis_bucket"
WHERE(
partition_0 = '2019' AND
partition_1 = '04' AND
partition_2 = '08' AND
partition_3 = '06'
);
Lo mbuzo uya kukhetha zonke iirekhodi ezifunyenwe phakathi kwentsimbi yesi-6 kunye neyesi-7 kusasa nge-8 ka-Epreli 2019. Kodwa ingaba oku kusebenza ngakumbi kangakanani kunokufunda nje kwitafile engahlulwanga? Masifumanise kwaye sikhethe iirekhodi ezifanayo, sizihluze ngesitampu sexesha:
Imizuzwana eyi-3.59 kunye ne-244.34 megabytes yedatha kwidathasethi eneveki kuphela yelog. Masizame isihluzi ngokwahlulahlula:
Ngokukhawuleza, kodwa okona kubaluleke kakhulu - kuphela i-megabytes eyi-1.23 yedatha! Ingabiza ixabiso eliphantsi kakhulu xa kungekuko ubuncinci be-megabytes ezili-10 ngesicelo ngasinye kumaxabiso. Kodwa kusengcono kakhulu, kwaye kwiiseti zedatha ezinkulu umahluko uya kuba unomtsalane ngakumbi.
Ukwakha ideshibhodi usebenzisa iCube.js
Ukudibanisa ideshibhodi, sisebenzisa isakhelo sohlalutyo seCube.js. Inemisebenzi emininzi, kodwa sinomdla kokubini: ukukwazi ukusebenzisa ngokuzenzekelayo izihluzi zesahlulelo kunye nokuhlanganiswa kwangaphambili kwedatha. Isebenzisa i-schema yedatha
Masenze isicelo esitsha seCube.js. Kuba sele sisebenzisa istaki se-AWS, kuyavakala ukusebenzisa iLambda ukusasazwa. Ungasebenzisa itemplate ecacileyo yesizukulwana ukuba uceba ukubamba iCube.js backend kwiHeroku okanye kwiDocker. Amaxwebhu achaza ezinye
$ npm install -g cubejs-cli
$ cubejs create nginx-log-analytics -t serverless -d athena
Izinto eziguquguqukayo zokusingqongileyo zisetyenziselwa ukuqwalasela ukufikelela kwisiseko sedatha kwi-cube.js. Umvelisi uyakwenza ifayile ye.env apho ungakhankanya khona izitshixo zakho
Ngoku siyayidinga
Kuluhlu schema
, yenza ifayile Logs.js
. Nanku umzekelo wedatha yenginx:
imodeli ikhowudi
const partitionFilter = (from, to) => `
date(from_iso8601_timestamp(${from})) <= date_parse(partition_0 || partition_1 || partition_2, '%Y%m%d') AND
date(from_iso8601_timestamp(${to})) >= date_parse(partition_0 || partition_1 || partition_2, '%Y%m%d')
`
cube(`Logs`, {
sql: `
select * from part_demo_kinesis_bucket
WHERE ${FILTER_PARAMS.Logs.createdAt.filter(partitionFilter)}
`,
measures: {
count: {
type: `count`,
},
errorCount: {
type: `count`,
filters: [
{ sql: `${CUBE.isError} = 'Yes'` }
]
},
errorRate: {
type: `number`,
sql: `100.0 * ${errorCount} / ${count}`,
format: `percent`
}
},
dimensions: {
status: {
sql: `status`,
type: `number`
},
isError: {
type: `string`,
case: {
when: [{
sql: `${CUBE}.status >= 400`, label: `Yes`
}],
else: { label: `No` }
}
},
createdAt: {
sql: `from_unixtime(created_at)`,
type: `time`
}
}
});
Apha sisebenzisa i-variable
Sikwaseta iimetrics kunye neeparameters esifuna ukuzibonisa kwideshibhodi kwaye sichaze ukuhlanganiswa kwangaphambili. ICube.js iya kudala iitafile ezongezelelweyo ezinedatha edityaniswe kwangaphambili kwaye iya kuzihlaziya ngokuzenzekelayo idatha njengoko ifika. Oku akukhawulezi nje imibuzo, kodwa kunciphisa iindleko zokusebenzisa i-Athena.
Masiyongeze olu lwazi kwifayile ye-schema yedatha:
preAggregations: {
main: {
type: `rollup`,
measureReferences: [count, errorCount],
dimensionReferences: [isError, status],
timeDimensionReference: createdAt,
granularity: `day`,
partitionGranularity: `month`,
refreshKey: {
sql: FILTER_PARAMS.Logs.createdAt.filter((from, to) =>
`select
CASE WHEN from_iso8601_timestamp(${to}) + interval '3' day > now()
THEN date_trunc('hour', now()) END`
)
}
}
}
Sicacisa kule modeli ukuba kuyimfuneko ukudibanisa kwangaphambili idatha yazo zonke iimetriki ezisetyenzisiweyo, kwaye usebenzise ulwahlulo ngenyanga.
Ngoku sinokudibanisa ideshibhodi!
Cube.js ngasemva ibonelela
Umncedisi we Cube.js wamkela isicelo kwi
{
"measures": ["Logs.errorCount"],
"timeDimensions": [
{
"dimension": "Logs.createdAt",
"dateRange": ["2019-01-01", "2019-01-07"],
"granularity": "day"
}
]
}
Masifakele umxhasi weCube.js kunye nelayibrari yecandelo leReact ngeNPM:
$ npm i --save @cubejs-client/core @cubejs-client/react
Singenisa amacandelo cubejs
ΠΈ QueryRenderer
ukukhuphela idatha, kwaye uqokelele ideshibhodi:
Ikhowudi yedeshibhodi
import React from 'react';
import { LineChart, Line, XAxis, YAxis } from 'recharts';
import cubejs from '@cubejs-client/core';
import { QueryRenderer } from '@cubejs-client/react';
const cubejsApi = cubejs(
'YOUR-CUBEJS-API-TOKEN',
{ apiUrl: 'http://localhost:4000/cubejs-api/v1' },
);
export default () => {
return (
<QueryRenderer
query={{
measures: ['Logs.errorCount'],
timeDimensions: [{
dimension: 'Logs.createdAt',
dateRange: ['2019-01-01', '2019-01-07'],
granularity: 'day'
}]
}}
cubejsApi={cubejsApi}
render={({ resultSet }) => {
if (!resultSet) {
return 'Loading...';
}
return (
<LineChart data={resultSet.rawData()}>
<XAxis dataKey="Logs.createdAt"/>
<YAxis/>
<Line type="monotone" dataKey="Logs.errorCount" stroke="#8884d8"/>
</LineChart>
);
}}
/>
)
}
Imithombo yeDashboard iyafumaneka kwi
umthombo: www.habr.com