A na-eji ngwaahịa azụmaahịa ma ọ bụ ụzọ mepere emepe emebere, dị ka Prometheus + Grafana, iji nyochaa na nyochaa ọrụ Nginx. Nke a bụ nhọrọ dị mma maka nleba anya ma ọ bụ nyocha oge, ma ọ bụghị nke ọma maka nyocha akụkọ ihe mere eme. Na akụrụngwa ọ bụla na-ewu ewu, olu data sitere na ndekọ nginx na-eto ngwa ngwa, yana iji nyochaa nnukwu data, ọ bụ ihe ezi uche dị na ya iji ihe pụrụ iche karịa.
N'isiokwu a, m ga-agwa gị otú ị nwere ike isi jiri
TL:DR;
Iji nakọta ozi anyị na-eji
Na-anakọta ndekọ Nginx
Site na ndabara, Nginx ndekọ dị ka nke a:
4/9/2019 12:58:17 PM1.1.1.1 - - [09/Apr/2019:09:58:17 +0000] "GET /sign-up HTTP/2.0" 200 9168 "https://example.com/sign-in" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" "-"
4/9/2019 12:58:17 PM1.1.1.1 - - [09/Apr/2019:09:58:17 +0000] "GET /sign-in HTTP/2.0" 200 9168 "https://example.com/sign-up" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" "-"
Enwere ike ịtụgharị ha, mana ọ dị mfe idozi nhazi Nginx ka o wee mepụta ndekọ na JSON:
log_format json_combined escape=json '{ "created_at": "$msec", '
'"remote_addr": "$remote_addr", '
'"remote_user": "$remote_user", '
'"request": "$request", '
'"status": $status, '
'"bytes_sent": $bytes_sent, '
'"request_length": $request_length, '
'"request_time": $request_time, '
'"http_referrer": "$http_referer", '
'"http_x_forwarded_for": "$http_x_forwarded_for", '
'"http_user_agent": "$http_user_agent" }';
access_log /var/log/nginx/access.log json_combined;
S3 maka nchekwa
Iji chekwaa ndekọ, anyị ga-eji S3. Nke a na-enye gị ohere ịchekwa na nyochaa ndekọ n'otu ebe, ebe Athena nwere ike ịrụ ọrụ na data na S3 ozugbo. Mgbe e mesịrị na isiokwu ahụ, m ga-agwa gị otu esi etinye ya n'ụzọ ziri ezi na nhazi ndekọ, ma nke mbụ anyị chọrọ ịwụ dị ọcha na S3, nke ọ dịghị ihe ọzọ ga-echekwa. Ọ bara uru ịtụle tupu oge eruo mpaghara ebe ị ga-eke bọket gị, n'ihi na Athena adịghị na mpaghara niile.
Ịmepụta sekit na console Athena
Ka anyị mepụta tebụl na Athena maka ndekọ. Ọ dị mkpa maka ide na ịgụ ma ọ bụrụ na ị na-eme atụmatụ iji Kinesis Firehose. Mepee console Athena wee mepụta tebụl:
SQL okpokoro okike
CREATE EXTERNAL TABLE `kinesis_logs_nginx`(
`created_at` double,
`remote_addr` string,
`remote_user` string,
`request` string,
`status` int,
`bytes_sent` int,
`request_length` int,
`request_time` double,
`http_referrer` string,
`http_x_forwarded_for` string,
`http_user_agent` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
's3://<YOUR-S3-BUCKET>'
TBLPROPERTIES ('has_encrypted_data'='false');
Ịmepụta Kinesis Firehose Stream
Kinesis Firehose ga-ede data enwetara site na Nginx gaa na S3 na usoro ahọpụtara, kewaa ya na akwụkwọ ndekọ aha na usoro YYYY/MM/DD/HH. Nke a ga-aba uru mgbe ị na-agụ data. Ị nwere ike, n'ezie, dee ozugbo na S3 site na fluentd, ma na nke a, ị ga-ede JSON, na nke a adịghị arụ ọrụ n'ihi nnukwu faịlụ. Na mgbakwunye, mgbe ị na-eji PrestoDB ma ọ bụ Athena, JSON bụ usoro data kacha nwayọ. Ya mere, mepee Kinesis Firehose console, pịa "Mepụta iyi nnyefe", họrọ "PUT kpọmkwem" na mpaghara "nnyefe":
Na taabụ na-esote, họrọ "Ntugharị usoro ndekọ" - "Enyere" wee họrọ "Apache ORC" dịka usoro ndekọ. Dị ka ụfọdụ nnyocha si kwuo
Anyị na-ahọrọ S3 maka nchekwa na ịwụ nke anyị kere na mbụ. Aws Glue Crawler, nke m ga-ekwu maka obere oge, enweghị ike ịrụ ọrụ na prefixes na bọket S3, yabụ ọ dị mkpa ịhapụ ya efu.
Enwere ike ịgbanwe nhọrọ ndị fọdụrụ na-adabere na ibu gị; M na-ejikarị ndị ndabara eme ihe. Rịba ama na mkpakọ S3 adịghị, mana ORC na-eji mkpakọ nwa afọ na ndabara.
mara mma
Ugbu a anyị haziela ịchekwa na ịnata ndekọ, anyị kwesịrị ịhazi izipu. Anyị ga-eji
Nke mbụ, anyị chọrọ faịlụ nhazi fluent.conf. Mepụta ya wee tinye isi mmalite:
ọdụ ụgbọ mmiri 24224
njide 0.0.0.0
Ugbu a ị nwere ike ịmalite sava Fluentd. Ọ bụrụ na ịchọrọ nhazi dị elu karịa, gaa na
$ docker run
-d
-p 24224:24224
-p 24224:24224/udp
-v /data:/fluentd/log
-v <PATH-TO-FLUENT-CONF>:/fluentd/etc fluentd
-c /fluentd/etc/fluent.conf
fluent/fluentd:stable
Nhazi a na-eji ụzọ ahụ /fluentd/log
ka cache ndekọ tupu izipu. Ị nwere ike ime na-enweghị nke a, ma mgbe ịmalitegharịa, ị nwere ike tufuo ihe niile echekwara na ọrụ na-agbaji azụ. Ị nwekwara ike iji ọdụ ụgbọ mmiri ọ bụla; 24224 bụ ọdụ ụgbọ mmiri Fluentd.
Ugbu a anyị nwere Fluentd na-agba ọsọ, anyị nwere ike izipu ndekọ Nginx ebe ahụ. Anyị na-agbakarị Nginx n'ime akpa Docker, nke ikpe Docker nwere onye ọkwọ ụgbọ ala maka Fluentd:
$ docker run
--log-driver=fluentd
--log-opt fluentd-address=<FLUENTD-SERVER-ADDRESS>
--log-opt tag="{{.Name}}"
-v /some/content:/usr/share/nginx/html:ro
-d
nginx
Ọ bụrụ na ị na-agba ọsọ Nginx dị iche iche, ị nwere ike iji faịlụ ndekọ, Fluentd nwere
Ka anyị tinye nchịkọta ndekọ ahaziri n'elu na nhazi Fluent:
<filter YOUR-NGINX-TAG.*>
@type parser
key_name log
emit_invalid_record_to_error false
<parse>
@type json
</parse>
</filter>
Na izipu ndekọ na Kinesis iji
<match YOUR-NGINX-TAG.*>
@type kinesis_firehose
region region
delivery_stream_name <YOUR-KINESIS-STREAM-NAME>
aws_key_id <YOUR-AWS-KEY-ID>
aws_sec_key <YOUR_AWS-SEC_KEY>
</match>
Athena
Ọ bụrụ na ị haziela ihe niile n'ụzọ ziri ezi, mgbe obere oge gasịrị (site na ndabara, Kinesis ndekọ natara data otu ugboro kwa nkeji 10) ị ga-ahụ faịlụ ndekọ na S3. Na menu "nleba anya" nke Kinesis Firehose ị nwere ike ịhụ ole data edere na S3, yana njehie. Echefula inye ohere ide na bọket S3 maka ọrụ Kinesis. Ọ bụrụ na Kinesis enweghị ike ịtugharị ihe, ọ ga-agbakwunye njehie na otu ịwụ ahụ.
Ugbu a ị nwere ike ịlele data na Athena. Ka anyị chọta arịrịọ ndị kacha ọhụrụ anyị weghachitere mperi:
SELECT * FROM "db_name"."table_name" WHERE status > 499 ORDER BY created_at DESC limit 10;
Na-enyocha ndekọ niile maka arịrịọ ọ bụla
Ugbu a, edozila ma chekwaa ndekọ anyị na S3 na ORC, ejikọta ya ma dị njikere maka nyocha. Kinesis Firehose haziri ha ka ha bụrụ akwụkwọ ndekọ aha maka elekere ọ bụla. Agbanyeghị, ọ bụrụhaala na ekewaghị tebụl ahụ, Athena ga-ebu data oge niile na arịrịọ ọ bụla, na-enweghị oke. Nke a bụ nnukwu nsogbu n'ihi ihe abụọ:
- Olu data na-eto eto mgbe niile, na-ebelata ajụjụ;
- A na-akwụ ụgwọ Athena dabere na olu data nyochara, yana opekata mpe 10 MB kwa arịrịọ.
Iji dozie nke a, anyị na-eji AWS Glue Crawler, nke ga-akpụ data na S3 wee dee ozi nkebi na Glue Metastore. Nke a ga-enye anyị ohere iji akụkụ dị ka nzacha mgbe a na-ajụ Athena, ọ ga-enyocha naanị akwụkwọ ndekọ aha akọwapụtara na ajụjụ a.
Ịtọlite Amazon Glue Crawler
Amazon Glue Crawler na-enyocha data niile dị na bọket S3 wee mepụta tebụl nwere akụkụ. Mepụta Crawler Glue si na AWS Glue console wee tinye ịwụ ebe ị na-echekwa data. Ị nwere ike iji otu crawler maka ọtụtụ bọket, nke ọ ga-emepụta tebụl na nchekwa data akọwapụtara nke nwere aha dabara na aha bọket. Ọ bụrụ na ị na-eme atụmatụ iji data a mgbe niile, jide n'aka na ị hazie usoro mbido Crawler ka ọ dabara na mkpa gị. Anyị na-eji otu Crawler maka tebụl niile, nke na-agba kwa awa.
Tebụl ndị kewara ekewa
Mgbe mbido mbụ nke crawler gasịrị, tebụl maka ịwụ nyocha ọ bụla kwesịrị ịpụta na nchekwa data akọwapụtara na ntọala. Mepee console Athena wee chọta tebụl nwere ndekọ Nginx. Ka anyị gbalịa ịgụ ihe:
SELECT * FROM "default"."part_demo_kinesis_bucket"
WHERE(
partition_0 = '2019' AND
partition_1 = '04' AND
partition_2 = '08' AND
partition_3 = '06'
);
Ajụjụ a ga-ahọrọ ndekọ niile enwetara n'etiti elekere isii nke ụtụtụ ruo elekere asaa nke ụtụtụ na Eprel 6, 7. Mana kedu ka nke a si rụọ ọrụ nke ọma karịa ịgụ naanị site na tebụl enweghị nkebi? Ka anyị chọpụta wee họrọ otu ndekọ, na-enyocha ha site na timestamp:
3.59 sekọnd na 244.34 megabyte nke data na dataset nwere naanị otu izu ndekọ. Ka anyị nwaa nzacha site na nkebi:
Obere ngwa ngwa, mana nke kachasị mkpa - naanị 1.23 megabyte data! Ọ ga-adị ọnụ ala karịa ma ọ bụrụ na ọ bụghị maka opekempe 10 megabyte kwa arịrịọ na ọnụahịa. Ma ọ ka dị mma karị, na nnukwu datasets dị iche ga-adọrọ mmasị karị.
Iji Cube.js wulite dashboard
Iji kpokọta dashboard ahụ, anyị na-eji usoro nyocha Cube.js. Ọ nwere ọtụtụ ọrụ, mana anyị nwere mmasị na abụọ: ikike iji nzacha nkebi na-akpaghị aka na nchịkọta data. Ọ na-eji data schema
Ka anyị mepụta ngwa Cube.js ọhụrụ. Ebe ọ bụ na anyị na-eji nchịkọta AWS, ọ bụ ihe ezi uche dị na ya iji Lambda maka mbugharị. Ị nwere ike iji ndebiri awara awara maka ọgbọ ma ọ bụrụ na ị na-eme atụmatụ ịkwado Cube.js backend na Heroku ma ọ bụ Docker. Akwụkwọ ahụ na-akọwa ndị ọzọ
$ npm install -g cubejs-cli
$ cubejs create nginx-log-analytics -t serverless -d athena
A na-eji mgbanwe gburugburu ebe obibi hazie ohere nchekwa data na cube.js. Igwe ọkụ ga-emepụta faịlụ .env nke ị nwere ike ịkọwa igodo gị n'ime ya
Ugbu a, anyị chọrọ
Na ndekọ schema
, mepụta faịlụ Logs.js
. Nke a bụ ihe atụ data nlereanya maka nginx:
Koodu nlereanya
const partitionFilter = (from, to) => `
date(from_iso8601_timestamp(${from})) <= date_parse(partition_0 || partition_1 || partition_2, '%Y%m%d') AND
date(from_iso8601_timestamp(${to})) >= date_parse(partition_0 || partition_1 || partition_2, '%Y%m%d')
`
cube(`Logs`, {
sql: `
select * from part_demo_kinesis_bucket
WHERE ${FILTER_PARAMS.Logs.createdAt.filter(partitionFilter)}
`,
measures: {
count: {
type: `count`,
},
errorCount: {
type: `count`,
filters: [
{ sql: `${CUBE.isError} = 'Yes'` }
]
},
errorRate: {
type: `number`,
sql: `100.0 * ${errorCount} / ${count}`,
format: `percent`
}
},
dimensions: {
status: {
sql: `status`,
type: `number`
},
isError: {
type: `string`,
case: {
when: [{
sql: `${CUBE}.status >= 400`, label: `Yes`
}],
else: { label: `No` }
}
},
createdAt: {
sql: `from_unixtime(created_at)`,
type: `time`
}
}
});
N'ebe a, anyị na-eji mgbanwe
Anyị na-edokwa metrics na paramita ndị anyị chọrọ igosi na dashboard wee kọwapụta nchikota mbụ. Cube.js ga-emepụta tebụl ndị ọzọ nwere data agbakọtara ọnụ ma ga-emelite data ozugbo ka ọ bịarutere. Nke a abụghị naanị na-eme ka ajụjụ dị ngwa ngwa, kamakwa ọ na-ebelata ọnụ ahịa iji Athena.
Ka anyị tinye ozi a na faịlụ nhazi data:
preAggregations: {
main: {
type: `rollup`,
measureReferences: [count, errorCount],
dimensionReferences: [isError, status],
timeDimensionReference: createdAt,
granularity: `day`,
partitionGranularity: `month`,
refreshKey: {
sql: FILTER_PARAMS.Logs.createdAt.filter((from, to) =>
`select
CASE WHEN from_iso8601_timestamp(${to}) + interval '3' day > now()
THEN date_trunc('hour', now()) END`
)
}
}
}
Anyị na-akọwapụta n'ụdị a na ọ dị mkpa ibu ụzọ chịkọta data maka metrik niile ejiri mee ihe, wee jiri nkewa site na ọnwa.
Ugbu a, anyị nwere ike ikpokọta dashboard!
Cube.js backend na-enye
Ihe nkesa Cube.js na-anabata arịrịọ n'ime
{
"measures": ["Logs.errorCount"],
"timeDimensions": [
{
"dimension": "Logs.createdAt",
"dateRange": ["2019-01-01", "2019-01-07"],
"granularity": "day"
}
]
}
Ka anyị tinye onye ahịa Cube.js na ọba akwụkwọ akụrụngwa React site na NPM:
$ npm i --save @cubejs-client/core @cubejs-client/react
Anyị na-ebubata akụrụngwa cubejs
и QueryRenderer
iji budata data, ma nakọta dashboard:
Koodu dashboard
import React from 'react';
import { LineChart, Line, XAxis, YAxis } from 'recharts';
import cubejs from '@cubejs-client/core';
import { QueryRenderer } from '@cubejs-client/react';
const cubejsApi = cubejs(
'YOUR-CUBEJS-API-TOKEN',
{ apiUrl: 'http://localhost:4000/cubejs-api/v1' },
);
export default () => {
return (
<QueryRenderer
query={{
measures: ['Logs.errorCount'],
timeDimensions: [{
dimension: 'Logs.createdAt',
dateRange: ['2019-01-01', '2019-01-07'],
granularity: 'day'
}]
}}
cubejsApi={cubejsApi}
render={({ resultSet }) => {
if (!resultSet) {
return 'Loading...';
}
return (
<LineChart data={resultSet.rawData()}>
<XAxis dataKey="Logs.createdAt"/>
<YAxis/>
<Line type="monotone" dataKey="Logs.errorCount" stroke="#8884d8"/>
</LineChart>
);
}}
/>
)
}
Isi mmalite dashboard dị na
isi: www.habr.com