ʻO ka maʻamau, hoʻohana ʻia nā huahana pāʻoihana a i ʻole nā mea ʻē aʻe i hoʻomākaukau ʻia, e like me Prometheus + Grafana, e nānā a nānā i ka hana o Nginx. He koho maikaʻi kēia no ka nānā ʻana a i ʻole ka nānā ʻana i ka manawa maoli, akā ʻaʻole maʻalahi no ka loiloi mōʻaukala. Ma nā kumuwaiwai kaulana, e ulu wikiwiki ana ka nui o ka ʻikepili mai nā logs nginx, a no ka nānā ʻana i ka nui o ka ʻikepili, kūpono ke hoʻohana ʻana i kahi mea kūikawā.
Ma kēia ʻatikala e haʻi wau iā ʻoe pehea e hoʻohana ai
TL:DR;
No ka ʻohi ʻana i ka ʻike a mākou e hoʻohana ai
E hōʻiliʻili i nā log Nginx
Ma ka maʻamau, ʻike ʻia nā logs Nginx e like me kēia:
4/9/2019 12:58:17 PM1.1.1.1 - - [09/Apr/2019:09:58:17 +0000] "GET /sign-up HTTP/2.0" 200 9168 "https://example.com/sign-in" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" "-"
4/9/2019 12:58:17 PM1.1.1.1 - - [09/Apr/2019:09:58:17 +0000] "GET /sign-in HTTP/2.0" 200 9168 "https://example.com/sign-up" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" "-"
Hiki ke hoʻopili ʻia, akā ʻoi aku ka maʻalahi o ka hoʻoponopono ʻana i ka hoʻonohonoho Nginx i mea e hana ai i nā lāʻau i JSON:
log_format json_combined escape=json '{ "created_at": "$msec", '
'"remote_addr": "$remote_addr", '
'"remote_user": "$remote_user", '
'"request": "$request", '
'"status": $status, '
'"bytes_sent": $bytes_sent, '
'"request_length": $request_length, '
'"request_time": $request_time, '
'"http_referrer": "$http_referer", '
'"http_x_forwarded_for": "$http_x_forwarded_for", '
'"http_user_agent": "$http_user_agent" }';
access_log /var/log/nginx/access.log json_combined;
S3 no ka mālama ʻana
No ka mālama ʻana i nā lāʻau, e hoʻohana mākou iā S3. Hiki iā ʻoe ke mālama a hoʻopaʻa i nā lāʻau ma kahi hoʻokahi, no ka mea hiki iā Athena ke hana me ka ʻikepili ma S3 pololei. Ma hope o ka ʻatikala, e haʻi wau iā ʻoe pehea e hoʻohui pono ai a hoʻoponopono i nā lāʻau, akā pono mākou i kahi bākeke maʻemaʻe ma S3, kahi mea ʻole e mālama ʻia. Pono e noʻonoʻo mua i ka ʻāina āu e hana ai i kāu bākeke, no ka mea ʻaʻole loaʻa ʻo Athena ma nā wahi āpau.
Ke hana nei i kahi kaapuni ma ka console Athena
E hana kākou i papa ma Athena no nā lāʻau. Pono ia no ka kākau ʻana a me ka heluhelu inā hoʻolālā ʻoe e hoʻohana i ka Kinesis Firehose. E wehe i ka console Athena a hana i kahi papaʻaina:
Hana ʻia ka papaʻaina SQL
CREATE EXTERNAL TABLE `kinesis_logs_nginx`(
`created_at` double,
`remote_addr` string,
`remote_user` string,
`request` string,
`status` int,
`bytes_sent` int,
`request_length` int,
`request_time` double,
`http_referrer` string,
`http_x_forwarded_for` string,
`http_user_agent` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
's3://<YOUR-S3-BUCKET>'
TBLPROPERTIES ('has_encrypted_data'='false');
E hana ana i ke kahawai Kinesis Firehose
E kākau ʻo Kinesis Firehose i ka ʻikepili i loaʻa mai Nginx a i S3 ma ke ʻano i koho ʻia, e hoʻokaʻawale iā ia i loko o nā papa kuhikuhi ma ka format YYYY/MM/DD/HH. Maikaʻi kēia i ka heluhelu ʻana i ka ʻikepili. Hiki iā ʻoe, ʻoiaʻiʻo, ke kākau pololei iā S3 mai fluentd, akā i kēia hihia, pono ʻoe e kākau iā JSON, a ʻaʻole kūpono kēia ma muli o ka nui o nā faila. Eia kekahi, i ka hoʻohana ʻana iā PrestoDB a i ʻole Athena, ʻo JSON ka hōʻano ikepili lohi. No laila e wehe i ka console Kinesis Firehose, kaomi "E hana i ke kahawai hāʻawi", koho i ka "PUT pololei" ma ka "delivery" kahua:
Ma ka ʻaoʻao aʻe, koho i ka "Record format conversion" - "Enabled" a koho iā "Apache ORC" e like me ke ʻano hoʻopaʻa. Wahi a kekahi noiʻi
Koho mākou iā S3 no ka mālama ʻana a me ka bakeke a mākou i hana ai ma mua. ʻO Aws Glue Crawler, ka mea aʻu e kamaʻilio ai ma hope iki, ʻaʻole hiki ke hana me nā prefix i loko o kahi bākeke S3, no laila he mea nui e waiho ʻole.
Hiki ke hoʻololi ʻia nā koho i koe ma muli o kāu ukana; Hoʻohana pinepine au i nā mea paʻamau. E hoʻomaopopo ʻaʻole loaʻa ka hoʻoemi S3, akā hoʻohana ʻo ORC i ka hoʻopiʻi maoli ma ke ʻano paʻamau.
fluentd
I kēia manawa ua hoʻonohonoho mākou i ka mālama ʻana a me ka loaʻa ʻana o nā lāʻau, pono mākou e hoʻonohonoho i ka hoʻouna ʻana. E hoʻohana mākou
ʻO ka mea mua, pono mākou i ka faila hoʻonohonoho fluent.conf. E hana a hoʻohui i ke kumu:
helu 24224
paa 0.0.0.0
I kēia manawa hiki iā ʻoe ke hoʻomaka i ka kikowaena Fluentd. Inā makemake ʻoe i kahi hoʻonohonoho ʻoi aku ka holomua, e hele i
$ docker run
-d
-p 24224:24224
-p 24224:24224/udp
-v /data:/fluentd/log
-v <PATH-TO-FLUENT-CONF>:/fluentd/etc fluentd
-c /fluentd/etc/fluent.conf
fluent/fluentd:stable
Hoʻohana kēia hoʻonohonoho i ke ala /fluentd/log
e hūnā i nā moʻolelo ma mua o ka hoʻouna ʻana. Hiki iā ʻoe ke hana me ka ʻole o kēia, akā i ka wā e hoʻomaka hou ai, hiki iā ʻoe ke nalowale i nā mea āpau i hūnā ʻia me ka hana haʻi hope. Hiki iā ʻoe ke hoʻohana i kekahi awa; ʻo 24224 ka pahu Fluentd paʻamau.
I kēia manawa ua holo mākou iā Fluentd, hiki iā mākou ke hoʻouna i nā log Nginx ma laila. Hoʻohana pinepine mākou iā Nginx i loko o kahi pahu Docker, i ia manawa he mea hoʻokele logging maoli ʻo Docker no Fluentd:
$ docker run
--log-driver=fluentd
--log-opt fluentd-address=<FLUENTD-SERVER-ADDRESS>
--log-opt tag="{{.Name}}"
-v /some/content:/usr/share/nginx/html:ro
-d
nginx
Inā holo ʻokoʻa ʻoe iā Nginx, hiki iā ʻoe ke hoʻohana i nā faila log, loaʻa iā Fluentd
E hoʻohui i ka log parsing i hoʻonohonoho ʻia ma luna i ka Fluent configuration:
<filter YOUR-NGINX-TAG.*>
@type parser
key_name log
emit_invalid_record_to_error false
<parse>
@type json
</parse>
</filter>
A hoʻouna i nā lāʻau i Kinesis me ka hoʻohana ʻana
<match YOUR-NGINX-TAG.*>
@type kinesis_firehose
region region
delivery_stream_name <YOUR-KINESIS-STREAM-NAME>
aws_key_id <YOUR-AWS-KEY-ID>
aws_sec_key <YOUR_AWS-SEC_KEY>
</match>
ʻO Athena
Inā ua hoʻonohonoho pono ʻoe i nā mea āpau, a laila ma hope o kekahi manawa (ma ka maʻamau, loaʻa nā ʻikepili Kinesis i hoʻokahi manawa i kēlā me kēia 10 mau minuke) pono ʻoe e ʻike i nā faila log ma S3. Ma ka papa kuhikuhi "nānā" o Kinesis Firehose hiki iā ʻoe ke ʻike i ka nui o ka ʻikepili i hoʻopaʻa ʻia ma S3, a me nā hewa. Mai poina e hāʻawi i ke komo kākau i ka bakeke S3 i ka hana Kinesis. Inā ʻaʻole hiki iā Kinesis ke hoʻokaʻawale i kekahi mea, e hoʻohui ia i nā hewa i ka bākeke hoʻokahi.
I kēia manawa hiki iā ʻoe ke nānā i ka ʻikepili ma Athena. E ʻimi i nā noi hou loa i hoʻihoʻi aku ai mākou i nā hewa:
SELECT * FROM "db_name"."table_name" WHERE status > 499 ORDER BY created_at DESC limit 10;
Ke nānā nei i nā moʻolelo a pau no kēlā me kēia noi
I kēia manawa ua hoʻoponopono ʻia kā mākou mau lāʻau a mālama ʻia ma S3 ma ORC, paʻa a mākaukau no ka nānā ʻana. Ua hoʻonohonoho ʻo Kinesis Firehose iā lākou i nā papa kuhikuhi no kēlā me kēia hola. Eia nō naʻe, ʻoiai ʻaʻole i māhele ʻia ka papaʻaina, e hoʻouka ʻo Athena i nā ʻikepili i nā manawa āpau ma kēlā me kēia noi, me nā ʻokoʻa ʻokoʻa. He pilikia nui kēia no nā kumu ʻelua:
- Ke ulu mau nei ka nui o ka ʻikepili, hoʻolohi i nā nīnau;
- Hoʻopili ʻia ʻo Athena ma muli o ka nui o ka ʻikepili i nānā ʻia, me ka liʻiliʻi o 10 MB no kēlā me kēia noi.
No ka hoʻoponopono ʻana i kēia, hoʻohana mākou i ka AWS Glue Crawler, nāna e kolo i ka ʻikepili ma S3 a kākau i ka ʻike ʻāpana i ka Glue Metastore. E ʻae kēia iā mākou e hoʻohana i nā ʻāpana ma ke ʻano he kānana i ka wā e nīnau ai iā Athena, a e nānā wale i nā papa kuhikuhi i kuhikuhi ʻia i ka nīnau.
Hoʻonohonoho i ka Amazon Glue Crawler
Nānā ʻo Amazon Glue Crawler i nā ʻikepili a pau i loko o ka bakeke S3 a hana i nā papa me nā ʻāpana. E hana i kahi Glue Crawler mai ka AWS Glue console a hoʻohui i kahi bākeke kahi āu e mālama ai i ka ʻikepili. Hiki iā ʻoe ke hoʻohana i hoʻokahi mea kolo no kekahi mau bākeke, a laila e hana ʻo ia i nā papa ma ka waihona i ʻōlelo ʻia me nā inoa e pili ana i nā inoa o nā bākeke. Inā hoʻolālā ʻoe e hoʻohana mau i kēia ʻikepili, e ʻoluʻolu e hoʻonohonoho i ka papa hoʻomaka o Crawler e kūpono i kāu mau pono. Hoʻohana mākou i hoʻokahi Crawler no nā papa a pau, e holo ana i kēlā me kēia hola.
Nā papa ʻaina ʻāpana
Ma hope o ka hoʻomaka mua ʻana o ka mea kolo, pono e hōʻike ʻia nā papa no kēlā me kēia bākeke i nānā ʻia i loko o ka waihona i kuhikuhi ʻia ma nā hoʻonohonoho. E wehe i ka console Athena a loaʻa i ka papa me nā log Nginx. E ho'āʻo kākou e heluhelu i kekahi mea:
SELECT * FROM "default"."part_demo_kinesis_bucket"
WHERE(
partition_0 = '2019' AND
partition_1 = '04' AND
partition_2 = '08' AND
partition_3 = '06'
);
E koho kēia nīnau i nā moʻolelo a pau i loaʻa ma waena o 6 a.m. a me 7 a.m. ma ʻApelila 8, 2019. Akā, pehea ka maikaʻi o kēia ma mua o ka heluhelu wale ʻana mai kahi papa ʻaina ʻole? E ʻimi a koho i nā moʻolelo like, kānana ʻana iā lākou ma ka timestamp:
3.59 kekona a me 244.34 megabytes o ka ʻikepili ma kahi waihona me hoʻokahi pule o nā lāʻau. E ho'āʻo kākou i kānana ma ka ʻāpana:
ʻOi aku ka wikiwiki, akā ʻo ka mea nui loa - ʻo 1.23 megabytes wale nō o ka ʻikepili! E ʻoi aku ka maikaʻi inā ʻaʻole no ka liʻiliʻi he 10 megabytes i kēlā me kēia noi ma ke kumu kūʻai. Akā ʻoi aku ka maikaʻi, a ma nā ʻikepili nui e ʻoi aku ka maikaʻi o ka ʻokoʻa.
Ke kūkulu ʻana i kahi dashboard me Cube.js
No ka hōʻuluʻulu ʻana i ka dashboard, hoʻohana mākou i ka Cube.js analytical framework. He nui nā hana, akā makemake mākou i ʻelua: ka hiki ke hoʻohana i nā kānana partition a me ka pre-aggregation data. Hoʻohana ia i ka schema data
E hana kāua i kahi noi Cube.js hou. No ka mea ke hoʻohana nei mākou i ka AWS stack, kūpono ke hoʻohana ʻana iā Lambda no ka waiho ʻana. Hiki iā ʻoe ke hoʻohana i ka hōʻailona hōʻike no ka hanauna inā ʻoe e hoʻolālā e hoʻokipa i ka Cube.js backend ma Heroku a i ʻole Docker. Hōʻike ka palapala i nā mea ʻē aʻe
$ npm install -g cubejs-cli
$ cubejs create nginx-log-analytics -t serverless -d athena
Hoʻohana ʻia nā ʻano hoʻololi kaiapuni no ka hoʻonohonoho ʻana i ka ʻike waihona ma cube.js. Na ka generator e hana i kahi faila .env kahi e hiki ai iā ʻoe ke kuhikuhi i kāu mau kī
I kēia manawa pono mākou
Ma ka papa kuhikuhi schema
, hana i kahi faila Logs.js
. Eia kahi hiʻohiʻona ʻikepili no ka nginx:
Code kumu hoʻohālike
const partitionFilter = (from, to) => `
date(from_iso8601_timestamp(${from})) <= date_parse(partition_0 || partition_1 || partition_2, '%Y%m%d') AND
date(from_iso8601_timestamp(${to})) >= date_parse(partition_0 || partition_1 || partition_2, '%Y%m%d')
`
cube(`Logs`, {
sql: `
select * from part_demo_kinesis_bucket
WHERE ${FILTER_PARAMS.Logs.createdAt.filter(partitionFilter)}
`,
measures: {
count: {
type: `count`,
},
errorCount: {
type: `count`,
filters: [
{ sql: `${CUBE.isError} = 'Yes'` }
]
},
errorRate: {
type: `number`,
sql: `100.0 * ${errorCount} / ${count}`,
format: `percent`
}
},
dimensions: {
status: {
sql: `status`,
type: `number`
},
isError: {
type: `string`,
case: {
when: [{
sql: `${CUBE}.status >= 400`, label: `Yes`
}],
else: { label: `No` }
}
},
createdAt: {
sql: `from_unixtime(created_at)`,
type: `time`
}
}
});
Eia mākou ke hoʻohana nei i ka mea hoʻololi
Hoʻonoho pū mākou i nā ana a me nā ʻāpana a mākou e makemake ai e hōʻike ma ka dashboard a kuhikuhi i nā pre-aggregations. E hana ʻo Cube.js i nā papaʻaina hou me ka ʻikepili i hoʻohui mua ʻia a e hōʻano hou i ka ʻikepili i kona hiki ʻana mai. ʻAʻole wale kēia e wikiwiki i nā nīnau, akā e hōʻemi hoʻi i ke kumukūʻai o ka hoʻohana ʻana iā Athena.
E hoʻohui i kēia ʻike i ka faila schema data:
preAggregations: {
main: {
type: `rollup`,
measureReferences: [count, errorCount],
dimensionReferences: [isError, status],
timeDimensionReference: createdAt,
granularity: `day`,
partitionGranularity: `month`,
refreshKey: {
sql: FILTER_PARAMS.Logs.createdAt.filter((from, to) =>
`select
CASE WHEN from_iso8601_timestamp(${to}) + interval '3' day > now()
THEN date_trunc('hour', now()) END`
)
}
}
}
Hōʻike mākou i loko o kēia kŘkohu he mea pono e hoʻohui mua i ka ʻikepili no nā ana a pau i hoʻohana ʻia, a hoʻohana i ka ʻāpana ma ka mahina.
I kēia manawa hiki iā mākou ke hōʻuluʻulu i ka dashboard!
Hāʻawi ʻo Cube.js backend
Ua ʻae ke kikowaena Cube.js i ka noi ma
{
"measures": ["Logs.errorCount"],
"timeDimensions": [
{
"dimension": "Logs.createdAt",
"dateRange": ["2019-01-01", "2019-01-07"],
"granularity": "day"
}
]
}
E hoʻokomo i ka mea kūʻai Cube.js a me ka waihona waihona React ma o NPM:
$ npm i --save @cubejs-client/core @cubejs-client/react
Lawe mākou i nā ʻāpana cubejs
и QueryRenderer
e kiʻi i ka ʻikepili, a e hōʻiliʻili i ka dashboard:
Ka helu papa kuhikuhi
import React from 'react';
import { LineChart, Line, XAxis, YAxis } from 'recharts';
import cubejs from '@cubejs-client/core';
import { QueryRenderer } from '@cubejs-client/react';
const cubejsApi = cubejs(
'YOUR-CUBEJS-API-TOKEN',
{ apiUrl: 'http://localhost:4000/cubejs-api/v1' },
);
export default () => {
return (
<QueryRenderer
query={{
measures: ['Logs.errorCount'],
timeDimensions: [{
dimension: 'Logs.createdAt',
dateRange: ['2019-01-01', '2019-01-07'],
granularity: 'day'
}]
}}
cubejsApi={cubejsApi}
render={({ resultSet }) => {
if (!resultSet) {
return 'Loading...';
}
return (
<LineChart data={resultSet.rawData()}>
<XAxis dataKey="Logs.createdAt"/>
<YAxis/>
<Line type="monotone" dataKey="Logs.errorCount" stroke="#8884d8"/>
</LineChart>
);
}}
/>
)
}
Loaʻa nā kumu Dashboard ma
Source: www.habr.com