Ni deede, awọn ọja iṣowo tabi awọn omiiran orisun ṣiṣi ti o ti ṣetan, gẹgẹbi Prometheus + Grafana, ni a lo lati ṣe atẹle ati itupalẹ iṣẹ Nginx. Eyi jẹ aṣayan ti o dara fun ibojuwo tabi awọn atupale akoko gidi, ṣugbọn kii ṣe irọrun pupọ fun itupalẹ itan. Lori eyikeyi orisun olokiki, iwọn data lati awọn akọọlẹ nginx n dagba ni iyara, ati lati ṣe itupalẹ iye nla ti data, o jẹ ọgbọn lati lo nkan pataki diẹ sii.
Ninu nkan yii Emi yoo sọ fun ọ bi o ṣe le lo
TL:DR;
Lati gba alaye ti a lo
Gbigba awọn akọọlẹ Nginx
Nipa aiyipada, awọn akọọlẹ Nginx dabi nkan bi eleyi:
4/9/2019 12:58:17 PM1.1.1.1 - - [09/Apr/2019:09:58:17 +0000] "GET /sign-up HTTP/2.0" 200 9168 "https://example.com/sign-in" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" "-"
4/9/2019 12:58:17 PM1.1.1.1 - - [09/Apr/2019:09:58:17 +0000] "GET /sign-in HTTP/2.0" 200 9168 "https://example.com/sign-up" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" "-"
Wọn le ṣe atunto, ṣugbọn o rọrun pupọ lati ṣe atunṣe iṣeto Nginx ki o ṣe agbejade awọn akọọlẹ ni JSON:
log_format json_combined escape=json '{ "created_at": "$msec", '
'"remote_addr": "$remote_addr", '
'"remote_user": "$remote_user", '
'"request": "$request", '
'"status": $status, '
'"bytes_sent": $bytes_sent, '
'"request_length": $request_length, '
'"request_time": $request_time, '
'"http_referrer": "$http_referer", '
'"http_x_forwarded_for": "$http_x_forwarded_for", '
'"http_user_agent": "$http_user_agent" }';
access_log /var/log/nginx/access.log json_combined;
S3 fun ibi ipamọ
Lati tọju awọn akọọlẹ, a yoo lo S3. Eyi n gba ọ laaye lati fipamọ ati itupalẹ awọn akọọlẹ ni aaye kan, nitori Athena le ṣiṣẹ pẹlu data ni S3 taara. Nigbamii ninu nkan naa Emi yoo sọ fun ọ bi o ṣe le ṣafikun ni deede ati ilana awọn igbasilẹ, ṣugbọn akọkọ a nilo garawa mimọ ni S3, ninu eyiti ko si ohun miiran ti yoo tọju. O tọ lati gbero siwaju ni agbegbe wo ni iwọ yoo ṣẹda garawa rẹ, nitori Athena ko si ni gbogbo awọn agbegbe.
Ṣiṣẹda ero kan ninu console Athena
Jẹ ki a ṣẹda tabili ni Athena fun awọn akọọlẹ. O nilo fun kikọ mejeeji ati kika ti o ba gbero lati lo Kinesis Firehose. Ṣii console Athena ki o ṣẹda tabili kan:
SQL tabili ẹda
CREATE EXTERNAL TABLE `kinesis_logs_nginx`(
`created_at` double,
`remote_addr` string,
`remote_user` string,
`request` string,
`status` int,
`bytes_sent` int,
`request_length` int,
`request_time` double,
`http_referrer` string,
`http_x_forwarded_for` string,
`http_user_agent` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
's3://<YOUR-S3-BUCKET>'
TBLPROPERTIES ('has_encrypted_data'='false');
Ṣiṣẹda Kinesis Firehose ṣiṣan
Kinesis Firehose yoo kọ data ti o gba lati Nginx si S3 ni ọna kika ti a yan, pin si awọn ilana ni ọna kika YYYY / MM / DD / HH. Eyi yoo wa ni ọwọ nigba kika data. O le, nitorinaa, kọ taara si S3 lati fluentd, ṣugbọn ninu ọran yii iwọ yoo ni lati kọ JSON, ati pe eyi jẹ ailagbara nitori iwọn nla ti awọn faili naa. Ni afikun, nigba lilo PrestoDB tabi Athena, JSON jẹ ọna kika data ti o lọra julọ. Nitorinaa ṣii console Kinesis Firehose, tẹ “Ṣẹda ṣiṣan ifijiṣẹ”, yan “PUT taara” ni aaye “ifijiṣẹ”:
Ni taabu atẹle, yan “Iyipada ọna kika igbasilẹ” - “Ṣiṣe” ki o yan “Apache ORC” bi ọna kika gbigbasilẹ. Gẹgẹbi diẹ ninu awọn iwadii
A yan S3 fun ibi ipamọ ati garawa ti a ṣẹda tẹlẹ. Aws Glue Crawler, eyiti Emi yoo sọrọ nipa diẹ sẹhin, ko le ṣiṣẹ pẹlu awọn ami-iṣaaju ninu garawa S3 kan, nitorinaa o ṣe pataki lati fi silẹ ni ofo.
Awọn aṣayan to ku le yipada da lori ẹru rẹ; Mo nigbagbogbo lo awọn aiyipada. Ṣe akiyesi pe funmorawon S3 ko si, ṣugbọn ORC nlo funmorawon abinibi nipasẹ aiyipada.
Fluentd
Ni bayi ti a ti tunto titoju ati gbigba awọn akọọlẹ, a nilo lati tunto fifiranṣẹ. A yoo lo
Ni akọkọ, a nilo faili iṣeto fluent.conf. Ṣẹda ati ṣafikun orisun:
24224 ibudo
dè 0.0.0.0
Bayi o le bẹrẹ olupin Fluentd. Ti o ba nilo iṣeto ni ilọsiwaju diẹ sii, lọ si
$ docker run
-d
-p 24224:24224
-p 24224:24224/udp
-v /data:/fluentd/log
-v <PATH-TO-FLUENT-CONF>:/fluentd/etc fluentd
-c /fluentd/etc/fluent.conf
fluent/fluentd:stable
Eto yii nlo ọna naa /fluentd/log
si awọn akọọlẹ kaṣe ṣaaju fifiranṣẹ. O le ṣe laisi eyi, ṣugbọn nigbana ti o ba tun bẹrẹ, o le padanu ohun gbogbo ti a fipamọ pẹlu iṣẹ fifọ-pada. O tun le lo eyikeyi ibudo; 24224 jẹ ibudo Fluentd aiyipada.
Ni bayi pe a ni Fluentd nṣiṣẹ, a le fi awọn akọọlẹ Nginx ranṣẹ sibẹ. Nigbagbogbo a nṣiṣẹ Nginx ninu apoti Docker kan, ninu eyiti Docker ni awakọ gedu abinibi fun Fluentd:
$ docker run
--log-driver=fluentd
--log-opt fluentd-address=<FLUENTD-SERVER-ADDRESS>
--log-opt tag="{{.Name}}"
-v /some/content:/usr/share/nginx/html:ro
-d
nginx
Ti o ba nṣiṣẹ Nginx ni oriṣiriṣi, o le lo awọn faili log, Fluentd ni
Jẹ ki a ṣafikun iṣiro log ti tunto loke si iṣeto Fluent:
<filter YOUR-NGINX-TAG.*>
@type parser
key_name log
emit_invalid_record_to_error false
<parse>
@type json
</parse>
</filter>
Ati fifiranṣẹ awọn akọọlẹ si Kinesis lilo
<match YOUR-NGINX-TAG.*>
@type kinesis_firehose
region region
delivery_stream_name <YOUR-KINESIS-STREAM-NAME>
aws_key_id <YOUR-AWS-KEY-ID>
aws_sec_key <YOUR_AWS-SEC_KEY>
</match>
Athena
Ti o ba ti tunto ohun gbogbo ni deede, lẹhinna lẹhin igba diẹ (nipasẹ aiyipada, awọn igbasilẹ Kinesis gba data lẹẹkan ni gbogbo iṣẹju mẹwa 10) o yẹ ki o wo awọn faili log ni S3. Ninu akojọ aṣayan "ibojuto" ti Kinesis Firehose o le wo iye data ti o gbasilẹ ni S3, ati awọn aṣiṣe. Maṣe gbagbe lati fun iraye si kikọ si garawa S3 si ipa Kinesis. Ti Kinesis ko ba le sọ nkan kan, yoo ṣafikun awọn aṣiṣe si garawa kanna.
Bayi o le wo data ni Athena. Jẹ ki a wa awọn ibeere tuntun fun eyiti a da awọn aṣiṣe pada:
SELECT * FROM "db_name"."table_name" WHERE status > 499 ORDER BY created_at DESC limit 10;
Ṣiṣayẹwo gbogbo awọn igbasilẹ fun ibeere kọọkan
Bayi awọn akọọlẹ wa ti ni ilọsiwaju ati fipamọ sinu S3 ni ORC, fisinuirindigbindigbin ati ṣetan fun itupalẹ. Kinesis Firehose paapaa ṣeto wọn sinu awọn ilana fun wakati kọọkan. Sibẹsibẹ, niwọn igba ti tabili ko ba pin si, Athena yoo gbe data gbogbo-akoko lori gbogbo ibeere, pẹlu awọn imukuro toje. Eyi jẹ iṣoro nla fun awọn idi meji:
- Iwọn data n dagba nigbagbogbo, fa fifalẹ awọn ibeere;
- Athena ti wa ni idiyele ti o da lori iwọn didun data ti ṣayẹwo, pẹlu o kere ju 10 MB fun ibeere.
Lati ṣatunṣe eyi, a lo AWS Glue Crawler, eyiti yoo ra data ni S3 ki o kọ alaye ipin si Glue Metastore. Eyi yoo gba wa laaye lati lo awọn ipin bi àlẹmọ nigba ibeere Athena, ati pe yoo ṣe ayẹwo awọn ilana ti pato ninu ibeere naa.
Ṣiṣeto Amazon Glue Crawler
Amazon Glue Crawler ṣe ayẹwo gbogbo data ti o wa ninu garawa S3 ati ṣẹda awọn tabili pẹlu awọn ipin. Ṣẹda Crawler Lẹ pọ lati AWS Glue console ki o ṣafikun garawa kan nibiti o tọju data naa. O le lo crawler kan fun ọpọlọpọ awọn buckets, ninu eyiti o yoo ṣẹda awọn tabili ni ibi ipamọ data ti a ti sọtọ pẹlu awọn orukọ ti o baamu awọn orukọ ti awọn buckets. Ti o ba gbero lati lo data yii nigbagbogbo, rii daju pe o tunto iṣeto ifilọlẹ Crawler lati baamu awọn iwulo rẹ. A lo Crawler kan fun gbogbo awọn tabili, eyiti o nṣiṣẹ ni gbogbo wakati.
Awọn tabili ti a pin
Lẹhin ifilọlẹ akọkọ ti crawler, awọn tabili fun garawa ti ṣayẹwo kọọkan yẹ ki o han ni ibi ipamọ data ti pato ninu awọn eto. Ṣii console Athena ki o wa tabili pẹlu awọn akọọlẹ Nginx. Jẹ ká gbiyanju lati ka nkankan:
SELECT * FROM "default"."part_demo_kinesis_bucket"
WHERE(
partition_0 = '2019' AND
partition_1 = '04' AND
partition_2 = '08' AND
partition_3 = '06'
);
Ibeere yii yoo yan gbogbo awọn igbasilẹ ti o gba laarin 6 owurọ si 7 owurọ ni Oṣu Kẹrin Ọjọ 8, Ọdun 2019. Ṣugbọn bawo ni diẹ sii daradara ni eyi ju kika kika lati tabili ti kii ṣe ipin? Jẹ ki a wa jade ki o yan awọn igbasilẹ kanna, sisẹ wọn nipasẹ aami akoko:
3.59 aaya ati 244.34 megabyte ti data lori dataset pẹlu ọsẹ kan ti awọn akọọlẹ. Jẹ ki a gbiyanju àlẹmọ nipasẹ ipin:
Iyara diẹ, ṣugbọn pataki julọ - nikan 1.23 megabyte ti data! Yoo din owo pupọ ti kii ba ṣe fun o kere ju megabyte 10 fun ibeere ni idiyele naa. Ṣugbọn o tun dara julọ, ati lori awọn ipilẹ data nla iyatọ yoo jẹ iwunilori pupọ sii.
Ilé kan Dasibodu lilo Cube.js
Lati ṣajọ dasibodu naa, a lo ilana itupalẹ Cube.js. O ni awọn iṣẹ pupọ pupọ, ṣugbọn a nifẹ si meji: agbara lati lo awọn asẹ ipin laifọwọyi ati iṣakojọpọ data. O nlo eto data
Jẹ ki a ṣẹda ohun elo Cube.js tuntun kan. Niwọn igba ti a ti nlo akopọ AWS tẹlẹ, o jẹ ọgbọn lati lo Lambda fun imuṣiṣẹ. O le lo awoṣe kiakia fun iran ti o ba gbero lati gbalejo Cube.js backend ni Heroku tabi Docker. Awọn iwe apejuwe awọn miiran
$ npm install -g cubejs-cli
$ cubejs create nginx-log-analytics -t serverless -d athena
Ayika oniyipada ti wa ni lo lati tunto wiwọle database ni cube.js. Olupilẹṣẹ yoo ṣẹda faili .env ninu eyiti o le ṣe pato awọn bọtini rẹ fun
Bayi a nilo
Ni awọn liana schema
, ṣẹda faili kan Logs.js
. Eyi ni apẹẹrẹ data apẹẹrẹ fun nginx:
koodu awoṣe
const partitionFilter = (from, to) => `
date(from_iso8601_timestamp(${from})) <= date_parse(partition_0 || partition_1 || partition_2, '%Y%m%d') AND
date(from_iso8601_timestamp(${to})) >= date_parse(partition_0 || partition_1 || partition_2, '%Y%m%d')
`
cube(`Logs`, {
sql: `
select * from part_demo_kinesis_bucket
WHERE ${FILTER_PARAMS.Logs.createdAt.filter(partitionFilter)}
`,
measures: {
count: {
type: `count`,
},
errorCount: {
type: `count`,
filters: [
{ sql: `${CUBE.isError} = 'Yes'` }
]
},
errorRate: {
type: `number`,
sql: `100.0 * ${errorCount} / ${count}`,
format: `percent`
}
},
dimensions: {
status: {
sql: `status`,
type: `number`
},
isError: {
type: `string`,
case: {
when: [{
sql: `${CUBE}.status >= 400`, label: `Yes`
}],
else: { label: `No` }
}
},
createdAt: {
sql: `from_unixtime(created_at)`,
type: `time`
}
}
});
Nibi a nlo oniyipada
A tun ṣeto awọn metiriki ati awọn paramita ti a fẹ ṣafihan lori dasibodu ati pato awọn akojọpọ iṣaaju. Cube.js yoo ṣẹda awọn tabili afikun pẹlu data iṣakojọpọ ati pe yoo ṣe imudojuiwọn data laifọwọyi bi o ti de. Eyi kii ṣe awọn ibeere iyara nikan, ṣugbọn tun dinku idiyele lilo Athena.
Jẹ ki a ṣafikun alaye yii si faili eto data naa:
preAggregations: {
main: {
type: `rollup`,
measureReferences: [count, errorCount],
dimensionReferences: [isError, status],
timeDimensionReference: createdAt,
granularity: `day`,
partitionGranularity: `month`,
refreshKey: {
sql: FILTER_PARAMS.Logs.createdAt.filter((from, to) =>
`select
CASE WHEN from_iso8601_timestamp(${to}) + interval '3' day > now()
THEN date_trunc('hour', now()) END`
)
}
}
}
A pato ninu awoṣe yii pe o jẹ dandan lati ṣajọpọ data tẹlẹ fun gbogbo awọn metiriki ti a lo, ati lo ipin nipasẹ oṣu.
Bayi a le ṣajọ dasibodu naa!
Cube.js backend pese
Olupin Cube.js gba ibeere naa sinu
{
"measures": ["Logs.errorCount"],
"timeDimensions": [
{
"dimension": "Logs.createdAt",
"dateRange": ["2019-01-01", "2019-01-07"],
"granularity": "day"
}
]
}
Jẹ ki a fi sori ẹrọ alabara Cube.js ati ile-ikawe paati React nipasẹ NPM:
$ npm i --save @cubejs-client/core @cubejs-client/react
A gbe awọn eroja wọle cubejs
и QueryRenderer
lati ṣe igbasilẹ data naa, ati gba dasibodu naa:
Koodu Dasibodu
import React from 'react';
import { LineChart, Line, XAxis, YAxis } from 'recharts';
import cubejs from '@cubejs-client/core';
import { QueryRenderer } from '@cubejs-client/react';
const cubejsApi = cubejs(
'YOUR-CUBEJS-API-TOKEN',
{ apiUrl: 'http://localhost:4000/cubejs-api/v1' },
);
export default () => {
return (
<QueryRenderer
query={{
measures: ['Logs.errorCount'],
timeDimensions: [{
dimension: 'Logs.createdAt',
dateRange: ['2019-01-01', '2019-01-07'],
granularity: 'day'
}]
}}
cubejsApi={cubejsApi}
render={({ resultSet }) => {
if (!resultSet) {
return 'Loading...';
}
return (
<LineChart data={resultSet.rawData()}>
<XAxis dataKey="Logs.createdAt"/>
<YAxis/>
<Line type="monotone" dataKey="Logs.errorCount" stroke="#8884d8"/>
</LineChart>
);
}}
/>
)
}
Awọn orisun Dasibodu wa ni
orisun: www.habr.com