Caadi ahaan, alaabada ganacsiga ama beddelka ilo-furan oo diyaarsan, sida Prometheus + Grafana, ayaa loo isticmaalaa si loola socdo loona falanqeeyo hawlgalka Nginx. Tani waa ikhtiyaar wanaagsan oo loogu talagalay la socodka ama falanqaynta waqtiga-dhabta ah, laakiin maaha mid aad ugu habboon falanqaynta taariikheed. Khayraad kasta oo caan ah, mugga xogta laga helo diiwaannada nginx ayaa si degdeg ah u koraya, iyo in la falanqeeyo tiro badan oo xog ah, waa macquul in la isticmaalo shay gaar ah.
Maqaalkan waxaan kuu sheegi doonaa sida aad u isticmaali karto
TL:DR;
Si aan u ururino macluumaadka aan isticmaalno
Ururinta diiwaannada Nginx
Sida caadiga ah, Nginx logs waxay u egyihiin sidatan:
4/9/2019 12:58:17 PM1.1.1.1 - - [09/Apr/2019:09:58:17 +0000] "GET /sign-up HTTP/2.0" 200 9168 "https://example.com/sign-in" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" "-"
4/9/2019 12:58:17 PM1.1.1.1 - - [09/Apr/2019:09:58:17 +0000] "GET /sign-in HTTP/2.0" 200 9168 "https://example.com/sign-up" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" "-"
Waa la kala saari karaa, laakiin aad bay u fududahay in la saxo qaabeynta Nginx si ay u soo saarto diiwaannada JSON:
log_format json_combined escape=json '{ "created_at": "$msec", '
'"remote_addr": "$remote_addr", '
'"remote_user": "$remote_user", '
'"request": "$request", '
'"status": $status, '
'"bytes_sent": $bytes_sent, '
'"request_length": $request_length, '
'"request_time": $request_time, '
'"http_referrer": "$http_referer", '
'"http_x_forwarded_for": "$http_x_forwarded_for", '
'"http_user_agent": "$http_user_agent" }';
access_log /var/log/nginx/access.log json_combined;
S3 ee kaydinta
Si loo kaydiyo logu, waxaan isticmaali doonaa S3. Tani waxay kuu ogolaaneysaa inaad kaydiso oo aad falanqeyso diiwaannada hal meel, maadaama Athena ay si toos ah ula shaqeyn karto xogta S3. Maqaalka dambe waxaan kuu sheegi doonaa sida saxda ah ee loogu daro loona habeeyo logyada, laakiin marka hore waxaan u baahanahay baaldi nadiif ah S3, kaas oo aan wax kale lagu kaydin doonin. Waxaa haboon in horay loo sii tixgeliyo gobolka aad ka abuuri doonto baaldigaaga, sababtoo ah Athena lagama heli karo dhammaan gobollada.
Abuuritaanka wareegga gudaha console Athena
Aan u samayno miis Athena loogu talagalay logu. Waxaa loo baahan yahay qoraalka iyo akhrinta labadaba haddii aad qorsheyneyso inaad isticmaasho Kinesis Firehose. Fur console-ka Athena oo samee miis:
Miiska SQL
CREATE EXTERNAL TABLE `kinesis_logs_nginx`(
`created_at` double,
`remote_addr` string,
`remote_user` string,
`request` string,
`status` int,
`bytes_sent` int,
`request_length` int,
`request_time` double,
`http_referrer` string,
`http_x_forwarded_for` string,
`http_user_agent` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
's3://<YOUR-S3-BUCKET>'
TBLPROPERTIES ('has_encrypted_data'='false');
Abuuritaanka qulqulka Kinesis Firehose
Kinesis Firehose waxay qori doontaa xogta laga helay Nginx ilaa S3 qaabka la doortay, iyada oo u qaybin doonta hagayaasha qaabka YYYY/MM/DD/HH. Tani waxay ku anfici doontaa markaad akhrinayso xogta. Dabcan, waxaad si toos ah ugu qori kartaa S3 adigoo si faseexa ah u qoraya, laakiin kiiskan waa inaad qortaa JSON, tani waa mid aan waxtar lahayn sababtoo ah cabbirka weyn ee faylasha. Intaa waxaa dheer, marka la isticmaalayo PrestoDB ama Athena, JSON waa xogta ugu gaabis ah. Markaa fur Kinesis Firehose console, dhagsii "Abuur qulqulka gudbinta", dooro "PUT toos ah" goobta "gaarsiinta":
Talaabada xigta, dooro "Beddelka qaabka Diiwaanka" - "Enabled" oo dooro "Apache ORC" qaabka duubista. Sida laga soo xigtay cilmi-baarista qaar
Waxaan u dooranaa S3 kaydinta iyo baaldiga aan hore u abuurnay. Aws Glue Crawler, oo aan ka hadli doono wax yar ka dib, kuma shaqeyn karto horgalayaasha baaldi S3, markaa waa muhiim inaad ka tagto faaruq.
Ikhtiyaarada soo hadhay waa la bedeli karaa iyadoo ku xidhan culayskaaga; Badanaa waxaan isticmaalaa kuwa caadiga ah. Ogsoonow in cadaadiska S3 aan la heli karin, laakiin ORC waxay isticmaashaa isku-buufinta asalka ah.
si fiican u yaqaan
Hadda oo aanu habaysanay kaydinta iyo helida diiwaanka, waxaanu u baahanahay inaanu habaynno diritaanka. Waan isticmaali doonaa
Marka hore, waxaan u baahanahay faylka qaabeynta fluent.conf. Abuur oo ku dar isha:
dekadda 24224
xidhidh 0.0.0.0
Hadda waxaad bilaabi kartaa server-ka Fluentd. Haddii aad u baahan tahay qaabayn horumarsan, tag
$ docker run
-d
-p 24224:24224
-p 24224:24224/udp
-v /data:/fluentd/log
-v <PATH-TO-FLUENT-CONF>:/fluentd/etc fluentd
-c /fluentd/etc/fluent.conf
fluent/fluentd:stable
Qaabayntani waxay isticmaashaa jidka /fluentd/log
si loo kaydiyo diiwaannada ka hor inta aan la dirin. Waxaad samayn kartaa tan la'aanteed, laakiin markaa marka aad dib u bilowdo, waxaad waayi kartaa wax kasta oo lagu kaydiyay foosha dhabarka. Waxa kale oo aad isticmaali kartaa deked kasta; 24224 waa dekedda Fluentd ee caadiga ah.
Hadda oo aan haysano Fluentd orodka, waxaan u diri karnaa Nginx logs halkaas. Caadi ahaan Nginx ayaan ku dhex wadnaa weelka Docker, markaas oo Docker uu leeyahay darawal qori-goob ah oo u dhashay Fluentd:
$ docker run
--log-driver=fluentd
--log-opt fluentd-address=<FLUENTD-SERVER-ADDRESS>
--log-opt tag="{{.Name}}"
-v /some/content:/usr/share/nginx/html:ro
-d
nginx
Haddii aad u maamusho Nginx si ka duwan, waxaad isticmaali kartaa faylasha log, Fluentd ayaa leh
Aynu ku darno falanqaynta diiwaanka ee kor lagu habeeyay qaabaynta Fluent:
<filter YOUR-NGINX-TAG.*>
@type parser
key_name log
emit_invalid_record_to_error false
<parse>
@type json
</parse>
</filter>
Iyo u dirida logyada Kinesis adoo isticmaalaya
<match YOUR-NGINX-TAG.*>
@type kinesis_firehose
region region
delivery_stream_name <YOUR-KINESIS-STREAM-NAME>
aws_key_id <YOUR-AWS-KEY-ID>
aws_sec_key <YOUR_AWS-SEC_KEY>
</match>
Athena
Haddii aad wax walba si sax ah u dejisay, ka dib muddo ka dib (sida caadiga ah, diiwaannada Kinesis waxay heleen xogta hal mar 10kii daqiiqoba) waa inaad aragto faylasha log ee S3. Liiska "kormeerka" ee Kinesis Firehose waxaad arki kartaa inta xogta lagu duubay S3, iyo sidoo kale khaladaadka. Ha iloobin inaad siiso marin qoraal ah baaldiga S3 doorka Kinesis. Haddii Kinesis uu wax kala saari kari waayo, waxay ku dari doontaa khaladaadka baaldi la mid ah.
Hadda waxaad ku arki kartaa xogta gudaha Athena. Aynu helno codsiyadii ugu dambeeyay ee aan u celinay khaladaadka:
SELECT * FROM "db_name"."table_name" WHERE status > 499 ORDER BY created_at DESC limit 10;
Sawirida dhammaan diiwaanada codsi kasta
Hadda diiwaannadayada waa la habeeyey oo lagu kaydiyey S3 gudaha ORC, la cufan oo diyaar u ah falanqaynta. Kinesis Firehose xitaa waxay u habaysay hagayaasha saacad kasta. Si kastaba ha ahaatee, ilaa inta miiska aan la qaybin, Athena waxay ku shuban doontaa xogta wakhti kasta codsi kasta, iyada oo laga reebo dhif ah. Tani waa dhibaato weyn laba sababood dartood:
- Mugga xogta ayaa si joogto ah u koraya, hoos u dhigaya weydiimaha;
- Athena waxaa lagu dalacayaa iyadoo lagu salaynayo mugga xogta la sawiray, iyadoo ugu yaraan 10 MB codsi kasta.
Si tan loo hagaajiyo, waxaanu isticmaalnaa AWS Glue Crawler, kaas oo gurguuran doona xogta ku jirta S3 oo u qori doona macluumaadka qaybta xabagta Metastore. Tani waxay noo ogolaan doontaa inaan u isticmaalno qaybaha shaandhada ahaan marka la waydiinayo Athena, oo waxay kaliya eegi doontaa hagayaasha lagu cayimay weydiinta.
Dejinta Amazon Glue Crawler
Amazon Glue Crawler waxay sawirtaa dhammaan xogta ku jirta baaldiga S3 waxayna abuurtaa miisas qaybo leh. Ka samee Gurguurta Xabagta AWS console oo ku dar baaldi meesha aad xogta ku kaydiso. Waxaad isticmaali kartaa hal gurguurte dhowr baaldiyo, taas oo ay dhacdo in ay abuuri doonto miisaska xogta la cayimay oo leh magacyo ku habboon magacyada baaldiyada. Haddii aad qorshaynayso inaad si joogto ah u isticmaasho xogtan, hubi inaad habayso jadwalka bilaabista Crawler si uu ugu habboonaado baahiyahaaga. Waxaan u isticmaalnaa hal Gurguurte dhammaan miisaska, kaas oo socda saacad kasta.
Miisaska qaybsan
Ka dib bilawga ugu horreeya ee gurguurta, miisaska baaldi kasta oo la sawiray waa in ay ka soo muuqdaan kaydka xogta ee lagu cayimay goobaha. Fur console-ka Athena oo ka hel miiska diiwaanka Nginx. Aan isku dayno inaan wax akhrino:
SELECT * FROM "default"."part_demo_kinesis_bucket"
WHERE(
partition_0 = '2019' AND
partition_1 = '04' AND
partition_2 = '08' AND
partition_3 = '06'
);
Weydiintani waxay dooran doontaa dhammaan diiwaanada la helay inta u dhaxaysa 6 subaxnimo iyo 7 subaxnimo ee Abriil 8, 2019. Laakiin intee in le'eg ayay tani ka waxtar badan tahay akhrinta miis aan qaybsanayn? Aan ogaano oo aan doorano diiwaan isku mid ah, anagoo ku shaandhaynayna wakhtiga:
3.59 ilbiriqsi iyo 244.34 megabyte oo xog ah oo ku saabsan kaydka xogta oo leh hal usbuuc oo kaliya. Aynu isku dayno shaandhaynta qayb ahaan:
In yar oo degdeg ah, laakiin ugu muhiimsan - kaliya 1.23 megabyte oo xog ah! Aad bay u jaban tahay haddaanay ahayn ugu yaraan 10 megabytes codsi kasta ee qiimaha. Laakiin weli way ka sii fiican tahay, iyo kaydinta xogta weyn farqiga ayaa noqon doona mid aad u xiiso badan.
Dhisida dashboard-ka adigoo isticmaalaya Cube.js
Si loo ururiyo dashboard-ka, waxaanu isticmaalnaa qaabka falanqaynta ee Cube.js. Waxay leedahay hawlo badan, laakiin waxaan xiisaynaynaa laba: awoodda si toos ah loo isticmaalo filtarrada qaybta iyo isu-ururinta xogta. Waxay isticmaashaa xogta schema
Aynu abuurno codsi Cube.js cusub. Maadaama aan horeyba u isticmaalnay xirmada AWS, waa macquul in Lambda loo isticmaalo geynta. Waxaad isticmaali kartaa qaabka qeexan jiilka haddii aad qorsheyneyso inaad ku martigeliso dhabarka Cube.js ee Heroku ama Docker. Dukumeentigu wuxuu qeexayaa kuwa kale
$ npm install -g cubejs-cli
$ cubejs create nginx-log-analytics -t serverless -d athena
Doorsoomayaasha deegaanka ayaa loo isticmaalaa in lagu habeeyo gelitaanka xogta ee cube.js. Dab-dhaliye ayaa abuuri doona faylka .env kaas oo aad ku qeexi karto furahaaga
Hadda waxaan u baahanahay
Hagaha ku jira schema
, samee fayl Logs.js
. Waa kan tusaalaha xogta tusaale ee nginx:
Koodhka qaabka
const partitionFilter = (from, to) => `
date(from_iso8601_timestamp(${from})) <= date_parse(partition_0 || partition_1 || partition_2, '%Y%m%d') AND
date(from_iso8601_timestamp(${to})) >= date_parse(partition_0 || partition_1 || partition_2, '%Y%m%d')
`
cube(`Logs`, {
sql: `
select * from part_demo_kinesis_bucket
WHERE ${FILTER_PARAMS.Logs.createdAt.filter(partitionFilter)}
`,
measures: {
count: {
type: `count`,
},
errorCount: {
type: `count`,
filters: [
{ sql: `${CUBE.isError} = 'Yes'` }
]
},
errorRate: {
type: `number`,
sql: `100.0 * ${errorCount} / ${count}`,
format: `percent`
}
},
dimensions: {
status: {
sql: `status`,
type: `number`
},
isError: {
type: `string`,
case: {
when: [{
sql: `${CUBE}.status >= 400`, label: `Yes`
}],
else: { label: `No` }
}
},
createdAt: {
sql: `from_unixtime(created_at)`,
type: `time`
}
}
});
Halkan waxaan isticmaaleynaa doorsoomaha
Waxaan sidoo kale dejinay cabbirada iyo cabbirrada aan rabno inaan ku muujinno dashboard-ka oo aan sheegno isku-darka horre. Cube.js waxay abuuri doontaa jadwal dheeri ah oo leh xog hore la isugu daray oo waxay si toos ah u cusboonaysiin doontaa xogta markay timaaddo. Tani kaliya ma dedejinayso su'aalaha, laakiin sidoo kale waxay yaraynaysaa kharashka isticmaalka Athena.
Aynu ku darno macluumaadkan faylka xogta qaabaynta:
preAggregations: {
main: {
type: `rollup`,
measureReferences: [count, errorCount],
dimensionReferences: [isError, status],
timeDimensionReference: createdAt,
granularity: `day`,
partitionGranularity: `month`,
refreshKey: {
sql: FILTER_PARAMS.Logs.createdAt.filter((from, to) =>
`select
CASE WHEN from_iso8601_timestamp(${to}) + interval '3' day > now()
THEN date_trunc('hour', now()) END`
)
}
}
}
Waxaan ku cadeyneynaa qaabkan in ay lagama maarmaan tahay in horay loo isku daro xogta dhammaan cabbirada la isticmaalo, oo la isticmaalo qaybinta bishiiba.
Hadda waxaan soo ururin karnaa dashboard-ka!
Cube.js dhabarka ayaa bixiya
Adeegga Cube.js wuxuu aqbalayaa codsiga gudaha
{
"measures": ["Logs.errorCount"],
"timeDimensions": [
{
"dimension": "Logs.createdAt",
"dateRange": ["2019-01-01", "2019-01-07"],
"granularity": "day"
}
]
}
Aynu ku rakibno macmiilka Cube.js iyo maktabadda qaybta React anagoo adeegsanayna NPM:
$ npm i --save @cubejs-client/core @cubejs-client/react
Waxaan soo dejinaa qaybo cubejs
ΠΈ QueryRenderer
si aad u soo dejiso xogta, oo aad u ururiso dashboard-ka:
Koodhka dashboard-ka
import React from 'react';
import { LineChart, Line, XAxis, YAxis } from 'recharts';
import cubejs from '@cubejs-client/core';
import { QueryRenderer } from '@cubejs-client/react';
const cubejsApi = cubejs(
'YOUR-CUBEJS-API-TOKEN',
{ apiUrl: 'http://localhost:4000/cubejs-api/v1' },
);
export default () => {
return (
<QueryRenderer
query={{
measures: ['Logs.errorCount'],
timeDimensions: [{
dimension: 'Logs.createdAt',
dateRange: ['2019-01-01', '2019-01-07'],
granularity: 'day'
}]
}}
cubejsApi={cubejsApi}
render={({ resultSet }) => {
if (!resultSet) {
return 'Loading...';
}
return (
<LineChart data={resultSet.rawData()}>
<XAxis dataKey="Logs.createdAt"/>
<YAxis/>
<Line type="monotone" dataKey="Logs.errorCount" stroke="#8884d8"/>
</LineChart>
);
}}
/>
)
}
Ilaha dashboard-ka ayaa laga heli karaa
Source: www.habr.com