Nginx log analytics faʻaaoga Amazon Athena ma Cube.js

E masani lava, o oloa faʻatau pisinisi poʻo isi mea e tatala ai avanoa, e pei o Prometheus + Grafana, e faʻaaogaina e mataʻituina ma auʻiliʻili le gaioiga a Nginx. O se filifiliga lelei lea mo le mataʻituina poʻo faʻamatalaga taimi moni, ae e le faigofie tele mo suʻesuʻega faʻasolopito. I luga o soʻo se punaoa taʻutaʻua, o le tele o faʻamatalaga mai ogalaau nginx o loʻo faʻatupulaia vave, ma e suʻesuʻe le tele o faʻamaumauga, e talafeagai le faʻaaogaina o se mea e sili atu ona faʻapitoa.

I lenei tusiga o le a ou taʻu atu ia te oe pe faʻapefea ona e faʻaaogaina Athena e iloilo ogalaau, ave Nginx e fai ma faʻataʻitaʻiga, ma o le a ou faʻaalia pe faʻafefea ona faʻapipiʻi se dashboard auʻiliʻili mai lenei faʻamatalaga e faʻaaoga ai le open-source cube.js framework. O le ata atoa lea o fofo:

Nginx log analytics faʻaaoga Amazon Athena ma Cube.js

TL:DR;
So'oga i le dashboard ua mae'a.

Ina ia aoina faʻamatalaga matou te faʻaaogaina tautala lelei, mo le faagasologa- AWS Kinesis Data Firehose и AWS Kelu, mo le teuina - AWS S3. I le faʻaaogaina o lenei fusi, e mafai ona e teuina e le gata o ogalaau nginx, ae faʻapea foʻi ma isi mea tutupu, faʻapea foʻi ma ogalaau o isi auaunaga. E mafai ona e suia nisi vaega i mea tutusa mo lau faaputuga, mo se faʻataʻitaʻiga, e mafai ona e tusia ogalaau i le kinesis saʻo mai le nginx, faʻafefe fluentd, pe faʻaaoga logstash mo lenei.

Aoina o ogalaau Nginx

Ona o le faaletonu, o ogalaau Nginx e foliga pei o lenei:

4/9/2019 12:58:17 PM1.1.1.1 - - [09/Apr/2019:09:58:17 +0000] "GET /sign-up HTTP/2.0" 200 9168 "https://example.com/sign-in" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" "-"
4/9/2019 12:58:17 PM1.1.1.1 - - [09/Apr/2019:09:58:17 +0000] "GET /sign-in HTTP/2.0" 200 9168 "https://example.com/sign-up" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" "-"

E mafai ona faʻasalalau, ae sili atu ona faigofie le faʻasaʻoina o le Nginx configuration ina ia maua ai ni ogalaau i le JSON:

log_format json_combined escape=json '{ "created_at": "$msec", '
            '"remote_addr": "$remote_addr", '
            '"remote_user": "$remote_user", '
            '"request": "$request", '
            '"status": $status, '
            '"bytes_sent": $bytes_sent, '
            '"request_length": $request_length, '
            '"request_time": $request_time, '
            '"http_referrer": "$http_referer", '
            '"http_x_forwarded_for": "$http_x_forwarded_for", '
            '"http_user_agent": "$http_user_agent" }';

access_log  /var/log/nginx/access.log  json_combined;

S3 mo le teuina

Mo le teuina o ogalaau, matou te faʻaogaina le S3. Ole mea lea e mafai ai ona e teuina ma au'ili'ili ogalaau i se nofoaga e tasi, talu ai e mafai e Athena ona galue fa'atasi ma fa'amaumauga ile S3 sa'o. Mulimuli ane i le tusiga o le a ou taʻu atu ia te oe pe faʻafefea ona faʻaopoopo saʻo ma faʻagasolo ogalaau, ae muamua matou te manaʻomia se pakete mama i le S3, lea e leai se isi mea e teu ai. E aoga le mafaufau muamua po o fea le itulagi e te faia ai lau pakete i totonu, aua e le maua Athena i itulagi uma.

Fausia se matagaluega i le Athena console

Tatou fai se laulau i Athena mo ogalaau. E manaʻomia mo le tusitusi ma le faitau pe afai e te fuafua e faʻaaoga Kinesis Firehose. Tatala le Athena console ma fai se laulau:

Fausia laulau SQL

CREATE EXTERNAL TABLE `kinesis_logs_nginx`(
  `created_at` double, 
  `remote_addr` string, 
  `remote_user` string, 
  `request` string, 
  `status` int, 
  `bytes_sent` int, 
  `request_length` int, 
  `request_time` double, 
  `http_referrer` string, 
  `http_x_forwarded_for` string, 
  `http_user_agent` string)
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
  's3://<YOUR-S3-BUCKET>'
TBLPROPERTIES ('has_encrypted_data'='false');

Fausiaina o Kinesis Firehose Stream

Kinesis Firehose o le a tusia faʻamatalaga na maua mai Nginx i le S3 i le faatulagaga filifilia, vaevaeina i totonu o faʻamaumauga i le YYYY / MM / DD / HH format. O lenei mea o le a aoga pe a faitau faʻamatalaga. E mafai, ioe, tusi saʻo i le S3 mai le fluentd, ae i lenei tulaga e tatau ona e tusia JSON, ma e le aoga lenei mea ona o le tele o faila. E le gata i lea, pe a faʻaaoga PrestoDB poʻo Athena, o le JSON o le faʻasologa o faʻamaumauga sili ona tuai. O lea, tatala le Kinesis Firehose faʻamafanafanaga, kiliki "Fausia le faʻafeiloaʻiga vaitafe", filifili "Tuʻu saʻo" i le "tuʻuina atu" fanua:

Nginx log analytics faʻaaoga Amazon Athena ma Cube.js

I le isi laupepa, filifili "Faʻamaumauga faʻaliliuga faʻamaumauga" - "Enabled" ma filifili "Apache ORC" e fai ma faʻamaumauga puʻeina. E tusa ai ma nisi suʻesuʻega Owen O'Malley, o le faatulagaga sili lea mo PrestoDB ma Athena. Matou te fa'aogaina le laulau na matou faia i luga o se fa'asologa. Fa'amolemole e mafai ona e fa'amaonia so'o se nofoaga S3 ile kinesis; na'o le schema e fa'aoga mai le laulau. Ae afai e te faʻamaonia se isi nofoaga S3, o le a le mafai ona e faitauina nei faʻamaumauga mai lenei laulau.

Nginx log analytics faʻaaoga Amazon Athena ma Cube.js

Matou te filifilia le S3 mo le teuina ma le pakete na matou faia muamua. Aws Glue Crawler, lea o le a ou talanoa i ai i se taimi mulimuli ane, e le mafai ona galue i prefix i totonu o se pakete S3, o lea e taua ai le tuʻu avanoa.

Nginx log analytics faʻaaoga Amazon Athena ma Cube.js

O isi filifiliga e mafai ona suia e faʻatatau i lau uta; E masani ona ou faʻaaogaina mea le lelei. Manatua e le o maua le S3 compression, ae o lo'o fa'aogaina e le ORC le fa'aogaina fa'aletino e ala ile fa'aletonu.

tautala lelei

I le taimi nei ua matou faʻatulagaina le teuina ma le mauaina o ogalaau, matou te manaʻomia le faʻatulagaina o le auina atu. O le a matou faʻaaogaina tautala lelei, aua ou te alofa ia Ruby, ae e mafai ona e faʻaogaina Logstash pe lafo saʻo ogalaau i kinesis. O le Fluentd server e mafai ona faʻalauiloa i le tele o auala, o le a ou taʻu atu ia te oe e uiga i le docker aua e faigofie ma faigofie.

Muamua, matou te manaʻomia le faila fetuutuunaiga fluent.conf. Fausia ma fa'aopoopo puna:

ituaiga luma
taulaga 24224
fusia 0.0.0.0

Ole taimi nei e mafai ona e amataina le Fluentd server. Afai e te manaʻomia se faʻatulagaga sili atu, alu i le Docker hub O loʻo i ai se taʻiala auiliili, e aofia ai le faʻapipiʻiina o lau ata.

$ docker run 
  -d 
  -p 24224:24224 
  -p 24224:24224/udp 
  -v /data:/fluentd/log 
  -v <PATH-TO-FLUENT-CONF>:/fluentd/etc fluentd 
  -c /fluentd/etc/fluent.conf
  fluent/fluentd:stable

O lenei fa'atulagaga e fa'aogaina ai le ala /fluentd/log e fa'amauina ogalaau a'o le'i auina atu. E mafai ona e faia e aunoa ma lenei mea, ae a e toe amata, e mafai ona e leiloa mea uma o loʻo faʻapipiʻiina i galuega faʻaletonu. E mafai fo'i ona e fa'aogaina so'o se taulaga; 24224 ole fa'aoga ole Fluentd port.

I le taimi nei o loʻo i ai le Fluentd, e mafai ona matou auina atu ogalaau Nginx iina. E masani ona matou taʻavale Nginx i totonu o se pusa Docker, i le tulaga lea o loʻo i ai i Docker se avetaʻavale taʻavale masani mo Fluentd:

$ docker run 
--log-driver=fluentd 
--log-opt fluentd-address=<FLUENTD-SERVER-ADDRESS>
--log-opt tag="{{.Name}}" 
-v /some/content:/usr/share/nginx/html:ro 
-d 
nginx

Afai e ese lau taʻavale Nginx, e mafai ona e faʻaogaina faila faila, Fluentd iai faila si'usi'u plugin.

Se'i o tatou fa'aopoopo le fa'asologa o ogalaau ua fa'atulagaina i luga i le fa'atonuga Fluent:

<filter YOUR-NGINX-TAG.*>
  @type parser
  key_name log
  emit_invalid_record_to_error false
  <parse>
    @type json
  </parse>
</filter>

Ma auina atu ogalaau i Kinesis faaaogaina kinesis firehose plugin:

<match YOUR-NGINX-TAG.*>
    @type kinesis_firehose
    region region
    delivery_stream_name <YOUR-KINESIS-STREAM-NAME>
    aws_key_id <YOUR-AWS-KEY-ID>
    aws_sec_key <YOUR_AWS-SEC_KEY>
</match>

Athena

Afai na e faʻatulagaina saʻo mea uma, a maeʻa sina taimi (e ala i le le mafai, Kinesis faʻamaumauga maua faʻamaumauga i le 10 minute) e tatau ona e vaʻai i faila faila ile S3. I le lisi o le "mataʻituina" o Kinesis Firehose e mafai ona e vaʻaia le tele o faʻamaumauga o loʻo faʻamauina i le S3, faʻapea foʻi ma mea sese. Aua nei galo e tuu atu le avanoa tusitusi i le pakete S3 i le matafaioi Kinesis. Afai e le mafai e Kinesis ona faʻasalalau se mea, o le a faʻaopoopoina mea sese i le pakete lava e tasi.

O lea e mafai ona e vaʻai i faʻamatalaga i Athena. Se'i o tatou su'e talosaga lata mai na matou toe fa'afo'i ai mea sese:

SELECT * FROM "db_name"."table_name" WHERE status > 499 ORDER BY created_at DESC limit 10;

Va'aiga fa'amaumauga uma mo talosaga ta'itasi

Ole taimi nei ua fa'agasolo ma teuina a tatou ogalaau ile S3 ile ORC, fa'apipi'i ma sauni mo au'ili'iliga. Kinesis Firehose na faʻapipiʻiina foi i latou i faʻamaumauga mo itula taʻitasi. Ae ui i lea, afai lava e le vaelua le laulau, o Athena o le a faʻapipiʻiina faʻamaumauga taimi uma i luga o talosaga uma, faʻatasi ai ma tuusaunoaga seasea. O se faʻafitauli tele lea mo mafuaaga e lua:

  • Ole tele o faʻamaumauga o loʻo faʻatupulaia pea, faʻagesegese ai fesili;
  • Athena e fa'atatau i le tele o fa'amaumauga su'esu'e, ma le itiiti ifo i le 10 MB i le talosaga.

Ina ia faʻaleleia lenei mea, matou te faʻaogaina le AWS Glue Crawler, lea o le a tolotolo ai faʻamatalaga i le S3 ma tusi le faʻamatalaga vaeluaga i le Glue Metastore. O lenei mea e mafai ai ona matou faʻaogaina vaega e fai ma faamama pe a fesiligia Athena, ma o le a naʻo le suʻesuʻeina o lisi o loʻo faʻamaonia i le fesili.

Fa'atulaga le Amazon Glue Crawler

Amazon Glue Crawler e su'esu'e uma fa'amaumauga i totonu ole pakete S3 ma fai laulau ma vaeluaga. Fausia se Glue Crawler mai le AWS Glue console ma fa'aopoopo se pakete e te teu ai fa'amaumauga. E mafai ona e fa'aogaina le tolotolo e tasi mo ni pakete e tele, i le tulaga lea o le a faia ai laulau i totonu o fa'amaumauga fa'amaonia ma igoa e fetaui ma igoa o pakete. Afai e te fuafua e fa'aoga i taimi uma ia fa'amaumauga, ia mautinoa e fa'atulaga le fa'asologa o le fa'alauiloa a Crawler e fetaui ma ou mana'oga. Matou te fa'aogaina le Crawler e tasi mo laulau uma, lea e alu i itula uma.

Vaevae laulau

A mae'a le fa'alauiloaina muamua o le fetolofi, e tatau ona fa'aalia laulau mo pakete ta'itasi su'esu'e i totonu o fa'amaumauga o lo'o fa'amaoti mai i fa'atulagaga. Tatala le faʻamafanafanaga Athena ma suʻe le laulau faʻatasi ma ogalaau Nginx. Sei o tatou taumafai e faitau se mea:

SELECT * FROM "default"."part_demo_kinesis_bucket"
WHERE(
  partition_0 = '2019' AND
  partition_1 = '04' AND
  partition_2 = '08' AND
  partition_3 = '06'
  );

O lenei fesili o le a filifilia ai faamaumauga uma na maua i le va o le 6 i le taeao ma le 7 i le taeao ia Aperila 8, 2019. Ae o le a le sili atu le lelei o lenei mea nai lo le na o le faitau mai se laulau e le o vaeluaina? Sei o tatou su'esu'e ma filifili ia lava fa'amaumauga, fa'amama i latou e ala i fa'ailoga taimi:

Nginx log analytics faʻaaoga Amazon Athena ma Cube.js

3.59 sekone ma 244.34 megabytes o faʻamaumauga i luga o se faʻamaumauga ma na o le vaiaso o ogalaau. Sei o tatou taumafai se faamama e ala i le vaeluaga:

Nginx log analytics faʻaaoga Amazon Athena ma Cube.js

Fai sina vave, ae sili ona taua - naʻo le 1.23 megabytes o faʻamaumauga! E sili atu le taugofie pe a le mo le itiiti ifo i le 10 megabytes ile talosaga ile tau. Ae e sili atu le lelei, ma i luga o faʻamaumauga tetele o le eseesega o le a sili atu ona manaia.

Fausiaina o se laupapa e fa'aaoga ai le Cube.js

Ina ia fa'apipi'i le dashboard, matou te fa'aogaina le auivi au'ili'ili a le Cube.js. E fai lava si tele o galuega, ae matou te fiafia i le lua: o le mafai ona faʻaogaina otometi filiga vaeluaga ma faʻamaumauga muamua-agregation. E fa'aogaina fa'amaumauga fa'amaumauga fuafuaga fa'amaumauga, tusia i le Javascript e gaosia ai le SQL ma faʻatino se faʻamatalaga faʻamaumauga. E naʻo le faʻaalia o le faʻaogaina o le faamama o le vaeluaga i le faʻasologa o faʻamaumauga.

Se'i o tatou faia se talosaga fou Cube.js. Talu ai o loʻo matou faʻaogaina le AWS stack, e talafeagai le faʻaogaina o Lambda mo le faʻapipiʻiina. E mafai ona e faʻaogaina le faʻataʻitaʻiga faʻaalia mo le augatupulaga pe afai e te fuafua e faʻafeiloaʻi le Cube.js backend i Heroku poʻo Docker. O faʻamaumauga o loʻo faʻamatalaina isi auala talimalo.

$ npm install -g cubejs-cli
$ cubejs create nginx-log-analytics -t serverless -d athena

O fesuiaiga o le siosiomaga e faʻaogaina e faʻapipiʻi ai avanoa faʻamaumauga i le cube.js. O le generator o le a fatuina se faila .env e mafai ona e faʻamaonia ai au ki mo Athena.

O lea ua tatou manaomia fuafuaga fa'amaumauga, lea o le a matou faʻaalia tonu ai le auala e teuina ai a matou ogalaau. O iina e mafai ai foi ona e faʻamaoti pe faʻafefea ona faʻatatau metrics mo dashboards.

I le lisi schema, fai se faila Logs.js. O se faʻataʻitaʻiga faʻataʻitaʻiga faʻamatalaga mo nginx:

Fa'ailoga fa'atusa

const partitionFilter = (from, to) => `
    date(from_iso8601_timestamp(${from})) <= date_parse(partition_0 || partition_1 || partition_2, '%Y%m%d') AND
    date(from_iso8601_timestamp(${to})) >= date_parse(partition_0 || partition_1 || partition_2, '%Y%m%d')
    `

cube(`Logs`, {
  sql: `
  select * from part_demo_kinesis_bucket
  WHERE ${FILTER_PARAMS.Logs.createdAt.filter(partitionFilter)}
  `,

  measures: {
    count: {
      type: `count`,
    },

    errorCount: {
      type: `count`,
      filters: [
        { sql: `${CUBE.isError} = 'Yes'` }
      ]
    },

    errorRate: {
      type: `number`,
      sql: `100.0 * ${errorCount} / ${count}`,
      format: `percent`
    }
  },

  dimensions: {
    status: {
      sql: `status`,
      type: `number`
    },

    isError: {
      type: `string`,
      case: {
        when: [{
          sql: `${CUBE}.status >= 400`, label: `Yes`
        }],
        else: { label: `No` }
      }
    },

    createdAt: {
      sql: `from_unixtime(created_at)`,
      type: `time`
    }
  }
});

O lo'o matou fa'aogaina le fesuiaiga FILTER_PARAMSe fa'atupu ai se fesili SQL fa'atasi ai ma se faamama vaeluaga.

Matou te setiina foʻi fua faʻatatau ma faʻamaufaʻailoga matou te manaʻo e faʻaalia i luga o le dashboard ma faʻamaonia muamua faʻapipiʻi. O le a faia e le Cube.js ni laulau fa'aopoopo fa'atasi ai ma fa'amaumauga muamua ma o le a otometi lava ona fa'afouina fa'amaumauga pe a o'o mai. E le gata ina faʻavaveina fesili, ae faʻaititia foi le tau o le faʻaaogaina o Athena.

Se'i o tatou fa'aopoopoina le fa'amatalaga lea i le faila o fa'amaumauga:

preAggregations: {
  main: {
    type: `rollup`,
    measureReferences: [count, errorCount],
    dimensionReferences: [isError, status],
    timeDimensionReference: createdAt,
    granularity: `day`,
    partitionGranularity: `month`,
    refreshKey: {
      sql: FILTER_PARAMS.Logs.createdAt.filter((from, to) => 
        `select
           CASE WHEN from_iso8601_timestamp(${to}) + interval '3' day > now()
           THEN date_trunc('hour', now()) END`
      )
    }
  }
}

Matou te faʻamaoti i lenei faʻataʻitaʻiga e manaʻomia le muaʻi faʻapipiʻiina o faʻamaumauga mo metric uma faʻaaogaina, ma faʻaaoga le vaeluaga i le masina. Vaeluaga muamua o le tuufaatasiga e mafai ona faatelevaveina le aoina ma le faʻafouina o faʻamaumauga.

O lea e mafai ona matou faʻapipiʻi le dashboard!

Cube.js backend maua malolo API ma se seti o faletusi a tagata o tausia mo ta'iala pito i luma lauiloa. O le a matou fa'aogaina le React version a le kalani e fau ai le dashboard. E na'o le Cube.js e maua ai fa'amaumauga, o lea e mana'omia ai se faletusi fa'aaliga - Ou te fiafia i ai recharts, ae e mafai ona e faʻaaogaina soʻo se mea.

E talia e le server Cube.js le talosaga i totonu JSON faatulagaga, lea e fa'amaoti mai ai fua fa'atatau. Mo se faʻataʻitaʻiga, e fuafua pe fia ni mea sese na tuʻuina atu e Nginx i le aso, e tatau ona e tuʻuina atu le talosaga lenei:

{
  "measures": ["Logs.errorCount"],
  "timeDimensions": [
    {
      "dimension": "Logs.createdAt",
      "dateRange": ["2019-01-01", "2019-01-07"],
      "granularity": "day"
    }
  ]
}

Tatou fa'apipi'i le Cube.js client ma le React component library e ala i le NPM:

$ npm i --save @cubejs-client/core @cubejs-client/react

Matou te aumaia vaega cubejs и QueryRenderere la'u mai ai fa'amaumauga, ma aoina le dashboard:

Fa'ailoga laupapa

import React from 'react';
import { LineChart, Line, XAxis, YAxis } from 'recharts';
import cubejs from '@cubejs-client/core';
import { QueryRenderer } from '@cubejs-client/react';

const cubejsApi = cubejs(
  'YOUR-CUBEJS-API-TOKEN',
  { apiUrl: 'http://localhost:4000/cubejs-api/v1' },
);

export default () => {
  return (
    <QueryRenderer
      query={{
        measures: ['Logs.errorCount'],
        timeDimensions: [{
            dimension: 'Logs.createdAt',
            dateRange: ['2019-01-01', '2019-01-07'],
            granularity: 'day'
        }]
      }}
      cubejsApi={cubejsApi}
      render={({ resultSet }) => {
        if (!resultSet) {
          return 'Loading...';
        }

        return (
          <LineChart data={resultSet.rawData()}>
            <XAxis dataKey="Logs.createdAt"/>
            <YAxis/>
            <Line type="monotone" dataKey="Logs.errorCount" stroke="#8884d8"/>
          </LineChart>
        );
      }}
    />
  )
}

E maua puna'oa o le Dashboard i code sandbox.

puna: www.habr.com

Faaopoopo i ai se faamatalaga