ื ื™ืชื•ื— ื™ื•ืžื ื™ Nginx ื‘ืืžืฆืขื•ืช Amazon Athena ื•-Cube.js

ื‘ื“ืจืš ื›ืœืœ, ืžื•ืฆืจื™ื ืžืกื—ืจื™ื™ื ืื• ื—ืœื•ืคื•ืช ืžื•ื›ื ื•ืช ืฉืœ ืงื•ื“ ืคืชื•ื—, ื›ื’ื•ืŸ Prometheus + Grafana, ืžืฉืžืฉื™ื ืœื ื™ื˜ื•ืจ ื•ื ื™ืชื•ื— ื”ืคืขื•ืœื” ืฉืœ Nginx. ื–ื•ื”ื™ ืืคืฉืจื•ืช ื˜ื•ื‘ื” ืœื ื™ื˜ื•ืจ ืื• ืœื ื™ืชื•ื— ื‘ื–ืžืŸ ืืžืช, ืืš ืœื ืžืื•ื“ ื ื•ื—ื” ืœื ื™ืชื•ื— ื”ื™ืกื˜ื•ืจื™. ื‘ื›ืœ ืžืฉืื‘ ืคื•ืคื•ืœืจื™, ื ืคื— ื”ื ืชื•ื ื™ื ืžื™ื•ืžื ื™ nginx ื’ื“ืœ ื‘ืžื”ื™ืจื•ืช, ื•ื›ื“ื™ ืœื ืชื— ื›ืžื•ืช ื’ื“ื•ืœื” ืฉืœ ื ืชื•ื ื™ื, ื–ื” ื”ื’ื™ื•ื ื™ ืœื”ืฉืชืžืฉ ื‘ืžืฉื”ื• ืžื™ื•ื—ื“ ื™ื•ืชืจ.

ื‘ืžืืžืจ ื–ื” ืืกืคืจ ืœืš ืื™ืš ืืชื” ื™ื›ื•ืœ ืœื”ืฉืชืžืฉ ืืชื ื” ื›ื“ื™ ืœื ืชื— ื™ื•ืžื ื™ื, ื ื™ืงื— ืืช Nginx ื›ื“ื•ื’ืžื”, ื•ืื ื™ ืืจืื” ื›ื™ืฆื“ ืœื”ืจื›ื™ื‘ ืœื•ื— ืžื—ื•ื•ื ื™ื ืื ืœื™ื˜ื™ ืžื”ื ืชื•ื ื™ื ื”ืืœื” ื‘ืืžืฆืขื•ืช ืžืกื’ืจืช ื”ืงื•ื“ ื”ืคืชื•ื— cube.js. ืœื”ืœืŸ ืืจื›ื™ื˜ืงื˜ื•ืจืช ื”ืคืชืจื•ืŸ ื”ืžืœืื”:

ื ื™ืชื•ื— ื™ื•ืžื ื™ Nginx ื‘ืืžืฆืขื•ืช Amazon Athena ื•-Cube.js

TL: DR;
ืงื™ืฉื•ืจ ืœืœื•ื— ื”ืžื—ื•ื•ื ื™ื ื”ืžื•ื’ืžืจ.

ื›ื“ื™ ืœืืกื•ืฃ ืžื™ื“ืข ืฉืื ื• ืžืฉืชืžืฉื™ื ื‘ื• ืคืœื•ืื ื˜, ืœืขื™ื‘ื•ื“ - AWS Kinesis Data Firehose ะธ ื“ื‘ืง AWS, ืœืื—ืกื•ืŸ - AWS S3. ื‘ืืžืฆืขื•ืช ื—ื‘ื™ืœื” ื–ื•, ืืชื” ื™ื›ื•ืœ ืœืื—ืกืŸ ืœื ืจืง ื™ื•ืžื ื™ nginx, ืืœื ื’ื ืื™ืจื•ืขื™ื ืื—ืจื™ื, ื›ืžื• ื’ื ื™ื•ืžื ื™ื ืฉืœ ืฉื™ืจื•ืชื™ื ืื—ืจื™ื. ืืชื” ื™ื›ื•ืœ ืœื”ื—ืœื™ืฃ ื—ืœืงื™ื ืžืกื•ื™ืžื™ื ืขื ื—ืœืงื™ื ื“ื•ืžื™ื ืขื‘ื•ืจ ื”ืžื—ืกื ื™ืช ืฉืœืš, ืœื“ื•ื’ืžื”, ืืชื” ื™ื›ื•ืœ ืœื›ืชื•ื‘ ื™ื•ืžื ื™ื ืœ-kinesis ื™ืฉื™ืจื•ืช ืž-nginx, ืขื•ืงืฃ ืืช fluentd, ืื• ืœื”ืฉืชืžืฉ ื‘-logstash ื‘ืฉื‘ื™ืœ ื–ื”.

ืื™ืกื•ืฃ ื™ื•ืžื ื™ Nginx

ื›ื‘ืจื™ืจืช ืžื—ื“ืœ, ื™ื•ืžื ื™ Nginx ื ืจืื™ื ื‘ืขืจืš ื›ืš:

4/9/2019 12:58:17 PM1.1.1.1 - - [09/Apr/2019:09:58:17 +0000] "GET /sign-up HTTP/2.0" 200 9168 "https://example.com/sign-in" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" "-"
4/9/2019 12:58:17 PM1.1.1.1 - - [09/Apr/2019:09:58:17 +0000] "GET /sign-in HTTP/2.0" 200 9168 "https://example.com/sign-up" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" "-"

ื ื™ืชืŸ ืœื ืชื— ืื•ืชื, ืื‘ืœ ื”ืจื‘ื” ื™ื•ืชืจ ืงืœ ืœืชืงืŸ ืืช ืชืฆื•ืจืช Nginx ื›ืš ืฉื”ื™ื ืชื™ื™ืฆืจ ื™ื•ืžื ื™ื ื‘-JSON:

log_format json_combined escape=json '{ "created_at": "$msec", '
            '"remote_addr": "$remote_addr", '
            '"remote_user": "$remote_user", '
            '"request": "$request", '
            '"status": $status, '
            '"bytes_sent": $bytes_sent, '
            '"request_length": $request_length, '
            '"request_time": $request_time, '
            '"http_referrer": "$http_referer", '
            '"http_x_forwarded_for": "$http_x_forwarded_for", '
            '"http_user_agent": "$http_user_agent" }';

access_log  /var/log/nginx/access.log  json_combined;

S3 ืœืื—ืกื•ืŸ

ื›ื“ื™ ืœืื—ืกืŸ ื™ื•ืžื ื™ื, ื ืฉืชืžืฉ ื‘-S3. ื–ื” ืžืืคืฉืจ ืœืš ืœืื—ืกืŸ ื•ืœื ืชื— ื™ื•ืžื ื™ื ื‘ืžืงื•ื ืื—ื“, ืฉื›ืŸ Athena ื™ื›ื•ืœื” ืœืขื‘ื•ื“ ื™ืฉื™ืจื•ืช ืขื ื ืชื•ื ื™ื ื‘-S3. ื‘ื”ืžืฉืš ื”ืžืืžืจ ืืกืคืจ ืœื›ื ื›ื™ืฆื“ ืœื”ื•ืกื™ืฃ ื•ืœืขื‘ื“ ื™ื•ืžื ื™ื ื‘ืฆื•ืจื” ื ื›ื•ื ื”, ืื‘ืœ ืงื•ื“ื ื›ืœ ืื ื—ื ื• ืฆืจื™ื›ื™ื ื“ืœื™ ื ืงื™ ื‘-S3, ืฉื‘ื• ืฉื•ื ื“ื‘ืจ ืื—ืจ ืœื ื™ืื•ื—ืกืŸ. ื›ื“ืื™ ืœืฉืงื•ืœ ืžืจืืฉ ื‘ืื™ื–ื” ืื–ื•ืจ ืชื™ืฆื•ืจ ืืช ื”ื“ืœื™ ืฉืœืš, ื›ื™ ืืชื ื” ืœื ื–ืžื™ื ื” ื‘ื›ืœ ื”ืื–ื•ืจื™ื.

ื™ืฆื™ืจืช ืžืขื’ืœ ื‘ืงื•ื ืกื•ืœืช Athena

ื‘ื•ืื• ื ื™ืฆื•ืจ ื˜ื‘ืœื” ื‘ืืชื ื” ืขื‘ื•ืจ ื™ื•ืžื ื™ื. ื–ื” ื ื—ื•ืฅ ื”ืŸ ืœื›ืชื™ื‘ื” ื•ื”ืŸ ืœืงืจื™ืื” ืื โ€‹โ€‹ืืชื ืžืชื›ื ื ื™ื ืœื”ืฉืชืžืฉ ื‘-Kinesis Firehose. ืคืชื— ืืช ืžืกื•ืฃ Athena ื•ืฆื•ืจ ื˜ื‘ืœื”:

ื™ืฆื™ืจืช ื˜ื‘ืœืช SQL

CREATE EXTERNAL TABLE `kinesis_logs_nginx`(
  `created_at` double, 
  `remote_addr` string, 
  `remote_user` string, 
  `request` string, 
  `status` int, 
  `bytes_sent` int, 
  `request_length` int, 
  `request_time` double, 
  `http_referrer` string, 
  `http_x_forwarded_for` string, 
  `http_user_agent` string)
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
  's3://<YOUR-S3-BUCKET>'
TBLPROPERTIES ('has_encrypted_data'='false');

ื™ืฆื™ืจืช Kinesis Firehose Stream

Kinesis Firehose ืชื›ืชื•ื‘ ืืช ื”ื ืชื•ื ื™ื ืฉื”ืชืงื‘ืœื• ืž-Nginx ืœ-S3 ื‘ืคื•ืจืžื˜ ื”ื ื‘ื—ืจ, ื•ืชื—ืœืง ืื•ืชื ืœืกืคืจื™ื•ืช ื‘ืคื•ืจืžื˜ YYYY/MM/DD/HH. ื–ื” ื™ื”ื™ื” ืฉื™ืžื•ืฉื™ ื‘ืขืช ืงืจื™ืืช ื ืชื•ื ื™ื. ืืคืฉืจ ื›ืžื•ื‘ืŸ ืœื›ืชื•ื‘ ื™ืฉื™ืจื•ืช ืœ-S3 ืž-fluentd, ืื‘ืœ ื‘ืžืงืจื” ื”ื–ื” ืชืฆื˜ืจื›ื• ืœื›ืชื•ื‘ JSON, ื•ื–ื” ืœื ื™ืขื™ืœ ื‘ื’ืœืœ ื”ื’ื•ื“ืœ ื”ื’ื“ื•ืœ ืฉืœ ื”ืงื‘ืฆื™ื. ื‘ื ื•ืกืฃ, ื‘ืขืช ืฉื™ืžื•ืฉ ื‘-PrestoDB ืื• Athena, JSON ื”ื•ื ืคื•ืจืžื˜ ื”ื ืชื•ื ื™ื ื”ืื™ื˜ื™ ื‘ื™ื•ืชืจ. ืื– ืคืชื—ื• ืืช ืงื•ื ืกื•ืœืช Kinesis Firehose, ืœื—ืฆื• ืขืœ "ืฆื•ืจ ื–ืจื ืžืฉืœื•ื—", ื‘ื—ืจื• "PUT ื™ืฉื™ืจ" ื‘ืฉื“ื” "ืžืกื™ืจื”":

ื ื™ืชื•ื— ื™ื•ืžื ื™ Nginx ื‘ืืžืฆืขื•ืช Amazon Athena ื•-Cube.js

ื‘ื›ืจื˜ื™ืกื™ื™ื” ื”ื‘ืื”, ื‘ื—ืจ "ื”ืžืจืช ืคื•ืจืžื˜ ืœื”ืงืœื™ื˜" - "ืžื•ืคืขืœ" ื•ื‘ื—ืจ "Apache ORC" ื›ืคื•ืจืžื˜ ื”ื”ืงืœื˜ื”. ืœืคื™ ื›ืžื” ืžื—ืงืจื™ื ืื•ื•ืŸ ืื•ืžืืœื™, ื–ื”ื• ื”ืคื•ืจืžื˜ ื”ืื•ืคื˜ื™ืžืœื™ ืขื‘ื•ืจ PrestoDB ื•- Athena. ืื ื• ืžืฉืชืžืฉื™ื ื‘ื˜ื‘ืœื” ืฉื™ืฆืจื ื• ืœืžืขืœื” ื›ืกื›ื™ืžื”. ืฉื™ื ืœื‘ ืฉืืชื” ื™ื›ื•ืœ ืœืฆื™ื™ืŸ ื›ืœ ืžื™ืงื•ื S3 ื‘-kinesis; ืจืง ื”ืกื›ื™ืžื” ืžืฉืžืฉืช ืžื”ื˜ื‘ืœื”. ืื‘ืœ ืื ืชืฆื™ื™ืŸ ืžื™ืงื•ื S3 ืื—ืจ, ืœื ืชื•ื›ืœ ืœืงืจื•ื ืืช ื”ืจืฉื•ืžื•ืช ื”ืืœื” ืžื”ื˜ื‘ืœื” ื”ื–ื•.

ื ื™ืชื•ื— ื™ื•ืžื ื™ Nginx ื‘ืืžืฆืขื•ืช Amazon Athena ื•-Cube.js

ืื ื• ื‘ื•ื—ืจื™ื ืืช S3 ืœืื—ืกื•ืŸ ื•ืืช ื”ื“ืœื™ ืฉื™ืฆืจื ื• ืงื•ื“ื ืœื›ืŸ. Aws Glue Crawler, ืขืœื™ื• ืื“ื‘ืจ ืžืขื˜ ื‘ื”ืžืฉืš, ืœื ื™ื›ื•ืœ ืœืขื‘ื•ื“ ืขื ืงื™ื“ื•ืžื•ืช ื‘ื“ืœื™ S3, ื•ืœื›ืŸ ื—ืฉื•ื‘ ืœื”ืฉืื™ืจ ืื•ืชื• ืจื™ืง.

ื ื™ืชื•ื— ื™ื•ืžื ื™ Nginx ื‘ืืžืฆืขื•ืช Amazon Athena ื•-Cube.js

ื ื™ืชืŸ ืœืฉื ื•ืช ืืช ื”ืืคืฉืจื•ื™ื•ืช ื”ื ื•ืชืจื•ืช ื‘ื”ืชืื ืœืขื•ืžืก ืฉืœืš; ืื ื™ ื‘ื“ืจืš ื›ืœืœ ืžืฉืชืžืฉ ื‘ื‘ืจื™ืจืช ื”ืžื—ื“ืœ. ืฉื™ื ืœื‘ ืฉื“ื—ื™ืกื” ืฉืœ S3 ืื™ื ื” ื–ืžื™ื ื”, ืืš ORC ืžืฉืชืžืฉืช ื‘ื“ื—ื™ืกื” ืžืงื•ืจื™ืช ื›ื‘ืจื™ืจืช ืžื—ื“ืœ.

ืคืœื•ืื ื˜

ื›ืขืช, ืœืื—ืจ ืฉื”ื’ื“ืจื ื• ืื—ืกื•ืŸ ื•ืงื‘ืœื” ืฉืœ ื™ื•ืžื ื™ื, ืขืœื™ื ื• ืœื”ื’ื“ื™ืจ ืืช ื”ืฉืœื™ื—ื”. ืื ื—ื ื• ื ืฉืชืžืฉ ืคืœื•ืื ื˜, ื›ื™ ืื ื™ ืื•ื”ื‘ ืืช ืจื•ื‘ื™, ืื‘ืœ ืืชื” ื™ื›ื•ืœ ืœื”ืฉืชืžืฉ ื‘-Logstash ืื• ืœืฉืœื•ื— ื™ื•ืžื ื™ื ืœ-kinesis ื™ืฉื™ืจื•ืช. ื ื™ืชืŸ ืœื”ืคืขื™ืœ ืืช ืฉืจืช Fluent ื‘ื›ืžื” ื“ืจื›ื™ื, ืื ื™ ืืกืคืจ ืœื›ื ืขืœ docker ื›ื™ ื–ื” ืคืฉื•ื˜ ื•ื ื•ื—.

ืจืืฉื™ืช, ืื ื—ื ื• ืฆืจื™ื›ื™ื ืืช ืงื•ื‘ืฅ ื”ืชืฆื•ืจื” fluent.conf. ืฆื•ืจ ืื•ืชื• ื•ื”ื•ืกืฃ ืžืงื•ืจ:

ืกื•ื’ ืงึธื“ึดื™ืžึธื”
ื™ืฆื™ืื” 24224
ืœืื’ื“ 0.0.0.0

ืขื›ืฉื™ื• ืืชื” ื™ื›ื•ืœ ืœื”ืคืขื™ืœ ืืช ืฉืจืช Fluent. ืื ืืชื” ืฆืจื™ืš ืชืฆื•ืจื” ืžืชืงื“ืžืช ื™ื•ืชืจ, ืขื‘ื•ืจ ืืœ ืจื›ื–ืช ื“ื•ืงืจ ื™ืฉ ืžื“ืจื™ืš ืžืคื•ืจื˜, ื›ื•ืœืœ ืื™ืš ืœื”ืจื›ื™ื‘ ืืช ื”ืชืžื•ื ื” ืฉืœืš.

$ docker run 
  -d 
  -p 24224:24224 
  -p 24224:24224/udp 
  -v /data:/fluentd/log 
  -v <PATH-TO-FLUENT-CONF>:/fluentd/etc fluentd 
  -c /fluentd/etc/fluent.conf
  fluent/fluentd:stable

ืชืฆื•ืจื” ื–ื• ืžืฉืชืžืฉืช ื‘ื ืชื™ื‘ /fluentd/log ืœืื—ืกืŸ ื™ื•ืžื ื™ื ืœืคื ื™ ื”ืฉืœื™ื—ื”. ืืชื” ื™ื›ื•ืœ ืœื”ืกืชื“ืจ ื‘ืœื™ ื–ื”, ืื‘ืœ ืื– ื›ืฉืชืชื—ื™ืœ ืžื—ื“ืฉ, ืืชื” ื™ื›ื•ืœ ืœืื‘ื“ ืืช ื›ืœ ืžื” ืฉื ืžืฆื ื‘ืžื˜ืžื•ืŸ ืขื ืฆื™ืจื™ื ื—ื•ื–ืจื™ื. ืืชื” ื™ื›ื•ืœ ื’ื ืœื”ืฉืชืžืฉ ื‘ื›ืœ ื™ืฆื™ืื”; 24224 ื”ื™ื ื™ืฆื™ืืช ื‘ืจื™ืจืช ื”ืžื—ื“ืœ ืฉืœ Fluentd.

ื›ืขืช, ื›ืืฉืจ ื™ืฉ ืœื ื• ืืช Fluent ืคื•ืขืœืช, ืื ื• ื™ื›ื•ืœื™ื ืœืฉืœื•ื— ืœืฉื ื™ื•ืžื ื™ Nginx. ื‘ื“ืจืš ื›ืœืœ ืื ื• ืžืจื™ืฆื™ื ืืช Nginx ื‘ืงื•ื ื˜ื™ื™ื ืจ ืฉืœ Docker, ื•ื‘ืžืงืจื” ื–ื” ืœ-Docker ื™ืฉ ืžื ื”ืœ ื”ืชืงืŸ ืจื™ืฉื•ื ืžืงื•ืจื™ ืขื‘ื•ืจ Fluentd:

$ docker run 
--log-driver=fluentd 
--log-opt fluentd-address=<FLUENTD-SERVER-ADDRESS>
--log-opt tag="{{.Name}}" 
-v /some/content:/usr/share/nginx/html:ro 
-d 
nginx

ืื ืืชื” ืžืคืขื™ืœ ืืช Nginx ื‘ืฆื•ืจื” ืฉื•ื ื”, ืืชื” ื™ื›ื•ืœ ืœื”ืฉืชืžืฉ ื‘ืงื•ื‘ืฆื™ ื™ื•ืžืŸ, ื›ืš ื™ืฉ ืœ-Fuentd ืชื•ืกืฃ ื–ื ื‘ ืงื•ื‘ืฅ.

ื‘ื•ืื• ื ื•ืกื™ืฃ ืืช ื ื™ืชื•ื— ื”ื™ื•ืžืŸ ืฉื”ื•ื’ื“ืจ ืœืขื™ืœ ืœืชืฆื•ืจืช Fluent:

<filter YOUR-NGINX-TAG.*>
  @type parser
  key_name log
  emit_invalid_record_to_error false
  <parse>
    @type json
  </parse>
</filter>

ื•ืฉืœื™ื—ืช ื™ื•ืžื ื™ื ืœืงื™ื ื–ื™ืก ื‘ืืžืฆืขื•ืช plugin kinesis firehose:

<match YOUR-NGINX-TAG.*>
    @type kinesis_firehose
    region region
    delivery_stream_name <YOUR-KINESIS-STREAM-NAME>
    aws_key_id <YOUR-AWS-KEY-ID>
    aws_sec_key <YOUR_AWS-SEC_KEY>
</match>

ืืชื ื”

ืื ื”ื’ื“ืจืชื ื”ื›ืœ ื›ื”ืœื›ื”, ืœืื—ืจ ื–ืžืŸ ืžื” (ื›ื‘ืจื™ืจืช ืžื—ื“ืœ, Kinesis ืžืชืขื“ ื ืชื•ื ื™ื ืฉื”ืชืงื‘ืœื• ืคืขื ื‘-10 ื“ืงื•ืช) ืืชื ืืžื•ืจื™ื ืœืจืื•ืช ืงื‘ืฆื™ ื™ื•ืžืŸ ื‘-S3. ื‘ืชืคืจื™ื˜ "ื ื™ื˜ื•ืจ" ืฉืœ Kinesis Firehose ืืชื” ื™ื›ื•ืœ ืœืจืื•ืช ื›ืžื” ื ืชื•ื ื™ื ื ืจืฉืžื™ื ื‘-S3, ื›ืžื• ื’ื ืฉื’ื™ืื•ืช. ืืœ ืชืฉื›ื— ืœืชืช ื’ื™ืฉืช ื›ืชื™ื‘ื” ืœื“ืœื™ S3 ืœืชืคืงื™ื“ Kinesis. ืื Kinesis ืœื ื”ืฆืœื™ื— ืœื ืชื— ืžืฉื”ื•, ื”ื•ื ื™ื•ืกื™ืฃ ืืช ื”ืฉื’ื™ืื•ืช ืœืื•ืชื• ื“ืœื™.

ืขื›ืฉื™ื• ืืชื” ื™ื›ื•ืœ ืœื”ืฆื™ื’ ืืช ื”ื ืชื•ื ื™ื ื‘ืืชื ื”. ื‘ื•ื ื ืžืฆื ืืช ื”ื‘ืงืฉื•ืช ื”ืื—ืจื•ื ื•ืช ืฉืขื‘ื•ืจืŸ ื”ื—ื–ืจื ื• ืฉื’ื™ืื•ืช:

SELECT * FROM "db_name"."table_name" WHERE status > 499 ORDER BY created_at DESC limit 10;

ืกืจื™ืงืช ื›ืœ ื”ืจืฉื•ืžื•ืช ืขื‘ื•ืจ ื›ืœ ื‘ืงืฉื”

ื›ืขืช ื”ื™ื•ืžื ื™ื ืฉืœื ื• ืขื‘ืจื• ืขื™ื‘ื•ื“ ื•ืื•ื—ืกื ื• ื‘-S3 ื‘-ORC, ื“ื—ื•ืกื™ื ื•ืžื•ื›ื ื™ื ืœื ื™ืชื•ื—. Kinesis Firehose ืืคื™ืœื• ืืจื’ื ื” ืื•ืชื ื‘ืกืคืจื™ื•ืช ืขื‘ื•ืจ ื›ืœ ืฉืขื”. ืขื ื–ืืช, ื›ืœ ืขื•ื“ ื”ื˜ื‘ืœื” ืื™ื ื” ืžื—ื•ืœืงืช, Athena ืชื˜ืขืŸ ื ืชื•ื ื™ื ืžื›ืœ ื”ื–ืžื ื™ื ืขืœ ื›ืœ ื‘ืงืฉื”, ืœืžืขื˜ ื—ืจื™ื’ื™ื ื ื“ื™ืจื™ื. ื–ื• ื‘ืขื™ื” ื’ื“ื•ืœื” ืžืฉืชื™ ืกื™ื‘ื•ืช:

  • ื ืคื— ื”ื ืชื•ื ื™ื ื’ื“ืœ ื›ืœ ื”ื–ืžืŸ, ื•ืžืื˜ ืืช ื”ืฉืื™ืœืชื•ืช;
  • ืืชื ื” ืžื—ื•ื™ื‘ืช ืขืœ ืกืžืš ื ืคื— ื”ื ืชื•ื ื™ื ืฉื ืกืจืงื•, ืขื ืžื™ื ื™ืžื•ื ืฉืœ 10 MB ืœื›ืœ ื‘ืงืฉื”.

ื›ื“ื™ ืœืชืงืŸ ื–ืืช, ืื ื• ืžืฉืชืžืฉื™ื ื‘-AWS Glue Crawler, ืืฉืจ ื™ืกืจื•ืง ืืช ื”ื ืชื•ื ื™ื ื‘-S3 ื•ื™ื›ืชื•ื‘ ืืช ืคืจื˜ื™ ื”ืžื—ื™ืฆื” ืœ-Glue Metastore. ื–ื” ื™ืืคืฉืจ ืœื ื• ืœื”ืฉืชืžืฉ ื‘ืžื—ื™ืฆื•ืช ื›ืžืกื ืŸ ื‘ืขืช โ€‹โ€‹ืฉืื™ืœืชืช Athena, ื•ื”ื•ื ื™ืกืจื•ืง ืจืง ืืช ื”ืกืคืจื™ื•ืช ืฉืฆื•ื™ื ื• ื‘ืฉืื™ืœืชื”.

ื”ื’ื“ืจืช Amazon Glue Crawler

Amazon Glue Crawler ืกื•ืจืง ืืช ื›ืœ ื”ื ืชื•ื ื™ื ื‘ื“ืœื™ S3 ื•ื™ื•ืฆืจ ื˜ื‘ืœืื•ืช ืขื ืžื—ื™ืฆื•ืช. ืฆื•ืจ ืกื•ืจืง ื“ื‘ืง ืžืžืกื•ืฃ ื”ื“ื‘ืง ืฉืœ AWS ื•ื”ื•ืกืฃ ื“ืœื™ ืฉื‘ื• ืืชื” ืžืื—ืกืŸ ืืช ื”ื ืชื•ื ื™ื. ื ื™ืชืŸ ืœื”ืฉืชืžืฉ ื‘ืกื•ืจืง ืื—ื“ ืขื‘ื•ืจ ืžืกืคืจ ื“ืœื™ื™ื, ื•ื‘ืžืงืจื” ื–ื” ื”ื•ื ื™ืฆื•ืจ ื˜ื‘ืœืื•ืช ื‘ืžืกื“ ื”ื ืชื•ื ื™ื ืฉืฆื•ื™ืŸ ืขื ืฉืžื•ืช ื”ืชื•ืืžื™ื ืืช ืฉืžื•ืช ื”ื“ืœื™ื™ื. ืื ืืชื” ืžืชื›ื ืŸ ืœื”ืฉืชืžืฉ ื‘ื ืชื•ื ื™ื ืืœื” ื‘ืื•ืคืŸ ืงื‘ื•ืข, ื”ืงืคื“ ืœื”ื’ื“ื™ืจ ืืช ืœื•ื— ื”ื–ืžื ื™ื ืฉืœ ื”ื”ืฉืงื” ืฉืœ Crawler ื›ืš ืฉื™ืชืื™ื ืœืฆืจื›ื™ื ืฉืœืš. ืื ื• ืžืฉืชืžืฉื™ื ื‘-Crawler ืื—ื“ ืขื‘ื•ืจ ื›ืœ ื”ืฉื•ืœื—ื ื•ืช, ืฉืคื•ืขืœ ื›ืœ ืฉืขื”.

ืฉื•ืœื—ื ื•ืช ืžื—ื•ืœืงื™ื

ืœืื—ืจ ื”ื”ืฉืงื” ื”ืจืืฉื•ื ื” ืฉืœ ื”ืกื•ืจืง, ื˜ื‘ืœืื•ืช ืขื‘ื•ืจ ื›ืœ ื“ืœื™ ืกืจื•ืง ืฆืจื™ื›ื•ืช ืœื”ื•ืคื™ืข ื‘ืžืกื“ ื”ื ืชื•ื ื™ื ืฉืฆื•ื™ืŸ ื‘ื”ื’ื“ืจื•ืช. ืคืชื— ืืช ืžืกื•ืฃ Athena ื•ืžืฆื ืืช ื”ื˜ื‘ืœื” ืขื ื™ื•ืžื ื™ Nginx. ื‘ื•ื ื ื ืกื” ืœืงืจื•ื ืžืฉื”ื•:

SELECT * FROM "default"."part_demo_kinesis_bucket"
WHERE(
  partition_0 = '2019' AND
  partition_1 = '04' AND
  partition_2 = '08' AND
  partition_3 = '06'
  );

ืฉืื™ืœืชื” ื–ื• ืชื‘ื—ืจ ืืช ื›ืœ ื”ืจืฉื•ืžื•ืช ืฉื”ืชืงื‘ืœื• ื‘ื™ืŸ ื”ืฉืขื•ืช 6:7 ืœ-8:2019 ื‘-XNUMX ื‘ืืคืจื™ืœ, XNUMX. ืื‘ืœ ื›ืžื” ื–ื” ื™ืขื™ืœ ื™ื•ืชืจ ืžืกืชื ืงืจื™ืื” ืžื˜ื‘ืœื” ืœื ืžื—ื•ืœืงืช? ื‘ื•ืื• ื ื‘ืจืจ ื•ื ื‘ื—ืจ ืืช ืื•ืชืŸ ืจืฉื•ืžื•ืช, ื•ื ืกื ืŸ ืื•ืชืŸ ืœืคื™ ื—ื•ืชืžืช ื–ืžืŸ:

ื ื™ืชื•ื— ื™ื•ืžื ื™ Nginx ื‘ืืžืฆืขื•ืช Amazon Athena ื•-Cube.js

3.59 ืฉื ื™ื•ืช ื•-244.34 ืžื’ื”-ื‘ื™ื™ื˜ ืฉืœ ื ืชื•ื ื™ื ื‘ืžืขืจืš ื ืชื•ื ื™ื ืขื ืจืง ืฉื‘ื•ืข ืฉืœ ื™ื•ืžื ื™ื. ื‘ื•ื ื ื ืกื” ืœืกื ืŸ ืœืคื™ ืžื—ื™ืฆื”:

ื ื™ืชื•ื— ื™ื•ืžื ื™ Nginx ื‘ืืžืฆืขื•ืช Amazon Athena ื•-Cube.js

ืงืฆืช ื™ื•ืชืจ ืžื”ืจ, ืื‘ืœ ื”ื›ื™ ื—ืฉื•ื‘ - ืจืง 1.23 ืžื’ื” ื‘ื™ื™ื˜ ืฉืœ ื ืชื•ื ื™ื! ื–ื” ื™ื”ื™ื” ื”ืจื‘ื” ื™ื•ืชืจ ื–ื•ืœ ืืœืžืœื ื”ืžื™ื ื™ืžื•ื 10 ืžื’ื”-ื‘ื™ื™ื˜ ืœื›ืœ ื‘ืงืฉื” ื‘ืชืžื—ื•ืจ. ืื‘ืœ ื–ื” ืขื“ื™ื™ืŸ ื”ืจื‘ื” ื™ื•ืชืจ ื˜ื•ื‘, ื•ื‘ืžืขืจื›ื™ ื ืชื•ื ื™ื ื’ื“ื•ืœื™ื ื”ื”ื‘ื“ืœ ื™ื”ื™ื” ื”ืจื‘ื” ื™ื•ืชืจ ืžืจืฉื™ื.

ื‘ื ื™ื™ืช ืœื•ื— ืžื—ื•ื•ื ื™ื ื‘ืืžืฆืขื•ืช Cube.js

ื›ื“ื™ ืœื”ืจื›ื™ื‘ ืืช ืœื•ื— ื”ืžื—ื•ื•ื ื™ื, ืื ื• ืžืฉืชืžืฉื™ื ื‘ืžืกื’ืจืช ื”ืื ืœื™ื˜ื™ืช Cube.js. ื™ืฉ ืœื• ื“ื™ ื”ืจื‘ื” ืคื•ื ืงืฆื™ื•ืช, ืื‘ืœ ืื ื—ื ื• ืžืขื•ื ื™ื™ื ื™ื ื‘ืฉืชื™ื™ื: ื”ื™ื›ื•ืœืช ืœื”ืฉืชืžืฉ ื‘ืื•ืคืŸ ืื•ื˜ื•ืžื˜ื™ ื‘ืžืกื ื ื™ ืžื—ื™ืฆื•ืช ื•ื‘ืฆื‘ื™ืจื” ืžืจืืฉ ืฉืœ ื ืชื•ื ื™ื. ื”ื•ื ืžืฉืชืžืฉ ื‘ืกื›ื™ืžืช ื ืชื•ื ื™ื ืกื›ืžืช ื ืชื•ื ื™ื, ื›ืชื•ื‘ ื‘-Javascript ื›ื“ื™ ืœื™ืฆื•ืจ SQL ื•ืœื‘ืฆืข ืฉืื™ืœืชืช ืžืกื“ ื ืชื•ื ื™ื. ืขืœื™ื ื• ืจืง ืœืฆื™ื™ืŸ ื›ื™ืฆื“ ืœื”ืฉืชืžืฉ ื‘ืžืกื ืŸ ื”ืžื—ื™ืฆื•ืช ื‘ืกื›ื™ืžืช ื”ื ืชื•ื ื™ื.

ื‘ื•ืื• ื ื™ืฆื•ืจ ื™ื™ืฉื•ื Cube.js ื—ื“ืฉ. ืžื›ื™ื•ื•ืŸ ืฉืื ื• ื›ื‘ืจ ืžืฉืชืžืฉื™ื ื‘ืขืจื™ืžืช AWS, ื–ื” ื”ื’ื™ื•ื ื™ ืœื”ืฉืชืžืฉ ื‘- Lambda ืœืฆื•ืจืš ืคืจื™ืกื”. ืืชื” ื™ื›ื•ืœ ืœื”ืฉืชืžืฉ ื‘ืชื‘ื ื™ืช ื”ืืงืกืคืจืก ืœื™ืฆื™ืจืช ืื ืืชื” ืžืชื›ื ืŸ ืœืืจื— ืืช ื”-Cube.js backend ื‘-Heroku ืื• ื‘-Docker. ื”ืชื™ืขื•ื“ ืžืชืืจ ืื—ืจื™ื ืฉื™ื˜ื•ืช ืื™ืจื•ื—.

$ npm install -g cubejs-cli
$ cubejs create nginx-log-analytics -t serverless -d athena

ืžืฉืชื ื™ ืกื‘ื™ื‘ื” ืžืฉืžืฉื™ื ืœื”ื’ื“ืจืช ื’ื™ืฉื” ืœืžืกื“ ื ืชื•ื ื™ื ื‘-cube.js. ื”ืžื—ื•ืœืœ ื™ืฆื•ืจ ืงื•ื‘ืฅ .env ืฉื‘ื• ืชื•ื›ืœ ืœืฆื™ื™ืŸ ืืช ื”ืžืคืชื—ื•ืช ืฉืœืš ืืชื ื”.

ืขื›ืฉื™ื• ืื ื—ื ื• ืฆืจื™ื›ื™ื ืกื›ื™ืžืช ื ืชื•ื ื™ื, ืฉื‘ื• ื ืฆื™ื™ืŸ ื‘ื“ื™ื•ืง ืื™ืš ื”ื™ื•ืžื ื™ื ืฉืœื ื• ืžืื•ื—ืกื ื™ื. ืฉื ืชื•ื›ืœ ื’ื ืœืฆื™ื™ืŸ ื›ื™ืฆื“ ืœื—ืฉื‘ ืžื“ื“ื™ื ืขื‘ื•ืจ ืœื•ื—ื•ืช ืžื—ื•ื•ื ื™ื.

ื‘ืžื“ืจื™ืš schema, ืฆื•ืจ ืงื•ื‘ืฅ Logs.js. ืœื”ืœืŸ ืžื•ื“ืœ ื ืชื•ื ื™ื ืœื“ื•ื’ืžื” ืขื‘ื•ืจ nginx:

ืงื•ื“ ื“ื’ื

const partitionFilter = (from, to) => `
    date(from_iso8601_timestamp(${from})) <= date_parse(partition_0 || partition_1 || partition_2, '%Y%m%d') AND
    date(from_iso8601_timestamp(${to})) >= date_parse(partition_0 || partition_1 || partition_2, '%Y%m%d')
    `

cube(`Logs`, {
  sql: `
  select * from part_demo_kinesis_bucket
  WHERE ${FILTER_PARAMS.Logs.createdAt.filter(partitionFilter)}
  `,

  measures: {
    count: {
      type: `count`,
    },

    errorCount: {
      type: `count`,
      filters: [
        { sql: `${CUBE.isError} = 'Yes'` }
      ]
    },

    errorRate: {
      type: `number`,
      sql: `100.0 * ${errorCount} / ${count}`,
      format: `percent`
    }
  },

  dimensions: {
    status: {
      sql: `status`,
      type: `number`
    },

    isError: {
      type: `string`,
      case: {
        when: [{
          sql: `${CUBE}.status >= 400`, label: `Yes`
        }],
        else: { label: `No` }
      }
    },

    createdAt: {
      sql: `from_unixtime(created_at)`,
      type: `time`
    }
  }
});

ื›ืืŸ ืื ื• ืžืฉืชืžืฉื™ื ื‘ืžืฉืชื ื” FILTER_PARAMSื›ื“ื™ ืœื™ืฆื•ืจ ืฉืื™ืœืชืช SQL ืขื ืžืกื ืŸ ืžื—ื™ืฆื•ืช.

ืื ื—ื ื• ื’ื ืžื’ื“ื™ืจื™ื ืืช ื”ืžื“ื“ื™ื ื•ื”ืคืจืžื˜ืจื™ื ืฉื‘ืจืฆื•ื ื ื• ืœื”ืฆื™ื’ ื‘ืœื•ื— ื”ืžื—ื•ื•ื ื™ื ื•ืžืฆื™ื™ื ื™ื ืฆื‘ื™ืจื•ืช ืžืจืืฉ. Cube.js ืชื™ืฆื•ืจ ื˜ื‘ืœืื•ืช ื ื•ืกืคื•ืช ืขื ื ืชื•ื ื™ื ืžืฆื˜ื‘ืจื™ื ืžืจืืฉ ื•ื™ืขื“ื›ืŸ ืืช ื”ื ืชื•ื ื™ื ื‘ืื•ืคืŸ ืื•ื˜ื•ืžื˜ื™ ื›ืฉื”ื ืžื’ื™ืขื™ื. ื–ื” ืœื ืจืง ืžืื™ืฅ ืืช ื”ืฉืื™ืœืชื•ืช, ืืœื ื’ื ืžืคื—ื™ืช ืืช ืขืœื•ืช ื”ืฉื™ืžื•ืฉ ื‘ืืชื ื”.

ื‘ื•ืื• ื ื•ืกื™ืฃ ืืช ื”ืžื™ื“ืข ื”ื–ื” ืœืงื•ื‘ืฅ ืกื›ื™ืžืช ื”ื ืชื•ื ื™ื:

preAggregations: {
  main: {
    type: `rollup`,
    measureReferences: [count, errorCount],
    dimensionReferences: [isError, status],
    timeDimensionReference: createdAt,
    granularity: `day`,
    partitionGranularity: `month`,
    refreshKey: {
      sql: FILTER_PARAMS.Logs.createdAt.filter((from, to) => 
        `select
           CASE WHEN from_iso8601_timestamp(${to}) + interval '3' day > now()
           THEN date_trunc('hour', now()) END`
      )
    }
  }
}

ืื ื• ืžืฆื™ื™ื ื™ื ื‘ืžื•ื“ืœ ื–ื” ื›ื™ ื™ืฉ ืฆื•ืจืš ืœืฆื‘ื•ืจ ืžืจืืฉ ื ืชื•ื ื™ื ืขื‘ื•ืจ ื›ืœ ื”ืžื“ื“ื™ื ื‘ื”ื ื ืขืฉื” ืฉื™ืžื•ืฉ, ื•ืœื”ืฉืชืžืฉ ื‘ื—ืœื•ืงื” ืœืคื™ ื—ื•ื“ืฉื™ื. ื—ืœื•ืงื” ืžืจืืฉ ืœืฆื‘ื™ืจื” ื™ื›ื•ืœ ืœื”ืื™ืฅ ืžืฉืžืขื•ืชื™ืช ืืช ืื™ืกื•ืฃ ื•ืขื“ื›ื•ืŸ ื”ื ืชื•ื ื™ื.

ืขื›ืฉื™ื• ืื ื—ื ื• ื™ื›ื•ืœื™ื ืœื”ืจื›ื™ื‘ ืืช ืœื•ื— ื”ืžื—ื•ื•ื ื™ื!

Cube.js backend ืžืกืคืง REST API ื•ืžืขืจื›ืช ืฉืœ ืกืคืจื™ื•ืช ืœืงื•ื— ืขื‘ื•ืจ ืžืกื’ืจื•ืช ื—ื–ื™ืชื™ื•ืช ืคื•ืคื•ืœืจื™ื•ืช. ืื ื• ื ืฉืชืžืฉ ื‘ื’ืจืกืช React ืฉืœ ื”ืœืงื•ื— ื›ื“ื™ ืœื‘ื ื•ืช ืืช ืœื•ื— ื”ืžื—ื•ื•ื ื™ื. Cube.js ืžืกืคืง ืจืง ื ืชื•ื ื™ื, ืื– ื ืฆื˜ืจืš ืกืคืจื™ื™ืช ื”ื“ืžื™ื” - ืื ื™ ืื•ื”ื‘ ืืช ื–ื” ืชืจืฉื™ืžื™ื ืžื—ื“ืฉ, ืื‘ืœ ืืชื” ื™ื›ื•ืœ ืœื”ืฉืชืžืฉ ื‘ื›ืœ.

ืฉืจืช Cube.js ืžืงื‘ืœ ืืช ื”ื‘ืงืฉื” ื‘ ืคื•ืจืžื˜ JSON, ื”ืžืคืจื˜ ืืช ื”ืžื“ื“ื™ื ื”ื ื“ืจืฉื™ื. ืœื“ื•ื’ืžื”, ื›ื“ื™ ืœื—ืฉื‘ ื›ืžื” ืฉื’ื™ืื•ืช Nginx ื ืชืŸ ื‘ื™ื•ื, ืขืœื™ืš ืœืฉืœื•ื— ืืช ื”ื‘ืงืฉื” ื”ื‘ืื”:

{
  "measures": ["Logs.errorCount"],
  "timeDimensions": [
    {
      "dimension": "Logs.createdAt",
      "dateRange": ["2019-01-01", "2019-01-07"],
      "granularity": "day"
    }
  ]
}

ื‘ื•ืื• ื ืชืงื™ืŸ ืืช ืœืงื•ื— Cube.js ื•ืืช ืกืคืจื™ื™ืช ื”ืจื›ื™ื‘ื™ื React ื“ืจืš NPM:

$ npm i --save @cubejs-client/core @cubejs-client/react

ืื ื• ืžื™ื™ื‘ืื™ื ืจื›ื™ื‘ื™ื cubejs ะธ QueryRendererื›ื“ื™ ืœื”ื•ืจื™ื“ ืืช ื”ื ืชื•ื ื™ื, ื•ืœืืกื•ืฃ ืืช ืœื•ื— ื”ืžื—ื•ื•ื ื™ื:

ืงื•ื“ ืœื•ื— ื”ืžื—ื•ื•ื ื™ื

import React from 'react';
import { LineChart, Line, XAxis, YAxis } from 'recharts';
import cubejs from '@cubejs-client/core';
import { QueryRenderer } from '@cubejs-client/react';

const cubejsApi = cubejs(
  'YOUR-CUBEJS-API-TOKEN',
  { apiUrl: 'http://localhost:4000/cubejs-api/v1' },
);

export default () => {
  return (
    <QueryRenderer
      query={{
        measures: ['Logs.errorCount'],
        timeDimensions: [{
            dimension: 'Logs.createdAt',
            dateRange: ['2019-01-01', '2019-01-07'],
            granularity: 'day'
        }]
      }}
      cubejsApi={cubejsApi}
      render={({ resultSet }) => {
        if (!resultSet) {
          return 'Loading...';
        }

        return (
          <LineChart data={resultSet.rawData()}>
            <XAxis dataKey="Logs.createdAt"/>
            <YAxis/>
            <Line type="monotone" dataKey="Logs.errorCount" stroke="#8884d8"/>
          </LineChart>
        );
      }}
    />
  )
}

ืžืงื•ืจื•ืช ืœื•ื— ื”ืžื—ื•ื•ื ื™ื ื–ืžื™ื ื™ื ื‘ ืืจื’ื– ื—ื•ืœ ืฉืœ ืงื•ื“.

ืžืงื•ืจ: www.habr.com

ื”ื•ืกืคืช ืชื’ื•ื‘ื”