Webalizer ááŸáá·áº Google Analytics ááá¯á·ááẠáááºááá¯ááºáá»á¬ážáá±á«áºááœáẠááŒá áºáá»ááºáá±áááºáá»á¬ážááᯠááŸá áºáá±á«ááºážáá»á¬ážá áœá¬ ááá¯ážááœááºážááááŒááºááá¯ááºá á±ááẠáá°áá®áá±ážáá²á·áá«áááºá ááá¯áá±á¬á· áá°ááá¯á·ááẠá¡ááœááºá¡áá¯á¶ážáááºáá±á¬ á¡áá»ááºá¡áááºáá»á¬ážááᯠáá±ážáá±á¬ááºáááºááᯠáá»áœááºá¯ááºáá¬ážáááºáá«áááºá áááºá access.log ááá¯ááºááᯠáááºáá±á¬ááºááŒáá·áºááŸá¯ááŒááºážááŒáá·áºá á á¬áááºážááá¬ážáá»á¬ážááᯠáá¬ážáááºááẠá¡ááœááºááœááºáá°ááŒá®áž sqliteá htmlá sql áá¬áá¬á áá¬ážááŸáá·áº áááºááá·áº scripting programming language áá²á·ááá¯á·áá±á¬ á¡ááŒá±áá¶áááááá¬áá»á¬ážááá¯áááᯠá¡áá±á¬ááºá¡áááºáá±á¬áºááẠá¡ááœááºááœááºáá°áá«áááºá
Webalizer á¡ááœáẠáá±áá¬á¡áááºážá¡ááŒá áºááẠáá¬áá¬á access.log ááá¯ááºááŒá áºáááºá á€á¡áá¬ááẠáááºážá áá¬ážáá»á¬ážááŸáá·áº áá¶áá«ááºáá»á¬ážáá²á·ááá¯á·ááŒá áºááŒá®ážá áá¬ááºá¡ááœá¬ážá¡áá¬á á¯á á¯áá±á«ááºážááá¬ááᬠááŸááºážáááºážááŒááºáá¬ážáááº-
Google Analytics áá²á·ááá¯á·áá±á¬ áááááá¬áá»á¬ážááẠáááºáá¬ážáá±á¬ á
á¬áá»ááºááŸá¬á០áá±áá¬áá»á¬ážááᯠáááºážááá¯á·ááá¯ááºááá¯áẠá
á¯áá±á¬ááºážáá«áááºá ááŸááºáááºáá±á¬ áá±á¬ááºáá»ááºááœá²ááẠáááºáá²áá±á·ááŸáááá·áºá¡áá±á«áº á¡ááŒá±áá¶á áááºážááá¯á·ááẠáá»áœááºá¯ááºááá¯á·á¡á¬áž áá¯á¶áá»á¬ážááŸáá·áº á
á¬ááŒá±á¬ááºážá¡áá»áá¯á·ááᯠááŒááá±ážáá«áááºá áá®áááºááá¯ááŒá®áž á¡á¬ážáá¯ááºááá·áºááá¬ážá ááááá°ážá
áá«ááŒá±á¬áá·áº áááºááá¯ááºáááºáááºáá°á á¬áááºážááá¬ážááŸá¬ áá¬ááá¯ááŒáá·áºáá»ááºáá¬áá²á
á¡áá¯á¶ážááŒá¯áá°ááŸáá·áº bot á¡ááœá¬ážá¡áá¬
áááŒá¬ááááá¯áááᯠááá¯ááºá¡ááœá¬ážá¡áá¬ááᯠááá·áºáááºáá¬ážááŒá®áž á¡áá¯á¶ážáááºáá±á¬ á¡ááœá¬ážá¡áá¬ááᯠáááºáá»áŸá¡áá¯á¶ážááŒá¯áá±áááºááᯠááŒáá·áºááŸá¯ááẠááá¯á¡ááºáá«áááºá á¥ááá¬á á€áá²á·ááá¯á·áá±á¬á
SQL á¡á
á®áááºáá¶á
á¬áá±ážááŒááºážááŸá¯
SELECT
1 as 'StackedArea: Traffic generated by Users and Bots',
strftime('%d.%m', datetime(FCT.EVENT_DT, 'unixepoch')) AS 'Day',
SUM(CASE WHEN USG.AGENT_BOT!='n.a.' THEN FCT.BYTES ELSE 0 END)/1000 AS 'Bots, KB',
SUM(CASE WHEN USG.AGENT_BOT='n.a.' THEN FCT.BYTES ELSE 0 END)/1000 AS 'Users, KB'
FROM
FCT_ACCESS_USER_AGENT_DD FCT,
DIM_USER_AGENT USG
WHERE FCT.DIM_USER_AGENT_ID=USG.DIM_USER_AGENT_ID
AND datetime(FCT.EVENT_DT, 'unixepoch') >= date('now', '-14 day')
GROUP BY strftime('%d.%m', datetime(FCT.EVENT_DT, 'unixepoch'))
ORDER BY FCT.EVENT_DT
ááááºááẠáá±á¬á·ááºáá»á¬ážá á¡áááºáááŒááºáá¯ááºáá±á¬ááºááŸá¯ááᯠááŒááááºá á¡áááºááŒáœáá¯á¶áž ááá¯ááºá á¬ážááŸááºááœá±ááᯠá¡áá±ážá áááºáá±á·áá¬ááá¯á· á áááºáááºá á¬ážááá¯á·áá±á¬ááºážáááºá
á áááºá¡ááŸá±á¬ááºá¡ááŸááºááŒá áºá á±áá±á¬ áá±á¬á·ááºáá»á¬áž
áá»áœááºá¯ááºááá¯á·ááẠáá¯á¶ážá áœá²áá°á¡á±ážáá»áá·áºá¡áá»ááºá¡áááºáá»á¬ážá¡áá±á«áºá¡ááŒá±áá¶á áá±á¬á·ááºáá»á¬ážááᯠá¡áá»áá¯ážá¡á á¬ážááœá²ááŒá¬ážáá«áááºá áá±á·á ááºá¡ááœá¬ážá¡áá¬ááá¯ááºáᬠá á¬áááºážááá¬ážáá»á¬ážá á¡á±á¬ááºááŒááºááŒá®áž áá¡á±á¬ááºááŒááºáá±á¬ áá±á¬ááºážááá¯ááŸá¯áá»á¬áž á¡áá±á¡ááœááºááẠbot ááŸá¯ááºááŸá¬ážááŸá¯á¡ááœáẠáá±á¬ááºážááœááºáá±á¬ á¡ááŒá¶á¥á¬ááºááᯠáá±ážáá«áááºá
SQL á¡á
á®áááºáá¶á
á¬áá±ážááŒááºážááŸá¯
SELECT
1 AS 'Table: Annoying Bots',
MAX(USG.AGENT_BOT) AS 'Bot',
ROUND(SUM(FCT.BYTES)/1000 / 14.0, 1) AS 'KB per Day',
ROUND(SUM(FCT.IP_CNT) / 14.0, 1) AS 'IPs per Day',
ROUND(SUM(CASE WHEN STS.STATUS_GROUP IN ('Client Error', 'Server Error') THEN FCT.REQUEST_CNT / 14.0 ELSE 0 END), 1) AS 'Error Requests per Day',
ROUND(SUM(CASE WHEN STS.STATUS_GROUP IN ('Successful', 'Redirection') THEN FCT.REQUEST_CNT / 14.0 ELSE 0 END), 1) AS 'Success Requests per Day',
USG.USER_AGENT_NK AS 'Agent'
FROM FCT_ACCESS_USER_AGENT_DD FCT,
DIM_USER_AGENT USG,
DIM_HTTP_STATUS STS
WHERE FCT.DIM_USER_AGENT_ID = USG.DIM_USER_AGENT_ID
AND FCT.DIM_HTTP_STATUS_ID = STS.DIM_HTTP_STATUS_ID
AND USG.AGENT_BOT != 'n.a.'
AND datetime(FCT.EVENT_DT, 'unixepoch') >= date('now', '-14 day')
GROUP BY USG.USER_AGENT_NK
ORDER BY 3 DESC
LIMIT 10
á€ááá á¹á ááœááºá ááœá²ááŒááºážá áááºááŒá¬ááŸá¯áááááºááŸá¬ áááºážááᯠrobots.txt ááá¯ááºááœáẠááá·áºááœááºážááŒááºážááŒáá·áº áááºááá¯ááºááá¯á· áááºáá±á¬ááºááœáá·áºááᯠááá·áºáááºááẠáá¯á¶ážááŒááºáá»ááºááŒá áºáááºá
User-agent: AhrefsBot
Disallow: /
User-agent: dotbot
Disallow: /
User-agent: bingbot
Crawl-delay: 5
ááá bot ááŸá
áºáá¯ááẠááá¬ážá០áá»á±á¬ááºááœááºááœá¬ážáá²á·ááŒá®áž MS á
ááºáá¯ááºáá»á¬ážááẠáááááá¯ááºážáá»á¬ážá០á¡á±á¬ááºááá¯á· ááœá±á·ááœá¬ážáá²á·áááºá
áá±á·ááŸáá·áºá¡áá»áááº
á¡ááœá¬ážá¡áᬠáááºážááŒá±á¬ááºážáá»á¬ážááœáẠááŒááºááá¯ááºáááºá áááºážááá¯á·ááᯠá¡áá±ážá áááºáá±á·áá¬áááºá áááºážááá¯á·á ááŒá áºáá»ááºááŸá¯á¡áá»áááºááᯠáá®ážáá±á¬ááºážááá¯ážááŒááẠááá¯á¡ááºááŒá®áž á¡áá»áááºááá¯ááºážáá¬ááŸá¯á áá¬áá®ááŸáá·áº áááºá¡á¬ážáá¯á¶ážááᯠááŒáááẠáááá¯á¡ááºáá«á á¡áá±ážá áááºááœá²ááŒááºážá áááºááŒá¬áááºááá¯á¡ááºáá«á ááŸááºáááºážááá¯ááºááœáẠáá áºáŠážáá»ááºážáá±á¬ááºážááá¯ááŸá¯áá»á¬ážááᯠááá¯ááá¯ááœááºáá°á áœá¬ááŸá¬ááœá±ááá¯ááºáááºááŒá áºáááºá
SQL á¡á
á®áááºáá¶á
á¬áá±ážááŒááºážááŸá¯
SELECT
1 AS 'Line: Day and Hour of Hits from Users and Bots',
strftime('%d.%m-%H', datetime(EVENT_DT, 'unixepoch')) AS 'Date Time',
HIB AS 'Bots, Hits',
HIU AS 'Users, Hits'
FROM (
SELECT
EVENT_DT,
SUM(CASE WHEN AGENT_BOT!='n.a.' THEN LINE_CNT ELSE 0 END) AS HIB,
SUM(CASE WHEN AGENT_BOT='n.a.' THEN LINE_CNT ELSE 0 END) AS HIU
FROM FCT_ACCESS_REQUEST_REF_HH
WHERE datetime(EVENT_DT, 'unixepoch') >= date('now', '-14 day')
GROUP BY EVENT_DT
ORDER BY SUM(LINE_CNT) DESC
LIMIT 10
) ORDER BY EVENT_DT
ááá¬ážááœáẠááááá¯á¶ážáá±á·á á¡áááºááŒáœáá¯á¶áž áá¬áá® 11á 14 ááŸáá·áº 20 ááᯠáá»áœááºá¯ááºááá¯á· á á±á¬áá·áºááŒáá·áºáááºá áá«áá±ááá·áº áá±á¬ááºáá áºáá±á· 13:XNUMX ááŸá¬ bot ááœá±á active ááŒá áºáá±áá«áááºá
áá áºáááºáá»áŸáẠáá»ááºážáá»áŸ áá±á·á ááºá¡áá¯á¶ážááŒá¯áá° áá¯ááºáá±á¬ááºáá»ááº
ááŸá¯ááºááŸá¬ážááŸá¯ááŸáá·áº áá¬ááºááŒá±á¬á¡ááœá¬ážá¡áá¬áá»á¬ážááŒáá·áº á¡áá¬áá»á¬ážááᯠá á®á á¥áºáá¬ážáááºá áá±á¬ááºáá±ážááœááºážááá±á¬á· áá¯á¶ážá áœá²áá°ááœá±ááá¯ááºááá¯ááºáá²á· áá¯ááºáá±á¬ááºáá»ááºáá«á ááá¯áá²á·ááá¯á·áá±á¬ á á¬áááºážááá¬ážáá»á¬ážá¡ááœááºá áá áºáááºáá²á·ááá¯á·áá±á¬ ááŸááºáá»á¬ážáá±á¬ á á¯á ááºážááŸá¯áá¬ááá»á¬ážááẠááŸá áºááá¯ááœááºáá±á¬ááºážáááºá
SQL á¡á
á®áááºáá¶á
á¬áá±ážááŒááºážááŸá¯
SELECT
1 as 'Line: Average Daily User Activity by Week',
strftime('%W week', datetime(FCT.EVENT_DT, 'unixepoch')) AS 'Week',
ROUND(1.0*SUM(FCT.PAGE_CNT)/SUM(FCT.IP_CNT),1) AS 'Pages per IP per Day',
ROUND(1.0*SUM(FCT.FILE_CNT)/SUM(FCT.IP_CNT),1) AS 'Files per IP per Day'
FROM
FCT_ACCESS_USER_AGENT_DD FCT,
DIM_USER_AGENT USG,
DIM_HTTP_STATUS HST
WHERE FCT.DIM_USER_AGENT_ID=USG.DIM_USER_AGENT_ID
AND FCT.DIM_HTTP_STATUS_ID = HST.DIM_HTTP_STATUS_ID
AND USG.AGENT_BOT='n.a.' /* users only */
AND HST.STATUS_GROUP IN ('Successful') /* good pages */
AND datetime(FCT.EVENT_DT, 'unixepoch') > date('now', '-3 month')
GROUP BY strftime('%W week', datetime(FCT.EVENT_DT, 'unixepoch'))
ORDER BY FCT.EVENT_DT
á¡áááºá ááºá á¬áááºážááá¬ážáá»á¬ážá¡á áá»ááºážáá»áŸá¡áá¯á¶ážááŒá¯áá°áá áºáŠážááẠáá áºáá±á·áá»áŸáẠá á¬áá»ááºááŸá¬ 1,6 ááœáá·áºáááºá á€ááá á¹á ááœáẠá¡áá¯á¶ážááŒá¯áá°áá áºáŠážáá»áŸáẠáá±á¬ááºážááá¯áá¬ážáá±á¬ ááá¯ááºá¡áá±á¡ááœááºááẠáááºááá¯ááºááá¯á· ááá¯ááºá¡áá áºáá»á¬áž áááºááá¯ážááŒááºážá¡áá±á«áº áá°áááºáá«áááºá
áá±á¬ááºážááá¯áá»ááºá¡á¬ážáá¯á¶ážááŸáá·áº áááºážááá¯á·á á¡ááŒá±á¡áá±áá»á¬áž
Webalizer ááẠáááá»áá±á¬ á á¬áá»ááºááŸá¬áá¯ááºáá»á¬ážááᯠá¡ááŒá²ááŒááá²á·ááŒá®áž á¡á±á¬ááºááŒááºáá±á¬ áá±á¬ááºážááá¯áá»ááºáá»á¬ážááŸáá·áº á¡ááŸá¬ážá¡ááœááºážá¡áá±á¡ááœááºááá¯áᬠá¡ááŒá²ááŒááºáá»ááºáá²á·áááºá
SQL á¡á
á®áááºáá¶á
á¬áá±ážááŒááºážááŸá¯
SELECT
1 as 'Line: All Requests by Status',
strftime('%d.%m', datetime(FCT.EVENT_DT, 'unixepoch')) AS 'Day',
SUM(CASE WHEN STS.STATUS_GROUP='Successful' THEN FCT.REQUEST_CNT ELSE 0 END) AS 'Success',
SUM(CASE WHEN STS.STATUS_GROUP='Redirection' THEN FCT.REQUEST_CNT ELSE 0 END) AS 'Redirect',
SUM(CASE WHEN STS.STATUS_GROUP='Client Error' THEN FCT.REQUEST_CNT ELSE 0 END) AS 'Customer Error',
SUM(CASE WHEN STS.STATUS_GROUP='Server Error' THEN FCT.REQUEST_CNT ELSE 0 END) AS 'Server Error'
FROM
FCT_ACCESS_USER_AGENT_DD FCT,
DIM_HTTP_STATUS STS
WHERE FCT.DIM_HTTP_STATUS_ID=STS.DIM_HTTP_STATUS_ID
AND datetime(FCT.EVENT_DT, 'unixepoch') >= date('now', '-14 day')
GROUP BY strftime('%d.%m', datetime(FCT.EVENT_DT, 'unixepoch'))
ORDER BY FCT.EVENT_DT
á¡á á®áááºáá¶á á¬ááẠáá±á¬ááºážááá¯ááŸá¯áá»á¬ážááᯠááŒááááºá ááá áºáá»á¬áž ( hits ) ááá¯ááºáá² LINE_CNT ááŸáá·áºááá°áá²á REQUEST_CNT áááºááá áºááᯠCOUNT(DISTINCT STG.REQUEST_NK) á¡ááŒá Ạááœááºáá»ááºáá«áááºá áááºááœááºáá»ááºááŸá¬ áááá±á¬ááºáá±á¬ááŒá áºáááºáá»á¬ážááá¯ááŒááááºááŒá áºáááºá á¥ááá¬á MS bots áá»á¬ážááẠáá áºáá±á·áá»áŸáẠá¡ááŒáááºáá¬ááŸáá·áºáá»á®áá±á¬ robots.txt ááá¯ááºááᯠá á áºáááºážáá±á¬ááºáá°ááŒá®áž á€á¡ááŒá±á¡áá±ááœááºá ááá¯ááá¯á·áá±á¬á á áºáááºážáá»á¬ážááᯠáá áºááŒáááºáá±ááœááºáááºááŒá áºáááºá áááºážááẠááá·áºá¡á¬áž ááááºááœáẠáá¯ááºááŒááºážáá»á¬ážááᯠáá»á±á¬ááœá±á·á á±ááá¯ááºáááºá
ááááºááºá០á¡ááŸá¬ážá¡ááœááºážáá»á¬ážá áœá¬ááᯠáááºááœá±á·ááá¯ááºááẠ- áááºážááá¯á·ááẠáááºááŸáááŒááºážáááŸááá±á¬ á á¬áá»ááºááŸá¬áá»á¬ážááŒá áºáááºá ááœá²ááŒááºážá áááºááŒá¬ááŸá¯á ááááºááŸá¬ á¡áá±ážá០á á¬áá»ááºááŸá¬áá»á¬ážá០ááŒááºááœáŸááºážááŸá¯áá»á¬áž áááºáá±á¬ááºážááŒááºáž ááŒá áºáááºá
ááá±á¬ááºážáá±á¬áá±á¬ááºážááá¯ááŸá¯áá»á¬áž
áá±á¬ááºážááá¯ááŸá¯áá»á¬ážááᯠá¡áá±ážá áááºá á áºáá±ážáááºá áááºááẠá¡áá±ážá áááºá á¬áááºážá¡ááºážáá»á¬ážááᯠááŒáááá¯ááºáá«áááºá
SQL á¡á
á®áááºáá¶á
á¬áá±ážááŒááºážááŸá¯
SELECT
1 AS 'Table: Top Error Requests',
REQ.REQUEST_NK AS 'Request',
'Error' AS 'Request Status',
ROUND(SUM(FCT.LINE_CNT) / 14.0, 1) AS 'Hits per Day',
ROUND(SUM(FCT.IP_CNT) / 14.0, 1) AS 'IPs per Day',
ROUND(SUM(FCT.BYTES)/1000 / 14.0, 1) AS 'KB per Day'
FROM
FCT_ACCESS_REQUEST_REF_HH FCT,
DIM_REQUEST_V_ACT REQ
WHERE FCT.DIM_REQUEST_ID = REQ.DIM_REQUEST_ID
AND FCT.STATUS_GROUP IN ('Client Error', 'Server Error')
AND datetime(FCT.EVENT_DT, 'unixepoch') >= date('now', '-14 day')
GROUP BY REQ.REQUEST_NK
ORDER BY 4 DESC
LIMIT 20
á€á á¬áááºážááœáẠáá±á«áºááá¯ááŸá¯á¡á¬ážáá¯á¶ážáá«áááºáááºááŒá áºááŒá®ážá á¥ááá¬á /wp-login.php ááá¯á· áá±á¬ááºážááá¯áá»ááºáá áºáᯠáá¬áá¬á០áá±á¬ááºážááá¯áá»ááºáá»á¬ážááᯠááŒááºáááºáá±ážáá¬ážááŒááºážááá¯ááºáᬠá ááºážáá»ááºážáá»á¬ážááᯠáá»áááºááŸáááŒááºážááŒáá·áºá áááºááẠááá¯áá±á¬ááºážááá¯ááŸá¯áá»á¬ážááᯠáá¬áá¬ááá¯á¶á·ááŒááºááŸá¯ááᯠáá»áááºááŸáááá¯ááºááŒá®áž áááºážááá¯á·ááᯠá áááºááá·áºá á¬áá»ááºááŸá¬ááá¯á· áá±ážááá¯á·ááá¯ááºáá«áááºá
ááá¯á·ááŒá±á¬áá·áºá áá¬áá¬ááŸááºáááºážááá¯ááºááá¯á¡ááŒá±áá¶á ááá¯ážááŸááºážáá±á¬á¡á á®áááºáá¶á á¬á¡áááºážáááºááẠáááºááá¯ááºáá±á«áºááœááºááŒá áºáá»ááºáá±ááá·áºá¡áá¬áá»á¬ážááᯠá¡áá±á¬áºáá±ážááŒáá·áºá á¯á¶á á±áááºá
ááááºážá¡áá»ááºá¡ááẠáááºááá¯áááá¯ááºááá²á
sqlite database á áá¯á¶áá±á¬ááºáá«áááºá ááá¬ážáá»á¬ážáááºáá®ážááŒáá«á áá¯á·- ETL áá¯ááºáááºážá ááºáá»á¬ážááᯠááŸááºáááºážáááºáááºá¡ááœáẠá¡áááºá
PHP ááᯠââá¡áá¯á¶ážááŒá¯á ááŸááºáááºážááá¯ááºáá»á¬áž áá±ážáá¬ážááá·áº ááá¬ážá¡ááá·áºá á
á¯á
á¯áá±á«ááºáž á
á¬ážááœá²ááŸá
áºáá¯á¶ážá áá¯á¶ážá
áœá²áá°á¡á±ážáá»áá·áºáá»á¬ážááŸáá·áº áá±á¬ááºážááá¯ááŸá¯á¡ááŒá±á¡áá±áá»á¬ážááá¯ááºáᬠá
á¬áááºážááá¬ážáá»á¬ážáá«ááŸááá±á¬ áá±á·á
ááºááá¬ážáá
áºáá¯ááᯠáááºáá®ážááŒáá«á
áá¯á·á áá±á¬ááºážááá¯ááŸá¯áá»á¬ážá á¡ááŒá±á¡áá±á¡á¯ááºá
á¯áá»á¬ážááŸáá·áº á¡á±ážáá»áá·áºáá»á¬ážá¡ááœáẠá
á¬áááºážááá¬ážáá»á¬ážááŒáá·áº áá¬áá®ááá¯ááºážá áááºááá¯ááºáá¬ááá¯ááºážáá¬ááŸá¯ááá¬áž áá±ážáá¯á
ááááºááŸá¬ á¡á±á¬ááºáá« áááºá ááºáá¯á¶á á¶ááŒá áºáááºá
áá±áá¬áá±á¬áºáááº
sqlite áá±áá¬áá±á·á
áºááœáẠá¡áá¬ááá¹áá¯áá
áºáá¯áááºáá®ážááẠScript
DDL á¡áá¬ááá¹áá¯áááºáá®ážááŒááºážá
DROP TABLE IF EXISTS DIM_USER_AGENT;
CREATE TABLE DIM_USER_AGENT (
DIM_USER_AGENT_ID INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
USER_AGENT_NK TEXT NOT NULL DEFAULT 'n.a.',
AGENT_OS TEXT NOT NULL DEFAULT 'n.a.',
AGENT_ENGINE TEXT NOT NULL DEFAULT 'n.a.',
AGENT_DEVICE TEXT NOT NULL DEFAULT 'n.a.',
AGENT_BOT TEXT NOT NULL DEFAULT 'n.a.',
UPDATE_DT INTEGER NOT NULL DEFAULT 0,
UNIQUE (USER_AGENT_NK)
);
INSERT INTO DIM_USER_AGENT (DIM_USER_AGENT_ID) VALUES (-1);
áá¬ááºáá¯á¶
access.log ááá¯ááºááá á¹á ááœááºá áá±á¬ááºážááá¯áá»ááºá¡á¬ážáá¯á¶ážááᯠáá±áá¬áá±á·á áºááá¯á· áááºáááºá ááœá²ááŒááºážá áááºááŒá¬ááŒá®áž áá±ážáá¬ážááẠááá¯á¡ááºáááºá áááºážááᯠscripting language ááŒáá·áº ááá¯ááºááá¯áẠááá¯á·ááá¯áẠsqlite áááááá¬áá»á¬ážááᯠá¡áá¯á¶ážááŒá¯á áá±á¬áºáááºážáá±á¬ááºáž áá¯ááºáá±á¬ááºááá¯ááºáááºá
ááŸááºáááºážááá¯ááºáá±á¬áºáááº-
//67.221.59.195 - - [28/Dec/2012:01:47:47 +0100] "GET /files/default.css HTTP/1.1" 200 1512 "https://project.edu/" "Mozilla/4.0"
//host ident auth time method request_nk protocol status bytes ref browser
$log_pattern = '/^([^ ]+) ([^ ]+) ([^ ]+) ([[^]]+]) "(.*) (.*) (.*)" ([0-9-]+) ([0-9-]+) "(.*)" "(.*)"$/';
á¡ááá ááœááºáá¬áá«áááºá
áá±áá¬á¡ááŒááºážááẠáá±áá¬áá±á·á áºááœáẠááŸááá±áá±á¬á¡áá«á ááá¯ááºážáá¬ááŒááºážááá¬ážáá»á¬ážááœáẠááá«áá±á¬áá±á¬á·áá»á¬ážááᯠáá±ážááẠááá¯á¡ááºáááºá ááá¯á·áá±á¬áẠááá¯ááºážáá¬ááŸá¯áá»á¬ážááᯠááá¯ážáá¬ážá áááºáá±á¬ááºááá¯ááºáááºááŒá áºáááºá á¥ááá¬á¡á¬ážááŒáá·áºá DIM_REFERRER ááá¬ážááœááºá áá±á¬á·ááẠá¡ááœááºáá¯á¶ážáá¯áá±á«ááºážá ááºáá¬ážáááºá
SQL áá±á¬á·ááŒáá·áºááœá¬ážááŸá¯áá±ážááœááºáž
/* Propagate the referrer from access log */
INSERT INTO DIM_REFERRER (HOST_NK, PATH_NK, QUERY_NK, UPDATE_DT)
SELECT
CLS.HOST_NK,
CLS.PATH_NK,
CLS.QUERY_NK,
STRFTIME('%s','now') AS UPDATE_DT
FROM (
SELECT DISTINCT
REFERRER_HOST AS HOST_NK,
REFERRER_PATH AS PATH_NK,
CASE WHEN INSTR(REFERRER_QUERY,'&sid')>0 THEN SUBSTR(REFERRER_QUERY, 1, INSTR(REFERRER_QUERY,'&sid')-1) /* ПÑÑезаеЌ sid - ÑпеÑОÑОка ÑÐŒÑ */
ELSE REFERRER_QUERY END AS QUERY_NK
FROM STG_ACCESS_LOG
) CLS
LEFT OUTER JOIN DIM_REFERRER TRG
ON (CLS.HOST_NK = TRG.HOST_NK AND CLS.PATH_NK = TRG.PATH_NK AND CLS.QUERY_NK = TRG.QUERY_NK)
WHERE TRG.DIM_REFERRER_ID IS NULL
á¡áá¯á¶ážááŒá¯áá° á¡á±ážáá»áá·áºááá¬ážááá¯á· ááŒáá·áºáá±áá¬ááœáẠá¥ááᬠsql á¡ááá¯á¡ááœá¬ áá±á¬á·áá»á Ạáá«áááºááá¯ááºáááº-
CASE
WHEN INSTR(LOWER(CLS.BROWSER),'yandex.com')>0
THEN 'yandex'
WHEN INSTR(LOWER(CLS.BROWSER),'googlebot')>0
THEN 'google'
WHEN INSTR(LOWER(CLS.BROWSER),'bingbot')>0
THEN 'microsoft'
WHEN INSTR(LOWER(CLS.BROWSER),'ahrefsbot')>0
THEN 'ahrefs'
WHEN INSTR(LOWER(CLS.BROWSER),'mj12bot')>0
THEN 'majestic-12'
WHEN INSTR(LOWER(CLS.BROWSER),'compatible')>0 OR INSTR(LOWER(CLS.BROWSER),'http')>0
OR INSTR(LOWER(CLS.BROWSER),'libwww')>0 OR INSTR(LOWER(CLS.BROWSER),'spider')>0
OR INSTR(LOWER(CLS.BROWSER),'java')>0 OR INSTR(LOWER(CLS.BROWSER),'python')>0
OR INSTR(LOWER(CLS.BROWSER),'robot')>0 OR INSTR(LOWER(CLS.BROWSER),'curl')>0
OR INSTR(LOWER(CLS.BROWSER),'wget')>0
THEN 'other'
ELSE 'n.a.' END AS AGENT_BOT
ááá¬ážáá»á¬ážá á¯á ááºáž
áá±á¬ááºáá¯á¶ážá¡áá±ááŒáá·áºá á á¯á ááºážáá¬ážáá±á¬ ááá¬ážáá»á¬ážááᯠáá»áœááºá¯ááºááá¯á· áááºáá«áááºá á¥ááá¬á áá±á·á ááºááá¬ážááᯠá¡á±á¬ááºáá«á¡ááá¯ááºáž áááºááá¯ááºáááºá
á¡á á¯ááá¯ááºáááºááŒááºážá¡ááœáẠSQL query
/* Load fact from access log */
INSERT INTO FCT_ACCESS_USER_AGENT_DD (EVENT_DT, DIM_USER_AGENT_ID, DIM_HTTP_STATUS_ID, PAGE_CNT, FILE_CNT, REQUEST_CNT, LINE_CNT, IP_CNT, BYTES)
WITH STG AS (
SELECT
STRFTIME( '%s', SUBSTR(TIME_NK,9,4) || '-' ||
CASE SUBSTR(TIME_NK,5,3)
WHEN 'Jan' THEN '01' WHEN 'Feb' THEN '02' WHEN 'Mar' THEN '03' WHEN 'Apr' THEN '04' WHEN 'May' THEN '05' WHEN 'Jun' THEN '06'
WHEN 'Jul' THEN '07' WHEN 'Aug' THEN '08' WHEN 'Sep' THEN '09' WHEN 'Oct' THEN '10' WHEN 'Nov' THEN '11'
ELSE '12' END || '-' || SUBSTR(TIME_NK,2,2) || ' 00:00:00' ) AS EVENT_DT,
BROWSER AS USER_AGENT_NK,
REQUEST_NK,
IP_NR,
STATUS,
LINE_NK,
BYTES
FROM STG_ACCESS_LOG
)
SELECT
CAST(STG.EVENT_DT AS INTEGER) AS EVENT_DT,
USG.DIM_USER_AGENT_ID,
HST.DIM_HTTP_STATUS_ID,
COUNT(DISTINCT (CASE WHEN INSTR(STG.REQUEST_NK,'.')=0 THEN STG.REQUEST_NK END) ) AS PAGE_CNT,
COUNT(DISTINCT (CASE WHEN INSTR(STG.REQUEST_NK,'.')>0 THEN STG.REQUEST_NK END) ) AS FILE_CNT,
COUNT(DISTINCT STG.REQUEST_NK) AS REQUEST_CNT,
COUNT(DISTINCT STG.LINE_NK) AS LINE_CNT,
COUNT(DISTINCT STG.IP_NR) AS IP_CNT,
SUM(BYTES) AS BYTES
FROM STG,
DIM_HTTP_STATUS HST,
DIM_USER_AGENT USG
WHERE STG.STATUS = HST.STATUS_NK
AND STG.USER_AGENT_NK = USG.USER_AGENT_NK
AND CAST(STG.EVENT_DT AS INTEGER) > $param_epoch_from /* load epoch date */
AND CAST(STG.EVENT_DT AS INTEGER) < strftime('%s', date('now', 'start of day'))
GROUP BY STG.EVENT_DT, HST.DIM_HTTP_STATUS_ID, USG.DIM_USER_AGENT_ID
sqlite áá±áá¬áá±á·á áºáááºááá·áºá¡á¬ážááŸá¯ááºááœá±ážáá±á¬áá±ážááœááºážáá»á¬ážááá¯áá±ážáá¬ážáááºááœáá·áºááŒá¯áááºá WITH ááœáẠáá±áá¬ááŸáá·áº áá±á¬á·áá»á¬áž ááŒááºáááºááŸá¯ áá«ááŸááááºá áááºááá±ážááŒááºážáá»ááºááẠá¡ááá¯ááºážá¡áá¬áá»á¬ážá¡ááœáẠá¡ááá¯ážá¡áá¬ážá¡á¬ážáá¯á¶ážááᯠá á¯áá±á¬ááºážáááºá
á¡ááŒá±á¡áá±ááẠááŸááºáááºážááᯠáááºáá¶áááºááẠááœáá·áºáááŒá¯áá«- CAST(STG.EVENT_DT AS INTEGER) > $param_epoch_fromá ááá·áºáááºáá»ááºááẠáá±á¬ááºážááá¯áá»ááºáááááºááŒá
áºááá·áº
'SELECT COALESCE(MAX(EVENT_DT), '3600') FCT_ACCESS_USER_AGENT_DD á០LAST_EVENT_EPOCH á¡ááŒá
áº
á¡ááŒá±á¡áá±ááẠáá áºáá±áá¯ááºáᬠáááºáá±ážáá«áááº- CAST(STG.EVENT_DT AS INTEGER) < strftime('%s'á date('now'á 'start of day'))
á á¬áá»ááºááŸá¬áá»á¬áž ááá¯á·ááá¯áẠááá¯ááºáá»á¬ážááᯠáá±ááœááºááŒááºážááẠá¡ááŸááºáá áºáá¯ááᯠááŸá¬ááœá±ááŒááºážááŒáá·áº ááŸá±ážáŠážáááºážáááºážááŒáá·áº áá¯ááºáá±á¬ááºáááºá
á¡á á®áááºáá¶á á¬áá»á¬áž
ááŸá¯ááºááœá±ážáá±á¬á¡ááŒááºá¡á¬áá¯á¶á áá áºáá»á¬ážááœááºá áá±áá¬áá±á·á áºá¡áá¬ááá¹áá¯áá»á¬ážá¡áá±á«áºá¡ááŒá±áá¶á áááºáá¬áá¯á¶á á¶áá áºáá¯ááᯠáááºáá®ážááá¯ááºááŒá®áž á á áºáá¯ááºááŸá¯áá»á¬ážááŸáá·áº á á¯á ááºážááŸá¯á ááºážáá»ááºážáá»á¬ážááᯠááá¯ááºááá áºáá»áá»á á®áá¶ááá·áºááœá²ááá¯ááºáááºá á¡áá¯á¶ážá áœááºá¡á¬ážááŒáá·áºá ááá·áºáá»á±á¬áºáá±á¬áááááá¬áá»á¬ážá¡á¬ážáá¯á¶ážááẠSQL query áá áºáá¯ááá¯áá¯ááºáá±ážáááºá
á€á¥ááá¬ááœááºá áá»áœááºá¯ááºááá¯á·ááẠá¡áááºááá·áºáá¯ááºáá¬ážáá±á¬ SQL áá±ážááŒááºážááŸá¯áá»á¬ážááᯠáááºáá®ážááŒá®áž áááºážááá¯á·ááᯠáá±áá¬áá±á·á áºááœáẠááŒáá·áºááŸá¯ááŸá¯áá»á¬ážá¡ááŒá Ạááááºážáááºážáá«ááẠ- áááºážááá¯á·ááẠá¡á á®áááºáá¶á á¬áá»á¬ážááŒá áºáááºá
ááŒááºááœááºáž
Bluff- JavaScript ááœáẠááŸááá±á¬ááááºáá»á¬ážááᯠáá¯á¶áá±á¬áºááŒááºážáááááá¬áá áºáá¯á¡ááŒá Ạá¡áá¯á¶ážááŒá¯áá²á·áááºá
áá«ááá¯áá¯ááºááá¯á·á PHP ááá¯áá¯á¶ážááŒá®áž á¡á á®áááºáá¶á á¬ááœá±á¡á¬ážáá¯á¶ážááᯠááŒááºááŒá®áž ááá¬ážááœá±áá²á· html ááá¯ááºáá áºáᯠáááºáá®ážááá¯á· ááá¯á¡ááºáááºá
$sqls = array(
'SELECT * FROM RPT_ACCESS_USER_VS_BOT',
'SELECT * FROM RPT_ACCESS_ANNOYING_BOT',
'SELECT * FROM RPT_ACCESS_TOP_HOUR_HIT',
'SELECT * FROM RPT_ACCESS_USER_ACTIVE',
'SELECT * FROM RPT_ACCESS_REQUEST_STATUS',
'SELECT * FROM RPT_ACCESS_TOP_REQUEST_PAGE',
'SELECT * FROM RPT_ACCESS_TOP_REQUEST_REFERRER',
'SELECT * FROM RPT_ACCESS_NEW_REQUEST',
'SELECT * FROM RPT_ACCESS_TOP_REQUEST_SUCCESS',
'SELECT * FROM RPT_ACCESS_TOP_REQUEST_ERROR'
);
áááááá¬ááẠááááºááá¬ážáá»á¬ážááᯠááá¯ážááá¯ážááŸááºážááŸááºáž ááŒááºáá¬á á±áááºá
áá±á¬ááºáá»ááº
áááºááœá²ááŒááºážá áááºááŒá¬ááŸá¯ááᯠááá°áá¬á¡ááŒá Ạá¡áá¯á¶ážááŒá¯á áá±á¬ááºážáá«ážááẠáá±áá¬ááá¯ááŸá±á¬ááºáá¯á¶áá»á¬áž áááºáá±á¬ááºááẠááá¯á¡ááºáá±á¬ ááá¹ááá¬ážáá»á¬ážááᯠáá±á¬áºááŒáááºá ááááºáá»á¬ážá០ááŒááºááœá±á·ááá¯ááºááá²á·ááá¯á·á á¡ááá¯ážááŸááºážáá¯á¶áž áááááá¬áá»á¬ážááẠáá±áá¬ááᯠáááºááŸáá¯ááºážá áœá¬ ááœá²ááŒááºážá áááºááŒá¬ááŒááºážááŸáá·áº ááŒááºáá¬ááŒááºáá¬ááŒááºážá¡ááœáẠáá¯á¶áá±á¬ááºáá«áááºá
á¡áá¬áááºááœááºá á€ááá¯ááŸá±á¬ááºááŸá¯á¡á¬áž ááá°áá¬á¡ááŒá Ạá¡áá¯á¶ážááŒá¯á á¡ááá¯ááºážá¡áá¬áá»á¬ážá áááºáá¬áá±áá¬á á á¯á ááºážááŸá¯á¡ááá·áºáá»á¬ážááŸáá·áº ááá°áá®áá±á¬á¡áááºážá¡ááŒá áºáá»á¬ážá០áá±áá¬áá±á«ááºážá ááºážááŸá¯á ááá·áº ááŒááºážááŒááºážáá»ááºážááŒá±á¬ááºážáá²áá±áá±á¬ á¡ááá¯ááºážá¡áá¬á áááºáá¬áá±áá¬á á á¯á ááºážááŸá¯á¡ááá·áºáá»á¬ážááŸáá·áº áá±á«ááºážá ááºááœá²á·á ááºážáá¯á¶áá»á¬ážááᯠá¡áá±á¬ááºá¡áááºáá±á¬áºááẠááŒáá¯ážá á¬ážáá«áááºá
ááá¯á·á¡ááŒááºá ááá¬ážáá áºáá¯áááºážáá±á«áºá¡ááŒá±áá¶á ETL áá¯ááºáááºážá ááºáá»á¬ážááᯠá á®áá¶ááá·áºááœá²áááºá¡ááœáẠá¡ááá¯ážááŸááºážáá¯á¶ážáááááá¬ááᯠá¡áá®ážáááºáá±á·áá¬ááŒáá·áºááŒáá«á áá¯á·á
áá±áá¬á¡áááºá¡ááœá±ážááᯠááá¯ááºážáá¬ááŒááºážááŸáá·áº á€áá¯ááºáááºážá ááºááᯠá¡ááá¯á¡áá»á±á¬ááºáá¯ááºáá±á¬ááºááŒááºážááá¯ááºáᬠáá±á«ááºážá ááºááá¯á· ááŒááºááœá¬ážááŒáá«á áá¯á·á
áá»áœááºá¯ááºááá¯á·ááẠRaspberry Pi ááá¯á¡ááŒá±áá¶á á¡áááºážá¡ááŒá
áºá¡áááºážáááºáá¬ááŸááá±á¬ ááá¯ááŸá±á¬ááºááŸá¯áá¬áá¬ááᯠá¡áá±á¬ááºá¡áááºáá±á¬áºááá·áº áááºážááá¬ááá¯ááºáá¬áááºáááºážáá»ááºááŸáá·áº áá±áá¬ááá¯ááŸá±á¬ááºááŸá¯ááááºážááááºážááŸá¯ááá¯ááºáᬠááŒá¿áá¬áá»á¬ážááᯠáá±á·áá¬áá«áááºá
source: www.habr.com