Site statistics and your own small storage

Webalizer et Google Analytics adiuverunt me perspicientiam in illis quae in websites per multos annos fiunt. Nunc intellego parum utiles se praebere notitias. Cum accessum ad tuum accessum.log lima, facillime statistica comprehendere et instrumenta valde fundamentalia efficere, ut sqlite, html, sql lingua et lingua programmandi quaelibet scripta.

Fons notitiae pro Webalizer lima accessum est.log. Tales sunt vectes eius et numeri, ex quibus solum totum volumen negotiationis patet;

Site statistics and your own small storage
Site statistics and your own small storage
Instrumenta ut Google Analytica notitias colligent ex ipsis pagina oneratis. Duo schemata et lineae nobis ostendunt, ex quibus saepe difficile est rectas conclusiones concludere. Forsitan plus operae debuerit? Nescis.

Quid ergo videre volo in in website visitatoris mutant?

User and bot traffic

Saepe situs negotiationis limitatur et necessarium est videre quanto usui negotium adhibeatur. Exempli gratia;

Site statistics and your own small storage

SQL referre query

SELECT
1 as 'StackedArea: Traffic generated by Users and Bots',
strftime('%d.%m', datetime(FCT.EVENT_DT, 'unixepoch')) AS 'Day',
SUM(CASE WHEN USG.AGENT_BOT!='n.a.' THEN FCT.BYTES ELSE 0 END)/1000 AS 'Bots, KB',
SUM(CASE WHEN USG.AGENT_BOT='n.a.' THEN FCT.BYTES ELSE 0 END)/1000 AS 'Users, KB'
FROM
  FCT_ACCESS_USER_AGENT_DD FCT,
  DIM_USER_AGENT USG
WHERE FCT.DIM_USER_AGENT_ID=USG.DIM_USER_AGENT_ID
  AND datetime(FCT.EVENT_DT, 'unixepoch') >= date('now', '-14 day')
GROUP BY strftime('%d.%m', datetime(FCT.EVENT_DT, 'unixepoch'))
ORDER BY FCT.EVENT_DT

Aliquam lacinia purus ostendit constantem actuositatem autocinetorum. Multum interest repraesentativis singillatim studere.

molestus Automata

Automata indicamus secundum informationes agentis user. Additae statisticae in negotiatione cotidiano, numerus petitionum prosperorum et infelicium bonam notionem activitatis bot dat.

Site statistics and your own small storage

SQL referre query

SELECT 
1 AS 'Table: Annoying Bots',
MAX(USG.AGENT_BOT) AS 'Bot',
ROUND(SUM(FCT.BYTES)/1000 / 14.0, 1) AS 'KB per Day',
ROUND(SUM(FCT.IP_CNT) / 14.0, 1) AS 'IPs per Day',
ROUND(SUM(CASE WHEN STS.STATUS_GROUP IN ('Client Error', 'Server Error') THEN FCT.REQUEST_CNT / 14.0 ELSE 0 END), 1) AS 'Error Requests per Day',
ROUND(SUM(CASE WHEN STS.STATUS_GROUP IN ('Successful', 'Redirection') THEN FCT.REQUEST_CNT / 14.0 ELSE 0 END), 1) AS 'Success Requests per Day',
USG.USER_AGENT_NK AS 'Agent'
FROM FCT_ACCESS_USER_AGENT_DD FCT,
     DIM_USER_AGENT USG,
     DIM_HTTP_STATUS STS
WHERE FCT.DIM_USER_AGENT_ID = USG.DIM_USER_AGENT_ID
  AND FCT.DIM_HTTP_STATUS_ID = STS.DIM_HTTP_STATUS_ID
  AND USG.AGENT_BOT != 'n.a.'
  AND datetime(FCT.EVENT_DT, 'unixepoch') >= date('now', '-14 day')
GROUP BY USG.USER_AGENT_NK
ORDER BY 3 DESC
LIMIT 10

In hoc casu eventus analyseos consilium fuit restringere aditum ad locum addendo ad fasciculum robots.txt

User-agent: AhrefsBot
Disallow: /
User-agent: dotbot
Disallow: /
User-agent: bingbot
Crawl-delay: 5

Duo prima autocineta e mensa evanuerunt, et MS robots e primis lineis moti sunt.

Die ac tempore maximae actionis

Upswings apparent in negotiatione. Ea ut singillatim studeat, illustrare oportet tempus eventusque eorum, nec necesse est omnes mensuras temporis ac horas ostendere. Hoc facilius erit singulas petitiones in tabella invenire, si accurata analysis necessaria sit.

Site statistics and your own small storage

SQL referre query

SELECT
1 AS 'Line: Day and Hour of Hits from Users and Bots',
strftime('%d.%m-%H', datetime(EVENT_DT, 'unixepoch')) AS 'Date Time',
HIB AS 'Bots, Hits',
HIU AS 'Users, Hits'
FROM (
	SELECT
	EVENT_DT,
	SUM(CASE WHEN AGENT_BOT!='n.a.' THEN LINE_CNT ELSE 0 END) AS HIB,
	SUM(CASE WHEN AGENT_BOT='n.a.' THEN LINE_CNT ELSE 0 END) AS HIU
	FROM FCT_ACCESS_REQUEST_REF_HH
	WHERE datetime(EVENT_DT, 'unixepoch') >= date('now', '-14 day')
	GROUP BY EVENT_DT
	ORDER BY SUM(LINE_CNT) DESC
	LIMIT 10
) ORDER BY EVENT_DT

Horas acerrimas observamus XI, XIV et XX primi diei in chart. Postero autem die at 11:14 botones activae sunt.

Mediocris cotidie user actio in hebdomada

Nos res ex aliquantulus activitatem ac negotiationem sorted. Proxima quaestio fuit de ipsis utentium activitate. Talibus statisticis, longae aggregationis periodi, ut hebdomadae, optabilia sunt.

Site statistics and your own small storage

SQL referre query

SELECT
1 as 'Line: Average Daily User Activity by Week',
strftime('%W week', datetime(FCT.EVENT_DT, 'unixepoch')) AS 'Week',
ROUND(1.0*SUM(FCT.PAGE_CNT)/SUM(FCT.IP_CNT),1) AS 'Pages per IP per Day',
ROUND(1.0*SUM(FCT.FILE_CNT)/SUM(FCT.IP_CNT),1) AS 'Files per IP per Day'
FROM
  FCT_ACCESS_USER_AGENT_DD FCT,
  DIM_USER_AGENT USG,
  DIM_HTTP_STATUS HST
WHERE FCT.DIM_USER_AGENT_ID=USG.DIM_USER_AGENT_ID
  AND FCT.DIM_HTTP_STATUS_ID = HST.DIM_HTTP_STATUS_ID
  AND USG.AGENT_BOT='n.a.' /* users only */
  AND HST.STATUS_GROUP IN ('Successful') /* good pages */
  AND datetime(FCT.EVENT_DT, 'unixepoch') > date('now', '-3 month')
GROUP BY strftime('%W week', datetime(FCT.EVENT_DT, 'unixepoch'))
ORDER BY FCT.EVENT_DT

Weekly statistics show that in average one user opens 1,6 pages per day. Numerus imaginum rogatarum per usorem in hoc casu pendet additamento novorum fasciculorum ad locum.

Omnes petitiones eorum et statuses

Webalizer semper in certis codicibus paginae ostendit et semper solum numerum felicium postulationum et errorum videre volui.

Site statistics and your own small storage

SQL referre query

SELECT
1 as 'Line: All Requests by Status',
strftime('%d.%m', datetime(FCT.EVENT_DT, 'unixepoch')) AS 'Day',
SUM(CASE WHEN STS.STATUS_GROUP='Successful' THEN FCT.REQUEST_CNT ELSE 0 END) AS 'Success',
SUM(CASE WHEN STS.STATUS_GROUP='Redirection' THEN FCT.REQUEST_CNT ELSE 0 END) AS 'Redirect',
SUM(CASE WHEN STS.STATUS_GROUP='Client Error' THEN FCT.REQUEST_CNT ELSE 0 END) AS 'Customer Error',
SUM(CASE WHEN STS.STATUS_GROUP='Server Error' THEN FCT.REQUEST_CNT ELSE 0 END) AS 'Server Error'
FROM
  FCT_ACCESS_USER_AGENT_DD FCT,
  DIM_HTTP_STATUS STS
WHERE FCT.DIM_HTTP_STATUS_ID=STS.DIM_HTTP_STATUS_ID
  AND datetime(FCT.EVENT_DT, 'unixepoch') >= date('now', '-14 day')
GROUP BY strftime('%d.%m', datetime(FCT.EVENT_DT, 'unixepoch'))
ORDER BY FCT.EVENT_DT

Renuntiatio petitiones ostendit, not clicks (hits), dissimiles LINE_CNT, REQUEST_CNT metrica computatur ut COUNT (DISTINCT STG.REQUEST_NK). Propositum est eventus efficaces ostendere, exempli gratia, MS automata sua capita robots.txt fasciculi centies in die et, hoc casu, talia capita semel numerabuntur. Hoc te permittit ut in graphi salit levis.

Ex grapho multos errores videre potes - hae paginae nullae sunt. Effectus analyseos addita redirectoribus e paginis remotis.

Malum petitiones

Petitiones singillatim examinare, singulas statistics ostendere potes.

Site statistics and your own small storage

SQL referre query

SELECT
  1 AS 'Table: Top Error Requests',
  REQ.REQUEST_NK AS 'Request',
  'Error' AS 'Request Status',
  ROUND(SUM(FCT.LINE_CNT) / 14.0, 1) AS 'Hits per Day',
  ROUND(SUM(FCT.IP_CNT) / 14.0, 1) AS 'IPs per Day',
  ROUND(SUM(FCT.BYTES)/1000 / 14.0, 1) AS 'KB per Day'
FROM
  FCT_ACCESS_REQUEST_REF_HH FCT,
  DIM_REQUEST_V_ACT REQ
WHERE FCT.DIM_REQUEST_ID = REQ.DIM_REQUEST_ID
  AND FCT.STATUS_GROUP IN ('Client Error', 'Server Error')
  AND datetime(FCT.EVENT_DT, 'unixepoch') >= date('now', '-14 day')
GROUP BY REQ.REQUEST_NK
ORDER BY 4 DESC
LIMIT 20

Hoc album etiam omnes vocationes continebit, exempli gratia, rogationem /wp-login.php Aptans regulas petitionum rescriptorum a servo, accommodare potes reactionem ministri talibus petitionibus ac mitte in pagina incipiente.

Itaque, paucae relationes simplices quae in servo stipendii fundatur, imaginem satis integram dant eorum quae in situ eveniunt.

Quomodo notitias?

A sqlite database satis est. Tabulas efficiamus: auxilia ad processuum colligationem ETL.

Site statistics and your own small storage

Tabula scaena qua scribemus tabellas stipes utentes PHP. Tabulae aggregatae duae. Cotidianam mensam cum statisticis usoris agentibus et status petentibus efficiamus. Hourly cum statistica in petitionibus, status coetibus et agentibus. Quatuor tabulae ad mensuras pertinentes.

Effectus est sequens exemplar relationis:

Exemplar dataSite statistics and your own small storage

Scriptor creare aliquid in database sqlite:

DDL object creatio

DROP TABLE IF EXISTS DIM_USER_AGENT;
CREATE TABLE DIM_USER_AGENT (
  DIM_USER_AGENT_ID INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
  USER_AGENT_NK     TEXT NOT NULL DEFAULT 'n.a.',
  AGENT_OS          TEXT NOT NULL DEFAULT 'n.a.',
  AGENT_ENGINE      TEXT NOT NULL DEFAULT 'n.a.',
  AGENT_DEVICE      TEXT NOT NULL DEFAULT 'n.a.',
  AGENT_BOT         TEXT NOT NULL DEFAULT 'n.a.',
  UPDATE_DT         INTEGER NOT NULL DEFAULT 0,
  UNIQUE (USER_AGENT_NK)
);
INSERT INTO DIM_USER_AGENT (DIM_USER_AGENT_ID) VALUES (-1);

Scaena

In casu lima access.loga, legere, parse et scribere omnes petitiones datorum necesse est. Id fieri potest vel directe utens sermone scripto vel instrumento sqlite utens.

Stipes lima forma:

//67.221.59.195 - - [28/Dec/2012:01:47:47 +0100] "GET /files/default.css HTTP/1.1" 200 1512 "https://project.edu/" "Mozilla/4.0"
//host ident auth time method request_nk protocol status bytes ref browser
$log_pattern = '/^([^ ]+) ([^ ]+) ([^ ]+) ([[^]]+]) "(.*) (.*) (.*)" ([0-9-]+) ([0-9-]+) "(.*)" "(.*)"$/';

Clavis propagatio

Cum notitia rudis in datorum datorum est, scribere debes claves quae in tabulis mensuris non sunt. Tunc de mensuris licebit aedificare. Exempli gratia, in mensa DIM_REFERRER clavis est compositum ex tribus agris.

SQL clavem interrogationis propagationis

/* Propagate the referrer from access log */
INSERT INTO DIM_REFERRER (HOST_NK, PATH_NK, QUERY_NK, UPDATE_DT)
SELECT
	CLS.HOST_NK,
	CLS.PATH_NK,
	CLS.QUERY_NK,
	STRFTIME('%s','now') AS UPDATE_DT
FROM (
	SELECT DISTINCT
	REFERRER_HOST AS HOST_NK,
	REFERRER_PATH AS PATH_NK,
	CASE WHEN INSTR(REFERRER_QUERY,'&sid')>0 THEN SUBSTR(REFERRER_QUERY, 1, INSTR(REFERRER_QUERY,'&sid')-1) /* ΠΎΡ‚Ρ€Π΅Π·Π°Π΅ΠΌ sid - спСцифика цмс */
	ELSE REFERRER_QUERY END AS QUERY_NK
	FROM STG_ACCESS_LOG
) CLS
LEFT OUTER JOIN DIM_REFERRER TRG
ON (CLS.HOST_NK = TRG.HOST_NK AND CLS.PATH_NK = TRG.PATH_NK AND CLS.QUERY_NK = TRG.QUERY_NK)
WHERE TRG.DIM_REFERRER_ID IS NULL

Propagatio ad mensam agentis utentis logicam bot contineat, exempli gratia sql snippet:


CASE
WHEN INSTR(LOWER(CLS.BROWSER),'yandex.com')>0
	THEN 'yandex'
WHEN INSTR(LOWER(CLS.BROWSER),'googlebot')>0
	THEN 'google'
WHEN INSTR(LOWER(CLS.BROWSER),'bingbot')>0
	THEN 'microsoft'
WHEN INSTR(LOWER(CLS.BROWSER),'ahrefsbot')>0
	THEN 'ahrefs'
WHEN INSTR(LOWER(CLS.BROWSER),'mj12bot')>0
	THEN 'majestic-12'
WHEN INSTR(LOWER(CLS.BROWSER),'compatible')>0 OR INSTR(LOWER(CLS.BROWSER),'http')>0
	OR INSTR(LOWER(CLS.BROWSER),'libwww')>0 OR INSTR(LOWER(CLS.BROWSER),'spider')>0
	OR INSTR(LOWER(CLS.BROWSER),'java')>0 OR INSTR(LOWER(CLS.BROWSER),'python')>0
	OR INSTR(LOWER(CLS.BROWSER),'robot')>0 OR INSTR(LOWER(CLS.BROWSER),'curl')>0
	OR INSTR(LOWER(CLS.BROWSER),'wget')>0
	THEN 'other'
ELSE 'n.a.' END AS AGENT_BOT

Summa mensae

Postremo tabulas aggregatas oneremus, verbi gratia, mensa quotidiana sic onerari potest:

SQL query ad loading aggregatum

/* Load fact from access log */
INSERT INTO FCT_ACCESS_USER_AGENT_DD (EVENT_DT, DIM_USER_AGENT_ID, DIM_HTTP_STATUS_ID, PAGE_CNT, FILE_CNT, REQUEST_CNT, LINE_CNT, IP_CNT, BYTES)
WITH STG AS (
SELECT
	STRFTIME( '%s', SUBSTR(TIME_NK,9,4) || '-' ||
	CASE SUBSTR(TIME_NK,5,3)
	WHEN 'Jan' THEN '01' WHEN 'Feb' THEN '02' WHEN 'Mar' THEN '03' WHEN 'Apr' THEN '04' WHEN 'May' THEN '05' WHEN 'Jun' THEN '06'
	WHEN 'Jul' THEN '07' WHEN 'Aug' THEN '08' WHEN 'Sep' THEN '09' WHEN 'Oct' THEN '10' WHEN 'Nov' THEN '11'
	ELSE '12' END || '-' || SUBSTR(TIME_NK,2,2) || ' 00:00:00' ) AS EVENT_DT,
	BROWSER AS USER_AGENT_NK,
	REQUEST_NK,
	IP_NR,
	STATUS,
	LINE_NK,
	BYTES
FROM STG_ACCESS_LOG
)
SELECT
	CAST(STG.EVENT_DT AS INTEGER) AS EVENT_DT,
	USG.DIM_USER_AGENT_ID,
	HST.DIM_HTTP_STATUS_ID,
	COUNT(DISTINCT (CASE WHEN INSTR(STG.REQUEST_NK,'.')=0 THEN STG.REQUEST_NK END) ) AS PAGE_CNT,
	COUNT(DISTINCT (CASE WHEN INSTR(STG.REQUEST_NK,'.')>0 THEN STG.REQUEST_NK END) ) AS FILE_CNT,
	COUNT(DISTINCT STG.REQUEST_NK) AS REQUEST_CNT,
	COUNT(DISTINCT STG.LINE_NK) AS LINE_CNT,
	COUNT(DISTINCT STG.IP_NR) AS IP_CNT,
	SUM(BYTES) AS BYTES
FROM STG,
	DIM_HTTP_STATUS HST,
	DIM_USER_AGENT USG
WHERE STG.STATUS = HST.STATUS_NK
  AND STG.USER_AGENT_NK = USG.USER_AGENT_NK
  AND CAST(STG.EVENT_DT AS INTEGER) > $param_epoch_from /* load epoch date */
  AND CAST(STG.EVENT_DT AS INTEGER) < strftime('%s', date('now', 'start of day'))
GROUP BY STG.EVENT_DT, HST.DIM_HTTP_STATUS_ID, USG.DIM_USER_AGENT_ID

In database sqlite permittit te implicatas interrogationes scribere. CUM continet notitiarum et clavium praeparationem. Praecipua quaestio omnia references ad dimensiones colligit.

Conditio non patitur historiam iterum onerare: CAST(STG.EVENT_DT AS INTEGER) > $param_epoch_from, ubi parameter effectus rogationis est.
'SELECTO COALESCE(MAX(EVENT_DT), '3600') AS LAST_EVENT_EPOCH EX FCT_ACCESS_USER_AGENT_DD'

Conditio solum plenum diem oneret: CAST(STG.EVENT_DT AS INTEGER) <strftime(β€˜%s’, diem(β€˜nunc’, β€˜initium diei’))

Paginae seu fasciculi computando primo modo exercetur, punctum quaerendo.

Renuntiationes

In systematis visualizationis complexis, meta-exemplum ex obiectis datorum creare potest, dynamice filtra et regulas aggregationis administrare. Denique omnia instrumenta honesta interrogationem SQL generant.

Hoc exemplo, quaesita SQL parata creabimus et sicut sententias in datorum - his relationibus servabimus.

visualization

Bluff: Pulchra graphs in JavaScript usus est ut instrumentum visualization

Ad hoc faciendum, necesse erat ut omnes relationes PHP percurrere et fasciculum HTML modum tabularum gignere.

$sqls = array(
'SELECT * FROM RPT_ACCESS_USER_VS_BOT',
'SELECT * FROM RPT_ACCESS_ANNOYING_BOT',
'SELECT * FROM RPT_ACCESS_TOP_HOUR_HIT',
'SELECT * FROM RPT_ACCESS_USER_ACTIVE',
'SELECT * FROM RPT_ACCESS_REQUEST_STATUS',
'SELECT * FROM RPT_ACCESS_TOP_REQUEST_PAGE',
'SELECT * FROM RPT_ACCESS_TOP_REQUEST_REFERRER',
'SELECT * FROM RPT_ACCESS_NEW_REQUEST',
'SELECT * FROM RPT_ACCESS_TOP_REQUEST_SUCCESS',
'SELECT * FROM RPT_ACCESS_TOP_REQUEST_ERROR'
);

Instrumentum tabulas proventuum simpliciter visualizat.

conclusio,

Analysis interretialis utens exemplo, articulum machinas necessarias ad notitias apothecas aedificandas describit. Ut ex eventibus videri potest, instrumenta simplicissima sufficiunt ad profundam analysin et visualizationem notarum.

In posterum, hoc reposito in exemplum utentes, eiusmodi structuras efficere conabimur, ut sensim mutatis dimensionibus, metadata, aggregatione graduum et notitiarum integratione ex diversis fontibus.

Etiam propius inspiciamus instrumentum simplicissimum processuum administrandi ETL in una tabula innixum.

Redeamus ad thema mensurae notarum qualitatis et hunc processum automandi.

Investigandum est problemata de ambitu technico et conservatione notitiarum stormariorum, pro quibus servo repositorium cum minimis opibus efficiemus, exempli gratia, ex Raspberry Pi.

Source: www.habr.com