Mataituina faiga ETL i totonu o se tamai fale teu oloa

E toʻatele tagata e faʻaogaina meafaigaluega faʻapitoa e fatu ai faiga masani mo le suʻeina, suia, ma le faʻapipiʻiina o faʻamaumauga i totonu o faʻamaumauga fesoʻotaʻiga. O loʻo faʻamauina le faagasologa o meafaigaluega, faʻamaumau mea sese.

I le tulaga o se mea sese, o le ogalaau o loʻo i ai faʻamatalaga na le mafai e le meafaigaluega ona faʻamaeʻaina le galuega ma o fea modules (masani java) na tu i le mea. O laina mulimuli e mafai ona i ai se mea sese fa'amaumauga, e pei o le solia o le ki tulaga ese a le laulau.

Ina ia tali le fesili pe o le a le matafaioi a le ETL faʻamatalaga sese, na ou faʻavasegaina faʻafitauli uma na tutupu i le lua tausaga talu ai i totonu o se fale tele tele.

Mataituina faiga ETL i totonu o se tamai fale teu oloa

Fa'amatalaga sese e aofia ai e pei o: sa le lava le avanoa, leiloa le feso'ota'iga, tautau le sauniga, ma isi.

O mea sese talafeagai e aofia ai le solia o ki laulau, mea le aoga, leai se avanoa i mea faitino, ma isi.
E le mafai ona fa'alauiloa le fa'atulagaina i le taimi, e mafai ona malolo, ma isi.

O mea sese faigofie e le umi se taimi e faasa'o ai. E mafai e se ETL lelei ona taulimaina le tele o latou na o ia.

O mea sese lavelave e mana'omia ai ona tatala ma siaki faiga fa'atautaia o fa'amaumauga ma su'esu'e puna'oa fa'amaumauga. E masani ona taʻitaʻia ai le manaʻoga e suʻe suiga ma faʻapipiʻi.

O lea la, o le afa o faʻafitauli uma e fesoʻotaʻi ma faʻamaumauga. 48% o mea sese uma o mea sese faigofie.
O le tasi vaetolu o faʻafitauli uma e fesoʻotaʻi ma suiga i le teuina o manatu poʻo le faʻataʻitaʻiga; sili atu ma le afa o nei mea sese e lavelave.

Ma e itiiti ifo ma le kuata o faʻafitauli uma e fesoʻotaʻi ma le faʻatulagaina o galuega, 18% o na mea sese faigofie.

I le aotelega, 22% o mea sese uma e tutupu e lavelave ma e manaʻomia le tele o le gauai ma le taimi e faʻasaʻo ai. E tupu pe a ma le tasi i le vaiaso. A o mea sese faigofie e tupu toetoe lava i aso uma.

E manino lava, o le mataʻituina o faiga ETL o le a aoga pe a faʻaalia le nofoaga o le mea sese i totonu o le ogalaau i le saʻo lelei ma e manaʻomia sina taimi itiiti e suʻe ai le puna o le faʻafitauli.

Mataituina lelei

O le a le mea na ou manaʻo e vaʻaia i le faagasologa o le mataʻituina o le ETL?

Mataituina faiga ETL i totonu o se tamai fale teu oloa
Amata i - ina ua amata ona ou galue,
Punavai - puna'oa,
Layer - o le fea tulaga e teu ai ua utaina,
ETL Job Name o se faʻatonuga o le utaina e aofia ai le tele o laʻasaga laiti,
Laasaga Numera - numera o le laasaga o loʻo faʻatinoina,
Laina A'afia - o le a le tele o fa'amaumauga ua uma ona fa'agaioia,
Umi sec - o le a le umi e faatino ai,
Tulaga - pe lelei mea uma pe leai: OK, ERROR, RUNNING, HANGS
Feau - savali manuia mulimuli poʻo faʻamatalaga sese.

Faʻavae i luga o le tulaga o faʻamaumauga, e mafai ona e lafoina se imeli. tusi i isi tagata auai. Afai e leai ni mea sese, ona le manaʻomia lea o se tusi.

I lenei auala, pe a tupu se mea sese, o le nofoaga o le mea na tupu e faʻaalia manino.

O nisi taimi e tupu e le aoga le meafaigaluega mataʻituina. I lenei tulaga, e mafai ona valaʻau saʻo le vaʻaiga (vaaiga) i totonu o faʻamaumauga, i luga o le faʻavae o loʻo fausia ai le lipoti.

ETL mata'ituina laulau

Ina ia faʻatinoina le mataʻituina o faiga ETL, tasi le laulau ma le tasi vaaiga e lava.

Ina ia faia lenei mea e mafai ona e toe foʻi i o lau lava mea e teu ai ma fatuina se faʻataʻitaʻiga i le sqlite database.

DDL laulau

CREATE TABLE UTL_JOB_STATUS (
/* Table for logging of job execution log. Important that the job has the steps ETL_START and ETL_END or ETL_ERROR */
  UTL_JOB_STATUS_ID INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
  SID               INTEGER NOT NULL DEFAULT -1, /* Session Identificator. Unique for every Run of job */
  LOG_DT            INTEGER NOT NULL DEFAULT 0,  /* Date time */
  LOG_D             INTEGER NOT NULL DEFAULT 0,  /* Date */
  JOB_NAME          TEXT NOT NULL DEFAULT 'N/A', /* Job name like JOB_STG2DM_GEO */
  STEP_NAME         TEXT NOT NULL DEFAULT 'N/A', /* ETL_START, ... , ETL_END/ETL_ERROR */
  STEP_DESCR        TEXT,                        /* Description of task or error message */
  UNIQUE (SID, JOB_NAME, STEP_NAME)
);
INSERT INTO UTL_JOB_STATUS (UTL_JOB_STATUS_ID) VALUES (-1);

Va'ai/lipoti DDL

CREATE VIEW IF NOT EXISTS UTL_JOB_STATUS_V
AS /* Content: Package Execution Log for last 3 Months. */
WITH SRC AS (
  SELECT LOG_D,
    LOG_DT,
    UTL_JOB_STATUS_ID,
    SID,
	CASE WHEN INSTR(JOB_NAME, 'FTP') THEN 'TRANSFER' /* file transfer */
	     WHEN INSTR(JOB_NAME, 'STG') THEN 'STAGE' /* stage */
	     WHEN INSTR(JOB_NAME, 'CLS') THEN 'CLEANSING' /* cleansing */
	     WHEN INSTR(JOB_NAME, 'DIM') THEN 'DIMENSION' /* dimension */
	     WHEN INSTR(JOB_NAME, 'FCT') THEN 'FACT' /* fact */
		 WHEN INSTR(JOB_NAME, 'ETL') THEN 'STAGE-MART' /* data mart */
	     WHEN INSTR(JOB_NAME, 'RPT') THEN 'REPORT' /* report */
	     ELSE 'N/A' END AS LAYER,
	CASE WHEN INSTR(JOB_NAME, 'ACCESS') THEN 'ACCESS LOG' /* source */
	     WHEN INSTR(JOB_NAME, 'MASTER') THEN 'MASTER DATA' /* source */
	     WHEN INSTR(JOB_NAME, 'AD-HOC') THEN 'AD-HOC' /* source */
	     ELSE 'N/A' END AS SOURCE,
    JOB_NAME,
    STEP_NAME,
    CASE WHEN STEP_NAME='ETL_START' THEN 1 ELSE 0 END AS START_FLAG,
    CASE WHEN STEP_NAME='ETL_END' THEN 1 ELSE 0 END AS END_FLAG,
    CASE WHEN STEP_NAME='ETL_ERROR' THEN 1 ELSE 0 END AS ERROR_FLAG,
    STEP_NAME || ' : ' || STEP_DESCR AS STEP_LOG,
	SUBSTR( SUBSTR(STEP_DESCR, INSTR(STEP_DESCR, '***')+4), 1, INSTR(SUBSTR(STEP_DESCR, INSTR(STEP_DESCR, '***')+4), '***')-2 ) AS AFFECTED_ROWS
  FROM UTL_JOB_STATUS
  WHERE datetime(LOG_D, 'unixepoch') >= date('now', 'start of month', '-3 month')
)
SELECT JB.SID,
  JB.MIN_LOG_DT AS START_DT,
  strftime('%d.%m.%Y %H:%M', datetime(JB.MIN_LOG_DT, 'unixepoch')) AS LOG_DT,
  JB.SOURCE,
  JB.LAYER,
  JB.JOB_NAME,
  CASE
  WHEN JB.ERROR_FLAG = 1 THEN 'ERROR'
  WHEN JB.ERROR_FLAG = 0 AND JB.END_FLAG = 0 AND strftime('%s','now') - JB.MIN_LOG_DT > 0.5*60*60 THEN 'HANGS' /* half an hour */
  WHEN JB.ERROR_FLAG = 0 AND JB.END_FLAG = 0 THEN 'RUNNING'
  ELSE 'OK'
  END AS STATUS,
  ERR.STEP_LOG     AS STEP_LOG,
  JB.CNT           AS STEP_CNT,
  JB.AFFECTED_ROWS AS AFFECTED_ROWS,
  strftime('%d.%m.%Y %H:%M', datetime(JB.MIN_LOG_DT, 'unixepoch')) AS JOB_START_DT,
  strftime('%d.%m.%Y %H:%M', datetime(JB.MAX_LOG_DT, 'unixepoch')) AS JOB_END_DT,
  JB.MAX_LOG_DT - JB.MIN_LOG_DT AS JOB_DURATION_SEC
FROM
  ( SELECT SID, SOURCE, LAYER, JOB_NAME,
           MAX(UTL_JOB_STATUS_ID) AS UTL_JOB_STATUS_ID,
           MAX(START_FLAG)       AS START_FLAG,
           MAX(END_FLAG)         AS END_FLAG,
           MAX(ERROR_FLAG)       AS ERROR_FLAG,
           MIN(LOG_DT)           AS MIN_LOG_DT,
           MAX(LOG_DT)           AS MAX_LOG_DT,
           SUM(1)                AS CNT,
           SUM(IFNULL(AFFECTED_ROWS, 0)) AS AFFECTED_ROWS
    FROM SRC
    GROUP BY SID, SOURCE, LAYER, JOB_NAME
  ) JB,
  ( SELECT UTL_JOB_STATUS_ID, SID, JOB_NAME, STEP_LOG
    FROM SRC
    WHERE 1 = 1
  ) ERR
WHERE 1 = 1
  AND JB.SID = ERR.SID
  AND JB.JOB_NAME = ERR.JOB_NAME
  AND JB.UTL_JOB_STATUS_ID = ERR.UTL_JOB_STATUS_ID
ORDER BY JB.MIN_LOG_DT DESC, JB.SID DESC, JB.SOURCE;

SQL Siaki le mafai ona maua se numera fou o le sauniga

SELECT SUM (
  CASE WHEN start_job.JOB_NAME IS NOT NULL AND end_job.JOB_NAME IS NULL /* existed job finished */
	    AND NOT ( 'y' = 'n' ) /* force restart PARAMETER */
       THEN 1 ELSE 0
  END ) AS IS_RUNNING
  FROM
    ( SELECT 1 AS dummy FROM UTL_JOB_STATUS WHERE sid = -1) d_job
  LEFT OUTER JOIN
    ( SELECT JOB_NAME, SID, 1 AS dummy
      FROM UTL_JOB_STATUS
      WHERE JOB_NAME = 'RPT_ACCESS_LOG' /* job name PARAMETER */
	    AND STEP_NAME = 'ETL_START'
      GROUP BY JOB_NAME, SID
    ) start_job /* starts */
  ON d_job.dummy = start_job.dummy
  LEFT OUTER JOIN
    ( SELECT JOB_NAME, SID
      FROM UTL_JOB_STATUS
      WHERE JOB_NAME = 'RPT_ACCESS_LOG'  /* job name PARAMETER */
	    AND STEP_NAME in ('ETL_END', 'ETL_ERROR') /* stop status */
      GROUP BY JOB_NAME, SID
    ) end_job /* ends */
  ON start_job.JOB_NAME = end_job.JOB_NAME
     AND start_job.SID = end_job.SID

Fa'atusa o le laulau:

  • o le amataga ma le fa'ai'uga o le fa'asologa o fa'amaumauga e tatau ona fa'atasi ma la'asaga ETL_START ma ETL_END
  • i le tulaga o se mea sese, e tatau ona faia se laasaga ETL_ERROR ma lona faʻamatalaga
  • o le aofaʻi o faʻamatalaga faʻatautaia e tatau ona faʻamaonia, mo se faʻataʻitaʻiga, faʻatasi ma asterisk
  • o le faiga lava lea e tasi e mafai ona amata i le taimi lava e tasi ma le force_restart = y parameter; a aunoa ma lea, o le numera o le vasega e tuʻuina atu naʻo le faʻagasologa maeʻa.
  • i le tulaga masani e le mafai ona faʻatautaia le faʻasologa o faʻamaumauga tutusa i le tutusa

O gaioiga manaʻomia mo le galue ma le laulau o mea nei:

  • maua le numera o sauniga o le faiga ETL o loʻo faʻalauiloaina
  • fa'aofiina o se ogalaau ulufale i totonu o se laulau
  • maua le fa'amaumauga manuia mulimuli o se faiga ETL

I faʻamaumauga e pei o Oracle poʻo Postgres, o nei gaioiga e mafai ona faʻatinoina ma galuega faʻapipiʻi. sqlite e manaʻomia se masini fafo ma i lenei tulaga fa'ata'ita'iga ile PHP.

iʻuga

O le mea lea, o le lipotia o mea sese i meafaigaluega faʻapipiʻi faʻamaumauga e faia se sao taua tele. Ae e le mafai ona taʻua e sili ona lelei mo le vave maua o le mafuaʻaga o le faʻafitauli. A oʻo atu le aofaʻi o faʻatonuga i le selau, o le mataʻituina o le faagasologa e liua i se galuega faʻalavelave.

O le tusiga o loʻo tuʻuina mai ai se faʻataʻitaʻiga o se fofo talafeagai i le faʻafitauli i le tulaga o se prototype. O le faʻataʻitaʻiga atoa o le tamaʻi fale teu oloa o loʻo maua i le gitlab SQLite PHP ETL Utilities.

puna: www.habr.com

Faaopoopo i ai se faamatalaga