Cras ETL processus in parva notitia CELLA

Multi instrumentorum specializatorum utuntur ad rationes creandi rationes extrahendi, commutandi, onerandique notitias in databases relativas. Processus instrumentorum operandi est initium, errores infixi sunt.

In casu erroris, index informationes continet instrumentum ad negotium perficiendum et qui moduli (saepe java) destiterunt ubi defecerunt. In ultimis lineis, errorem database invenire potes, exempli gratia, mensam clavem unicam violationem.

Ad quaestionem de quanam parte ETL errorum notitiarum fabularum respondeat, indicavi omnes difficultates, quae per duos annos in repositio magis magna facta sunt.

Cras ETL processus in parva notitia CELLA

Errores database includunt non satis spatium, nexum amissum, sessio suspensum, etc.

Errores logici tales includunt ut violationem clavium mensarum, obiecta non valida, defectus accessus ad obiecta, etc.
In schedula non incipiet tempus, ut congelatur, etc.

Simplex errata non diu figere. Vir bonus ETL plurimas earum per se tractare potest.

cimices complexi necessariam faciunt ad inveniendas et probandas rationes ad operandum cum notitia, ad fontes notitias explorandas. Saepe necessitatem mutationis tentationis et instruere.

Dimidium ergo omnium quaestionum ad datorum se referunt. XLVIII% omnium errata sunt simplex errata.
Tertia omnium quaestionum ad mutandam logicam vel exemplar reponendarum pertinentium, plus quam dimidium horum errorum implicatae sunt.

Et minus quarta pars omnium quaestionum ad munus schedularum referuntur, 18% quarum errores simplices sunt.

In genere, 22% omnium errorum qui fiunt implicati sunt, eorumque correctio plurimum requirit attentionem et tempus. Fiunt semel in hebdomada. Sed simplicia errata quotidie fere accidunt.

Patet vigilantiam ETL processuum efficacem esse cum locus erroris in indice quam accuratissime indicatus est et tempus minimum requiri ad fontem problema.

Efficax magna

Quid vis videre in magna processu ETL?

Cras ETL processus in parva notitia CELLA
Committitur - cum incepit opus,
Fons - data origo;
Stratum - quo gradu repositionis oneratur;
ETL Nomen Job - ratio notae, quae ex multis parvis gradibus consistit;
Numerus gradus - numerus gradus conficitur;
Affectus Ordines - quantum data iam processit;
Duratio sec - quam diu capit;
Status - utrum omnia bene sit necne: OK, ERROR, CURSUS, HANGS
Nuntius - Last felix nuntius vel error descriptio.

Secundum status monumentorum, email mittere potes. epistola ad alios sodales. Si errores non sunt, littera non est necessaria.

Ita, in eventu erroris, locus incidentis evidenter indicatur.

Interdum evenit ut ipsum instrumentum vigilantia non operetur. Hoc in casu, potest directe visum vocare in database, ex cuius fama aedificatur.

ETL magna mensa

Ad vigilantiam ETL processuum efficiendum, una mensa et una sententia satis sunt.

Ad hoc redire potes tuum parum repono et creare exemplar in sqlite database.

DDL tabulae

CREATE TABLE UTL_JOB_STATUS (
/* Table for logging of job execution log. Important that the job has the steps ETL_START and ETL_END or ETL_ERROR */
  UTL_JOB_STATUS_ID INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
  SID               INTEGER NOT NULL DEFAULT -1, /* Session Identificator. Unique for every Run of job */
  LOG_DT            INTEGER NOT NULL DEFAULT 0,  /* Date time */
  LOG_D             INTEGER NOT NULL DEFAULT 0,  /* Date */
  JOB_NAME          TEXT NOT NULL DEFAULT 'N/A', /* Job name like JOB_STG2DM_GEO */
  STEP_NAME         TEXT NOT NULL DEFAULT 'N/A', /* ETL_START, ... , ETL_END/ETL_ERROR */
  STEP_DESCR        TEXT,                        /* Description of task or error message */
  UNIQUE (SID, JOB_NAME, STEP_NAME)
);
INSERT INTO UTL_JOB_STATUS (UTL_JOB_STATUS_ID) VALUES (-1);

Visum / Report DDL

CREATE VIEW IF NOT EXISTS UTL_JOB_STATUS_V
AS /* Content: Package Execution Log for last 3 Months. */
WITH SRC AS (
  SELECT LOG_D,
    LOG_DT,
    UTL_JOB_STATUS_ID,
    SID,
	CASE WHEN INSTR(JOB_NAME, 'FTP') THEN 'TRANSFER' /* file transfer */
	     WHEN INSTR(JOB_NAME, 'STG') THEN 'STAGE' /* stage */
	     WHEN INSTR(JOB_NAME, 'CLS') THEN 'CLEANSING' /* cleansing */
	     WHEN INSTR(JOB_NAME, 'DIM') THEN 'DIMENSION' /* dimension */
	     WHEN INSTR(JOB_NAME, 'FCT') THEN 'FACT' /* fact */
		 WHEN INSTR(JOB_NAME, 'ETL') THEN 'STAGE-MART' /* data mart */
	     WHEN INSTR(JOB_NAME, 'RPT') THEN 'REPORT' /* report */
	     ELSE 'N/A' END AS LAYER,
	CASE WHEN INSTR(JOB_NAME, 'ACCESS') THEN 'ACCESS LOG' /* source */
	     WHEN INSTR(JOB_NAME, 'MASTER') THEN 'MASTER DATA' /* source */
	     WHEN INSTR(JOB_NAME, 'AD-HOC') THEN 'AD-HOC' /* source */
	     ELSE 'N/A' END AS SOURCE,
    JOB_NAME,
    STEP_NAME,
    CASE WHEN STEP_NAME='ETL_START' THEN 1 ELSE 0 END AS START_FLAG,
    CASE WHEN STEP_NAME='ETL_END' THEN 1 ELSE 0 END AS END_FLAG,
    CASE WHEN STEP_NAME='ETL_ERROR' THEN 1 ELSE 0 END AS ERROR_FLAG,
    STEP_NAME || ' : ' || STEP_DESCR AS STEP_LOG,
	SUBSTR( SUBSTR(STEP_DESCR, INSTR(STEP_DESCR, '***')+4), 1, INSTR(SUBSTR(STEP_DESCR, INSTR(STEP_DESCR, '***')+4), '***')-2 ) AS AFFECTED_ROWS
  FROM UTL_JOB_STATUS
  WHERE datetime(LOG_D, 'unixepoch') >= date('now', 'start of month', '-3 month')
)
SELECT JB.SID,
  JB.MIN_LOG_DT AS START_DT,
  strftime('%d.%m.%Y %H:%M', datetime(JB.MIN_LOG_DT, 'unixepoch')) AS LOG_DT,
  JB.SOURCE,
  JB.LAYER,
  JB.JOB_NAME,
  CASE
  WHEN JB.ERROR_FLAG = 1 THEN 'ERROR'
  WHEN JB.ERROR_FLAG = 0 AND JB.END_FLAG = 0 AND strftime('%s','now') - JB.MIN_LOG_DT > 0.5*60*60 THEN 'HANGS' /* half an hour */
  WHEN JB.ERROR_FLAG = 0 AND JB.END_FLAG = 0 THEN 'RUNNING'
  ELSE 'OK'
  END AS STATUS,
  ERR.STEP_LOG     AS STEP_LOG,
  JB.CNT           AS STEP_CNT,
  JB.AFFECTED_ROWS AS AFFECTED_ROWS,
  strftime('%d.%m.%Y %H:%M', datetime(JB.MIN_LOG_DT, 'unixepoch')) AS JOB_START_DT,
  strftime('%d.%m.%Y %H:%M', datetime(JB.MAX_LOG_DT, 'unixepoch')) AS JOB_END_DT,
  JB.MAX_LOG_DT - JB.MIN_LOG_DT AS JOB_DURATION_SEC
FROM
  ( SELECT SID, SOURCE, LAYER, JOB_NAME,
           MAX(UTL_JOB_STATUS_ID) AS UTL_JOB_STATUS_ID,
           MAX(START_FLAG)       AS START_FLAG,
           MAX(END_FLAG)         AS END_FLAG,
           MAX(ERROR_FLAG)       AS ERROR_FLAG,
           MIN(LOG_DT)           AS MIN_LOG_DT,
           MAX(LOG_DT)           AS MAX_LOG_DT,
           SUM(1)                AS CNT,
           SUM(IFNULL(AFFECTED_ROWS, 0)) AS AFFECTED_ROWS
    FROM SRC
    GROUP BY SID, SOURCE, LAYER, JOB_NAME
  ) JB,
  ( SELECT UTL_JOB_STATUS_ID, SID, JOB_NAME, STEP_LOG
    FROM SRC
    WHERE 1 = 1
  ) ERR
WHERE 1 = 1
  AND JB.SID = ERR.SID
  AND JB.JOB_NAME = ERR.JOB_NAME
  AND JB.UTL_JOB_STATUS_ID = ERR.UTL_JOB_STATUS_ID
ORDER BY JB.MIN_LOG_DT DESC, JB.SID DESC, JB.SOURCE;

SQL Reprehendo si fieri potest ut novam sessionem numero

SELECT SUM (
  CASE WHEN start_job.JOB_NAME IS NOT NULL AND end_job.JOB_NAME IS NULL /* existed job finished */
	    AND NOT ( 'y' = 'n' ) /* force restart PARAMETER */
       THEN 1 ELSE 0
  END ) AS IS_RUNNING
  FROM
    ( SELECT 1 AS dummy FROM UTL_JOB_STATUS WHERE sid = -1) d_job
  LEFT OUTER JOIN
    ( SELECT JOB_NAME, SID, 1 AS dummy
      FROM UTL_JOB_STATUS
      WHERE JOB_NAME = 'RPT_ACCESS_LOG' /* job name PARAMETER */
	    AND STEP_NAME = 'ETL_START'
      GROUP BY JOB_NAME, SID
    ) start_job /* starts */
  ON d_job.dummy = start_job.dummy
  LEFT OUTER JOIN
    ( SELECT JOB_NAME, SID
      FROM UTL_JOB_STATUS
      WHERE JOB_NAME = 'RPT_ACCESS_LOG'  /* job name PARAMETER */
	    AND STEP_NAME in ('ETL_END', 'ETL_ERROR') /* stop status */
      GROUP BY JOB_NAME, SID
    ) end_job /* ends */
  ON start_job.JOB_NAME = end_job.JOB_NAME
     AND start_job.SID = end_job.SID

Tabula lineamenta:

  • initium et finis processus notitiae in gradibus ETL_START et ETL_END . sequi debet
  • in casu erroris, gradus ETL_ERROR cum sua descriptione creari debent
  • moles notitia processus processus illustrari debet, exempli gratia, cum asteriscis
  • idemque processus simul inchoari potest cum vi parametri =y_restart, sine quo numerus sessionis editur tantum ad modum procedendi completo.
  • in normali modo, non potes currere eadem processus notitia processus in parallela

Necessariae operationes ad opus cum mensa sunt haec:

  • questus sessionis numerus cursus ETL procedure
  • inserere iniuriarum ingressum in mensa
  • questus ultimum felix recordum an ETL procedure

In databases, sicut Oraculum seu Postgres, hae operationes sicut in muneribus aedificatae perfici possunt. sqlite externam mechanismum requirit et in hoc casu prototyped in PHP.

conclusio,

Sic, error nuntii in instrumentorum processus mega-magni ponderis obtinet. Sed difficile est eos optimales vocare ad quaestionem causa celeriter inveniendam. Cum numerus processuum ad centum appropinquat, tum processus vigilantia in consilium complexum vertit.

Articulus praebet exemplum solutionis possibilis ad problema prototypi forma. Tota parva repositio prototypum praesto est in gitlab SQLite PHP ETL Utilitas.

Source: www.habr.com