Yuav Ua Li Cas Peb Siv Lazy Replication rau Kev Puas Tsuaj Rov Qab Nrog PostgreSQL

Yuav Ua Li Cas Peb Siv Lazy Replication rau Kev Puas Tsuaj Rov Qab Nrog PostgreSQL
Replication tsis yog thaub qab. Los tsis? Nov yog yuav ua li cas peb tau siv ncua sijhawm rov qab los ntawm kev rho tawm yuam kev.

Infrastructure kws tshwj xeeb GitLab yog lub luag haujlwm rau kev ua haujlwm GitLab.com - qhov loj tshaj plaws GitLab piv txwv hauv qhov xwm txheej. Nrog 3 lab cov neeg siv thiab ze li ntawm 7 lab cov haujlwm, nws yog ib qho ntawm qhov loj tshaj plaws qhib qhov chaw SaaS nrog rau kev ua haujlwm siab. Yog tsis muaj PostgreSQL database system, GitLab.com infrastructure yuav tsis mus deb, thiab peb tab tom ua dab tsi los xyuas kom meej qhov ua txhaum cai thaum muaj kev ua tsis tiav thaum cov ntaub ntawv tuaj yeem ploj. Nws tsis zoo li qhov xwm txheej zoo li no yuav tshwm sim, tab sis peb tau npaj zoo thiab khaws cia nrog ntau yam kev thaub qab thiab rov ua dua.

Replication tsis yog ib txoj hauv kev thaub qab databases (saib hauv qab no). Tab sis tam sim no peb yuav pom yuav ua li cas rov qab tau sai sai deleted cov ntaub ntawv siv tub nkeeg replication: on GitLab.com tus neeg siv deleted qhov shortcut rau qhov project gitlab-ce thiab poob kev sib txuas nrog kev sib koom ua ke thov thiab ua haujlwm.

Nrog rau ncua sijhawm replica, peb rov qab tau cov ntaub ntawv hauv tsuas yog 1,5 teev. Saib seb nws tshwm sim li cas.

Taw tes rau lub sijhawm rov qab nrog PostgreSQL

PostgreSQL muaj ib qho kev ua haujlwm uas ua rau lub xeev ntawm cov ntaub ntawv rov qab mus rau qhov tshwj xeeb hauv lub sijhawm. Nws yog hu ua Point-in-Time Recovery (PITR) thiab siv tib lub tswv yim uas ua kom cov replica mus txog rau hnub tim: pib nrog kev txhim khu kev qha snapshot ntawm tag nrho cov ntaub ntawv hauv pawg (piv txwv li thaub qab), peb siv cov kev hloov pauv hauv lub xeev mus txog qee lub sijhawm.

Txhawm rau siv qhov tshwj xeeb no rau kev thaub qab txias, peb niaj hnub ua cov ntaub ntawv yooj yim thaub qab thiab khaws cia rau hauv ib qho archive (GitLab archives nyob hauv Google huab cia). Peb kuj saib xyuas kev hloov pauv hauv lub xeev ntawm cov ntaub ntawv los ntawm kev khaws cov ntawv sau ua ntej (sau ntawv ua ntej, AW). Thiab nrog rau tag nrho cov no nyob rau hauv qhov chaw, peb tuaj yeem ua PITR rau kev puas tsuaj rov qab: pib nrog lub snapshot coj ua ntej qhov ua tsis tiav, thiab siv cov kev hloov pauv los ntawm WAL archive mus txog qhov ua tsis tiav.

Dab tsi yog ncua sij hawm replication?

Lazy replication yog daim ntawv thov kev hloov pauv ntawm WAL nrog ncua sijhawm. Ntawd yog, kev hloov pauv tau tshwm sim hauv ib teev X, tab sis nws yuav tshwm sim nyob rau hauv lub replica nrog ib tug ncua d hauv ib teev X + d.

PostgreSQL muaj 2 txoj hauv kev los teeb tsa lub cev database replica: thaub qab rov qab thiab streaming replication. Rov qab los ntawm ib qho archive, qhov tseem ceeb ua haujlwm zoo li PITR, tab sis tsis tu ncua: peb niaj hnub khaws cov kev hloov pauv los ntawm WAL archive thiab siv lawv rau qhov hloov pauv. A streaming replication ncaj qha retrieves WAL kwj los ntawm upstream database host. Peb nyiam archive rov qab - nws yooj yim dua los tswj thiab muaj kev ua haujlwm ib txwm ua nrog cov pawg tsim khoom.

Yuav ua li cas teeb tsa ncua kev rov qab los ntawm ib qho archive

Kev xaiv rov qab piav nyob rau hauv cov ntaub ntawv recovery.conf... Piv txwv:

standby_mode = 'on'
restore_command = '/usr/bin/envdir /etc/wal-e.d/env /opt/wal-e/bin/wal-e wal-fetch -p 4 "%f" "%p"'
recovery_min_apply_delay = '8h'
recovery_target_timeline = 'latest'

Nrog rau cov kev txwv no, peb teeb tsa ib qho kev ncua sij hawm nrog kev rov qab los. Ntawm no nws yog siv wal-e rho tawm WAL ntu (restore_command) los ntawm cov ntaub ntawv khaws tseg, thiab cov kev hloov pauv yuav raug siv tom qab yim teev (recovery_min_apply_delay). Lub replica yuav saib rau ncua sij hawm hloov nyob rau hauv lub archive, piv txwv li vim ib pawg tsis ua hauj lwm (recovery_target_timeline).

Π‘ recovery_min_apply_delay Koj tuaj yeem teeb tsa streaming replication nrog ncua sij hawm, tab sis muaj ob peb qhov pitfalls ntawm no uas muaj feem xyuam rau replication slots, kub standby tawm tswv yim, thiab hais txog. WAL archive tso cai rau koj zam lawv.

Parameter recovery_min_apply_delay tsuas yog tshwm sim hauv PostgreSQL 9.3. Nyob rau hauv lub dhau los versions, rau ncua replication koj yuav tsum tau configure lub ua ke rov qab tswj kev ua haujlwm (pg_xlog_replay_pause(), pg_xlog_replay_resume()) lossis tuav WAL ntu hauv archive rau lub sijhawm ncua.

PostgreSQL ua qhov no li cas?

Nws yog qhov nthuav kom pom yuav ua li cas PostgreSQL siv tub nkeeg rov qab. Cia peb saib recoveryApplyDelay(XlogReaderState). Nws yog hu los ntawm lub ntsiab rov voj rau txhua qhov nkag los ntawm WAL.

static bool
recoveryApplyDelay(XLogReaderState *record)
{
    uint8       xact_info;
    TimestampTz xtime;
    long        secs;
    int         microsecs;

    /* nothing to do if no delay configured */
    if (recovery_min_apply_delay <= 0)
        return false;

    /* no delay is applied on a database not yet consistent */
    if (!reachedConsistency)
        return false;

    /*
     * Is it a COMMIT record?
     *
     * We deliberately choose not to delay aborts since they have no effect on
     * MVCC. We already allow replay of records that don't have a timestamp,
     * so there is already opportunity for issues caused by early conflicts on
     * standbys.
     */
    if (XLogRecGetRmid(record) != RM_XACT_ID)
        return false;

    xact_info = XLogRecGetInfo(record) & XLOG_XACT_OPMASK;

    if (xact_info != XLOG_XACT_COMMIT &&
        xact_info != XLOG_XACT_COMMIT_PREPARED)
        return false;

    if (!getRecordTimestamp(record, &xtime))
        return false;

    recoveryDelayUntilTime =
        TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay);

    /*
     * Exit without arming the latch if it's already past time to apply this
     * record
     */
    TimestampDifference(GetCurrentTimestamp(), recoveryDelayUntilTime,
                        &secs, &microsecs);
    if (secs <= 0 && microsecs <= 0)
        return false;

    while (true)
    {
        // Shortened:
        // Use WaitLatch until we reached recoveryDelayUntilTime
        // and then
        break;
    }
    return true;
}

Cov kab hauv qab yog tias qhov ncua sij hawm yog raws li lub sijhawm lub cev tau sau tseg hauv kev sib pauv pauv sijhawm teev tseg (xtime). Raws li koj tuaj yeem pom, qhov ncua sij hawm tsuas yog siv rau kev cog lus thiab tsis cuam tshuam rau lwm yam kev nkag - tag nrho cov kev hloov pauv tau siv ncaj qha, thiab kev cog lus qeeb, yog li peb tsuas pom cov kev hloov pauv tom qab kev teeb tsa qeeb.

Yuav ua li cas siv ncua sij hawm replica los kho cov ntaub ntawv

Cia peb hais tias peb muaj ib pawg database thiab ib qho replica nrog rau yim teev qeeb hauv kev tsim khoom. Cia peb saib yuav ua li cas rov qab tau cov ntaub ntawv siv ib qho piv txwv txhob txwm rho tawm shortcuts.

Thaum peb kawm txog qhov teeb meem, peb archive restoration tau nres rau ib tug ncua sij hawm replica:

SELECT pg_xlog_replay_pause();

Nrog rau ib ntus, peb tsis muaj kev pheej hmoo tias tus qauv yuav rov ua dua qhov kev thov DELETE. Ib qho tseem ceeb yog tias koj xav tau sijhawm los txiav txim txhua yam.

Lub ntsiab lus yog tias qhov kev hloov pauv hloov pauv yuav tsum mus txog lub sijhawm ua ntej qhov kev thov DELETE. Peb kwv yees paub lub sij hawm tshem tawm lub cev. Peb tau deleted recovery_min_apply_delay thiab ntxiv recovery_target_time Π² recovery.conf. Qhov no yog yuav ua li cas lub replica mus txog lub sijhawm zoo yam tsis muaj ncua sijhawm:

recovery_target_time = '2018-10-12 09:25:00+00'

Nrog rau lub sij hawm stamps, nws yog qhov zoo dua los txo qhov ntau dhau kom tsis txhob nco. Muaj tseeb, qhov tsawg dua, cov ntaub ntawv ntau dua peb poob. Ntxiv dua thiab, yog tias peb nco qhov kev thov DELETE, txhua yam yuav raug muab tshem tawm dua thiab koj yuav tau pib dua (lossis txawm siv lub thaub qab txias rau PITR).

Peb rov pib dua qhov piv txwv Postgres ncua sijhawm thiab ntu WAL tau rov ua dua kom txog thaum lub sijhawm teev tseg. Koj tuaj yeem taug qab kev nce qib ntawm theem no los ntawm kev nug:

SELECT
  -- current location in WAL
  pg_last_xlog_replay_location(),
  -- current transaction timestamp (state of the replica)
  pg_last_xact_replay_timestamp(),
  -- current physical time
  now(),
  -- the amount of time still to be applied until recovery_target_time has been reached
  '2018-10-12 09:25:00+00'::timestamptz - pg_last_xact_replay_timestamp() as delay;

Yog tias lub sij hawm tsis hloov lawm, qhov rov qab ua tiav. Kev txiav txim tuaj yeem kho tau recovery_target_actionkaw, txhawb, lossis ncua qhov piv txwv tom qab rov ua dua (nws raug ncua los ntawm lub neej ntawd).

Cov ntaub ntawv rov qab mus rau nws lub xeev ua ntej qhov kev thov tsis muaj hmoo. Tam sim no koj muaj peev xwm, piv txwv li, export cov ntaub ntawv. Peb xa tawm cov ntaub ntawv deleted daim ntawv lo thiab tag nrho cov kev sib txuas rau cov teeb meem thiab sib koom ua ke thov thiab txav mus rau hauv cov ntaub ntawv tsim khoom. Yog tias qhov poob yog qhov loj, koj tuaj yeem txhawb nqa lub replica thiab siv nws ua qhov tseem ceeb. Tab sis tom qab ntawd tag nrho cov kev hloov pauv tom qab lub ntsiab lus uas peb tau rov qab los yuav ploj mus.

Hloov cov ntawv teev sijhawm, nws yog qhov zoo dua los siv cov ID hloov pauv. Nws yog qhov tsim nyog los sau cov ID no, piv txwv li, rau DDL nqe lus (xws li DROP TABLE), los ntawm kev siv log_statements = 'ddl'. Yog tias peb muaj tus lej hloov pauv, peb yuav coj recovery_target_xid thiab khiav txhua yam mus rau qhov kev pauv ua ntej qhov kev thov DELETE.

Rov qab mus ua haujlwm yog qhov yooj yim heev: tshem tawm tag nrho cov kev hloov pauv ntawm recovery.conf thiab rov pib Postgres. Lub replica yuav sai sai no muaj yim-teev ncua ntxiv, thiab peb tau npaj rau yav tom ntej teeb meem.

Cov txiaj ntsig rov qab

Nrog rau ncua sij hawm replica es tsis txhob txias thaub qab, koj tsis tas yuav siv sij hawm los kho tag nrho cov duab los ntawm cov archive. Piv txwv li, nws yuav siv sij hawm peb tsib teev kom tau txais tag nrho 2 TB thaub qab. Thiab tom qab ntawd koj tseem yuav tsum tau siv tag nrho WAL txhua hnub kom rov qab mus rau lub xeev xav tau (qhov phem tshaj plaws).

Ib tug ncua sij hawm replica yog zoo dua li ib tug txias backup nyob rau hauv ob txoj kev:

  1. Tsis tas yuav tshem tawm tag nrho cov thaub qab yooj yim ntawm cov ntaub ntawv.
  2. Muaj ib lub qhov rais XNUMX teev ntawm WAL ntu uas yuav tsum tau rov ua dua.

Peb tseem niaj hnub tshawb xyuas seb puas muaj peev xwm ua tau PITR los ntawm WAL, thiab peb yuav pom sai sai pom kev noj nyiaj txiag lossis lwm yam teeb meem nrog WAL archive los ntawm kev saib xyuas cov lag luam ntawm qhov kev ncua sij hawm.

Hauv qhov piv txwv no, nws tau siv sijhawm 50 feeb los kho peb, txhais tau tias qhov ceev yog 110 GB ntawm WAL cov ntaub ntawv ib teev (cov ntaub ntawv tseem nyob rau. AWS S3). Nyob rau hauv tag nrho, peb daws qhov teeb meem thiab rov qab tau cov ntaub ntawv nyob rau hauv 1,5 teev.

Cov txiaj ntsig: qhov twg qhov kev hloov pauv hloov pauv tau muaj txiaj ntsig (thiab qhov twg nws tsis yog)

Siv ncua sij hawm replication ua thawj pab yog tias koj yuam kev poob cov ntaub ntawv thiab pom qhov teeb meem no nyob rau hauv lub configured ncua.

Tab sis nco ntsoov: replication tsis yog thaub qab.

Backup thiab replication muaj ntau lub hom phiaj. Cov thaub qab txias yuav tuaj yeem ua ke yog tias koj ua yuam kev DELETE los yog DROP TABLE. Peb ua ib qho thaub qab los ntawm kev cia txias thiab rov qab lub xeev dhau los ntawm lub rooj lossis tag nrho cov ntaub ntawv. Tab sis tib lub sij hawm thov DROP TABLE yuav luag tam sim rov tsim dua nyob rau hauv tag nrho cov replicas ntawm cov ua hauj lwm pawg, yog li zoo tib yam replication yuav tsis pab ntawm no. Replication nws tus kheej khaws cov ntaub ntawv muaj nyob rau thaum ib tus neeg servers raug xauj tawm thiab faib cov khoom thauj.

Txawm hais tias muaj qhov hloov pauv hloov pauv, qee zaum peb xav tau qhov txias txias hauv qhov chaw nyab xeeb yog tias qhov chaw khaws ntaub ntawv tsis ua haujlwm, zais kev puas tsuaj, lossis lwm yam xwm txheej uas tsis tshwm sim tam sim ntawd. Replication ib leeg yog tsis siv ntawm no.

ΠŸΡ€ΠΈΠΌΠ΅Ρ‡Π°Π½ΠΈΠ΅Cov. Nyob rau GitLab.com Peb tam sim no tsuas yog tiv thaiv cov ntaub ntawv poob ntawm qhov system theem thiab tsis rov qab cov ntaub ntawv ntawm tus neeg siv qib.

Tau qhov twg los: www.hab.com

Ntxiv ib saib