Lub Kaum Ob Hlis tas los kuv tau txais tsab ntawv ceeb toom kab mob txaus ntshai los ntawm pab pawg pab txhawb nqa VWO. Lub sijhawm thauj khoom rau ib qho ntawm cov ntaub ntawv txheeb xyuas rau cov neeg siv khoom loj zoo li txwv tsis pub. Thiab txij li qhov no yog kuv cheeb tsam ntawm lub luag haujlwm, kuv tau tsom mus rau kev daws qhov teeb meem tam sim ntawd.
prehistory
Kom paub meej tias kuv tab tom tham txog, Kuv yuav qhia koj me ntsis txog VWO. Qhov no yog lub platform uas koj tuaj yeem tsim ntau yam phiaj xwm phiaj xwm ntawm koj lub vev xaib: ua A / B thwmsim, taug qab cov neeg tuaj saib thiab hloov pauv, txheeb xyuas qhov muag funnel, tso saib cov duab tshav kub thiab ua si mus saib cov ntaub ntawv.
Tab sis qhov tseem ceeb tshaj plaws ntawm lub platform yog qhia. Tag nrho cov haujlwm saum toj no yog sib cuam tshuam. Thiab rau cov neeg siv khoom lag luam, cov ntaub ntawv loj loj yuav tsuas yog tsis muaj txiaj ntsig yam tsis muaj lub platform muaj zog uas nthuav tawm nws hauv daim ntawv tshuaj ntsuam.
Siv lub platform, koj tuaj yeem ua cov lus nug random ntawm cov ntaub ntawv loj. Nov yog ib qho piv txwv yooj yim:
Qhia tag nrho cov clicks ntawm nplooj ntawv "abc.com" NTAWM <date d1> TO <date d2> rau cov neeg siv Chrome LOSSIS (nyob hauv Europe THIAB siv iPhone)
Ua tib zoo saib xyuas cov neeg ua haujlwm Boolean. Lawv muaj rau cov neeg siv khoom hauv cov lus nug cuam tshuam los ua cov lus nug nyuaj arbitrarily kom tau txais cov qauv.
Kev thov qeeb
Tus neeg siv khoom hauv nqe lus nug tau sim ua qee yam uas yuav tsum ua haujlwm sai sai:
Lub vev xaib no muaj ib tuj ntawm kev khiav tsheb thiab peb tau khaws ntau tshaj li ib lab qhov URLs tshwj xeeb rau nws. Thiab lawv xav nrhiav tus qauv URL yooj yim uas cuam tshuam nrog lawv cov qauv kev lag luam.
Kev tshawb nrhiav ua ntej
Cia peb saib seb dab tsi tshwm sim hauv lub database. Hauv qab no yog thawj qeeb SQL query:
SELECT
count(*)
FROM
acc_{account_id}.urls as recordings_urls,
acc_{account_id}.recording_data as recording_data,
acc_{account_id}.sessions as sessions
WHERE
recording_data.usp_id = sessions.usp_id
AND sessions.referrer_id = recordings_urls.id
AND ( urls && array(select id from acc_{account_id}.urls where url ILIKE '%enterprise_customer.com/jobs%')::text[] )
AND r_time > to_timestamp(1542585600)
AND r_time < to_timestamp(1545177599)
AND recording_data.duration >=5
AND recording_data.num_of_pages > 0 ;
Thiab ntawm no yog lub sijhawm:
Lub sijhawm npaj: 1.480 ms Lub sijhawm ua tiav: 1431924.650 ms
urls: Kom tsis txhob duplicating URLs loj heev, peb khaws cia rau hauv ib lub rooj sib cais.
Tsis tas li ntawd nco ntsoov tias tag nrho peb cov ntxhuav twb muab faib los ntawm account_id. Txoj kev no, qhov xwm txheej uas ib tus account loj tshwj xeeb ua rau muaj teeb meem rau lwm tus raug cais tawm.
urls && array(
select id from acc_{account_id}.urls
where url ILIKE '%enterprise_customer.com/jobs%'
)::text[]
Thawj qhov kev xav yog tej zaum vim ILIKE ntawm tag nrho cov URLs ntev no (peb muaj ntau dua 1,4 lab txawv URLs sau rau tus account no) kev ua tau zoo yuav raug kev txom nyem.
SELECT id FROM urls WHERE url ILIKE '%enterprise_customer.com/jobs%';
id
--------
...
(198661 rows)
Time: 5231.765 ms
Kev tshawb nrhiav template nws tus kheej siv sijhawm 5 vib nas this xwb. Kev tshawb nrhiav tus qauv hauv ib lab tus URLs tshwj xeeb tsis yog qhov teeb meem.
Tus neeg phem tom ntej ntawm daim ntawv teev npe yog ob peb JOIN. Tej zaum lawv txoj kev siv ntau dhau tau ua rau qeeb? Feem ntau JOIN's yog cov neeg sib tw pom tseeb tshaj plaws rau cov teeb meem kev ua tau zoo, tab sis kuv tsis ntseeg tias peb cov ntaub ntawv yog qhov raug.
analytics_db=# SELECT
count(*)
FROM
acc_{account_id}.urls as recordings_urls,
acc_{account_id}.recording_data_0 as recording_data,
acc_{account_id}.sessions_0 as sessions
WHERE
recording_data.usp_id = sessions.usp_id
AND sessions.referrer_id = recordings_urls.id
AND r_time > to_timestamp(1542585600)
AND r_time < to_timestamp(1545177599)
AND recording_data.duration >=5
AND recording_data.num_of_pages > 0 ;
count
-------
8086
(1 row)
Time: 147.851 ms
Thiab qhov no kuj tsis yog peb li. JOIN's tig tawm sai heev.
Narrowing cia lub voj voog ntawm cov neeg raug liam
Kuv tau npaj pib hloov cov lus nug kom ua tiav qhov kev txhim kho kev ua tau zoo. Kuv thiab kuv tau tsim 2 lub tswv yim tseem ceeb:
Siv EXISTS rau subquery URL: Peb xav rov kuaj dua yog tias muaj teeb meem nrog cov lus nug ntxiv rau URLs. Ib txoj hauv kev kom ua tiav qhov no yog siv yooj yim EXISTS. EXISTStau zoo heev txhim kho kev ua tau zoo txij li thaum nws xaus tam sim ntawd sai li sai tau thaum nws pom cov hlua nkaus xwb uas phim tus mob.
SELECT
count(*)
FROM
acc_{account_id}.urls as recordings_urls,
acc_{account_id}.recording_data as recording_data,
acc_{account_id}.sessions as sessions
WHERE
recording_data.usp_id = sessions.usp_id
AND ( 1 = 1 )
AND sessions.referrer_id = recordings_urls.id
AND (exists(select id from acc_{account_id}.urls where url ILIKE '%enterprise_customer.com/jobs%'))
AND r_time > to_timestamp(1547585600)
AND r_time < to_timestamp(1549177599)
AND recording_data.duration >=5
AND recording_data.num_of_pages > 0 ;
count
32519
(1 row)
Time: 1636.637 ms
Zoo, yog. Subquery thaum qhwv hauv EXISTS, ua txhua yam ceev ceev. Cov lus nug tom ntej yog vim li cas qhov kev thov nrog JOIN-ami thiab subquery nws tus kheej yog ceev ceev ib tus zuj zus, tab sis puas qeeb ua ke?
WITH matching_urls AS (
select id::text from acc_{account_id}.urls where url ILIKE '%enterprise_customer.com/jobs%'
)
SELECT
count(*) FROM acc_{account_id}.urls as recordings_urls,
acc_{account_id}.recording_data as recording_data,
acc_{account_id}.sessions as sessions,
matching_urls
WHERE
recording_data.usp_id = sessions.usp_id
AND ( 1 = 1 )
AND sessions.referrer_id = recordings_urls.id
AND (urls && array(SELECT id from matching_urls)::text[])
AND r_time > to_timestamp(1542585600)
AND r_time < to_timestamp(1545107599)
AND recording_data.duration >=5
AND recording_data.num_of_pages > 0;
Tab sis nws tseem qeeb heev.
Nrhiav tus neeg ua txhaum
Txhua lub sijhawm no, ib qho me me flashed ua ntej kuv ob lub qhov muag, uas kuv niaj hnub txhuam ib sab. Tab sis vim tsis muaj dab tsi ntxiv lawm, kuv txiav txim siab saib nws thiab. Kuv hais txog && tus neeg ua haujlwm. Bye EXISTS tsuas yog txhim kho kev ua tau zoo && tsuas yog qhov tseem ceeb tshaj plaws nyob rau hauv tag nrho cov versions ntawm cov lus nug qeeb.
Filter: ((urls && ($0)::text[]) AND (r_time > '2018-12-17 12:17:23+00'::timestamp with time zone) AND (r_time < '2018-12-18 23:59:59+00'::timestamp with time zone) AND (duration >= '5'::double precision) AND (num_of_pages > 0))
Rows Removed by Filter: 52710
Muaj ob peb kab ntawm cov ntxaij lim dej nkaus xwb los ntawm &&. Qhov no txhais tau tias qhov kev ua haujlwm no tsis yog kim xwb, tab sis kuj ua tau ntau zaus.
Kuv sim qhov no los ntawm kev cais tus mob
SELECT 1
FROM
acc_{account_id}.urls as recordings_urls,
acc_{account_id}.recording_data_30 as recording_data_30,
acc_{account_id}.sessions_30 as sessions_30
WHERE
urls && array(select id from acc_{account_id}.urls where url ILIKE '%enterprise_customer.com/jobs%')::text[]
Cov lus nug no qeeb. Vim lub JOIN-s yog ceev thiab subqueries yog ceev, qhov tsuas yog qhov uas tshuav yog && tus neeg ua haujlwm.
Qhov no tsuas yog ib qho haujlwm tseem ceeb xwb. Peb ib txwm yuav tsum tshawb nrhiav tag nrho cov lus hauv qab ntawm URLs los tshawb nrhiav tus qauv, thiab peb ib txwm yuav tsum nrhiav kev sib tshuam. Peb tsis tuaj yeem tshawb nrhiav los ntawm URL cov ntaub ntawv ncaj qha, vim tias cov no tsuas yog IDs xa mus urls.
Ntawm txoj kev mus rau kev daws teeb meem
&& qeeb vim ob qhov teeb meem loj. Kev ua haujlwm yuav ceev heev yog tias kuv hloov urls rau { "http://google.com/", "http://wingify.com/" }.
Thaum kawg, peb txiav txim siab los daws qhov teeb meem hauv kev sib cais: muab txhua yam rau kuv urls cov kab uas URL phim tus qauv. Yog tsis muaj cov xwm txheej ntxiv nws yuav yog -
SELECT urls.url
FROM
acc_{account_id}.urls as urls,
(SELECT unnest(recording_data.urls) AS id) AS unrolled_urls
WHERE
urls.id = unrolled_urls.id AND
urls.url ILIKE '%jobs%'
Qhov tseem ceeb tshaj plaws ntawm no yog qhov ntawd && siv los xyuas seb qhov kev nkag tau muaj qhov sib txuam URL. Yog tias koj squint me ntsis, koj tuaj yeem pom cov haujlwm no txav mus los ntawm cov ntsiab lus ntawm ib qho array (los yog kab ntawm lub rooj) thiab nres thaum muaj xwm txheej (muab sib tw). Tsis nco koj txog dab tsi? Yog lawm, EXISTS.
Txij li thaum recording_data.urls tuaj yeem hais los ntawm sab nraud ntawm cov ntsiab lus subquery, thaum qhov no tshwm sim peb tuaj yeem poob rov qab rau peb tus phooj ywg qub EXISTS thiab qhwv lub subquery nrog nws.
Muab txhua yam ua ke, peb tau txais cov lus nug zoo kawg nkaus:
SELECT
count(*)
FROM
acc_{account_id}.urls as recordings_urls,
acc_{account_id}.recording_data as recording_data,
acc_{account_id}.sessions as sessions
WHERE
recording_data.usp_id = sessions.usp_id
AND ( 1 = 1 )
AND sessions.referrer_id = recordings_urls.id
AND r_time > to_timestamp(1542585600)
AND r_time < to_timestamp(1545177599)
AND recording_data.duration >=5
AND recording_data.num_of_pages > 0
AND EXISTS(
SELECT urls.url
FROM
acc_{account_id}.urls as urls,
(SELECT unnest(urls) AS rec_url_id FROM acc_{account_id}.recording_data)
AS unrolled_urls
WHERE
urls.id = unrolled_urls.rec_url_id AND
urls.url ILIKE '%enterprise_customer.com/jobs%'
);
Thiab lub sijhawm ua ntej kawg Time: 1898.717 ms Lub sij hawm ua kev zoo siab?!?
Tsis ceev! Ua ntej koj yuav tsum xyuas qhov tseeb. Kuv twb tsis tshua muaj neeg suspicions txog EXISTS optimization raws li nws hloov lub logic kom tiav ua ntej. Peb yuav tsum nco ntsoov tias peb tsis tau ntxiv qhov yuam kev tsis meej rau qhov kev thov.
Ib qho kev sim yooj yim yog khiav count(*) ntawm ob qho tib si qeeb thiab ceev cov lus nug rau ntau cov ntaub ntawv sib txawv. Tom qab ntawd, rau ib qho me me ntawm cov ntaub ntawv, kuv manually txheeb xyuas tias tag nrho cov txiaj ntsig tau raug.
Txhua qhov kev ntsuam xyuas tau muab cov txiaj ntsig zoo tas li. Peb kho txhua yam!
Cov Lus Qhia Kawm
Muaj ntau ntau zaj lus qhia los ntawm zaj dab neeg no:
Cov phiaj xwm nug tsis qhia tag nrho zaj dab neeg, tab sis lawv tuaj yeem muab cov lus qhia
Tsis yog txhua qhov kev ua kom zoo tshaj yog txo qis hauv qhov xwm txheej
Siv EXIST, yog qhov ua tau, tuaj yeem ua rau muaj kev nce ntxiv hauv kev tsim khoom
xaus
Peb tau mus los ntawm lub sijhawm nug ntawm ~ 24 feeb mus rau 2 vib nas this - qhov ua tau zoo heev! Txawm hais tias tsab xov xwm no tawm los loj, tag nrho cov kev sim peb tau tshwm sim hauv ib hnub, thiab nws tau kwv yees tias lawv siv sijhawm li 1,5 mus rau 2 teev rau kev ua kom zoo thiab sim.
SQL yog ib hom lus zoo yog tias koj tsis ntshai nws, tab sis sim kawm thiab siv nws. Los ntawm kev nkag siab zoo txog yuav ua li cas SQL cov lus nug raug tua, yuav ua li cas cov ntaub ntawv tsim cov lus nug, yuav ua li cas indexes ua haujlwm, thiab tsuas yog qhov loj ntawm cov ntaub ntawv koj tab tom cuam tshuam nrog, koj tuaj yeem ua tau zoo heev ntawm optimizing queries. Nws yog ib qho tseem ceeb sib npaug, txawm li cas los xij, txuas ntxiv mus sim ntau txoj hauv kev thiab maj mam rhuav tshem qhov teeb meem, nrhiav cov fwj.
Qhov zoo tshaj plaws txog kev ua tiav cov txiaj ntsig zoo li no yog qhov pom tau, pom kev txhim kho nrawm - qhov twg daim ntawv tshaj tawm uas yav dhau los yuav tsis txawm thauj khoom tam sim no yuav luag tam sim ntawd.
Ua tsaug tshwj xeeb rau kuv cov phooj ywg ntawm Aditya Mishra hais kom ua, Aditya Gauru ΠΈ Varun Malhotra rau txoj kev xav thiab Dinkar Pandir: koj puas xav tau ntau tus thwjtim? txhawm rau nrhiav qhov yuam kev tseem ceeb hauv peb qhov kev thov zaum kawg ua ntej peb thaum kawg tau hais lus zoo rau nws!