Ko te korero mo tetahi rangahau SQL

I te Hakihea kua pahure ake nei i whiwhi ahau i tetahi purongo pepeha whakamere mai i te roopu tautoko VWO. Ko te wa uta mo tetahi o nga purongo tātaritanga mo tetahi kaihoko umanga nui he ahua aukati. Na i te mea koinei taku waahi kawenga, ka aro tonu ahau ki te whakaoti rapanga.

prehistory

Kia marama ai taku korero, ka korero au ki a koe mo te VWO. He turanga tenei e taea ai e koe te whakarewa i nga momo kaupapa whakatairanga i runga i o paetukutuku: whakahaere i nga whakamatautau A/B, whai i nga manuhiri me nga huringa, tātarihia te kohanga hoko, whakaatu mapi wera me te purei i nga rekoata haerenga.

Engari ko te mea nui mo te papaaho ko te purongo. Ko nga mahi katoa i runga ake nei he hononga hono. A, mo nga kaihoko umanga, he nui nga korero ka kore noa e whai hua ki te kore he papaa kaha e whakaatu ana i roto i te ahua tātaritanga.

Ma te whakamahi i te papaahi, ka taea e koe te hanga patai matapōkere i runga i te huinga raraunga nui. Anei he tauira ngawari:

Whakaatuhia nga patene katoa ki te wharangi "abc.com" MAI i <te ra d1> ki te <te ra d2> mo nga tangata i whakamahi Chrome OR (kei Uropi me te iPhone)

Kia tupato ki nga kaiwhakahaere Boolean. E waatea ana ki nga kiritaki i roto i te atanga patai ki te hanga patai uaua ki te tiki tauira.

tono puhoi

Kei te ngana te kaihoko ki te mahi i tetahi mea kia tere te mahi:

Whakaatuhia nga rekoata hui katoa mo nga kaiwhakamahi i toro ki tetahi wharangi me te URL kei roto "/mahi"

He maha nga waka o tenei papaanga, a, neke atu i te kotahi miriona nga URL ahurei e rongoa ana matou mo tera. A i hiahia ratou ki te rapu tauira URL ngawari e pa ana ki a raatau tauira pakihi.

Te tūhuratanga tuatahi

Kia titiro tatou ki nga mahi kei roto i te paataka raraunga. Kei raro ko te patai SQL puhoi taketake:

SELECT 
    count(*) 
FROM 
    acc_{account_id}.urls as recordings_urls, 
    acc_{account_id}.recording_data as recording_data, 
    acc_{account_id}.sessions as sessions 
WHERE 
    recording_data.usp_id = sessions.usp_id 
    AND sessions.referrer_id = recordings_urls.id 
    AND  (  urls &&  array(select id from acc_{account_id}.urls where url  ILIKE  '%enterprise_customer.com/jobs%')::text[]   ) 
    AND r_time > to_timestamp(1542585600) 
    AND r_time < to_timestamp(1545177599) 
    AND recording_data.duration >=5 
    AND recording_data.num_of_pages > 0 ;

Anei nga wa:

Wā kua whakamaheretia: 1.480 ms Te wa mahi: 1431924.650 ms

150 mano nga rarangi ngoki te patai. I whakaatuhia e te kaiwhakatakoto patai etahi korero whakamere, engari karekau he maatanga pounamu.

Me ako ano te tono. Mai ta outou e ite ra, te na reira ra oia JOIN e toru nga tepu:

  1. : ki te whakaatu i nga korero mo te huihuinga: kaitirotiro, kaihoko kaiwhakamahi, whenua, me era atu.
  2. record_data: nga URL kua tuhia, nga wharangi, te roa o nga haerenga
  3. urino: Hei karo i te tāruarua i nga URL tino nui, ka penapenahia e matou ki tetahi ripanga motuhake.

Kia mahara ano kua wehewehea katoa a maatau teepu e account_id. Ma tenei ara, ka whakakorehia te ahuatanga e raru ai tetahi kaute nui mo etahi atu.

Kei te rapu tohu

Ki te ata tirotirohia, ka kite tatou kei te he tetahi tono. He mea tika kia ata tirohia tenei rarangi:

urls && array(
	select id from acc_{account_id}.urls 
	where url  ILIKE  '%enterprise_customer.com/jobs%'
)::text[]

Ko te whakaaro tuatahi na te mea pea ILIKE i runga i enei URL roa katoa (neke atu i te 1,4 miriona ahurei Ko nga URL kua kohia mo tenei kaute) ka raru pea te mahi.

Engari kaore, ehara i te mea!

SELECT id FROM urls WHERE url ILIKE '%enterprise_customer.com/jobs%';
  id
--------
 ...
(198661 rows)

Time: 5231.765 ms

E 5 hēkona anake te tono rapu tauira. Ko te rapu tauira i roto i te miriona URL ahurei kaore he raru.

Ko te whakapae i muri mai i te rarangi he maha JOIN. Tena pea na to ratou whakamahinga nui i paheke ai? Ko te tikanga JOINKo nga kaitono e tino kitea ana mo nga raruraru mahi, engari kaore au i whakapono he ahua noa to taatau keehi.

analytics_db=# SELECT
    count(*)
FROM
    acc_{account_id}.urls as recordings_urls,
    acc_{account_id}.recording_data_0 as recording_data,
    acc_{account_id}.sessions_0 as sessions
WHERE
    recording_data.usp_id = sessions.usp_id
    AND sessions.referrer_id = recordings_urls.id
    AND r_time > to_timestamp(1542585600)
    AND r_time < to_timestamp(1545177599)
    AND recording_data.duration >=5
    AND recording_data.num_of_pages > 0 ;
 count
-------
  8086
(1 row)

Time: 147.851 ms

A ehara hoki tenei i a matou. JOINKua tino tere.

Te whakaiti i te porowhita o te hunga whakapae

Kua reri ahau ki te huri i te patai kia tutuki ai nga whakapainga mahi ka taea. I whakawhanakehia e maua ko taku roopu e rua nga whakaaro matua:

  • Whakamahia te EXISTS mo te URL patai: I hiahia matou ki te tirotiro ano mena he raru kei te patai mo nga URL. Ko tetahi huarahi ki te whakatutuki i tenei ko te whakamahi noa EXISTS. EXISTS taea ka tino pai ake te mahi na te mea ka mutu tonu ina kitea te aho anake e rite ana ki te ahuatanga.

SELECT
	count(*) 
FROM 
    acc_{account_id}.urls as recordings_urls,
    acc_{account_id}.recording_data as recording_data,
    acc_{account_id}.sessions as sessions
WHERE
    recording_data.usp_id = sessions.usp_id
    AND  (  1 = 1  )
    AND sessions.referrer_id = recordings_urls.id
    AND  (exists(select id from acc_{account_id}.urls where url  ILIKE '%enterprise_customer.com/jobs%'))
    AND r_time > to_timestamp(1547585600)
    AND r_time < to_timestamp(1549177599)
    AND recording_data.duration >=5
    AND recording_data.num_of_pages > 0 ;
 count
 32519
(1 row)
Time: 1636.637 ms

Ae, ae. Uiui ka takaia ki roto EXISTS, ka tere nga mea katoa. Ko te patai arorau e whai ake nei he aha te tono me JOIN-ami me te uiuinga ake he tere takitahi, engari he tino puhoi tahi?

  • Te neke i te patai ki te CTE : Mēnā he tere te uiui i a ia ano, ka taea e taatau te tatau i te hua tere i te tuatahi ka tukuna ki te patai matua

WITH matching_urls AS (
    select id::text from acc_{account_id}.urls where url  ILIKE  '%enterprise_customer.com/jobs%'
)

SELECT 
    count(*) FROM acc_{account_id}.urls as recordings_urls, 
    acc_{account_id}.recording_data as recording_data, 
    acc_{account_id}.sessions as sessions,
    matching_urls
WHERE 
    recording_data.usp_id = sessions.usp_id 
    AND  (  1 = 1  )  
    AND sessions.referrer_id = recordings_urls.id
    AND (urls && array(SELECT id from matching_urls)::text[])
    AND r_time > to_timestamp(1542585600) 
    AND r_time < to_timestamp(1545107599)
    AND recording_data.duration >=5 
    AND recording_data.num_of_pages > 0;

Engari i tino puhoi tonu.

Te kimi i te tangata hara

I enei wa katoa, kotahi te mea iti i whiti mai i mua i oku kanohi, ka pehia e au ki te taha. Engari i te mea kaore he mea ke atu, ka whakatau ahau ki te titiro ano ki a ia. Kei te korero ahau && kaiwhakahaere. Kia ora EXISTS kua pai ake te mahi && ko te mea noa e toe ana puta noa i nga putanga katoa o te patai puhoi.

Ka titiro ki tuhinga, ka kite tatou i tera && whakamahia ina hiahia koe ki te kimi huānga noa i waenga i nga huānga e rua.

I roto i te tono taketake ko:

AND  (  urls &&  array(select id from acc_{account_id}.urls where url  ILIKE  '%enterprise_customer.com/jobs%')::text[]   )

Ko te tikanga ka mahi maatau i te rapu tauira i runga i o maatau URL, katahi ka kimihia te whakawhitinga me nga URL katoa me nga pou noa. He ahua rangirua tenei na te mea kaore nga "urls" i konei e tohu ana ki te ripanga kei roto nga URL katoa, engari ki te "urls" tīwae i te ripanga recording_data.

Me te tipu haere o nga whakapae mo &&, I ngana ahau ki te kimi whakau mo ratou i roto i te mahere patai i hangaia EXPLAIN ANALYZE (Kei a au he mahere kua tiakina, engari he pai ake ahau ki te whakamatau i te SQL i te ngana ki te mohio ki te opacity o nga kaiwhakatakoto uiui).

Filter: ((urls && ($0)::text[]) AND (r_time > '2018-12-17 12:17:23+00'::timestamp with time zone) AND (r_time < '2018-12-18 23:59:59+00'::timestamp with time zone) AND (duration >= '5'::double precision) AND (num_of_pages > 0))
                           Rows Removed by Filter: 52710

He maha nga rarangi whiriwhiringa mai i &&. Ko te tikanga ko tenei mahi ehara i te utu noa, engari he maha nga wa i mahia.

I whakamatauria e au tenei ma te wehe i te ahuatanga

SELECT 1
FROM 
    acc_{account_id}.urls as recordings_urls, 
    acc_{account_id}.recording_data_30 as recording_data_30, 
    acc_{account_id}.sessions_30 as sessions_30 
WHERE 
	urls &&  array(select id from acc_{account_id}.urls where url  ILIKE  '%enterprise_customer.com/jobs%')::text[]

He puhoi tenei patai. No te mea ko te JOIN-he tere, he tere hoki nga paatai, ko te mea anake i toe && kaiwhakahaere.

He mahi matua noa tenei. Me rapu i nga wa katoa te ripanga o raro o nga URL ki te rapu tauira, me rapu tonu tatou i nga waahi. Kaore e taea te rapu tika ma nga rekoata URL, na te mea he ID noa enei urls.

I te huarahi ki te otinga

&& puhoi no te mea he nui nga huinga e rua. Ka tere te mahi ki te whakakapi ahau urls i runga i { "http://google.com/", "http://wingify.com/" }.

I tiimata ahau ki te rapu huarahi ki te whakarite i te whakawhitinga ki Postgres me te kore whakamahi &&, engari kaore i tino angitu.

I te mutunga, i whakatau matou ki te whakaoti noa i te raru: homai nga mea katoa ki ahau urls raina e rite ana te URL ki te tauira. Ki te kore he tikanga taapiri ka - 

SELECT urls.url
FROM 
	acc_{account_id}.urls as urls,
	(SELECT unnest(recording_data.urls) AS id) AS unrolled_urls
WHERE
	urls.id = unrolled_urls.id AND
	urls.url  ILIKE  '%jobs%'

Tuhinga o mua JOIN syntax I whakamahia noa e ahau he patai me te whakawhānui ake recording_data.urls huānga kia taea e koe te tono tika i te ahuatanga ki roto WHERE.

Ko te mea nui i konei ko tera && whakamahia ki te taki mēnā kei roto i tētahi tāurunga he URL ōrite. Mēnā he paku titiro koe, ka kite koe e neke haere ana tēnei mahi i roto i ngā huānga o te huānga (ranei rarangi o te tepu) ka mutu ina tutuki tetahi tikanga (taurite). Kaore koe e whakamahara ki tetahi mea? Ae, EXISTS.

Mai i muri mai recording_data.urls Ka taea te tohutoro mai i waho o te horopaki patai, ka pa ana tenei ka hoki ano tatou ki to tatou hoa tawhito EXISTS ka takai i te patai ki a ia.

Ma te whakakotahi i nga mea katoa, ka whiwhi tatou i te patai kua arotau whakamutunga:

SELECT 
    count(*) 
FROM 
    acc_{account_id}.urls as recordings_urls, 
    acc_{account_id}.recording_data as recording_data, 
    acc_{account_id}.sessions as sessions 
WHERE 
    recording_data.usp_id = sessions.usp_id 
    AND  (  1 = 1  )  
    AND sessions.referrer_id = recordings_urls.id 
    AND r_time > to_timestamp(1542585600) 
    AND r_time < to_timestamp(1545177599) 
    AND recording_data.duration >=5 
    AND recording_data.num_of_pages > 0
    AND EXISTS(
        SELECT urls.url
        FROM 
            acc_{account_id}.urls as urls,
            (SELECT unnest(urls) AS rec_url_id FROM acc_{account_id}.recording_data) 
            AS unrolled_urls
        WHERE
            urls.id = unrolled_urls.rec_url_id AND
            urls.url  ILIKE  '%enterprise_customer.com/jobs%'
    );

Na te wa arahi whakamutunga Time: 1898.717 ms Te wa ki te whakanui?!?

Ehara i te mea tere! Tuatahi me tirotiro koe i te tika. I tino whakapae ahau EXISTS arotautanga i te mea ka huri te arorau ki te whakamutu i mua. Me tino mohio tatou kaore ano kia taapirihia he hapa kore-kitea ki te tono.

He whakamatautau ngawari ko te whakahaere count(*) i runga i nga patai puhoi me te tere mo te maha o nga huinga raraunga rereke. Na, mo tetahi waahanga iti o nga raraunga, i whakamanahia e au he tika nga hua katoa.

Ko nga whakamatautau katoa i puta nga hua pai. I whakatikahia e matou nga mea katoa!

Nga Akoranga i Akona

He maha nga akoranga hei ako mai i tenei korero:

  1. Karekau nga mahere patai e whakaatu i te katoa o nga korero, engari ka taea e raatau te whakaatu tohu
  2. Ko nga whakapae matua ehara i nga wa katoa ko te tino hara
  3. Ka taea te wawahi i nga patai puhoi ki te wehe i nga kohungahunga
  4. Ehara i te mea he whakahekeheke nga arotautanga katoa
  5. Whakamahia EXIST, i te mea ka taea, ka nui haere te hua

mutunga

I haere matou mai i te wa uiui o ~24 meneti ki te 2 hēkona - he tino pikinga mahi! Ahakoa i puta nui tenei tuhinga, ko nga whakamatautau katoa i mahia e matou i roto i te ra kotahi, a ko te whakaaro he 1,5 ki te 2 haora te roa mo te arotautanga me te whakamatautau.

He reo whakamiharo a SQL ki te kore koe e mataku, engari me ngana ki te ako me te whakamahi. Ma te mohio pai ki te pehea e mahia ai nga patai SQL, me pehea te mahi a te papaaarangi i nga mahere uiui, me pehea te mahi o nga tohu, me te rahi noa o nga raraunga e pa ana ki a koe, ka tino angitu koe ki te arotau i nga patai. He mea nui ano, engari, ki te haere tonu ki te whakamatau i nga huarahi rereke me te ata wawahi i te raru, ki te kimi i nga kohanga.

Ko te mea pai mo te whakatutuki i nga hua penei ko te whakapainga tere e kitea ana - kei reira ka utaina te purongo i mua karekau e utaina inaianei.

Nga mihi nui ki a oku hoa i te whakahau a Aditya MishraAditya Gauru и Varun Malhotra mo te whakaaro whakaaro me te Dinkar Pandir mo te kimi hapa nui i roto i ta matou tono whakamutunga i mua i to matou poroporoaki!

Source: will.com

Tāpiri i te kōrero