Desember kepungkur aku nampa laporan bug sing menarik saka tim dhukungan VWO. Wektu loading kanggo salah sawijining laporan analytics kanggo klien perusahaan gedhe katon nglarang. Lan amarga iki minangka tanggung jawabku, aku langsung fokus kanggo ngrampungake masalah kasebut.
prasejarah
Kanggo nggawe cetha apa aku ngomong bab, Aku bakal pitutur marang kowe sethitik babagan VWO. Iki minangka platform sing bisa digunakake kanggo ngluncurake macem-macem kampanye sing ditargetake ing situs web sampeyan: nindakake eksperimen A/B, nglacak pengunjung lan konversi, nganalisa corong penjualan, nampilake peta panas lan muter rekaman kunjungan.
Nanging sing paling penting babagan platform yaiku nglaporake. Kabeh fungsi ing ndhuwur saling gegandhengan. Lan kanggo klien perusahaan, jumlah informasi sing akeh bakal dadi ora ana gunane tanpa platform kuat sing nyedhiyakake ing wangun analytics.
Nggunakake platform, sampeyan bisa nggawe pitakon acak ing set data gedhe. Punika conto prasaja:
Tampilake kabeh klik ing kaca "abc.com" Saka <tanggal d1> nganti <tanggal d2> kanggo wong sing nggunakake Chrome UTAWA (dumunung ing Eropa LAN nggunakake iPhone)
Klien sing dimaksud nyoba nindakake apa wae sing kudu ditindakake kanthi intuisi kanthi cepet:
Tampilake kabeh rekaman sesi kanggo pangguna sing ngunjungi kaca apa wae kanthi URL sing ngemot "/jobs"
Situs iki duwe ton lalu lintas lan kita nyimpen luwih saka yuta URL unik mung kanggo iku. Lan dheweke pengin golek template URL sing cukup prasaja sing ana hubungane karo model bisnis.
Penyelidikan awal
Ayo ndeleng apa sing kedadeyan ing basis data. Ing ngisor iki query SQL alon asli:
SELECT
count(*)
FROM
acc_{account_id}.urls as recordings_urls,
acc_{account_id}.recording_data as recording_data,
acc_{account_id}.sessions as sessions
WHERE
recording_data.usp_id = sessions.usp_id
AND sessions.referrer_id = recordings_urls.id
AND ( urls && array(select id from acc_{account_id}.urls where url ILIKE '%enterprise_customer.com/jobs%')::text[] )
AND r_time > to_timestamp(1542585600)
AND r_time < to_timestamp(1545177599)
AND recording_data.duration >=5
AND recording_data.num_of_pages > 0 ;
Lan iki wektu:
Wektu sing direncanakake: 1.480 ms Wektu eksekusi: 1431924.650 ms
Pitakonan nyusup 150 ewu larik. Perencana pitakon nuduhake sawetara rincian sing menarik, nanging ora ana kemacetan sing jelas.
Ayo sinau panjaluk kasebut luwih lanjut. Kaya sing sampeyan ngerteni, dheweke nindakake JOIN telung tabel:
mau: kanggo nampilake informasi sesi: browser, agen pangguna, negara, lan liya-liyane.
rekaman_data: URL sing direkam, kaca, durasi kunjungan
URL: Kanggo ngindhari duplikasi URL sing gedhe banget, kita simpen ing tabel sing kapisah.
Uga elinga yen kabeh tabel kita wis dipisahake dening account_id. Kanthi cara iki, kahanan ing ngendi siji akun gedhe banget nyebabake masalah kanggo wong liya ora kalebu.
Nggolek pitunjuk
Sawise dipriksa kanthi tliti, kita weruh ana sing salah karo panjaluk tartamtu. Iku worth njupuk dipikir nyedhaki baris iki:
urls && array(
select id from acc_{account_id}.urls
where url ILIKE '%enterprise_customer.com/jobs%'
)::text[]
Pikiran pisanan sing mbok menawa amarga ILIKE ing kabeh URL sing dawa iki (kita duwe luwih saka 1,4 yuta unik URL sing diklumpukake kanggo akun iki) kinerja bisa nandhang sangsara.
Nanging ora, iku dudu titik!
SELECT id FROM urls WHERE url ILIKE '%enterprise_customer.com/jobs%';
id
--------
...
(198661 rows)
Time: 5231.765 ms
Panjaluk telusuran cithakan mung butuh 5 detik. Nggoleki pola ing yuta URL unik jelas ora dadi masalah.
analytics_db=# SELECT
count(*)
FROM
acc_{account_id}.urls as recordings_urls,
acc_{account_id}.recording_data_0 as recording_data,
acc_{account_id}.sessions_0 as sessions
WHERE
recording_data.usp_id = sessions.usp_id
AND sessions.referrer_id = recordings_urls.id
AND r_time > to_timestamp(1542585600)
AND r_time < to_timestamp(1545177599)
AND recording_data.duration >=5
AND recording_data.num_of_pages > 0 ;
count
-------
8086
(1 row)
Time: 147.851 ms
Lan iki uga dudu kasus kita. JOIN's ternyata cukup cepet.
Narrowing mudhun bunder saka tersangka
Aku wis siyap miwiti ngganti pitakon kanggo entuk perbaikan kinerja sing bisa ditindakake. Aku lan tim ngembangake 2 gagasan utama:
Gunakake EXISTS kanggo URL subquery: Kita pengin mriksa maneh yen ana masalah karo subquery kanggo URL. Salah siji cara kanggo entuk iki mung nggunakake EXISTS. EXISTSbisa nemen nambah kinerja wiwit ends sanalika ketemu mung senar sing cocog kondisi.
SELECT
count(*)
FROM
acc_{account_id}.urls as recordings_urls,
acc_{account_id}.recording_data as recording_data,
acc_{account_id}.sessions as sessions
WHERE
recording_data.usp_id = sessions.usp_id
AND ( 1 = 1 )
AND sessions.referrer_id = recordings_urls.id
AND (exists(select id from acc_{account_id}.urls where url ILIKE '%enterprise_customer.com/jobs%'))
AND r_time > to_timestamp(1547585600)
AND r_time < to_timestamp(1549177599)
AND recording_data.duration >=5
AND recording_data.num_of_pages > 0 ;
count
32519
(1 row)
Time: 1636.637 ms
Pindhah subquery menyang CTE : Yen pitakon cepet dhewe, kita mung bisa ngetung asil cepet dhisik banjur menehi menyang pitakon utama
WITH matching_urls AS (
select id::text from acc_{account_id}.urls where url ILIKE '%enterprise_customer.com/jobs%'
)
SELECT
count(*) FROM acc_{account_id}.urls as recordings_urls,
acc_{account_id}.recording_data as recording_data,
acc_{account_id}.sessions as sessions,
matching_urls
WHERE
recording_data.usp_id = sessions.usp_id
AND ( 1 = 1 )
AND sessions.referrer_id = recordings_urls.id
AND (urls && array(SELECT id from matching_urls)::text[])
AND r_time > to_timestamp(1542585600)
AND r_time < to_timestamp(1545107599)
AND recording_data.duration >=5
AND recording_data.num_of_pages > 0;
Nanging isih alon banget.
Nggoleki pelakune
Kabeh wektu iki, siji bab cilik flashed sadurunge mripatku, kang terus-terusan disikat aside. Nanging amarga ora ana apa-apa liyane, aku mutusake kanggo ndeleng dheweke uga. Aku ngomong babagan && operator. Bye EXISTS mung apik kinerja && mung siji-sijine faktor umum sing isih ana ing kabeh versi pitakon alon.
nyawang dokumentasi, kita weruh sing && digunakake nalika sampeyan kudu nemokake unsur umum antarane rong susunan.
Ing panjalukan asli iki:
AND ( urls && array(select id from acc_{account_id}.urls where url ILIKE '%enterprise_customer.com/jobs%')::text[] )
Tegese kita nindakake telusuran pola ing URL kita, banjur temokake persimpangan karo kabeh URL kanthi kiriman umum. Iki rada mbingungake amarga "url" ing kene ora nuduhake tabel sing ngemot kabeh URL, nanging menyang kolom "url" ing tabel. recording_data.
Kanthi akeh anggepan babagan &&, Aku nyoba golek konfirmasi kanggo wong-wong mau ing rencana query kui EXPLAIN ANALYZE (Aku wis duwe rencana sing disimpen, nanging aku biasane luwih seneng eksperimen ing SQL tinimbang nyoba ngerti opacity saka perancang pitakon).
Filter: ((urls && ($0)::text[]) AND (r_time > '2018-12-17 12:17:23+00'::timestamp with time zone) AND (r_time < '2018-12-18 23:59:59+00'::timestamp with time zone) AND (duration >= '5'::double precision) AND (num_of_pages > 0))
Rows Removed by Filter: 52710
Ana sawetara baris saringan mung saka &&. Sing tegese operasi iki ora mung larang, nanging uga ditindakake kaping pirang-pirang.
Aku nyoba iki kanthi ngisolasi kondisi kasebut
SELECT 1
FROM
acc_{account_id}.urls as recordings_urls,
acc_{account_id}.recording_data_30 as recording_data_30,
acc_{account_id}.sessions_30 as sessions_30
WHERE
urls && array(select id from acc_{account_id}.urls where url ILIKE '%enterprise_customer.com/jobs%')::text[]
Pitakon iki alon. Amarga ing JOIN-s cepet lan subqueries cepet, mung bab kiwa ana && operator.
Iki mung operasi kunci. Kita mesthi kudu nggoleki kabeh tabel dhasar URL kanggo nggoleki pola, lan kita kudu tansah golek persimpangan. Kita ora bisa nelusuri kanthi cathetan URL langsung, amarga iki mung ID sing dirujuk urls.
Ing dalan kanggo solusi
&& alon amarga loro mranata ageng. Operasi bakal relatif cepet yen aku ngganti urls ing { "http://google.com/", "http://wingify.com/" }.
Aku miwiti nggoleki cara kanggo nggawe persimpangan ing Postgres tanpa nggunakake &&, nanging tanpa akeh sukses.
Pungkasane, kita mutusake kanggo ngrampungake masalah kasebut kanthi mandiri: wenehi kabeh urls baris sing URL cocog karo pola. Tanpa syarat tambahan iku bakal -
SELECT urls.url
FROM
acc_{account_id}.urls as urls,
(SELECT unnest(recording_data.urls) AS id) AS unrolled_urls
WHERE
urls.id = unrolled_urls.id AND
urls.url ILIKE '%jobs%'
Tinimbang JOIN sintaks aku mung nggunakake subquery lan ditambahi recording_data.urls array supaya sampeyan bisa langsung ngetrapake kondisi kasebut ing WHERE.
Sing paling penting ing kene yaiku && digunakake kanggo mriksa apa entri diwenehi ngemot URL sing cocog. Yen sampeyan squint sethitik, sampeyan bisa ndeleng operasi iki gerakane liwat unsur Uploaded (utawa larik saka meja) lan mandheg nalika kondisi (cocog) ketemu. Apa ora ngelingake sampeyan apa-apa? ya, EXISTS.
Wiwit ing recording_data.urls bisa dirujuk saka njaba konteks subquery, nalika kedadeyan kasebut, kita bisa bali menyang kanca lawas EXISTS lan mbungkus subquery karo.
Nggabungake kabeh, kita entuk pitakon sing dioptimalake pungkasan:
SELECT
count(*)
FROM
acc_{account_id}.urls as recordings_urls,
acc_{account_id}.recording_data as recording_data,
acc_{account_id}.sessions as sessions
WHERE
recording_data.usp_id = sessions.usp_id
AND ( 1 = 1 )
AND sessions.referrer_id = recordings_urls.id
AND r_time > to_timestamp(1542585600)
AND r_time < to_timestamp(1545177599)
AND recording_data.duration >=5
AND recording_data.num_of_pages > 0
AND EXISTS(
SELECT urls.url
FROM
acc_{account_id}.urls as urls,
(SELECT unnest(urls) AS rec_url_id FROM acc_{account_id}.recording_data)
AS unrolled_urls
WHERE
urls.id = unrolled_urls.rec_url_id AND
urls.url ILIKE '%enterprise_customer.com/jobs%'
);
Lan wektu timbal pungkasan Time: 1898.717 ms Wektu kanggo ngrayakake?!?
A test prasaja kanggo mbukak count(*) ing pitakon alon lan cepet kanggo pirang-pirang set data sing beda. Banjur, kanggo subset cilik data, aku verifikasi kanthi manual yen kabeh asil bener.
Kabeh tes menehi asil positif kanthi konsisten. Kita ndandani kabeh!
Piwulang sing Disinaoni
Kathah piwulang ingkang saged dipunpendhet saking cariyos punika:
Rencana pitakon ora nyritakake kabeh, nanging bisa menehi pitunjuk
Tersangka utama ora mesthi dadi pelaku nyata
Pitakonan alon bisa dipecah kanggo ngisolasi bottlenecks
Ora kabeh optimasi sifate reduktif
Gunakake EXIST, yen bisa, bisa nyebabake peningkatan produktivitas sing dramatis
kesimpulan
Kita pindhah saka wektu pitakon ~ 24 menit dadi 2 detik - peningkatan kinerja sing cukup signifikan! Sanajan artikel iki metu gedhe, kabeh eksperimen sing ditindakake sajrone sedina, lan kira-kira butuh 1,5 nganti 2 jam kanggo optimasi lan tes.
SQL minangka basa sing apik yen sampeyan ora wedi, nanging coba sinau lan gunakake. Kanthi duwe pangerten sing apik babagan carane query SQL dieksekusi, carane database nggawe rencana query, carane indeks bisa digunakake, lan mung ukuran data sing sampeyan tindakake, sampeyan bisa sukses banget ing ngoptimalake pitakon. Nanging, penting banget kanggo terus nyoba pendekatan sing beda-beda lan alon-alon ngilangi masalah kasebut, nemokake kemacetan.
Sisih paling apik babagan entuk asil kaya iki yaiku paningkatan kacepetan sing katon lan katon - ing ngendi laporan sing sadurunge ora bakal dimuat saiki meh langsung dimuat.
Matur nuwun khusus kanggo kancaku ing dhawuhe Aditya Mishra, Aditya Gauru ΠΈ Varun Malhotra kanggo brainstorming lan Dinkar Pandir kanggo nemokake kesalahan penting ing panjalukan pungkasan kita sadurunge pungkasanipun pamit!