๋ฐ์ดํ„ฐ ๊ตฌ์ถ• ๋„๊ตฌ ๋˜๋Š” ๋ฐ์ดํ„ฐ ์›จ์–ดํ•˜์šฐ์Šค์™€ ์Šค๋ฌด๋””์˜ ๊ณตํ†ต์ 

๋ฐ์ดํ„ฐ ๊ตฌ์ถ• ๋„๊ตฌ ๋˜๋Š” ๋ฐ์ดํ„ฐ ์›จ์–ดํ•˜์šฐ์Šค์™€ ์Šค๋ฌด๋””์˜ ๊ณตํ†ต์ 
์ด์ƒ์ ์ธ ๋ฐ์ดํ„ฐ ์›จ์–ดํ•˜์šฐ์Šค๋Š” ์–ด๋–ค ์›์น™์— ๋”ฐ๋ผ ๊ตฌ์ถ•๋˜๋‚˜์š”?

์ƒ์šฉ๊ตฌ ์ฝ”๋“œ๊ฐ€ ์—†๋Š” ๊ฒฝ์šฐ ๋น„์ฆˆ๋‹ˆ์Šค ๊ฐ€์น˜์™€ ๋ถ„์„์— ์ง‘์ค‘ํ•˜์„ธ์š”. DWH๋ฅผ ์ฝ”๋“œ๋ฒ ์ด์Šค๋กœ ๊ด€๋ฆฌ: ๋ฒ„์ „ ๊ด€๋ฆฌ, ๊ฒ€ํ† , ์ž๋™ํ™”๋œ ํ…Œ์ŠคํŠธ ๋ฐ CI. ๋ชจ๋“ˆ์‹, ํ™•์žฅ ๊ฐ€๋Šฅํ•œ ์˜คํ”ˆ ์†Œ์Šค ๋ฐ ์ปค๋ฎค๋‹ˆํ‹ฐ์ž…๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž ์นœํ™”์ ์ธ ๋ฌธ์„œํ™” ๋ฐ ์ข…์†์„ฑ ์‹œ๊ฐํ™”(Data Lineage).

์ด์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ๊ณผ ๋น… ๋ฐ์ดํ„ฐ ๋ฐ ๋ถ„์„ ์ƒํƒœ๊ณ„์—์„œ DBT์˜ ์—ญํ• ์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์•Œ์•„๋ณด์„ธ์š”. cat์— ์˜ค์‹  ๊ฒƒ์„ ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค.

์•ˆ๋…•ํ•˜์„ธ์š”.

Artemy Kozyr๋‹˜์ด ์—ฐ๋ฝ๋“œ๋ฆฝ๋‹ˆ๋‹ค. ์ €๋Š” 5๋…„ ๋„˜๊ฒŒ ๋ฐ์ดํ„ฐ ์›จ์–ดํ•˜์šฐ์Šค ์ž‘์—…, ETL/ELT ๊ตฌ์ถ•, ๋ฐ์ดํ„ฐ ๋ถ„์„ ๋ฐ ์‹œ๊ฐํ™” ์ž‘์—…์„ ํ•ด์™”์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ํ˜„์žฌ ์ผํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค Wheely, ์ €๋Š” OTUS์—์„œ ๊ฐ•์ขŒ๋ฅผ ๊ฐ€๋ฅด์น˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด, ๊ทธ๋ฆฌ๊ณ  ์˜ค๋Š˜์€ ์‹œ์ž‘์„ ๊ธฐ๋Œ€ํ•˜๋ฉฐ ์ œ๊ฐ€ ์“ด ๊ธ€์„ ์—ฌ๋Ÿฌ๋ถ„๊ณผ ๊ณต์œ ํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. ๊ณผ์ •์— ๋Œ€ํ•œ ์‹ ๊ทœ ๋“ฑ๋ก.

๊ฐœ์š”

DBT ํ”„๋ ˆ์ž„์›Œํฌ๋Š” ELT(Extract - Transform - Load) ์•ฝ์–ด์˜ T์— ๊ด€ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

BigQuery, Redshift, Snowflake์™€ ๊ฐ™์€ ์ƒ์‚ฐ์ ์ด๊ณ  ํ™•์žฅ ๊ฐ€๋Šฅํ•œ ๋ถ„์„ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์˜ ์ถœํ˜„์œผ๋กœ ์ธํ•ด ๋ฐ์ดํ„ฐ ์›จ์–ดํ•˜์šฐ์Šค ์™ธ๋ถ€์—์„œ ๋ณ€ํ™˜์„ ์ˆ˜ํ–‰ํ•  ํ•„์š”๊ฐ€ ์—†๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. 

DBT๋Š” ์†Œ์Šค์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค์šด๋กœ๋“œํ•˜์ง€ ์•Š์ง€๋งŒ ์ €์žฅ์†Œ(๋‚ด๋ถ€ ๋˜๋Š” ์™ธ๋ถ€ ์ €์žฅ์†Œ)์— ์ด๋ฏธ ๋กœ๋“œ๋œ ๋ฐ์ดํ„ฐ๋กœ ์ž‘์—…ํ•  ์ˆ˜ ์žˆ๋Š” ์ข‹์€ ๊ธฐํšŒ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ ๊ตฌ์ถ• ๋„๊ตฌ ๋˜๋Š” ๋ฐ์ดํ„ฐ ์›จ์–ดํ•˜์šฐ์Šค์™€ ์Šค๋ฌด๋””์˜ ๊ณตํ†ต์ 
DBT์˜ ์ฃผ์š” ๋ชฉ์ ์€ ์ฝ”๋“œ๋ฅผ ๊ฐ€์ ธ์™€ SQL๋กœ ์ปดํŒŒ์ผํ•˜๊ณ  ๋ฆฌํฌ์ง€ํ† ๋ฆฌ์—์„œ ์˜ฌ๋ฐ”๋ฅธ ์ˆœ์„œ๋กœ ๋ช…๋ น์„ ์‹คํ–‰ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

DBT ํ”„๋กœ์ ํŠธ ๊ตฌ์กฐ

ํ”„๋กœ์ ํŠธ๋Š” ๋‹ค์Œ ๋‘ ๊ฐ€์ง€ ์œ ํ˜•์˜ ๋””๋ ‰ํ„ฐ๋ฆฌ์™€ ํŒŒ์ผ๋กœ๋งŒ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค.

  • ๋ชจ๋ธ(.sql) - SELECT ์ฟผ๋ฆฌ๋กœ ํ‘œํ˜„๋˜๋Š” ๋ณ€ํ™˜ ๋‹จ์œ„
  • ๊ตฌ์„ฑ ํŒŒ์ผ(.yml) - ๋งค๊ฐœ๋ณ€์ˆ˜, ์„ค์ •, ํ…Œ์ŠคํŠธ, ๋ฌธ์„œ

๊ธฐ๋ณธ์ ์œผ๋กœ ์ž‘์—…์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค.

  • ์‚ฌ์šฉ์ž๋Š” ํŽธ๋ฆฌํ•œ IDE์—์„œ ๋ชจ๋ธ ์ฝ”๋“œ๋ฅผ ์ค€๋น„ํ•ฉ๋‹ˆ๋‹ค.
  • CLI๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์ด ์‹œ์ž‘๋˜๊ณ  DBT๊ฐ€ ๋ชจ๋ธ ์ฝ”๋“œ๋ฅผ SQL๋กœ ์ปดํŒŒ์ผํ•ฉ๋‹ˆ๋‹ค.
  • ์ปดํŒŒ์ผ๋œ SQL ์ฝ”๋“œ๋Š” ์ง€์ •๋œ ์ˆœ์„œ(๊ทธ๋ž˜ํ”„)๋กœ Storage์—์„œ ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค.

CLI์—์„œ ์‹คํ–‰ํ•˜๋Š” ๋ชจ์Šต์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ ๊ตฌ์ถ• ๋„๊ตฌ ๋˜๋Š” ๋ฐ์ดํ„ฐ ์›จ์–ดํ•˜์šฐ์Šค์™€ ์Šค๋ฌด๋””์˜ ๊ณตํ†ต์ 

๋ชจ๋“  ๊ฒƒ์ด SELECT์ž…๋‹ˆ๋‹ค

์ด๋Š” Data Build Tool ํ”„๋ ˆ์ž„์›Œํฌ์˜ ํ•ต์‹ฌ ๊ธฐ๋Šฅ์ž…๋‹ˆ๋‹ค. ์ฆ‰, DBT๋Š” ์ฟผ๋ฆฌ๋ฅผ ์ €์žฅ์†Œ๋กœ ๊ตฌ์ฒดํ™”ํ•˜๋Š” ๊ฒƒ๊ณผ ๊ด€๋ จ๋œ ๋ชจ๋“  ์ฝ”๋“œ(CREATE, INSERT, UPDATE, DELETE ALTER, GRANT, ... ๋ช…๋ น์˜ ๋ณ€ํ˜•)๋ฅผ ์ถ”์ƒํ™”ํ•ฉ๋‹ˆ๋‹ค.

๋ชจ๋“  ๋ชจ๋ธ์—๋Š” ๊ฒฐ๊ณผ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์ •์˜ํ•˜๋Š” ํ•˜๋‚˜์˜ SELECT ์ฟผ๋ฆฌ ์ž‘์„ฑ์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค.

์ด ๊ฒฝ์šฐ ๋ณ€ํ™˜ ๋…ผ๋ฆฌ๋Š” ๋‹ค์ค‘ ๋ ˆ๋ฒจ์ผ ์ˆ˜ ์žˆ์œผ๋ฉฐ ์—ฌ๋Ÿฌ ๋‹ค๋ฅธ ๋ชจ๋ธ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ†ตํ•ฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฃผ๋ฌธ ์‡ผ์ผ€์ด์Šค(f_orders)๋ฅผ ๊ตฌ์ถ•ํ•  ๋ชจ๋ธ์˜ ์˜ˆ:

{% set payment_methods = ['credit_card', 'coupon', 'bank_transfer', 'gift_card'] %}
 
with orders as (
 
   select * from {{ ref('stg_orders') }}
 
),
 
order_payments as (
 
   select * from {{ ref('order_payments') }}
 
),
 
final as (
 
   select
       orders.order_id,
       orders.customer_id,
       orders.order_date,
       orders.status,
       {% for payment_method in payment_methods -%}
       order_payments.{{payment_method}}_amount,
       {% endfor -%}
       order_payments.total_amount as amount
   from orders
       left join order_payments using (order_id)
 
)
 
select * from final

์—ฌ๊ธฐ์„œ ์–ด๋–ค ํฅ๋ฏธ๋กœ์šด ์ ์„ ๋ณผ ์ˆ˜ ์žˆ๋‚˜์š”?

์ฒซ ๋ฒˆ์งธ: CTE(Common Table Expressions) ์‚ฌ์šฉ - ๋งŽ์€ ๋ณ€ํ™˜๊ณผ ๋น„์ฆˆ๋‹ˆ์Šค ๋…ผ๋ฆฌ๊ฐ€ ํฌํ•จ๋œ ์ฝ”๋“œ๋ฅผ ๊ตฌ์„ฑํ•˜๊ณ  ์ดํ•ดํ•ฉ๋‹ˆ๋‹ค.

๋‘˜์งธ: ๋ชจ๋ธ ์ฝ”๋“œ๋Š” SQL๊ณผ ์–ธ์–ด์˜ ํ˜ผํ•ฉ์ž…๋‹ˆ๋‹ค. ์ง„์ž (ํ…œํ”Œ๋ฆฟ ์–ธ์–ด).

์˜ˆ์ œ์—์„œ๋Š” ๋ฃจํ”„๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. for ํ‘œํ˜„์‹์— ์ง€์ •๋œ ๊ฐ ๊ฒฐ์ œ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ๊ธˆ์•ก์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์„ธํŠธ. ๊ธฐ๋Šฅ๋„ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค ์‹ฌํŒ โ€” ์ฝ”๋“œ ๋‚ด์—์„œ ๋‹ค๋ฅธ ๋ชจ๋ธ์„ ์ฐธ์กฐํ•˜๋Š” ๊ธฐ๋Šฅ:

  • ์ปดํŒŒ์ผ ์ค‘ ์‹ฌํŒ Storage์˜ ํ…Œ์ด๋ธ”์ด๋‚˜ ๋ทฐ์— ๋Œ€ํ•œ ๋Œ€์ƒ ํฌ์ธํ„ฐ๋กœ ๋ณ€ํ™˜๋ฉ๋‹ˆ๋‹ค.
  • ์‹ฌํŒ ๋ชจ๋ธ ์ข…์†์„ฑ ๊ทธ๋ž˜ํ”„๋ฅผ ์ž‘์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๊ฒƒ์€ ์ง„์ž DBT์— ๊ฑฐ์˜ ๋ฌดํ•œํ•œ ๊ฐ€๋Šฅ์„ฑ์„ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ๊ฐ€์žฅ ์ผ๋ฐ˜์ ์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ๊ฒƒ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • If / else ๋ฌธ - ๋ถ„๊ธฐ ๋ฌธ
  • For ๋ฃจํ”„ - ์‚ฌ์ดํด
  • ๋ณ€์ˆ˜
  • ๋งคํฌ๋กœ - ๋งคํฌ๋กœ ์ƒ์„ฑ

๊ตฌ์ฒดํ™”: ํ…Œ์ด๋ธ”, ๋ทฐ, ์ฆ๋ถ„

๊ตฌ์ฒดํ™” ์ „๋žต์€ ๊ฒฐ๊ณผ ๋ชจ๋ธ ๋ฐ์ดํ„ฐ ์„ธํŠธ๊ฐ€ ์ €์žฅ์†Œ์— ์ €์žฅ๋˜๋Š” ์ ‘๊ทผ ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค.

๊ธฐ๋ณธ์ ์œผ๋กœ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • ํ…Œ์ด๋ธ” - ์ €์žฅ์†Œ์˜ ๋ฌผ๋ฆฌ์  ํ…Œ์ด๋ธ”
  • ๋ณด๊ธฐ - ์ €์žฅ์†Œ์˜ ๋ณด๊ธฐ, ๊ฐ€์ƒ ํ…Œ์ด๋ธ”

๋” ๋ณต์žกํ•œ ๊ตฌ์ฒดํ™” ์ „๋žต๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ์ฆ๋ถ„ - ์ฆ๋ถ„ ๋กœ๋“œ(๋Œ€ํ˜• ํŒฉํŠธ ํ…Œ์ด๋ธ”) ์ƒˆ ์ค„์ด ์ถ”๊ฐ€๋˜๊ณ , ๋ณ€๊ฒฝ๋œ ์ค„์ด ์—…๋ฐ์ดํŠธ๋˜๊ณ , ์‚ญ์ œ๋œ ์ค„์ด ์ง€์›Œ์ง‘๋‹ˆ๋‹ค. 
  • ์ž„์‹œ - ๋ชจ๋ธ์ด ์ง์ ‘์ ์œผ๋กœ ๊ตฌ์ฒดํ™”๋˜์ง€๋Š” ์•Š์ง€๋งŒ ๋‹ค๋ฅธ ๋ชจ๋ธ์— CTE๋กœ ์ฐธ์—ฌํ•ฉ๋‹ˆ๋‹ค.
  • ์ง์ ‘ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” ๋‹ค๋ฅธ ์ „๋žต

๊ตฌ์ฒดํ™” ์ „๋žต ์™ธ์—๋„ ํŠน์ • ์ €์žฅ์†Œ์— ๋Œ€ํ•œ ์ตœ์ ํ™” ๊ธฐํšŒ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • ๋ˆˆ์†ก์ด: ์ž„์‹œ ํ…Œ์ด๋ธ”, ๋ณ‘ํ•ฉ ๋™์ž‘, ํ…Œ์ด๋ธ” ํด๋Ÿฌ์Šคํ„ฐ๋ง, ๊ถŒํ•œ ๋ณต์‚ฌ, ๋ณด์•ˆ ๋ทฐ
  • ์ ์ƒ‰ ํŽธ์ด: Distkey, Sortkey(์ธํ„ฐ๋ฆฌ๋ธŒ, ๋ณตํ•ฉ), ํ›„๊ธฐ ๋ฐ”์ธ๋”ฉ ๋ทฐ
  • BigQuery: ํ…Œ์ด๋ธ” ํŒŒํ‹ฐ์…”๋‹ ๋ฐ ํด๋Ÿฌ์Šคํ„ฐ๋ง, ๋ณ‘ํ•ฉ ๋™์ž‘, KMS ์•”ํ˜ธํ™”, ๋ ˆ์ด๋ธ” ๋ฐ ํƒœ๊ทธ
  • ๋ถˆ๊ฝƒ: ํŒŒ์ผ ํ˜•์‹(parquet, csv, json, orc, delta), partition_by, Clustered_by, buckets, Incremental_strategy

ํ˜„์žฌ ์ง€์›๋˜๋Š” ์ €์žฅ์†Œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • ํฌ์ŠคํŠธ๊ทธ๋ ˆ์Šค
  • ์ ์ƒ‰ ํŽธ์ด
  • BigQuery
  • ๋ˆˆ์†ก์ด
  • ํ”„๋ ˆ์Šคํ† (์ผ๋ถ€)
  • ์ŠคํŒŒํฌ(์ผ๋ถ€)
  • Microsoft SQL Server(์ปค๋ฎค๋‹ˆํ‹ฐ ์–ด๋Œ‘ํ„ฐ)

๋ชจ๋ธ์„ ๊ฐœ์„ ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

  • ์ฑ„์šฐ๊ธฐ๋ฅผ ์ฆ๋ถ„์ ์œผ๋กœ ๋งŒ๋“ค์–ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. (์ฆ๋ถ„)
  • Redshift์— ๋Œ€ํ•œ ๋ถ„ํ•  ๋ฐ ์ •๋ ฌ ํ‚ค๋ฅผ ์ถ”๊ฐ€ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

-- ะšะพะฝั„ะธะณัƒั€ะฐั†ะธั ะผะพะดะตะปะธ: 
-- ะ˜ะฝะบั€ะตะผะตะฝั‚ะฐะปัŒะฝะพะต ะฝะฐะฟะพะปะฝะตะฝะธะต, ัƒะฝะธะบะฐะปัŒะฝั‹ะน ะบะปัŽั‡ ะดะปั ะพะฑะฝะพะฒะปะตะฝะธั ะทะฐะฟะธัะตะน (unique_key)
-- ะšะปัŽั‡ ัะตะณะผะตะฝั‚ะฐั†ะธะธ (dist), ะบะปัŽั‡ ัะพั€ั‚ะธั€ะพะฒะบะธ (sort)
{{
  config(
       materialized='incremental',
       unique_key='order_id',
       dist="customer_id",
       sort="order_date"
   )
}}
 
{% set payment_methods = ['credit_card', 'coupon', 'bank_transfer', 'gift_card'] %}
 
with orders as (
 
   select * from {{ ref('stg_orders') }}
   where 1=1
   {% if is_incremental() -%}
       -- ะญั‚ะพั‚ ั„ะธะปัŒั‚ั€ ะฑัƒะดะตั‚ ะฟั€ะธะผะตะฝะตะฝ ั‚ะพะปัŒะบะพ ะดะปั ะธะฝะบั€ะตะผะตะฝั‚ะฐะปัŒะฝะพะณะพ ะทะฐะฟัƒัะบะฐ
       and order_date >= (select max(order_date) from {{ this }})
   {%- endif %} 
 
),
 
order_payments as (
 
   select * from {{ ref('order_payments') }}
 
),
 
final as (
 
   select
       orders.order_id,
       orders.customer_id,
       orders.order_date,
       orders.status,
       {% for payment_method in payment_methods -%}
       order_payments.{{payment_method}}_amount,
       {% endfor -%}
       order_payments.total_amount as amount
   from orders
       left join order_payments using (order_id)
 
)
 
select * from final

๋ชจ๋ธ ์ข…์†์„ฑ ๊ทธ๋ž˜ํ”„

๋˜ํ•œ ์ข…์†์„ฑ ํŠธ๋ฆฌ์ด๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค. DAG(๋ฐฉํ–ฅ์„ฑ ๋น„์ˆœํ™˜ ๊ทธ๋ž˜ํ”„)๋ผ๊ณ ๋„ ํ•ฉ๋‹ˆ๋‹ค.

DBT๋Š” ๋ชจ๋“  ํ”„๋กœ์ ํŠธ ๋ชจ๋ธ์˜ ๊ตฌ์„ฑ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ทธ๋ž˜ํ”„๋ฅผ ์ž‘์„ฑํ•˜๊ฑฐ๋‚˜ ์˜คํžˆ๋ ค ๋ชจ๋ธ ๋‚ด์˜ ref() ๋งํฌ๋ฅผ ๋‹ค๋ฅธ ๋ชจ๋ธ์— ์—ฐ๊ฒฐํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜ํ”„๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋‹ค์Œ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ์˜ฌ๋ฐ”๋ฅธ ์ˆœ์„œ๋กœ ๋ชจ๋ธ ์‹คํ–‰
  • ์ ํฌ ํ˜•์„ฑ์˜ ๋ณ‘๋ ฌํ™”
  • ์ž„์˜์˜ ํ•˜์œ„ ๊ทธ๋ž˜ํ”„ ์‹คํ–‰ 

๊ทธ๋ž˜ํ”„ ์‹œ๊ฐํ™”์˜ ์˜ˆ:

๋ฐ์ดํ„ฐ ๊ตฌ์ถ• ๋„๊ตฌ ๋˜๋Š” ๋ฐ์ดํ„ฐ ์›จ์–ดํ•˜์šฐ์Šค์™€ ์Šค๋ฌด๋””์˜ ๊ณตํ†ต์ 
๊ทธ๋ž˜ํ”„์˜ ๊ฐ ๋…ธ๋“œ๋Š” ๋ชจ๋ธ์ด๋ฉฐ, ๊ทธ๋ž˜ํ”„์˜ ๊ฐ€์žฅ์ž๋ฆฌ๋Š” ref ํ‘œํ˜„์‹์œผ๋กœ ์ง€์ •๋ฉ๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ ํ’ˆ์งˆ ๋ฐ ๋ฌธ์„œํ™”

๋ชจ๋ธ ์ž์ฒด๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ ์™ธ์—๋„ DBT๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๊ฒฐ๊ณผ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ๋Œ€ํ•ด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ๊ฐ€์ •์„ ํ…Œ์ŠคํŠธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • Null ์•„๋‹˜
  • ์œ ์ผํ•œ
  • ์ฐธ์กฐ ๋ฌด๊ฒฐ์„ฑ - ์ฐธ์กฐ ๋ฌด๊ฒฐ์„ฑ(์˜ˆ: ์ฃผ๋ฌธ ํ…Œ์ด๋ธ”์˜ customer_id๋Š” ๊ณ ๊ฐ ํ…Œ์ด๋ธ”์˜ id์— ํ•ด๋‹นํ•จ)
  • ํ—ˆ์šฉ๋˜๋Š” ๊ฐ’ ๋ชฉ๋ก ์ผ์น˜

์˜ˆ๋ฅผ ๋“ค์–ด ํ•˜๋ฃจ, ์ผ์ฃผ์ผ, ํ•œ ๋‹ฌ ์ „ ์ง€ํ‘œ๋ฅผ ์‚ฌ์šฉํ•œ ์ˆ˜์ต % ํŽธ์ฐจ์™€ ๊ฐ™์€ ์ž์ฒด ํ…Œ์ŠคํŠธ(๋งž์ถคํ˜• ๋ฐ์ดํ„ฐ ํ…Œ์ŠคํŠธ)๋ฅผ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. SQL ์ฟผ๋ฆฌ๋กœ ๊ณต์‹ํ™”๋œ ๋ชจ๋“  ๊ฐ€์ •์€ ํ…Œ์ŠคํŠธ๊ฐ€ ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด๋Ÿฌํ•œ ๋ฐฉ์‹์œผ๋กœ ์ฐฝ๊ณ  ์ฐฝ์˜ ๋ฐ์ดํ„ฐ์—์„œ ์›์น˜ ์•Š๋Š” ํŽธ์ฐจ์™€ ์˜ค๋ฅ˜๋ฅผ ํฌ์ฐฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฌธ์„œํ™” ์ธก๋ฉด์—์„œ DBT๋Š” ๋ชจ๋ธ ๋ฐ ์†์„ฑ ์ˆ˜์ค€์—์„œ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ์™€ ์„ค๋ช…์„ ์ถ”๊ฐ€, ๋ฒ„์ „ ๊ด€๋ฆฌ ๋ฐ ๋ฐฐํฌํ•˜๊ธฐ ์œ„ํ•œ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. 

๊ตฌ์„ฑ ํŒŒ์ผ ์ˆ˜์ค€์—์„œ ํ…Œ์ŠคํŠธ ๋ฐ ๋ฌธ์„œ๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

 - name: fct_orders
   description: This table has basic information about orders, as well as some derived facts based on payments
   columns:
     - name: order_id
       tests:
         - unique # ะฟั€ะพะฒะตั€ะบะฐ ะฝะฐ ัƒะฝะธะบะฐะปัŒะฝะพัั‚ัŒ ะทะฝะฐั‡ะตะฝะธะน
         - not_null # ะฟั€ะพะฒะตั€ะบะฐ ะฝะฐ ะฝะฐะปะธั‡ะธะต null
       description: This is a unique identifier for an order
     - name: customer_id
       description: Foreign key to the customers table
       tests:
         - not_null
         - relationships: # ะฟั€ะพะฒะตั€ะบะฐ ััั‹ะปะพั‡ะฝะพะน ั†ะตะปะพัั‚ะฝะพัั‚ะธ
             to: ref('dim_customers')
             field: customer_id
     - name: order_date
       description: Date (UTC) that the order was placed
     - name: status
       description: '{{ doc("orders_status") }}'
       tests:
         - accepted_values: # ะฟั€ะพะฒะตั€ะบะฐ ะฝะฐ ะดะพะฟัƒัั‚ะธะผั‹ะต ะทะฝะฐั‡ะตะฝะธั
             values: ['placed', 'shipped', 'completed', 'return_pending', 'returned']

์ƒ์„ฑ๋œ ์›น์‚ฌ์ดํŠธ์—์„œ ์ด ๋ฌธ์„œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ ๊ตฌ์ถ• ๋„๊ตฌ ๋˜๋Š” ๋ฐ์ดํ„ฐ ์›จ์–ดํ•˜์šฐ์Šค์™€ ์Šค๋ฌด๋””์˜ ๊ณตํ†ต์ 

๋งคํฌ๋กœ ๋ฐ ๋ชจ๋“ˆ

DBT์˜ ๋ชฉ์ ์€ SQL ์Šคํฌ๋ฆฝํŠธ ์„ธํŠธ๊ฐ€ ๋˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์‚ฌ์šฉ์ž์—๊ฒŒ ์ž์ฒด ๋ณ€ํ™˜์„ ๊ตฌ์ถ•ํ•˜๊ณ  ์ด๋Ÿฌํ•œ ๋ชจ๋“ˆ์„ ๋ฐฐํฌํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ•๋ ฅํ•˜๊ณ  ๊ธฐ๋Šฅ์ด ํ’๋ถ€ํ•œ ์ˆ˜๋‹จ์„ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋งคํฌ๋กœ๋Š” ๋ชจ๋ธ ๋‚ด์—์„œ ํ•จ์ˆ˜๋กœ ํ˜ธ์ถœํ•  ์ˆ˜ ์žˆ๋Š” ๊ตฌ๋ฌธ ๋ฐ ํ‘œํ˜„์‹ ์„ธํŠธ์ž…๋‹ˆ๋‹ค. ๋งคํฌ๋กœ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด DRY(Don't Repeat Yourself) ์—”์ง€๋‹ˆ์–ด๋ง ์›์น™์— ๋”ฐ๋ผ ๋ชจ๋ธ๊ณผ ํ”„๋กœ์ ํŠธ ๊ฐ„์— SQL์„ ์žฌ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋งคํฌ๋กœ ์˜ˆ:

{% macro rename_category(column_name) %}
case
 when {{ column_name }} ilike  '%osx%' then 'osx'
 when {{ column_name }} ilike  '%android%' then 'android'
 when {{ column_name }} ilike  '%ios%' then 'ios'
 else 'other'
end as renamed_product
{% endmacro %}

์‚ฌ์šฉ๋ฒ•์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

{% set column_name = 'product' %}
select
 product,
 {{ rename_category(column_name) }} -- ะฒั‹ะทะพะฒ ะผะฐะบั€ะพัะฐ
from my_table

DBT์—๋Š” ์‚ฌ์šฉ์ž๊ฐ€ ๊ฐœ๋ณ„ ๋ชจ๋“ˆ๊ณผ ๋งคํฌ๋กœ๋ฅผ ๊ฒŒ์‹œํ•˜๊ณ  ์žฌ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํŒจํ‚ค์ง€ ๊ด€๋ฆฌ์ž๊ฐ€ ํ•จ๊ป˜ ์ œ๊ณต๋ฉ๋‹ˆ๋‹ค.

์ด๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๋กœ๋“œํ•˜๊ณ  ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Œ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

  • dbt_utils: ๋‚ ์งœ/์‹œ๊ฐ„, ๋Œ€๋ฆฌ ํ‚ค, ์Šคํ‚ค๋งˆ ํ…Œ์ŠคํŠธ, ํ”ผ๋ฒ—/ํ”ผ๋ฒ— ํ•ด์ œ ๋ฐ ๊ธฐํƒ€ ์ž‘์—…
  • ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์„œ๋น„์Šค๋ฅผ ์œ„ํ•œ ๊ธฐ์„ฑํ’ˆ ์‡ผ์ผ€์ด์Šค ํ…œํ”Œ๋ฆฟ ์ œ์„ค๊ธฐ ะธ ์ŠคํŠธ๋ผ์ดํ”„ 
  • ํŠน์ • ๋ฐ์ดํ„ฐ ์ €์žฅ์†Œ๋ฅผ ์œ„ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ(์˜ˆ: ์ ์ƒ‰ ํŽธ์ด 
  • ๋กœ๊น… โ€” DBT ์ž‘์—…์„ ๋กœ๊น…ํ•˜๋Š” ๋ชจ๋“ˆ

์ „์ฒด ํŒจํ‚ค์ง€ ๋ชฉ๋ก์€ ๋‹ค์Œ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. DBT ํ—ˆ๋ธŒ.

๋” ๋งŽ์€ ๊ธฐ๋Šฅ

์—ฌ๊ธฐ์—์„œ๋Š” ํŒ€๊ณผ ์ œ๊ฐ€ ๋ฐ์ดํ„ฐ ์›จ์–ดํ•˜์šฐ์Šค๋ฅผ ๊ตฌ์ถ•ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•˜๋Š” ๋ช‡ ๊ฐ€์ง€ ๋‹ค๋ฅธ ํฅ๋ฏธ๋กœ์šด ๊ธฐ๋Šฅ๊ณผ ๊ตฌํ˜„์— ๋Œ€ํ•ด ์„ค๋ช…ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. Wheely.

๋Ÿฐํƒ€์ž„ ํ™˜๊ฒฝ ๋ถ„๋ฆฌ DEV - TEST - PROD

๋™์ผํ•œ DWH ํด๋Ÿฌ์Šคํ„ฐ ๋‚ด์—์„œ๋„(๋‹ค๋ฅธ ๊ตฌ์„ฑํ‘œ ๋‚ด์—์„œ). ์˜ˆ๋ฅผ ๋“ค์–ด ๋‹ค์Œ ํ‘œํ˜„์‹์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

with source as (
 
   select * from {{ source('salesforce', 'users') }}
   where 1=1
   {%- if target.name in ['dev', 'test', 'ci'] -%}           
       where timestamp >= dateadd(day, -3, current_date)   
   {%- endif -%}
 
)

์ด ์ฝ”๋“œ๋Š” ๋ฌธ์ž ๊ทธ๋Œ€๋กœ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋งํ•ฉ๋‹ˆ๋‹ค. ๊ฐœ๋ฐœ, ํ…Œ์ŠคํŠธ, CI ์ง€๋‚œ 3์ผ ๋™์•ˆ์˜ ๋ฐ์ดํ„ฐ๋งŒ ๊ฐ€์ ธ์˜ค๊ณ  ๊ทธ ์ด์ƒ์€ ๊ฐ€์ ธ์˜ค์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ฆ‰, ์ด๋Ÿฌํ•œ ํ™˜๊ฒฝ์—์„œ ์‹คํ–‰ํ•˜๋ฉด ํ›จ์”ฌ ๋” ๋น ๋ฅด๊ณ  ๋” ์ ์€ ๋ฆฌ์†Œ์Šค๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ํ™˜๊ฒฝ์—์„œ ์‹คํ–‰ํ•  ๋•Œ ์ฐŒ๋ฅด๋‹ค ํ•„ํ„ฐ ์กฐ๊ฑด์€ ๋ฌด์‹œ๋ฉ๋‹ˆ๋‹ค.

๋Œ€์ฒด ์—ด ์ธ์ฝ”๋”ฉ์„ ํ†ตํ•œ ๊ตฌ์ฒดํ™”

Redshift๋Š” ๊ฐ ๊ฐœ๋ณ„ ์—ด์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ ์••์ถ• ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์„ค์ •ํ•  ์ˆ˜ ์žˆ๋Š” ์—ด ๊ธฐ๋ฐ˜ DBMS์ž…๋‹ˆ๋‹ค. ์ตœ์ ์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์„ ํƒํ•˜๋ฉด ๋””์Šคํฌ ๊ณต๊ฐ„์„ 20~50%๊นŒ์ง€ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋งคํฌ๋กœ redshift.compress_table ANALYZE COMPRESSION ๋ช…๋ น์„ ์‹คํ–‰ํ•˜๊ณ , ๊ถŒ์žฅ ์—ด ์ธ์ฝ”๋”ฉ ์•Œ๊ณ ๋ฆฌ์ฆ˜, ์ง€์ •๋œ ๋ถ„ํ•  ํ‚ค(dist_key) ๋ฐ ์ •๋ ฌ ํ‚ค(sort_key)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ƒˆ ํ…Œ์ด๋ธ”์„ ์ƒ์„ฑํ•˜๊ณ , ๋ฐ์ดํ„ฐ๋ฅผ ํ•ด๋‹น ํ…Œ์ด๋ธ”๋กœ ์ „์†กํ•˜๊ณ , ํ•„์š”ํ•œ ๊ฒฝ์šฐ ์ด์ „ ๋ณต์‚ฌ๋ณธ์„ ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค.

๋งคํฌ๋กœ ์„œ๋ช…:

{{ compress_table(schema, table,
                 drop_backup=False,
                 comprows=none|Integer,
                 sort_style=none|compound|interleaved,
                 sort_keys=none|List<String>,
                 dist_style=none|all|even,
                 dist_key=none|String) }}

๋กœ๊น… ๋ชจ๋ธ ์‹คํ–‰

๋ชจ๋ธ์˜ ๊ฐ ์‹คํ–‰์— ํ›„ํฌ๋ฅผ ์—ฐ๊ฒฐํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ด๋Š” ์ถœ์‹œ ์ „์ด๋‚˜ ๋ชจ๋ธ ์ƒ์„ฑ์ด ์™„๋ฃŒ๋œ ์งํ›„์— ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค.

   pre-hook: "{{ logging.log_model_start_event() }}"
   post-hook: "{{ logging.log_model_end_event() }}"

๋กœ๊น… ๋ชจ๋“ˆ์„ ์‚ฌ์šฉํ•˜๋ฉด ํ•„์š”ํ•œ ๋ชจ๋“  ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋ฅผ ๋ณ„๋„์˜ ํ…Œ์ด๋ธ”์— ๊ธฐ๋กํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ดํ›„์— ๋ณ‘๋ชฉ ํ˜„์ƒ์„ ๊ฐ์‚ฌํ•˜๊ณ  ๋ถ„์„ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Looker์˜ ๋กœ๊น… ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ๋Œ€์‹œ๋ณด๋“œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ ๊ตฌ์ถ• ๋„๊ตฌ ๋˜๋Š” ๋ฐ์ดํ„ฐ ์›จ์–ดํ•˜์šฐ์Šค์™€ ์Šค๋ฌด๋””์˜ ๊ณตํ†ต์ 

์Šคํ† ๋ฆฌ์ง€ ์œ ์ง€ ๊ด€๋ฆฌ ์ž๋™ํ™”

UDF(์‚ฌ์šฉ์ž ์ •์˜ ํ•จ์ˆ˜)์™€ ๊ฐ™์€ ์‚ฌ์šฉ๋œ ๋ฆฌํฌ์ง€ํ† ๋ฆฌ ๊ธฐ๋Šฅ์˜ ์ผ๋ถ€ ํ™•์žฅ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ์ด๋Ÿฌํ•œ ๊ธฐ๋Šฅ์˜ ๋ฒ„์ „ ๊ด€๋ฆฌ, ์•ก์„ธ์Šค ์ œ์–ด ๋ฐ ์ƒˆ ๋ฆด๋ฆฌ์Šค์˜ ์ž๋™ ๋กค์•„์›ƒ์ด DBT์—์„œ ๋งค์šฐ ํŽธ๋ฆฌํ•ฉ๋‹ˆ๋‹ค.

์šฐ๋ฆฌ๋Š” Python์—์„œ UDF๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•ด์‹œ, ์ด๋ฉ”์ผ ๋„๋ฉ”์ธ ๋ฐ ๋น„ํŠธ๋งˆ์Šคํฌ ๋””์ฝ”๋”ฉ์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.

๋ชจ๋“  ์‹คํ–‰ ํ™˜๊ฒฝ(dev, test, prod)์—์„œ UDF๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋งคํฌ๋กœ์˜ ์˜ˆ:

{% macro create_udf() -%}
 
 {% set sql %}
       CREATE OR REPLACE FUNCTION {{ target.schema }}.f_sha256(mes "varchar")
           RETURNS varchar
           LANGUAGE plpythonu
           STABLE
       AS $$  
           import hashlib
           return hashlib.sha256(mes).hexdigest()
       $$
       ;
 {% endset %}
  
 {% set table = run_query(sql) %}
 
{%- endmacro %}

Wheely์—์„œ๋Š” PostgreSQL์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋Š” Amazon Redshift๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. Redshift์˜ ๊ฒฝ์šฐ ์ •๊ธฐ์ ์œผ๋กœ ํ…Œ์ด๋ธ”์— ๋Œ€ํ•œ ํ†ต๊ณ„๋ฅผ ์ˆ˜์ง‘ํ•˜๊ณ  ๋””์Šคํฌ ๊ณต๊ฐ„์„ ํ™•๋ณดํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค(๊ฐ๊ฐ ANALYZE ๋ฐ VACUUM ๋ช…๋ น).

์ด๋ฅผ ์œ„ํ•ด redshift_maintenance ๋งคํฌ๋กœ์˜ ๋ช…๋ น์ด ๋งค์ผ ๋ฐค ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค.

{% macro redshift_maintenance() %}
 
   {% set vacuumable_tables=run_query(vacuumable_tables_sql) %}
 
   {% for row in vacuumable_tables %}
       {% set message_prefix=loop.index ~ " of " ~ loop.length %}
 
       {%- set relation_to_vacuum = adapter.get_relation(
                                               database=row['table_database'],
                                               schema=row['table_schema'],
                                               identifier=row['table_name']
                                   ) -%}
       {% do run_query("commit") %}
 
       {% if relation_to_vacuum %}
           {% set start=modules.datetime.datetime.now() %}
           {{ dbt_utils.log_info(message_prefix ~ " Vacuuming " ~ relation_to_vacuum) }}
           {% do run_query("VACUUM " ~ relation_to_vacuum ~ " BOOST") %}
           {{ dbt_utils.log_info(message_prefix ~ " Analyzing " ~ relation_to_vacuum) }}
           {% do run_query("ANALYZE " ~ relation_to_vacuum) %}
           {% set end=modules.datetime.datetime.now() %}
           {% set total_seconds = (end - start).total_seconds() | round(2)  %}
           {{ dbt_utils.log_info(message_prefix ~ " Finished " ~ relation_to_vacuum ~ " in " ~ total_seconds ~ "s") }}
       {% else %}
           {{ dbt_utils.log_info(message_prefix ~ ' Skipping relation "' ~ row.values() | join ('"."') ~ '" as it does not exist') }}
       {% endif %}
 
   {% endfor %}
 
{% endmacro %}

DBT ํด๋ผ์šฐ๋“œ

DBT๋ฅผ ์„œ๋น„์Šค(Managed Service)๋กœ ํ™œ์šฉํ•˜๋Š” ๊ฒƒ์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. ํฌํ•จ:

  • ํ”„๋กœ์ ํŠธ ๋ฐ ๋ชจ๋ธ ๊ฐœ๋ฐœ์„ ์œ„ํ•œ Web IDE
  • ์ž‘์—… ๊ตฌ์„ฑ ๋ฐ ์˜ˆ์•ฝ
  • ๊ฐ„๋‹จํ•˜๊ณ  ํŽธ๋ฆฌํ•œ ๋กœ๊ทธ ์•ก์„ธ์Šค
  • ํ”„๋กœ์ ํŠธ ๋ฌธ์„œ๊ฐ€ ํฌํ•จ๋œ ์›น์‚ฌ์ดํŠธ
  • CI ์—ฐ๊ฒฐ(์ง€์†์  ํ†ตํ•ฉ)

๋ฐ์ดํ„ฐ ๊ตฌ์ถ• ๋„๊ตฌ ๋˜๋Š” ๋ฐ์ดํ„ฐ ์›จ์–ดํ•˜์šฐ์Šค์™€ ์Šค๋ฌด๋””์˜ ๊ณตํ†ต์ 

๊ฒฐ๋ก 

DWH๋ฅผ ์ค€๋น„ํ•˜๊ณ  ์„ญ์ทจํ•˜๋Š” ๊ฒƒ์€ ์Šค๋ฌด๋””๋ฅผ ๋งˆ์‹œ๋Š” ๊ฒƒ๋งŒํผ ์ฆ๊ฒ๊ณ  ์œ ์ตํ•ฉ๋‹ˆ๋‹ค. DBT๋Š” Jinja, ์‚ฌ์šฉ์ž ํ™•์žฅ(๋ชจ๋“ˆ), ์ปดํŒŒ์ผ๋Ÿฌ, ์‹คํ–‰๊ธฐ, ํŒจํ‚ค์ง€ ๊ด€๋ฆฌ์ž๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์š”์†Œ๋ฅผ ๊ฒฐํ•ฉํ•˜๋ฉด ๋ฐ์ดํ„ฐ ์›จ์–ดํ•˜์šฐ์Šค๋ฅผ ์œ„ํ•œ ์™„๋ฒฝํ•œ ์ž‘์—… ํ™˜๊ฒฝ์„ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ค๋Š˜๋‚  DWH ๋‚ด์—์„œ ํ˜์‹ ์„ ๊ด€๋ฆฌํ•˜๋Š” ๋ฐ ์ด๋ณด๋‹ค ๋” ์ข‹์€ ๋ฐฉ๋ฒ•์€ ์—†์Šต๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ ๊ตฌ์ถ• ๋„๊ตฌ ๋˜๋Š” ๋ฐ์ดํ„ฐ ์›จ์–ดํ•˜์šฐ์Šค์™€ ์Šค๋ฌด๋””์˜ ๊ณตํ†ต์ 

DBT ๊ฐœ๋ฐœ์ž๊ฐ€ ๋”ฐ๋ฅด๋Š” ์‹ ๋…์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ณต์‹ํ™”๋ฉ๋‹ˆ๋‹ค.

  • GUI๊ฐ€ ์•„๋‹Œ ์ฝ”๋“œ๋Š” ๋ณต์žกํ•œ ๋ถ„์„ ๋…ผ๋ฆฌ๋ฅผ ํ‘œํ˜„ํ•˜๋Š” ๋ฐ ๊ฐ€์žฅ ์ ํ•ฉํ•œ ์ถ”์ƒํ™”์ž…๋‹ˆ๋‹ค.
  • ๋ฐ์ดํ„ฐ ์ž‘์—…์€ ์†Œํ”„ํŠธ์›จ์–ด ์—”์ง€๋‹ˆ์–ด๋ง(์†Œํ”„ํŠธ์›จ์–ด ์—”์ง€๋‹ˆ์–ด๋ง)์˜ ๋ชจ๋ฒ” ์‚ฌ๋ก€๋ฅผ ์ ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

  • ์ค‘์š”ํ•œ ๋ฐ์ดํ„ฐ ์ธํ”„๋ผ๋Š” ์˜คํ”ˆ ์†Œ์Šค ์†Œํ”„ํŠธ์›จ์–ด๋กœ์„œ ์‚ฌ์šฉ์ž ์ปค๋ฎค๋‹ˆํ‹ฐ์— ์˜ํ•ด ์ œ์–ด๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  • ๋ถ„์„ ๋„๊ตฌ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ฝ”๋“œ๋„ ์ ์ฐจ ์˜คํ”ˆ ์†Œ์Šค ์ปค๋ฎค๋‹ˆํ‹ฐ์˜ ์ž์‚ฐ์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์ด๋Ÿฌํ•œ ํ•ต์‹ฌ ์‹ ๋…์€ ์˜ค๋Š˜๋‚  850๊ฐœ ์ด์ƒ์˜ ํšŒ์‚ฌ์—์„œ ์‚ฌ์šฉ๋˜๋Š” ์ œํ’ˆ์„ ํƒ„์ƒ์‹œ์ผฐ์œผ๋ฉฐ, ๋ฏธ๋ž˜์— ๋งŒ๋“ค์–ด์งˆ ๋งŽ์€ ํฅ๋ฏธ๋กœ์šด ํ™•์žฅ์˜ ๊ธฐ์ดˆ๋ฅผ ํ˜•์„ฑํ•ฉ๋‹ˆ๋‹ค.

๊ด€์‹ฌ ์žˆ๋Š” ๋ถ„๋“ค์„ ์œ„ํ•ด ์ œ๊ฐ€ ๋ช‡ ๋‹ฌ ์ „ OTUS์—์„œ ์—ด๋ฆฐ ๊ณต๊ฐœ ๋ ˆ์Šจ์˜ ์ผํ™˜์œผ๋กœ ์ง„ํ–‰ํ•œ ๊ณต๊ฐœ ๋ ˆ์Šจ ์˜์ƒ์ด ์žˆ์Šต๋‹ˆ๋‹ค. Amazon Redshift ์Šคํ† ๋ฆฌ์ง€์šฉ ๋ฐ์ดํ„ฐ ๊ตฌ์ถ• ๋„๊ตฌ.

DBT ๋ฐ ๋ฐ์ดํ„ฐ ์›จ์–ดํ•˜์šฐ์ง• ์™ธ์—๋„ OTUS ํ”Œ๋žซํผ์˜ ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด ๊ณผ์ •์˜ ์ผํ™˜์œผ๋กœ ๋™๋ฃŒ์™€ ์ €๋Š” ๊ธฐํƒ€ ๊ด€๋ จ ์žˆ๊ณ  ํ˜„๋Œ€์ ์ธ ์ฃผ์ œ์— ๋Œ€ํ•ด ๊ฐ•์˜ํ•ฉ๋‹ˆ๋‹ค.

  • ๋น… ๋ฐ์ดํ„ฐ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ์œ„ํ•œ ์•„ํ‚คํ…์ฒ˜ ๊ฐœ๋…
  • Spark ๋ฐ Spark Streaming ์‹ค์Šต
  • ๋ฐ์ดํ„ฐ ์†Œ์Šค ๋กœ๋“œ ๋ฐฉ๋ฒ• ๋ฐ ๋„๊ตฌ ํƒ์ƒ‰
  • DWH์—์„œ ๋ถ„์„ ์‡ผ์ผ€์ด์Šค ๊ตฌ์ถ•
  • NoSQL ๊ฐœ๋…: HBase, Cassandra, ElasticSearch
  • ๋ชจ๋‹ˆํ„ฐ๋ง ๋ฐ ์กฐ์ •์˜ ์›์น™ 
  • ์ตœ์ข… ํ”„๋กœ์ ํŠธ: ๋ฉ˜ํ† ๋ง ์ง€์›์„ ํ†ตํ•ด ๋ชจ๋“  ๊ธฐ์ˆ ์„ ํ•˜๋‚˜๋กœ ๋ชจ์œผ๊ธฐ

๋งํฌ :

  1. DBT ๋ฌธ์„œ - ์†Œ๊ฐœ โ€” ๊ณต์‹ ๋ฌธ์„œ
  2. ์ •ํ™•ํžˆ dbt๋ž€ ๋ฌด์—‡์ธ๊ฐ€์š”? โ€” DBT ์ €์ž ์ค‘ ํ•œ ์‚ฌ๋žŒ์˜ ๋ฆฌ๋ทฐ ๊ธฐ์‚ฌ 
  3. Amazon Redshift ์Šคํ† ๋ฆฌ์ง€์šฉ ๋ฐ์ดํ„ฐ ๊ตฌ์ถ• ๋„๊ตฌ โ€” YouTube, OTUS ์˜คํ”ˆ ๋ ˆ์Šจ ๋…นํ™”
  4. ๊ทธ๋ฆฐํ”Œ๋Ÿผ ์•Œ์•„๋ณด๊ธฐ โ€” ๋‹ค์Œ ๊ณต๊ฐœ ๊ฐ•์˜๋Š” 15๋…„ 2020์›” XNUMX์ผ์ž…๋‹ˆ๋‹ค.
  5. ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๋ง ์ฝ”์Šค โ€”OTUS
  6. ์„ฑ์ˆ™ํ•œ ๋ถ„์„ ์›Œํฌํ”Œ๋กœ ๊ตฌ์ถ• โ€” ๋ฐ์ดํ„ฐ ๋ฐ ๋ถ„์„์˜ ๋ฏธ๋ž˜ ์‚ดํŽด๋ณด๊ธฐ
  7. ์ด์ œ ์˜คํ”ˆ ์†Œ์Šค ๋ถ„์„์ด ํ•„์š”ํ•œ ๋•Œ์ž…๋‹ˆ๋‹ค. โ€” ๋ถ„์„์˜ ์ง„ํ™”์™€ ์˜คํ”ˆ ์†Œ์Šค์˜ ์˜ํ–ฅ
  8. dbtCloud๋ฅผ ์‚ฌ์šฉํ•œ ์ง€์†์ ์ธ ํ†ตํ•ฉ ๋ฐ ์ž๋™ํ™”๋œ ๋นŒ๋“œ ํ…Œ์ŠคํŠธ โ€” DBT๋ฅผ ํ™œ์šฉํ•œ CI ๊ตฌ์ถ• ์›์น™
  9. DBT ํŠœํ† ๋ฆฌ์–ผ ์‹œ์ž‘ํ•˜๊ธฐ โ€” ์—ฐ์Šต, ๋…๋ฆฝ์ ์ธ ์ž‘์—…์„ ์œ„ํ•œ ๋‹จ๊ณ„๋ณ„ ์ง€์นจ
  10. Jaffle ์ƒ์  โ€” Github DBT Tutorial โ€” Github, ๊ต์œก ํ”„๋กœ์ ํŠธ ์ฝ”๋“œ

๊ฐ•์ขŒ์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์•Œ์•„๋ณด์„ธ์š”.

์ถœ์ฒ˜ : habr.com

์ฝ”๋ฉ˜ํŠธ๋ฅผ ์ถ”๊ฐ€