ืืืืฃ ืืืึธืก ืคึผืจืื ืกืึทืคึผืึทืื ืืื ืึท ืืืขืึทื ืืึทืืึท ืืืึทืจืขืืึธืืกืข ืืขืืืื?
ืคืึธืงืืก ืืืืฃ ืืขืฉืขืคื ืืืขืจื ืืื ืึทื ืึทืืืืืงืก ืืื ืืขืจ ืึทืืืขืง ืคืื ืืึธืืืขืจืคึผืืึทืืข ืงืึธื. ืึธื ืคืืจืื ื DWH ืืื ืึท ืงืึธืืขืืึทืกืข: ืืืขืจืกืืข, ืจืขืฆืขื ืืืข, ืึธืืึทืืืืืื ืืขืกืืื ื ืืื ืกื. ืืึธืืืืึทืจ, ืขืงืกืืขื ืกืืืืข, ืึธืคึฟื ืืงืืจ ืืื ืงืื. ืืึทื ืืฆืขืจ-ืคืจืืึทื ืืืขื ืืึทืงืืืืขื ืืืืฉืึทื ืืื ืืขืคึผืขื ืืขื ืกื ืืืืืฉืืืึทืืึทืืืืฉืึทื (ืืึทืืึท ืืื ืขืึทืืข).
ืืขืจ ืืืขืื ืึทืืข ืืขื ืืื ืืืขืื ืื ืจืึธืืข ืคืื โโDBT ืืื ืื ืืื ืืึทืืึท & ืึทื ืึทืืืืืงืก ืืงืึธืืกืืกืืึทื - ืืึทืืจืืกืื ื ืฆื ืงืึทืฅ.
ืืขืื ืึทืืขืืขื
ืึทืจืืขืื ืงืึธืืืจ ืืื ืืื ืงืึธื ืืึทืงื. ืคึฟืึทืจ ืืขืจ ืืื 5 ืืึธืจ ืืื ืืึธืื ืืจืืขืื ืืื ืืึทืื ืืืขืจืืึทืืืื, ืื ืื ETL / ELT, ืืื ืืขืืื ื ืืื ืืึทืื ืึทื ืึทืืืืืงืก ืืื ืืืืืฉืืืึทืืึทืืืืฉืึทื. ืืื ืืื ืืืฆื ืืจืืขืื ืืื
ืงืืจืฅ ืืืจืืืื
ืื DBT ืคืจืืืืืืขืจืง ืืื ืึทืืข ืืืขืื ืื T ืืื ืื ELT (Extract - Transform - Load) ืึทืงืจืึทื ืื.
ืืื ืื ืึทืืืืขื ื ืคืื ืึทืืึท ืคึผืจืึธืืืงืืืื ืืื ืกืงืึทืืึทืืืข ืึทื ืึทืืืกืืก ืืึทืืึทืืืืกืื ืืื BigQuery, Redshift, Snowflake, ืขืก ืืื ืงืืื ืคืื ื ืฆื ืืึธื ืืจืึทื ืกืคืึธืจืืึทืฆืืข ืึทืจืืืก ืื ืืึทืืึท ืืืึทืจืขืืึธืืกืข.
DBT ืืื ื ืืฉื ืืจืืคืงืืคืืข ืืึทืื ืคึฟืื ืงืืืืื, ืึธืืขืจ ืืื ืืจืืืก ืึทืคึผืขืจืืื ืึทืืื ืคึฟืึทืจ ืืจืืขืื ืืื ืืึทืื ืืืึธืก ืืื ืฉืืื ืืึธืืืื ืืื ืื ืกืืึธืจืืืืฉ (ืืื ืืื ืขืจืืขืืขืจ ืึธืืขืจ ืคืื ืืจืืืกื ืืืง ืกืืึธืจืืืืฉ).
ืืขืจ ืืืืคึผื ืฆืื ืคืื DBT ืืื ืฆื ื ืขืืขื ืื ืงืึธื, ืฆืื ืืืคื ืขืืขื ืขืก ืืื SQL, ืืืกืคืืจื ืื ืงืึทืืึทื ืื ืืื ืื ืจืืืืืง ืกืืงืืืึทื ืก ืืื ืื ืจืืคึผืึทืืึทืืึธืจื.
ืืื ืคึผืจืึธืืขืงื ืกืืจืืงืืืจ
ืื ืคึผืจืืืขืงื ืืืฉืืืื ืคืื ืืืจืขืงืืขืจืื ืืื ืืขืงืขืก ืคืื ืืืืื 2 ืืืืคึผืก:
- ืืึธืืขื (. ืกืงื) - ืึท ืึทืคึผืึทืจืึทื ืคืื ืืจืึทื ืกืคืึธืจืืึทืฆืืข ืืืืกืืขืืจืืงื ืืืจื ืึท ืกืขืืขืงื ืึธื ืคึฟืจืขื
- ืงืึธื ืคืืืืจืึทืืืึธื ืืขืงืข (.ืืื) - ืคึผืึทืจืึทืืขืืขืจืก, ืกืขืืืื ืืก, ืืขืกืฅ, ืืึทืงืืืืขื ืืืืฉืึทื
ืืื ืึท ืืงืขืจืืืง ืืืจืื, ืื ืึทืจืืขื ืืื ืกืืจืึทืงืืฉืขืจื ืืื ืืืื:
- ืืขืจ ืืึทื ืืฆืขืจ ืคึผืจืืคึผืขืจื ืืึธืืขื ืงืึธื ืืื ืงืืื ืืึทืงืืืขื IDE
- ื ืืฆื ืื CLI, ืืึธืืขืืก ืืขื ืขื ืืึธื ืืฉื, DBT ืงืึทืืคึผืืืื ืื ืืึธืืขื ืงืึธื ืืื SQL
- ืื ืงืึทืืคึผืืืื ืกืงื ืงืึธื ืืื ืขืงืกืึทืงืืืืึทื ืืื ืื ืกืืึธืจืืืืฉ ืืื ืึท ืืขืืขืื ืกืืงืืืึทื ืก (ืืจืึทืฃ)
ืืึธ ืก ืืืึธืก ืคืืืกื ืืืง ืคึฟืื ืื CLI ืงืขื ืงืืงื ืืื:
ืึทืืฅ ืืื SELECT
ืืึธืก ืืื ืึท ืงืืืขืจ ืฉืืจืื ืคืื ืื Data Build Tool ืคืจืืืืืืขืจืง. ืืื ืื ืืขืจืข ืืืขืจืืขืจ, DBT ืึทืืกืืจืึทืงืฅ ืึทืืข ืื ืงืึธื ืคึฟืึทืจืืื ืื ืืื ืืึทืืืจืืึทืืืื ืืืื ืคึฟืจืืื ืืื ืื ืงืจืึธื (ืืืขืจืืืืฉืึทื ื ืคืื ืื ืงืึทืืึทื ืื CREATE, INSERT, UPDATE, DELETE ALTER, GRANT, ...).
ืงืืื ืืึธืืขื ืื ืืืึทืืืื ืฉืจืืืื ืืืื ืกืขืืขืงื ืึธื ืคึฟืจืขื ืืืึธืก ืืืคืืื ื ืื ืจืืืึทืืืื ื ืืึทืื ืฉืืขืื.
ืืื ืืขื ืคืึทื, ืื ืืจืึทื ืกืคืึธืจืืึทืฆืืข ืืึธืืืง ืงืขื ืขื ืืืื ืืึทืืื-ืืืจืื ืืื ืงืึธื ืกืึธืืืืืจื ืืึทืื ืคืื ืขืืืขืืข ืื ืืขืจืข ืืึธืืขืืก. ื ืืืืฉืคึผืื ืคืื ืึท ืืึธืืขื ืืืึธืก ืืืขื ืืืืขื ืึท ืกืืจ ืืืืืจืื ืข (ืค_ืึธืจืืขืจืก):
{% set payment_methods = ['credit_card', 'coupon', 'bank_transfer', 'gift_card'] %}
with orders as (
select * from {{ ref('stg_orders') }}
),
order_payments as (
select * from {{ ref('order_payments') }}
),
final as (
select
orders.order_id,
orders.customer_id,
orders.order_date,
orders.status,
{% for payment_method in payment_methods -%}
order_payments.{{payment_method}}_amount,
{% endfor -%}
order_payments.total_amount as amount
from orders
left join order_payments using (order_id)
)
select * from final
ืืืึธืก ืืฉืืงืึทืืืข ืืืื ืงืขื ืขื ืืืจ ืืขื ืืึธ?
ืขืจืฉืืขืจ: ืืขืืืืื ื CTE (Common Table Expressions) - ืฆื ืึธืจืืึทื ืืืืจื ืืื ืคึฟืึทืจืฉืืืื ืงืึธื ืืืึธืก ืึผืืื ืึท ืคึผืืึทืฅ ืคืื ืืจืึทื ืกืคืึธืจืืึทืฆืืข ืืื ืืขืฉืขืคื ืืึธืืืง
ืจืืข: ืืึธืืขื ืงืึธื ืืื ืึท ืืขืืืฉ ืคืื ืกืงื ืืื ืฉืคึผืจืึทื
ืืขืจ ืืืืฉืคึผืื ื ืืฆื ืึท ืฉืืืืฃ ืคึฟืึทืจ ืฆื ืืืฉืขื ืขืจืืื ืื ืกืืืข ืคึฟืึทืจ ืืขืืขืจ ืฆืึธืืื ื ืืืคึฟื ืกืคึผืขืกืืคืืขื ืืื ืืขืจ ืืืืกืืจืืง ืฉืืขืื. ืื ืคึฟืื ืงืฆืืข ืืื ืืืื ืืขื ืืฆื ืจืขืฃ - ืื ืคืืืืงืืื ืฆื ืืขืจืืึธื ืขื ืื ืืขืจืข ืืึธืืขืืก ืืื ืื ืงืึธื:
- ืืขืฉืึทืก ืืึทืืืื ื ืจืขืฃ ืืืขื ืืืื ืงืึธื ืืืขืจืืขื ืฆื ืึท ืฆืื ืืืึทืื ืฆื ืึท ืืืฉ ืึธืืขืจ ืืืื ืื ื ืืื ืกืืึธืจืืืืฉ
- ืจืขืฃ ืึทืืึทืื ืืืจ ืฆื ืืืืขื ืึท ืืึธืืขื ืืขืคึผืขื ืืขื ืกื ืืจืึทืคืืง
ืคึผืื ืงื
- ืืืื / ืึทื ืืขืจืฉ ืกืืืืืืึทื ืฅ - ืฆืืืืึทื ืกืืืืืืึทื ืฅ
- ืคึฟืึทืจ ืืืคึผืก
- ืืืขืจืืึทืืึทืื
- ืืึทืงืจืึธื - ืงืจืืืืืื ื ืืึทืงืจืึธืก
ืืึทืืขืจืืึทืืืืึทืืืึธื: ืืืฉ, View, ืื ืงืจืขืืขื ืืึทื
ืืึทืืขืจืืึทืืืืึทืืืึธื ืกืืจืึทืืขืืืข ืืื ืึท ืฆืืืึทื ื ืืืื ืืืึธืก ืื ืจืืืึทืืืื ื ืืึทื ื ืคืื ืืึธืืขื ืืึทืื ืืืขื ืืืื ืกืืึธืจื ืืื ืื ืกืืึธืจืืืืฉ.
ืืื ืืงืขืจืืืง ืืขืจืืื ืขื ืขืก ืืื:
- ืืืฉ - ืืฉืืืืช ืืืฉ ืืื ืื ืกืืึธืจืืืืฉ
- View - View, ืืืืจืืืึทื ืืืฉ ืืื ืกืืึธืจืืืืฉ
ืขืก ืืขื ืขื ืืืื ืืขืจ ืงืึธืืคึผืืืฆืืจื ืืึทืืืจืืึทืืืืืืฉืึทื ืกืืจืึทืืขืืืขืก:
- ืื ืงืจืึทืืขื ืืึทื - ืื ืงืจืึทืืขื ืืึทื ืืึธืืืื ื (ืคืื ืืจืืืก ืคืึทืงื ืืืฉื); ื ืืึท ืฉืืจืืช ืืขื ืขื ืฆืืืขืืขืื, ืืขืืืื ืฉืืจืืช ืืขื ืขื ืืขืจืืืึทื ืืืงื, ืืืืกืืขืืขืงื ืฉืืจืืช ืืขื ืขื ืงืืืจื
- ืขืคืขืืขืจืึทื - ืืขืจ ืืึธืืขื ืืื ื ืืฉื ืืึทืืืจืืึทืืืื ืืืืึทื, ืึธืืขืจ ืคึผืึทืจืืืกืึทืคึผืืืฅ ืืื ืึท CTE ืืื ืื ืืขืจืข ืืึธืืขืืก
- ืงืืื ืื ืืขืจืข ืกืืจืึทืืขืืืขืก ืืืจ ืงืขื ืขื ืืืืื ืืื
ืืื ืึทืืืฉืึทื ืฆื ืืึทืืืจืืึทืืืืืืฉืึทื ืกืืจืึทืืขืืืขืก, ืขืก ืืขื ืขื ืึทืคึผืขืจืืื ืึทืืื ืคึฟืึทืจ ืึทืคึผืืึทืืึทืืืืฉืึทื ืคึฟืึทืจ ืกืคึผืขืฆืืคืืฉ ืกืืึธืจืืืืฉ, ืืืฉื:
- ืฉื ืืืขืืข: ืืจืึทื ืกืืขื ื ืืืฉื, ืฆืื ืืืคืืืกื ื ืึทืืืจ, ืืืฉ ืงืืึทืกืืขืจืื ื, ืงืึทืคึผืืื ื ืืจืึทื ืฅ, ืืืืขืจ ืงืืงื
- Redshift: Distkey, Sortkey (ืื ืืขืจืืขืึทืืืขื, ืงืึทืืคึผืึทืื ื), ืฉืคึผืขื ืืืื ืืื ื ืงืืงื
- ืืืืงืืืขืจื: ืืืฉ ืคึผืึทืจืืืฉืึทื ืื ื ืืื ืงืืึทืกืืขืจืื ื, ืฆืื ืืืคืืืกื ื ืึทืืืจ, KMS ืขื ืงืจืืคึผืฉืึทื, ืืึทืืขืืก ืืื ืืึทืืก
- ืึธื ืฆืื ืื: ืืขืงืข ืคึฟืึธืจืืึทื (ืคึผืึทืจืงืขื, ืงืกืื, ืืืฉืกืึธื, ืึธืจืง, ืืขืืืึท), partition_by, clustered_by, ืืึทืงืึทืฅ, ืื ืงืจืึทืืขื ืืึทื_ืกืืจืึทืืขืื
ืื ืคืืืืขื ืืข ืกืืึธืจืืืืฉืื ืืขื ืขื ืืขืจืืืืึทื ืืขืฉืืืฆื:
- ืคึผืึธืกืืืจืขืก
- Redshift
- ืืืืงืืืขืจื
- ืฉื ืืืขืืข
- ืคึผืจืขืกืืึธ (ืืืื)
- ืึธื ืฆืื ืื (ืืืื)
- Microsoft SQL Server (ืงืืืื ืึทืืึทืคึผืืขืจ)
ืืึธืืืจ ืคึฟืึทืจืืขืกืขืจื ืืื ืืืขืจ ืืึธืืขื:
- ืืึธืืืจ ืืึทืื ืืืื ืคืืืื ื ืื ืงืจืึทืืขื ืืึทื (ืื ืงืจืขืืขื ืืึทื)
- ืืึธืืืจ ืืืืื ืกืขืืืึทื ืืืืฉืึทื ืืื ืกืึธืจืืื ื ืฉืืืกืืขื ืคึฟืึทืจ Redshift
-- ะะพะฝัะธะณััะฐัะธั ะผะพะดะตะปะธ:
-- ะะฝะบัะตะผะตะฝัะฐะปัะฝะพะต ะฝะฐะฟะพะปะฝะตะฝะธะต, ัะฝะธะบะฐะปัะฝัะน ะบะปัั ะดะปั ะพะฑะฝะพะฒะปะตะฝะธั ะทะฐะฟะธัะตะน (unique_key)
-- ะะปัั ัะตะณะผะตะฝัะฐัะธะธ (dist), ะบะปัั ัะพััะธัะพะฒะบะธ (sort)
{{
config(
materialized='incremental',
unique_key='order_id',
dist="customer_id",
sort="order_date"
)
}}
{% set payment_methods = ['credit_card', 'coupon', 'bank_transfer', 'gift_card'] %}
with orders as (
select * from {{ ref('stg_orders') }}
where 1=1
{% if is_incremental() -%}
-- ะญัะพั ัะธะปััั ะฑัะดะตั ะฟัะธะผะตะฝะตะฝ ัะพะปัะบะพ ะดะปั ะธะฝะบัะตะผะตะฝัะฐะปัะฝะพะณะพ ะทะฐะฟััะบะฐ
and order_date >= (select max(order_date) from {{ this }})
{%- endif %}
),
order_payments as (
select * from {{ ref('order_payments') }}
),
final as (
select
orders.order_id,
orders.customer_id,
orders.order_date,
orders.status,
{% for payment_method in payment_methods -%}
order_payments.{{payment_method}}_amount,
{% endfor -%}
order_payments.total_amount as amount
from orders
left join order_payments using (order_id)
)
select * from final
ืืึธืืขื ืืขืคึผืขื ืืขื ืกื ืืจืึทืคืืง
ืขืก ืืื ืืืื ืึท ืืขืคึผืขื ืืขื ืกื ืืืื. ืขืก ืืื ืืืื ืืืงืื ื ืืื DAG (ืืืจืขืงืืขื ืึทืกืืงืืืง ืืจืึทืคืืง).
DBT ืืืืขื ืึท ืืจืึทืคืืง ืืืืืจื ืืืืฃ ืื ืงืึทื ืคืืืืขืจืืืฉืึทื ืคืื ืึทืืข ืคึผืจืืืขืงื ืืึธืืขืืก, ืึธืืขืจ ืืึทื ืฅ, ืจืขืฃ () ืคึฟืึทืจืืื ืืื ืืขื ืืื ืืึธืืขืืก ืฆื ืื ืืขืจืข ืืึธืืขืืก. ืืื ืึท ืืจืึทืคืืง ืึทืืึทืื ืืืจ ืฆื ืืึธื ืื ืคืืืืขื ืืข ืืื ืื:
- ืคืืืกื ืืืง ืืึธืืขืืก ืืื ืื ืจืืืืืง ืกืืงืืืึทื ืก
- ืคึผืึทืจืึทืืขืืืืึทืืืึธื ืคืื ืกืืึธืจืคืจืึทื ื ืคืึธืจืืืจืื ื
- ืคืืืกื ืืืง ืึท ืึทืจืืืืจืึทืจืืฉ ืกืืืืจืึทืฃ
ืืืึทืฉืคึผืื ืคืื ืืจืึทืคืืง ืืืืืฉืืืึทืืึทืืืืฉืึทื:
ืืขืืขืจ ื ืึธืืข ืคืื โโืื ืืจืึทืคืืง ืืื ืึท ืืึธืืขื; ืื ืขืืืฉืึทื ืคืื ืื ืืจืึทืคืืง ืืขื ืขื ืกืคึผืขืกืืคืืขื ืืืจื ืื ืืืืกืืจืืง ืจืขืฃ.
ืืึทืืึท ืงืืืึทืืืื ืืื ืืึทืงืืืืขื ืืืืฉืึทื
ืืื ืึทืืืฉืึทื ืฆื ืืืฉืขื ืขืจืืื ืื ืืึธืืขืืก ืืื, DBT ืึทืืึทืื ืืืจ ืฆื ืคึผืจืืืืจื ืึท ื ืืืขืจ ืคืื ืึทืกืึทืืคึผืฉืึทื ื ืืืขืื ืื ืจืืืึทืืืื ื ืืึทืื ืฉืืขืื, ืึทืืึท ืืื:
- ื ืืฉื ื ืื
- ืืื ืฆืืง
- ืจืขืคืขืจืขื ืฅ ืึธืจื ืืืขืืงืืึทื - ืจืขืคืขืจืขื ืืฉืึทื ืึธืจื ืืืขืืงืืึทื (ืืืฉื, customer_id ืืื ืื ืึธืจืืขืจืก ืืืฉ ืงืึธืจืึทืกืคึผืึทื ืื ืฆื ืฉืืึทื ืืื ืื ืงืึทืกืืึทืืขืจื ืืืฉ)
- ืืึทืืืฉืื ื ืื ืจืฉืืื ืคืื ืคึผืึทืกืืง ืืืึทืืืขืก
ืขืก ืืื ืืขืืืขื ืฆื ืืืืื ืืืื ืืืืืขื ืข ืืขืกืฅ (ืื ืื ืืึทืื ืืขืกืฅ), ืึทืืึท ืืื, ืคึฟืึทืจ ืืืึทืฉืคึผืื,% ืืืืืืืืฉืึทื ืคืื ืจืขืืืขื ืืื ืื ืืืงืึทืืึธืจืก ืคืื ืึท ืืึธื, ืึท ืืืึธื, ืึท ืืืืฉ ืฆืืจืืง. ืงืืื ืืึทืฉืึธืจืข ืคืึธืจืืืืึทืืขื ืืื ืึท SQL ืึธื ืคึฟืจืขื ืงืขื ืขื ืืืขืจื ืึท ืคึผืจืึธืืข.
ืืื ืืขื ืืืขื, ืืืจ ืงืขื ืขื ืืึทืคึผื ืึทื ืืืึธื ืืื ืืืืืืืืฉืึทื ื ืืื ืขืจืจืึธืจืก ืืื ืืึทืื ืืื ืื ืืืึทืจืขืืึธืืกืข ืคึฟืขื ืฆืืขืจ.
ืืื ืืขืจืืื ืขื ืคืื ืืึทืงืืืืขื ืืืืฉืึทื, DBT ืืื ืืขืงืึทื ืืืึทืื ืคึฟืึทืจ ืึทืืื ื, ืืืขืจืกืืข ืืื ืืืกืืจืืืืืืื ื ืืขืืึทืืึทืืึท ืืื ืืึทืืขืจืงืื ืืขื ืืื ืื ืืึธืืขื ืืื ืืคืืื ืึทืืจืืืืื ืืขืืืขืืก.
ืืึธ ืก ืืื ืึทืืื ื ืืขืกืฅ ืืื ืืึทืงืืืืขื ืืืืฉืึทื ืงืืงื ืืื ืืืืฃ ืื ืงืึทื ืคืืืืขืจืืืฉืึทื ืืขืงืข ืืืจืื:
- name: fct_orders
description: This table has basic information about orders, as well as some derived facts based on payments
columns:
- name: order_id
tests:
- unique # ะฟัะพะฒะตัะบะฐ ะฝะฐ ัะฝะธะบะฐะปัะฝะพััั ะทะฝะฐัะตะฝะธะน
- not_null # ะฟัะพะฒะตัะบะฐ ะฝะฐ ะฝะฐะปะธัะธะต null
description: This is a unique identifier for an order
- name: customer_id
description: Foreign key to the customers table
tests:
- not_null
- relationships: # ะฟัะพะฒะตัะบะฐ ัััะปะพัะฝะพะน ัะตะปะพััะฝะพััะธ
to: ref('dim_customers')
field: customer_id
- name: order_date
description: Date (UTC) that the order was placed
- name: status
description: '{{ doc("orders_status") }}'
tests:
- accepted_values: # ะฟัะพะฒะตัะบะฐ ะฝะฐ ะดะพะฟัััะธะผัะต ะทะฝะฐัะตะฝะธั
values: ['placed', 'shipped', 'completed', 'return_pending', 'returned']
ืืื ืืึธ ืืื ืืื ืื ืืึทืงืืืืขื ืืืืฉืึทื ืงืืงื ืืื ืืืืฃ ืื ืืืฉืขื ืขืจืืืืึทื ืืืขืืืืืื:
ืืึทืงืจืึธืก ืืื ืืึธืืืืขืก
ืืขืจ ืฆืื ืคืื DBT ืืื ื ืืฉื ืึทืืื ืคืื ืฆื ืืืขืจื ืึท ืกืืื ืคืื SQL ืกืงืจืืคึผืก, ืึธืืขืจ ืฆื ืฆืืฉืืขืื ื ืืฆืขืจืก ืึท ืฉืืึทืจืง ืืื ืฉืืจืื-ืจืืึทื ืืืื ืฆื ืืืืขื ืืืืขืจ ืืืืืขื ืข ืืจืึทื ืกืคืึธืจืืึทืฆืืข ืืื ืืืกืืจืืืืืืื ื ืื ืืึทืืืฉืืื.
ืืึทืงืจืึธืก ืืขื ืขื ืฉืืขืื ืคืื ืงืึทื ืกืืจืึทืงืฉืึทื ื ืืื ืืืืกืืจืืงื ืืืึธืก ืงืขื ืขื ืืืื ืืขืจืืคึฟื ืืื ืคืึทื ืืงืฉืึทื ื ืืื ืืึธืืขืืก. ืืึทืงืจืึธืก ืืึธืื ืืืจ ืฆื ืจืืืืก SQL ืฆืืืืฉื ืืึธืืขืืก ืืื ืคึผืจืึทืืืฉืขืงืก ืืื ืืืื ืืื ืื ืื ืืฉืขื ืืขืจืืข ืคึผืจืื ืฆืืคึผ ืคืื DRY (ืื ืืืืกื ื ืืฉื ืืืืขืจืืืจื ืืื).
ืืึทืงืจืึธ ืืืึทืฉืคึผืื:
{% macro rename_category(column_name) %}
case
when {{ column_name }} ilike '%osx%' then 'osx'
when {{ column_name }} ilike '%android%' then 'android'
when {{ column_name }} ilike '%ios%' then 'ios'
else 'other'
end as renamed_product
{% endmacro %}
ืืื ืืืึทื ื ืืฆื:
{% set column_name = 'product' %}
select
product,
{{ rename_category(column_name) }} -- ะฒัะทะพะฒ ะผะฐะบัะพัะฐ
from my_table
DBT ืงืืื ืืื ืึท ืคึผืขืงื ืคืึทืจืืืึทืืืขืจ ืืืึธืก ืึทืืึทืื ื ืืฆืขืจืก ืฆื ืึทืจืืืกืืขืื ืืื ืจืืืื ืืืื ืืึทืืืฉืืื ืืื ืืึทืงืจืึธืก.
ืืึธืก ืืืื ืึทื ืืืจ ืงืขื ืขื ืืึธืื ืืื ื ืืฆื ืืืืืจืขืจืื ืึทืืึท ืืื:
dbt_utils : ืืจืืขืื ืืื ืืึทืืข / ืฆืืื, ืกืขืจืึทืืึทื ืงืื, ืกืืฉืขืืึท ืืขืกืฅ, ืคึผืืืืึธื / ืื ืคึผืืืืึธื ืืื ืื ืืขืจืข- ืืจืืื-ืืขืืืื ืืืืืจืื ืข ืืขืืคึผืืึทืืขืก ืคึฟืึทืจ ืืึทืืื ืื ืืก ืึทืืึท ืืื
ืฉื ืื ืคึผืืึธื ะธืคึผืึทืก - ืืืืจืึทืจืืขืก ืคึฟืึทืจ ืกืคึผืขืฆืืคืืฉ ืืึทืืึท ืกืืึธืจืขืก, ืืืฉื.
Redshift ืืึธืืืื ื - ืืึธืืืืข ืคึฟืึทืจ ืืึธืืื ื DBT ืึธืคึผืขืจืึทืฆืืข
ื ืืึทื ืฅ ืจืฉืืื ืคืื ืคึผืึทืงืึทืืืฉืึทื ืงืขื ืขื ืืืื ืืขืคึฟืื ืขื ืืื
ืืคืืื ืืขืจ ืคึฟืขืึดืงืืืื
ืืึธ ืืื ืืืขื ืืึทืฉืจืืึทืื ืึท ืืืกื ืื ืืขืจืข ืืฉืืงืึทืืืข ืคึฟืขืึดืงืืืื ืืื ืืืคึผืืึทืืึทื ืฅ ืืืึธืก ืื ืืึทื ืฉืึทืคึฟื ืืื ืืื ื ืืฆื ืฆื ืืืืขื ืึท ืืึทืืึท ืืืึทืจืขืืึธืืกืข ืืื
ืฆืขืฉืืืืื ื ืคืื ืจืื ืืืืข ืื ืืืืืจืึทื ืืึทื ืฅ DEV - TEST - PROD
ืืคืืื ืืื ืืขืจ ืืขืืืืงืขืจ DWH ืงื ืืื (ืื ืคืึทืจืฉืืืขื ืข ืกืงืืื). ืคึฟืึทืจ ืืืึทืฉืคึผืื, ื ืืฆื ืื ืคืืืืขื ืืข ืืืืกืืจืืง:
with source as (
select * from {{ source('salesforce', 'users') }}
where 1=1
{%- if target.name in ['dev', 'test', 'ci'] -%}
where timestamp >= dateadd(day, -3, current_date)
{%- endif -%}
)
ืืขืจ ืงืึธื ืืืฉ ืืืื: ืคึฟืึทืจ ืื ืืืืืจืึทื ืืึทื ืฅ ืืขืื, ืคึผืจืืืืจื, ืกื ื ืขืืขื ืืึทืื ืืืืื ืคึฟืึทืจ ืื ืืขืฆืืข 3 ืืขื ืืื ื ืื ืืขืจ. ืึทื ืืื, ืคืืืกื ืืืง ืืื ืื ืื ืืืืืจืึทื ืืึทื ืฅ ืืืขื ืืืื ืคืื ืคืึทืกืืขืจ ืืื ืืึทืจืคื ืืืืื ืืงืขืจืข ืจืขืกืืจืกื. ืืืขื ืคืืืกื ืืืง ืืืืฃ ืกืืืืืืข ืฉืืขืื ืื ืคืืืืขืจ ืฆืืฉืืึทื ื ืืืขื ืืืื ืืืื ืึธืจืืจื.
ืืึทืืขืจืืึทืืืืึทืืืึธื ืืื ืึธืืืขืจื ืึทืืืื ืืืึทื ืงืึธืืืจืื ื
Redshift ืืื ืึท ืงืึธืืืื ืขืจ DBMS ืืืึธืก ืึทืืึทืื ืืืจ ืฆื ืฉืืขืื ืืึทืื ืงืึทืืคึผืจืขืฉืึทื ืึทืืืขืจืืืึทืื ืคึฟืึทืจ ืืขืืขืจ ืืืื ืืืึทื. ืกืึทืืขืงืืื ื ืึธืคึผืืืืึทื ืึทืืืขืจืืืึทืื ืงืขื ืขื ืจืขืืืฆืืจื ืืืกืง ืคึผืืึทืฅ ืืื 20-50%.
ืืึทืงืจืึธื
ืืึทืงืจืึธ ืืกืืืข:
{{ compress_table(schema, table,
drop_backup=False,
comprows=none|Integer,
sort_style=none|compound|interleaved,
sort_keys=none|List<String>,
dist_style=none|all|even,
dist_key=none|String) }}
ืืึธืืื ื ืืึธืืขื ืืืืคื
ืืืจ ืงืขื ืขื ืฆืืืฉืขืคึผืขื ืืืงืก ืฆื ืืขืืขืจ ืืืจืืคืืจืื ื ืคืื ืื ืืึธืืขื, ืืืึธืก ืืืขื ืืืื ืขืงืกืึทืงืืืืึทื ืืืืืขืจ ืงืึทืืขืจ ืึธืืขืจ ืืืืื ื ืึธื ืื ืฉืึทืคืื ื ืคืื ืื ืืึธืืขื ืืื ืืขืขื ืืืงื:
pre-hook: "{{ logging.log_model_start_event() }}"
post-hook: "{{ logging.log_model_end_event() }}"
ืื ืืึธืืื ื ืืึธืืืืข ืืืขื ืืึธืื ืืืจ ืฆื ืจืขืงืึธืจืืืจื ืึทืืข ืื ื ืืืืืง ืืขืืึทืืึทืืึท ืืื ืึท ืืึทืืื ืืขืจ ืืืฉ, ืืืึธืก ืืขืจื ืึธื ืงืขื ืขื ืืืื ืืขืืืืื ื ืฆื ืงืึธื ืืจืึธืืืจื ืืื ืึทื ืึทืืืื ืืึทืืึทืื ืขืงืก.
ืืึธืก ืืื ืืื ืื ืืึทืฉืืึธืจื ืงืืงื ืืื ืืืืืจื ืืืืฃ ืืึธืืื ื ืืึทืื ืืื ืืึธืงืขืจ:
ืึธืืึทืืืืฉืึทื ืคืื ืกืืึธืจืืืืฉ ืืืฉืึทืื
ืืืื ืืืจ ื ืืฆื ืขืืืขืืข ืืงืกืืขื ืฉืึทื ื ืคืื ืื ืคืึทื ืืงืฉืึทื ืึทืืืื ืคืื ืื ืืขืืืืื ื ืจืืคึผืึทืืึทืืึธืจื, ืึทืืึท ืืื UDF (User Defined Functions), ืื ืืืขืจืกืืข ืคืื โโืื ืคืึทื ืืงืฉืึทื ื, ืึทืงืกืขืก ืงืึธื ืืจืึธื ืืื ืึธืืึทืืืืืื ืจืึธืืืื ื ืคืื ื ืืึทืข ืจืืืืกืื ืืื ืืืืขืจ ืืึทืงืืืขื ืฆื ืืึธื ืืื DBT.
ืืืจ ื ืืฆื UDF ืืื Python ืฆื ืจืขืืขื ืขื ืืึทืฉืขืก, E- ืืจืืื ืืึธืืืืื ื ืืื ืืืืืึทืกืง ืืืงืึธืืืื ื.
ืึท ืืืืฉืคึผืื ืคืื ืึท ืืึทืงืจืึธื ืืืึธืก ืงืจืืืืฅ ืึท UDF ืืื ืงืืื ืืืจืืคืืจืื ื ืกืืืืืืข (ืืขืื, ืคึผืจืืืืจื, ืคึผืจืึธื):
{% macro create_udf() -%}
{% set sql %}
CREATE OR REPLACE FUNCTION {{ target.schema }}.f_sha256(mes "varchar")
RETURNS varchar
LANGUAGE plpythonu
STABLE
AS $$
import hashlib
return hashlib.sha256(mes).hexdigest()
$$
;
{% endset %}
{% set table = run_query(sql) %}
{%- endmacro %}
ืืื Wheely ืืืจ ื ืืฆื Amazon Redshift, ืืืึธืก ืืื ืืืืืจื ืืืืฃ PostgreSQL. ืคึฟืึทืจ Redshift, ืขืก ืืื ืืืืืืืง ืฆื ืงืขืกืืืืขืจ ืืึทืืืขื ืกืืึทืืืกืืืง ืืืืฃ ืืืฉื ืืื ืคืจืื ืืืกืง ืคึผืืึทืฅ - ืจืืกืคึผืขืงืืืืืื ืื ANALYZE ืืื VACUUM ืงืึทืืึทื ืื.
ืฆื ืืึธื ืืึธืก, ืื ืงืึทืืึทื ืื ืคืื ืื ืจืขืืฉืืคื_ืืึทืื ืืขื ืึทื ืกืข ืืึทืงืจืึธื ืืขื ืขื ืขืงืกืึทืงืืืืึทื ืืขืืขืจ ื ืึทืื:
{% macro redshift_maintenance() %}
{% set vacuumable_tables=run_query(vacuumable_tables_sql) %}
{% for row in vacuumable_tables %}
{% set message_prefix=loop.index ~ " of " ~ loop.length %}
{%- set relation_to_vacuum = adapter.get_relation(
database=row['table_database'],
schema=row['table_schema'],
identifier=row['table_name']
) -%}
{% do run_query("commit") %}
{% if relation_to_vacuum %}
{% set start=modules.datetime.datetime.now() %}
{{ dbt_utils.log_info(message_prefix ~ " Vacuuming " ~ relation_to_vacuum) }}
{% do run_query("VACUUM " ~ relation_to_vacuum ~ " BOOST") %}
{{ dbt_utils.log_info(message_prefix ~ " Analyzing " ~ relation_to_vacuum) }}
{% do run_query("ANALYZE " ~ relation_to_vacuum) %}
{% set end=modules.datetime.datetime.now() %}
{% set total_seconds = (end - start).total_seconds() | round(2) %}
{{ dbt_utils.log_info(message_prefix ~ " Finished " ~ relation_to_vacuum ~ " in " ~ total_seconds ~ "s") }}
{% else %}
{{ dbt_utils.log_info(message_prefix ~ ' Skipping relation "' ~ row.values() | join ('"."') ~ '" as it does not exist') }}
{% endif %}
{% endfor %}
{% endmacro %}
ืืื ืงืืึธืื
ืขืก ืืื ืืขืืืขื ืฆื ื ืืฆื DBT ืืื ืึท ืกืขืจืืืืก (Managed Service). ืึทืจืืึทื ืืขืจืขืื ื:
- ืืืขื IDE ืคึฟืึทืจ ืืขืืืขืืึธืคึผืื ื ืคึผืจืึทืืืฉืขืงืก ืืื ืืึธืืขืืก
- ืึทืจืืขื ืงืึทื ืคืืืืขืจืืืฉืึทื ืืื ืกืงืขืืืฉืืืื ื
- ืคึผืฉืื ืืื ืืึทืงืืืขื ืึทืงืกืขืก ืฆื ืืึธืืก
- ืืืขืืืืืื ืืื ืืึทืงืืืืขื ืืืืฉืึทื ืคืื ืืืื ืคึผืจืืืขืงื
- ืงืึทื ืขืงืืื ื ืกื (ืงืึธื ืืื ืืึธืืก ืื ืืขืืจืึทืืืึธื)
ืกืึธืฃ
ืคึผืจืืคึผืขืจืื ื ืืื ืงืึทื ืกืืืื ื DWH ืืืขืจื ืืื ืขื ืืืฉืืืึทืืึทื ืืื ืืืืืืืืืง ืืื ืืจืื ืงื ืึท ืกืืึธืึธืืืืข. DBT ืืืฉืืืื ืคืื ืืืฉืื ืืืฉืึท, ืืึทื ืืฆืขืจ ืืงืกืืขื ืฉืึทื ื (ืืึธืืืืขืก), ืึท ืงืึทืืคึผืืืืขืจ, ืึทื ืขืงืกืึทืงืืืืขืจ ืืื ืึท ืคึผืขืงื ืคืึทืจืืืึทืืืขืจ. ืืืจื ืฉืืขืื ืื ืขืืขืืขื ืื ืฆืืืึทืืขื ืืืจ ืืึทืงืืืขื ืึท ืืึทื ืฅ ืึทืจืืขื ืกืืืืืืข ืคึฟืึทืจ ืืืื ืืึทืืึท ืืืึทืจืขืืึธืืกืข. ืขืก ืืื ืงืืื ืึท ืืขืกืขืจ ืืืขื ืฆื ืคืืจื ืืจืึทื ืกืคืึธืจืืึทืฆืืข ืืื DWH ืืืึทื ื.
ืื ืืืืืคืก ื ืืืืขืืื ืืขื ืืืจื ืื ืืขืืืขืืึธืคึผืขืจืก ืคืื DBT ืืขื ืขื ืคืืจืืืืืจื ืืื ืืืื:
- ืงืึธื, ื ืื GUI, ืืื ืืขืจ ืืขืกืืขืจ ืึทืืกืืจืึทืงืฆืืข ืคึฟืึทืจ ืืงืกืคึผืจืขืกืื ื ืงืึธืืคึผืืขืงืก ืึทื ืึทืืืืืงืึทื ืืึธืืืง
- ืืจืืขืื ืืื ืืึทืื ืืึธื ืึทืืึทืคึผื ืืขืกืืขืจ ืคึผืจืึทืงืืืกืื ืืื ืืืืืืืืืจื ืื ืืฉืขื ืืขืจืืข (ืกืึธืคืืืืึทืจืข ืืื ืืฉืขื ืืจืืข)
- ืงืจืืืืฉ ืืึทืื ืื ืคืจืึทืกืืจืึทืงืืฉืขืจ ืืึธื ืืืื ืงืึทื ืืจืึธืืื ืืืจื ืื ืืึทื ืืฆืขืจ ืงืื ืืื ืึธืคึฟื ืืงืืจ ืืืืืืืืืจื
- ื ืื ืืืืื ืึทื ืึทืืืืืงืก ืืืฉืืจืื, ืึธืืขืจ ืืืื ืงืึธื ืืืขื ืื ืงืจืืกืื ืืื ืืืขืจื ืื ืคืึทืจืืึธื ืคืื ืื ืขืคึฟื ืืงืืจ ืงืื
ืื ืืึทืจืฅ ืืืืืคืก ืืึธืื ืืขืคึฟืืจื ืึท ืคึผืจืึธืืืงื ืืืึธืก ืืื ืืขื ืืฆื ืืืจื ืืืืขืจ 850 ืงืึธืืคึผืึทื ืืขืก ืืืึทื ื, ืืื ืืื ืืขื ืขื ืื ืืงืขืจ ืคืื ืคืืืข ืืงืกืืืืื ื ืืงืกืืขื ืฉืึทื ื ืืืึธืก ืืืขื ืืืื ืืืฉืืคื ืืื ืืขืจ ืฆืืงืื ืคึฟื.
ืคึฟืึทืจ ืื ืืื ืืขืจืขืกืืจื, ืขืก ืืื ืึท ืืืืืขื ืคืื ืึทื ืึธืคึฟื ืืขืงืฆืืข ืืื ืืขืืขืื ืึท ืืืกื ืืืฉืื ืฆืืจืืง ืืื ืึท ืืืื ืคืื ืึทื ืึธืคึฟื ืืขืงืฆืืข ืืื OTUS -
ืืื ืึทืืืฉืึทื ืฆื DBT ืืื Data Warehousing, ืืื ืึท ืืืื ืคืื ืื ืืึทืืึท ืื ืืฉืขื ืืจ ืงืืจืก ืืืืฃ ืื OTUS ืคึผืืึทืืคืึธืจืืข, ืืืื ืืืจืื ืืื ืืื ืืขืจื ืขื ืงืืืกื ืืืืฃ ืึท ื ืืืขืจ ืคืื ืื ืืขืจืข ืืึทืืืึทืืืง ืืื ืืึธืืขืจื ืืขืืขืก:
- ืึทืจืงืึทืืขืงืืฉืขืจืึทื ืงืึทื ืกืขืคึผืก ืคึฟืึทืจ ืืื ืืึทืืึท ืึทืคึผืืึทืงืืืฉืึทื ื
- ืคึผืจืึทืงืืืกืื ืืื Spark ืืื Spark Streaming
- ืืืกืคืึธืจืฉื ืืขืืืึธืืก ืืื ืืืฉืืจืื ืคึฟืึทืจ ืืึธืืืื ื ืืึทืื ืงืืืืื
- ืื ืื ืึทื ืึทืืืกืืก ืฉืึธืืงืืืกืื ืืื DWH
- NoSQL ืงืึทื ืกืขืคึผืก: HBase, Cassandra, ElasticSearch
- ืคึผืจืื ืกืึทืคึผืึทืื ืคืื ืืึธื ืืืึธืจืื ื ืืื ืึธืจืงืขืกืืจืึทืืืึธื
- ืืขืฆื ืคึผืจืึธืืขืงื: ืฉืืขืื ืึทืืข ืื ืกืงืืื ืฆืืืึทืืขื ืืื ืืขืจ ืืขื ืืึธืจืื ื ืฉืืืฆื
ืืื ืงืก:
ืืื ืืึทืงืืืืขื ืืืืฉืึทื - ืืงืืื โ ืืคืืฆืืขืืข ืืืงืืืขื ืืืฆืืขืืืึธืก, ืคึผืื ืงื, ืืื dbt? โ ืืืืขืจืืืืง ืึทืจืืืงื ืคืื ืืืื ืขืจ ืคืื ืื ืืืืจืื ืคืื DBTืืึทืืึท ืืืืขื ืืื ืคึฟืึทืจ Amazon Redshift ืกืืึธืจืืืืฉ - ืืึธืืืืืข, ืจืขืงืึธืจืืื ื ืคืื ืึทื ืึธืืืก ืขืคื ืืขืงืฆืืขืืึทืงืขื ืขื ืืื ืืื Greenplum - ืืขืจ ืืืืึทืืขืจ ืึธืคึฟื ืืขืงืฆืืข ืืื 15 ืืื 2020ืืึทืืึท ืืื ืืฉืขื ืืจืืข ืงืึธืจืก โ ืึธืืืกืืืืขื ืึท ืืขืจืืืึทืงืกื ืึทื ืึทืืืืืงืก ืืืึธืจืงืคืืึธืื - ืึท ืงืืง ืืื ืืขืจ ืฆืืงืื ืคึฟื ืคืื ืืึทืื ืืื ืึทื ืึทืืืืืงืกืขืก ืืื ืฆืืื ืคึฟืึทืจ ืขืคึฟืขื ืขื ืืงืืจ ืึทื ืึทืืืืืงืก - ืื ืขืืืึธืืืฆืืข ืคืื โโโโืึทื ืึทืืืืืงืก ืืื ืื ืืฉืคึผืขื ืคืื ืขืคึฟื ืืงืืจืงืขืกืืืืขืจืืืง ืื ืืขืืจืึทืืืึธื ืืื ืึทืืืึธืืึทืืขื ืืืืขื ืืขืกืืื ื ืืื dbtCloud - ืคึผืจืื ืกืึทืคึผืึทืื ืคืื ืื ืื ืกื ื ืืฆื DBTืึธื ืืืืื ืืื DBT ืืืืึธืจืืึทื โ ืคึผืจืึทืงืืืก, ืฉืจืื-ืืืจื-ืฉืจืื ืื ืกืืจืึทืงืฉืึทื ื ืคึฟืึทืจ ืคืจืืึท ืึทืจืืขืืืืฉืึทืคืคืืข ืงืจืึธื - Github DBT ืืืืึธืจืืึทื - ืืืืืื, ืืืืืื ืืงืจืืื ืคึผืจืืืขืงื ืงืึธื
ืืงืืจ: www.habr.com