Bugun men Habr o'quvchilariga PostgreSQL DBMSda jadval bog'liqliklari bilan ishlash uchun Python tilida yozilgan yordamchi dasturni taqdim etmoqchiman.
Yordamchi dastur API'si oddiy va uchta usuldan iborat:
- arxiv_jadval - belgilangan birlamchi kalitlar bilan rekursiv arxivlash/yo'q qilish
- get_table_references - jadvalga bog'liqliklarni qidirish (ko'rsatilgan va unga havola qilingan jadvallarni ko'rsatadi)
- get_rows_references - kerakli jadvaldagi belgilangan qatorlarga havola qiladigan boshqa jadvallardagi qatorlarni qidirish
Sana oldin
Mening ismim Oleg Borzov, men Domklikdagi ipoteka krediti bo'yicha menejerlar uchun CRM jamoasida ishlab chiquvchiman.
Bizning CRM tizimimizning asosiy ma'lumotlar bazasi kompaniyadagi hajmi bo'yicha eng yiriklaridan biridir. Bu, shuningdek, eng qadimiylaridan biri: u loyihaning boshlanishida paydo bo'ldi, daraxtlar katta bo'lganida, Domklik startap edi va moda Python asinxron ramkasida mikroservis o'rniga PHPda ulkan monolit paydo bo'ldi.
PHP-dan Python-ga o'tish juda uzoq davom etdi va ikkala tizimni bir vaqtning o'zida qo'llab-quvvatlashni talab qildi, bu ma'lumotlar bazasi dizayniga ta'sir qildi.
Natijada, bizda turli xil so'rovlar uchun ko'plab indekslarga ega bo'lgan juda ko'p sonli bog'langan va ulkan jadvallardan iborat ma'lumotlar bazasi mavjud. Bularning barchasi ma'lumotlar bazasining ishlashiga salbiy ta'sir qiladi: katta jadvallar va ular orasidagi bog'lanishlar to'plami tufayli so'rovlarning murakkabligi doimiy ravishda oshib bormoqda, bu ayniqsa eng ko'p yuklangan jadvallar uchun juda muhimdir.
Ma'lumotlar bazasiga yukni kamaytirish uchun biz eski yozuvlarni eng katta hajmli va yuklangan jadvallardan arxivlanganlarga o'tkazadigan skript yozishga qaror qildik (masalan, task
в task_archive
).
Bu vazifa jadvallar orasidagi munosabatlarning ko'pligi bilan murakkablashadi: shunchaki qatorlarni ko'chiring task
в task_archive
etarli emas, bundan oldin barcha havolalar bilan xuddi shunday qilish kerak task
jadvallar.
Men misol bilan ko'rsataman
Aytaylik, jadvaldagi yozuvlarni o'chirishimiz kerak Flights
. Postgres bizga buni xuddi shunday qilishimizga ruxsat bermaydi: biz birinchi navbatda barcha havolalar jadvallaridan yozuvlarni o'chirishimiz kerak va hokazo, hech kim tomonidan havola qilinmagan jadvallargacha rekursiv.
Bizning misolimizda Flights
ishora qiladi Ticket_flights
, va uning ustida - Boarding_passes
.
Shuning uchun uni quyidagi tartibda o'chirishingiz kerak:
- Biz satrlarning asosiy kalitlari (PK) qiymatlarini olamiz
Ticket_flights
, bu o'chiriladigan qatorlarga ishora qiladiFlights
. - Biz PK qatorlarini olamiz
Boarding_passes
, qaysilarga tegishliTicket_flights
. - Jadvaldagi 2-bosqichdan boshlab PK bo'yicha qatorlarni o'chirib tashlaymiz
Boarding_passes
. - 1-bosqichdan boshlab PK bo'yicha qatorlarni o'chiring
Ticket_flights
. - dan qatorlarni olib tashlash
Flights
.
Natijada PgGraph deb nomlangan yordamchi dastur paydo bo'ldi, biz uni ochiq manba qilishga qaror qildik.
Qanday foydalanish kerak
Yordamchi dastur ikkita foydalanish rejimini qo'llab-quvvatlaydi:
- Buyruqlar qatoridan qo'ng'iroq qilish (
pggraph …
). - Python kodida foydalanish (sinf
PgGraphApi
).
O'rnatish va sozlash
Avval Pypi omboridan yordam dasturini o'rnatishingiz kerak:
pip3 install pggraph
Keyin mahalliy kompyuterda ma'lumotlar bazasi konfiguratsiyasi va arxivlash skripti bilan config.ini faylini yarating:
[db]
host = localhost
port = 5432
user = postgres
password = postgres
dbname = postgres
schema = public ; Необязательный параметр, указано значение по умолчанию
[archive] ; Данный раздел заполнять необязательно, ниже указаны значения по умолчанию
is_debug = false
chunk_size = 1000
max_depth = 20
to_archive = true
archive_suffix = 'archive'
Konsoldan ishga tushirish
parametrlar
$ pggraph -h
usage: pggraph action [-h] --table TABLE [--ids IDS] [--config_path CONFIG_PATH]
positional arguments:
action required action: archive_table, get_table_references, get_rows_references
optional arguments:
-h, --help show this help message and exit
--table TABLE table name
--ids IDS primary key ids, separated by comma, e.g. 1,2,3
--config_path CONFIG_PATH path to config.ini
--log_path LOG_PATH path to log dir
--log_level LOG_LEVEL log level (debug, info, error)
Pozitsion argumentlar:
action
- talab qilinadigan harakat:archive_table
,get_table_references
yokiget_rows_references
.
Nomlangan argumentlar:
--config_path
— konfiguratsiya fayliga yo'l;--table
— biror amalni bajarishingiz kerak bo‘lgan jadval;--ids
- vergul bilan ajratilgan id ro'yxati, masalan,1,2,3
(ixtiyoriy parametr);--log_path
— jurnallar uchun papkaga yo'l (ixtiyoriy parametr, sukut bo'yicha - uy papkasi);--log_level
— roʻyxatga olish darajasi (ixtiyoriy parametr, standart INFO).
Buyruqlar misollari
Jadvalni arxivlash
Yordamchi dasturning asosiy vazifasi ma'lumotlarni arxivlashdir, ya'ni. satrlarni asosiy jadvaldan arxiv jadvaliga o'tkazish (masalan, jadvaldan kitoblar в kitoblar_arxivi).
Arxivlashsiz o'chirish ham qo'llab-quvvatlanadi: buning uchun config.ini-da parametrni o'rnatishingiz kerak to_archive = noto'g'ri).
Kerakli parametrlar - config_path, jadval va identifikatorlar.
Ishga tushgandan so'ng, yozuvlar rekursiv ravishda o'chiriladi ids
jadvalda table
va unga tegishli barcha jadvallarda.
$ pggraph archive_table --config_path config.hw.local.ini --table flights --ids 1,2,3
2020-06-20 19:27:44 INFO: flights - START
2020-06-20 19:27:44 INFO: flights - start archive_recursive 3 rows (depth=0)
2020-06-20 19:27:44 INFO: START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: ticket_flights - start archive_recursive 3 rows (depth=1)
2020-06-20 19:27:44 INFO: START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO: START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO: boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO: START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO: boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO: START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO: boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO: START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO: END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: ticket_flights - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO: END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: flights - archive_by_ids 3 rows by id
2020-06-20 19:27:44 INFO: flights - END
Belgilangan jadval uchun bog'liqliklarni toping
Belgilangan jadvalning bog'liqliklarini topish funktsiyasi table
. Kerakli parametrlar - config_path
и table
.
Ishga tushgandan so'ng, ekranda lug'at ko'rsatiladi, bu erda:
in_refs
— berilganga havola qiluvchi jadvallar lugʻati, bunda kalit jadval nomi, qiymat esa xorijiy kalit obʼyektlar roʻyxati (pk_main
- asosiy jadvaldagi asosiy kalit;pk_ref
- havolalar jadvalidagi asosiy kalit;fk_ref
— manba jadvalining tashqi kaliti bo'lgan ustun nomi);out_refs
- bu tegishli jadvallar lug'ati.
$ pggraph get_table_references --config_path config.hw.local.ini --table flights
{'in_refs': {'ticket_flights': [ForeignKey(pk_main='flight_id', pk_ref='ticket_no, flight_id', fk_ref='flight_id')]},
'out_refs': {'aircrafts': [ForeignKey(pk_main='aircraft_code', pk_ref='flight_id', fk_ref='aircraft_code')],
'airports': [ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='arrival_airport'),
ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='departure_airport')]}}
Belgilangan Asosiy kalit bilan satrlarga havolalarni topish
Chet el kaliti orqali boshqa jadvallardagi qatorlarni qidirish funktsiyasi ids
jadvallar table
. Kerakli parametrlar - config_path
, table
и ids
.
Ishga tushgandan so'ng, ekranda quyidagi tuzilishga ega lug'at ko'rsatiladi:
{
pk_id_1: {
reffering_table_name_1: {
foreign_key_1: [
{row_pk_1: value, row_pk_2: value},
...
],
...
},
...
},
pk_id_2: {...},
...
}
Misol chaqiruvi:
$ pggraph get_rows_references --config_path config.hw.local.ini --table flights --ids 1,2,3
{1: {'ticket_flights': {'flight_id': [{'flight_id': 1,
'ticket_no': '0005432816945'},
{'flight_id': 1,
'ticket_no': '0005432816941'}]}},
2: {'ticket_flights': {'flight_id': [{'flight_id': 2,
'ticket_no': '0005433101832'},
{'flight_id': 2,
'ticket_no': '0005433101864'},
{'flight_id': 2,
'ticket_no': '0005432919715'}]}},
3: {'ticket_flights': {'flight_id': [{'flight_id': 3,
'ticket_no': '0005432817560'},
{'flight_id': 3,
'ticket_no': '0005432817568'},
{'flight_id': 3,
'ticket_no': '0005432817559'}]}}}
Kodda foydalanish
Kutubxonani konsolda ishga tushirishdan tashqari Python kodida ham foydalanish mumkin. iPython interaktiv muhitidagi qo'ng'iroqlar misollari quyida ko'rsatilgan.
Jadvalni arxivlash
>>> from pg_graph.main import setup_logging
>>> setup_logging(log_level='DEBUG')
>>> from pg_graph.api import PgGraphApi
>>> api = PgGraphApi('config.hw.local.ini')
>>> api.archive_table('flights', [4,5])
2020-06-20 23:12:08 INFO: flights - START
2020-06-20 23:12:08 INFO: flights - start archive_recursive 2 rows (depth=0)
2020-06-20 23:12:08 INFO: START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: ticket_flights - ForeignKey(pk_main='flight_id', pk_ref='flight_id, ticket_no', fk_ref='flight_id')
2020-06-20 23:12:08 DEBUG: SQL('SELECT flight_id, ticket_no FROM bookings.ticket_flights WHERE (flight_id) IN (%s, %s)')
2020-06-20 23:12:08 INFO: ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: ticket_flights - start archive_recursive 3 rows (depth=1)
2020-06-20 23:12:08 INFO: START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: boarding_passes - archive_by_fk 3 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: DELETE FROM boarding_passes by FK flight_id, ticket_no - 3 rows
2020-06-20 23:12:08 INFO: END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: ticket_flights - archive_by_ids 3 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: DELETE FROM ticket_flights by flight_id, ticket_no - 3 rows
2020-06-20 23:12:08 DEBUG: INSERT INTO ticket_flights_archive - 3 rows
2020-06-20 23:12:08 INFO: END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: flights - archive_by_ids 2 rows by flight_id
2020-06-20 23:12:09 DEBUG: SQL('CREATE TABLE IF NOT EXISTS bookings.flights_archive (LIKE bookings.flights)')
2020-06-20 23:12:09 DEBUG: DELETE FROM flights by flight_id - 2 rows
2020-06-20 23:12:09 DEBUG: INSERT INTO flights_archive - 2 rows
2020-06-20 23:12:09 INFO: flights - END
Belgilangan jadval uchun bog'liqliklarni toping
>>> from pg_graph.api import PgGraphApi
>>> from pprint import pprint
>>> api = PgGraphApi('config.hw.local.ini')
>>> res = api.get_table_references('flights')
>>> pprint(res)
{'in_refs': {'ticket_flights': [ForeignKey(pk_main='flight_id', pk_ref='flight_id, ticket_no', fk_ref='flight_id')]},
'out_refs': {'aircrafts': [ForeignKey(pk_main='aircraft_code', pk_ref='flight_id', fk_ref='aircraft_code')],
'airports': [ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='arrival_airport'),
ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='departure_airport')]}}
Belgilangan Asosiy kalit bilan satrlarga havolalarni topish
>>> from pg_graph.api import PgGraphApi
>>> from pprint import pprint
>>> api = PgGraphApi('config.hw.local.ini')
>>> rows = api.get_rows_references('flights', [1,2,3])
>>> pprint(rows)
{1: {'ticket_flights': {'flight_id': [{'flight_id': 1,
'ticket_no': '0005432816945'},
{'flight_id': 1,
'ticket_no': '0005432816941'}]}},
2: {'ticket_flights': {'flight_id': [{'flight_id': 2,
'ticket_no': '0005433101832'},
{'flight_id': 2,
'ticket_no': '0005433101864'},
{'flight_id': 2,
'ticket_no': '0005432919715'}]}},
3: {'ticket_flights': {'flight_id': [{'flight_id': 3,
'ticket_no': '0005432817560'},
{'flight_id': 3,
'ticket_no': '0005432817568'},
{'flight_id': 3,
'ticket_no': '0005432817559'}]}}}
Kutubxonaning manba kodi quyidagi manzilda mavjud
Men sharhlar, majburiyatlar va takliflardan mamnun bo'laman.
Men bu yerda va omborda qo'limdan kelgancha savollarga javob berishga harakat qilaman.
Manba: www.habr.com