PgGraph - PostgreSQL-da jadval bog'liqliklarini arxivlash va topish uchun yordamchi dastur

PgGraph - PostgreSQL-da jadval bog'liqliklarini arxivlash va topish uchun yordamchi dastur
Bugun men Habr o'quvchilariga PostgreSQL DBMSda jadval bog'liqliklari bilan ishlash uchun Python tilida yozilgan yordamchi dasturni taqdim etmoqchiman.

Yordamchi dastur API'si oddiy va uchta usuldan iborat:

  • arxiv_jadval - belgilangan birlamchi kalitlar bilan rekursiv arxivlash/yo'q qilish
  • get_table_references - jadvalga bog'liqliklarni qidirish (ko'rsatilgan va unga havola qilingan jadvallarni ko'rsatadi)
  • get_rows_references - kerakli jadvaldagi belgilangan qatorlarga havola qiladigan boshqa jadvallardagi qatorlarni qidirish

Sana oldin

Mening ismim Oleg Borzov, men Domklikdagi ipoteka krediti bo'yicha menejerlar uchun CRM jamoasida ishlab chiquvchiman.

Bizning CRM tizimimizning asosiy ma'lumotlar bazasi kompaniyadagi hajmi bo'yicha eng yiriklaridan biridir. Bu, shuningdek, eng qadimiylaridan biri: u loyihaning boshlanishida paydo bo'ldi, daraxtlar katta bo'lganida, Domklik startap edi va moda Python asinxron ramkasida mikroservis o'rniga PHPda ulkan monolit paydo bo'ldi.

PHP-dan Python-ga o'tish juda uzoq davom etdi va ikkala tizimni bir vaqtning o'zida qo'llab-quvvatlashni talab qildi, bu ma'lumotlar bazasi dizayniga ta'sir qildi.

Natijada, bizda turli xil so'rovlar uchun ko'plab indekslarga ega bo'lgan juda ko'p sonli bog'langan va ulkan jadvallardan iborat ma'lumotlar bazasi mavjud. Bularning barchasi ma'lumotlar bazasining ishlashiga salbiy ta'sir qiladi: katta jadvallar va ular orasidagi bog'lanishlar to'plami tufayli so'rovlarning murakkabligi doimiy ravishda oshib bormoqda, bu ayniqsa eng ko'p yuklangan jadvallar uchun juda muhimdir.

Ma'lumotlar bazasiga yukni kamaytirish uchun biz eski yozuvlarni eng katta hajmli va yuklangan jadvallardan arxivlanganlarga o'tkazadigan skript yozishga qaror qildik (masalan, task в task_archive).

Bu vazifa jadvallar orasidagi munosabatlarning ko'pligi bilan murakkablashadi: shunchaki qatorlarni ko'chiring task в task_archive etarli emas, bundan oldin barcha havolalar bilan xuddi shunday qilish kerak task jadvallar.

Men misol bilan ko'rsataman postgrespro.ru saytidan demo ma'lumotlar bazasi:

PgGraph - PostgreSQL-da jadval bog'liqliklarini arxivlash va topish uchun yordamchi dastur
Aytaylik, jadvaldagi yozuvlarni o'chirishimiz kerak Flights. Postgres bizga buni xuddi shunday qilishimizga ruxsat bermaydi: biz birinchi navbatda barcha havolalar jadvallaridan yozuvlarni o'chirishimiz kerak va hokazo, hech kim tomonidan havola qilinmagan jadvallargacha rekursiv.

Bizning misolimizda Flights ishora qiladi Ticket_flights, va uning ustida - Boarding_passes.

Shuning uchun uni quyidagi tartibda o'chirishingiz kerak:

  1. Biz satrlarning asosiy kalitlari (PK) qiymatlarini olamiz Ticket_flights, bu o'chiriladigan qatorlarga ishora qiladi Flights.
  2. Biz PK qatorlarini olamiz Boarding_passes, qaysilarga tegishli Ticket_flights.
  3. Jadvaldagi 2-bosqichdan boshlab PK bo'yicha qatorlarni o'chirib tashlaymiz Boarding_passes.
  4. 1-bosqichdan boshlab PK bo'yicha qatorlarni o'chiring Ticket_flights.
  5. dan qatorlarni olib tashlash Flights.

Natijada PgGraph deb nomlangan yordamchi dastur paydo bo'ldi, biz uni ochiq manba qilishga qaror qildik.

Qanday foydalanish kerak

Yordamchi dastur ikkita foydalanish rejimini qo'llab-quvvatlaydi:

  • Buyruqlar qatoridan qo'ng'iroq qilish (pggraph …).
  • Python kodida foydalanish (sinf PgGraphApi).

O'rnatish va sozlash

Avval Pypi omboridan yordam dasturini o'rnatishingiz kerak:

pip3 install pggraph

Keyin mahalliy kompyuterda ma'lumotlar bazasi konfiguratsiyasi va arxivlash skripti bilan config.ini faylini yarating:

[db]
host = localhost
port = 5432
user = postgres
password = postgres
dbname = postgres
schema = public ; Необязательный параметр, указано значение по умолчанию

[archive]  ; Данный раздел заполнять необязательно, ниже указаны значения по умолчанию
is_debug = false
chunk_size = 1000
max_depth = 20
to_archive = true
archive_suffix = 'archive'

Konsoldan ishga tushirish

parametrlar

$ pggraph -h
usage: pggraph action [-h] --table TABLE [--ids IDS] [--config_path CONFIG_PATH]
positional arguments:
  action        required action: archive_table, get_table_references, get_rows_references

optional arguments:
  -h, --help                    show this help message and exit
  --table TABLE                 table name
  --ids IDS                     primary key ids, separated by comma, e.g. 1,2,3
  --config_path CONFIG_PATH     path to config.ini
  --log_path LOG_PATH           path to log dir
  --log_level LOG_LEVEL         log level (debug, info, error)

Pozitsion argumentlar:

  • action - talab qilinadigan harakat: archive_table, get_table_references yoki get_rows_references.

Nomlangan argumentlar:

  • --config_path — konfiguratsiya fayliga yo'l;
  • --table — biror amalni bajarishingiz kerak bo‘lgan jadval;
  • --ids - vergul bilan ajratilgan id ro'yxati, masalan, 1,2,3 (ixtiyoriy parametr);
  • --log_path — jurnallar uchun papkaga yo'l (ixtiyoriy parametr, sukut bo'yicha - uy papkasi);
  • --log_level — roʻyxatga olish darajasi (ixtiyoriy parametr, standart INFO).

Buyruqlar misollari

Jadvalni arxivlash

Yordamchi dasturning asosiy vazifasi ma'lumotlarni arxivlashdir, ya'ni. satrlarni asosiy jadvaldan arxiv jadvaliga o'tkazish (masalan, jadvaldan kitoblar в kitoblar_arxivi).

Arxivlashsiz o'chirish ham qo'llab-quvvatlanadi: buning uchun config.ini-da parametrni o'rnatishingiz kerak to_archive = noto'g'ri).

Kerakli parametrlar - config_path, jadval va identifikatorlar.

Ishga tushgandan so'ng, yozuvlar rekursiv ravishda o'chiriladi ids jadvalda table va unga tegishli barcha jadvallarda.

$ pggraph archive_table --config_path config.hw.local.ini --table flights --ids 1,2,3
2020-06-20 19:27:44 INFO: flights - START
2020-06-20 19:27:44 INFO: flights - start archive_recursive 3 rows (depth=0)
2020-06-20 19:27:44 INFO:       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:       ticket_flights - start archive_recursive 3 rows (depth=1)
2020-06-20 19:27:44 INFO:               START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:       ticket_flights - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: flights - archive_by_ids 3 rows by id
2020-06-20 19:27:44 INFO: flights - END

Belgilangan jadval uchun bog'liqliklarni toping

Belgilangan jadvalning bog'liqliklarini topish funktsiyasi table. Kerakli parametrlar - config_path и table.

Ishga tushgandan so'ng, ekranda lug'at ko'rsatiladi, bu erda:

  • in_refs — berilganga havola qiluvchi jadvallar lugʻati, bunda kalit jadval nomi, qiymat esa xorijiy kalit obʼyektlar roʻyxati (pk_main - asosiy jadvaldagi asosiy kalit; pk_ref - havolalar jadvalidagi asosiy kalit; fk_ref — manba jadvalining tashqi kaliti bo'lgan ustun nomi);
  • out_refs - bu tegishli jadvallar lug'ati.

$ pggraph get_table_references --config_path config.hw.local.ini --table flights
{'in_refs': {'ticket_flights': [ForeignKey(pk_main='flight_id', pk_ref='ticket_no, flight_id', fk_ref='flight_id')]},
 'out_refs': {'aircrafts': [ForeignKey(pk_main='aircraft_code', pk_ref='flight_id', fk_ref='aircraft_code')],
              'airports': [ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='arrival_airport'),
                           ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='departure_airport')]}}

Belgilangan Asosiy kalit bilan satrlarga havolalarni topish

Chet el kaliti orqali boshqa jadvallardagi qatorlarni qidirish funktsiyasi ids jadvallar table. Kerakli parametrlar - config_path, table и ids.

Ishga tushgandan so'ng, ekranda quyidagi tuzilishga ega lug'at ko'rsatiladi:

{
	pk_id_1: {
		reffering_table_name_1: {
			foreign_key_1: [
				{row_pk_1: value, row_pk_2: value},
				...
			], 
			...
		},
		...
	},
	pk_id_2: {...},
	...
}

Misol chaqiruvi:

$ pggraph get_rows_references --config_path config.hw.local.ini --table flights --ids 1,2,3
{1: {'ticket_flights': {'flight_id': [{'flight_id': 1,
                                       'ticket_no': '0005432816945'},
                                      {'flight_id': 1,
                                       'ticket_no': '0005432816941'}]}},
 2: {'ticket_flights': {'flight_id': [{'flight_id': 2,
                                       'ticket_no': '0005433101832'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005433101864'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005432919715'}]}},
 3: {'ticket_flights': {'flight_id': [{'flight_id': 3,
                                       'ticket_no': '0005432817560'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817568'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817559'}]}}}

Kodda foydalanish

Kutubxonani konsolda ishga tushirishdan tashqari Python kodida ham foydalanish mumkin. iPython interaktiv muhitidagi qo'ng'iroqlar misollari quyida ko'rsatilgan.

Jadvalni arxivlash

>>> from pg_graph.main import setup_logging
>>> setup_logging(log_level='DEBUG')
>>> from pg_graph.api import PgGraphApi
>>> api = PgGraphApi('config.hw.local.ini')
>>> api.archive_table('flights', [4,5])
2020-06-20 23:12:08 INFO: flights - START
2020-06-20 23:12:08 INFO: flights - start archive_recursive 2 rows (depth=0)
2020-06-20 23:12:08 INFO: 	START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 	ticket_flights - ForeignKey(pk_main='flight_id', pk_ref='flight_id, ticket_no', fk_ref='flight_id')
2020-06-20 23:12:08 DEBUG: 	SQL('SELECT flight_id, ticket_no FROM bookings.ticket_flights WHERE (flight_id) IN (%s, %s)')
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 3 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 3 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 3 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 3 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 3 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 3 rows
2020-06-20 23:12:08 INFO: 	END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: flights - archive_by_ids 2 rows by flight_id
2020-06-20 23:12:09 DEBUG: SQL('CREATE TABLE IF NOT EXISTS bookings.flights_archive (LIKE bookings.flights)')
2020-06-20 23:12:09 DEBUG: DELETE FROM flights by flight_id - 2 rows
2020-06-20 23:12:09 DEBUG: INSERT INTO flights_archive - 2 rows
2020-06-20 23:12:09 INFO: flights - END

Belgilangan jadval uchun bog'liqliklarni toping

>>> from pg_graph.api import PgGraphApi
>>> from pprint import pprint
>>> api = PgGraphApi('config.hw.local.ini')
>>> res = api.get_table_references('flights')
>>> pprint(res)
{'in_refs': {'ticket_flights': [ForeignKey(pk_main='flight_id', pk_ref='flight_id, ticket_no', fk_ref='flight_id')]},
 'out_refs': {'aircrafts': [ForeignKey(pk_main='aircraft_code', pk_ref='flight_id', fk_ref='aircraft_code')],
              'airports': [ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='arrival_airport'),
                           ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='departure_airport')]}}

Belgilangan Asosiy kalit bilan satrlarga havolalarni topish

>>> from pg_graph.api import PgGraphApi
>>> from pprint import pprint
>>> api = PgGraphApi('config.hw.local.ini')
>>> rows = api.get_rows_references('flights', [1,2,3])
>>> pprint(rows)
{1: {'ticket_flights': {'flight_id': [{'flight_id': 1,
                                       'ticket_no': '0005432816945'},
                                      {'flight_id': 1,
                                       'ticket_no': '0005432816941'}]}},
 2: {'ticket_flights': {'flight_id': [{'flight_id': 2,
                                       'ticket_no': '0005433101832'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005433101864'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005432919715'}]}},
 3: {'ticket_flights': {'flight_id': [{'flight_id': 3,
                                       'ticket_no': '0005432817560'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817568'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817559'}]}}}

Kutubxonaning manba kodi quyidagi manzilda mavjud GitHub MIT litsenziyasi ostida, shuningdek, omborda PyPI.

Men sharhlar, majburiyatlar va takliflardan mamnun bo'laman.

Men bu yerda va omborda qo'limdan kelgancha savollarga javob berishga harakat qilaman.

Manba: www.habr.com

a Izoh qo'shish