PgGraph mangrupikeun utilitas pikeun ngarsipkeun sareng milarian katergantungan tabel dina PostgreSQL

PgGraph mangrupikeun utilitas pikeun ngarsipkeun sareng milarian katergantungan tabel dina PostgreSQL
Dinten ieu abdi hoyong nampilkeun pamiarsa Habr kalawan utilitas ditulis dina Python pikeun gawé bareng jeung kagumantungan tabel dina PostgreSQL DBMS.

API utiliti basajan tur diwangun ku tilu métode:

  • archive_table - arsip rekursif / ngahapus baris kalayan konci primér anu ditangtukeun
  • get_table_references - milarian katergantungan pikeun méja (bakal nunjukkeun tabel anu dirujuk ku anu ditunjuk sareng anu ngarujuk éta)
  • get_rows_references - Pilarian baris dina tabel séjén nu nuduhkeun baris dieusian dina tabel nu dipikahoyong

prasajarah

Ngaran abdi Oleg Borzov, Kami pamekar di tim CRM pikeun manajer lending KPR di Domklik.

Database utama sistem CRM kami mangrupikeun salah sahiji anu panggedéna dina hal volume perusahaan. Ieu oge salah sahiji pangkolotna: eta mucunghul dina pisan peluncuran proyék, nalika tangkal éta badag, Domklik éta ngamimitian, sarta tinimbang microservice dina kerangka Asynchronous Python fashionable aya monolith badag dina PHP.

Transisi ti PHP ka Python éta pisan lila sarta merlukeun rojongan simultaneous duanana sistem, nu mangaruhan rarancang database.

Hasilna, urang boga database kalawan angka nu gede ngarupakeun tabel kacida disambungkeun tur badag kalayan kebat indexes pikeun tipena béda queries. Sadaya ieu mangaruhan négatif kinerja database: alatan tabel badag sarta kebat hubungan antara aranjeunna, pajeulitna queries terus ningkat, nu utamana kritis pikeun tabel paling dimuat.

Pikeun ngirangan beban dina pangkalan data, kami mutuskeun pikeun nyerat naskah anu bakal nransferkeun rékaman lami tina tabel anu paling ageung sareng dimuat kana tabel anu diarsipkeun (contona, ti task в task_archive).

tugas ieu nyusahkeun ku jumlah badag hubungan antara tabel: saukur mindahkeun baris ti task в task_archive teu cukup, saméméh éta anjeun kudu lakonan hal nu sarua recursively kalawan sakabeh jalma rujukan task tabél.

Kuring baris demonstrate kalawan conto database demo ti situs postgrespro.ru:

PgGraph mangrupikeun utilitas pikeun ngarsipkeun sareng milarian katergantungan tabel dina PostgreSQL
Sebutkeun urang kedah mupus rékaman tina méja Flights. Postgres moal ngijinkeun urang ngalakukeun ieu sapertos kieu: urang kedah ngahapus rékaman tina sadaya tabel rujukan, sareng saterasna sacara rekursif kana tabel anu henteu dirujuk ku saha waé.

Dina conto urang di Flights ngarujuk Ticket_flights, sareng anjeunna - Boarding_passes.

Janten, anjeun kedah ngahapus dina urutan ieu:

  1. Kami nampi nilai konci primér (PK) tina barisan Ticket_flights, anu ngarujuk kana baris anu badé dihapus Flights.
  2. Simkuring meunang barisan PK Boarding_passes, nu nujul kana Ticket_flights.
  3. Urang mupus baris ku PK tina hambalan 2 dina tabél Boarding_passes.
  4. Hapus garis ku PK tina léngkah 1 di Ticket_flights.
  5. Nyoplokkeun garis tina Flights.

Hasilna mangrupikeun utilitas anu disebut PgGraph, anu kami mutuskeun pikeun ngadamel open source.

Kumaha carana nganggo

Utiliti ngadukung dua modeu pamakean:

  • Telepon tina garis paréntah (pggraph …).
  • Pamakéan dina kode Python (class PgGraphApi).

Pamasangan sareng konfigurasi

Mimiti anjeun kedah masang utilitas tina gudang Pypi:

pip3 install pggraph

Teras jieun file config.ini dina mesin lokal kalayan konfigurasi database sareng naskah arsip:

[db]
host = localhost
port = 5432
user = postgres
password = postgres
dbname = postgres
schema = public ; Необязательный параметр, указано значение по умолчанию

[archive]  ; Данный раздел заполнять необязательно, ниже указаны значения по умолчанию
is_debug = false
chunk_size = 1000
max_depth = 20
to_archive = true
archive_suffix = 'archive'

Ngajalankeun tina konsol

parameter

$ pggraph -h
usage: pggraph action [-h] --table TABLE [--ids IDS] [--config_path CONFIG_PATH]
positional arguments:
  action        required action: archive_table, get_table_references, get_rows_references

optional arguments:
  -h, --help                    show this help message and exit
  --table TABLE                 table name
  --ids IDS                     primary key ids, separated by comma, e.g. 1,2,3
  --config_path CONFIG_PATH     path to config.ini
  --log_path LOG_PATH           path to log dir
  --log_level LOG_LEVEL         log level (debug, info, error)

Argumen posisi:

  • action - tindakan diperlukeun: archive_table, get_table_references atawa get_rows_references.

Argumen ngaranna:

  • --config_path - jalur ka file config;
  • --table - méja dimana anjeun kedah ngalakukeun tindakan;
  • --ids — daptar id anu dipisahkeun ku koma, contona, 1,2,3 (parameter pilihan);
  • --log_path - jalur ka folder pikeun log (parameter opsional, sacara standar - folder imah);
  • --log_level - tingkat logging (parameter pilihan, standar nyaéta INFO).

Conto paréntah

Arsip méja

Fungsi utama utiliti nyaéta arsip data, i.e. mindahkeun barisan tina tabel utama kana tabel arsip (contona, tina tabel buku в arsip_buku).

Ngahapus tanpa arsip ogé dirojong: pikeun ieu anjeun kedah nyetél parameter dina config.ini to_archive = palsu).

Parameter anu diperyogikeun - config_path, méja jeung id.

Saatos peluncuran, rékaman bakal dipupus sacara rekursif ids dina méja table sarta dina sakabéh tabel nu nujul kana eta.

$ pggraph archive_table --config_path config.hw.local.ini --table flights --ids 1,2,3
2020-06-20 19:27:44 INFO: flights - START
2020-06-20 19:27:44 INFO: flights - start archive_recursive 3 rows (depth=0)
2020-06-20 19:27:44 INFO:       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:       ticket_flights - start archive_recursive 3 rows (depth=1)
2020-06-20 19:27:44 INFO:               START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:       ticket_flights - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: flights - archive_by_ids 3 rows by id
2020-06-20 19:27:44 INFO: flights - END

Manggihan kagumantungan pikeun tabel husus

Fungsi pikeun manggihan dependensi tina tabel anu ditangtukeun table. Parameter anu diperyogikeun - config_path и table.

Saatos peluncuran, kamus bakal dipintonkeun dina layar, dimana:

  • in_refs — kamus tabel anu ngarujuk kana anu dipasihkeun, dimana koncina nyaéta nami tabel, nilaina mangrupikeun daptar objék Konci Asing (pk_main - konci primér dina tabel utama, pk_ref - konci primér dina tabel rujukan, fk_ref - nami kolom anu mangrupakeun konci asing kana tabel sumber);
  • out_refs - kamus tabel ieu nujul kana.

$ pggraph get_table_references --config_path config.hw.local.ini --table flights
{'in_refs': {'ticket_flights': [ForeignKey(pk_main='flight_id', pk_ref='ticket_no, flight_id', fk_ref='flight_id')]},
 'out_refs': {'aircrafts': [ForeignKey(pk_main='aircraft_code', pk_ref='flight_id', fk_ref='aircraft_code')],
              'airports': [ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='arrival_airport'),
                           ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='departure_airport')]}}

Milarian rujukan kana senar sareng Key Primary anu ditangtukeun

Fungsi pikeun milarian jajar dina tabel sanés anu ngarujuk kana jajar ngalangkungan Key Asing ids tabél table. Parameter anu diperyogikeun - config_path, table и ids.

Saatos peluncuran, kamus kalayan struktur ieu bakal dipintonkeun dina layar:

{
	pk_id_1: {
		reffering_table_name_1: {
			foreign_key_1: [
				{row_pk_1: value, row_pk_2: value},
				...
			], 
			...
		},
		...
	},
	pk_id_2: {...},
	...
}

conto panggero:

$ pggraph get_rows_references --config_path config.hw.local.ini --table flights --ids 1,2,3
{1: {'ticket_flights': {'flight_id': [{'flight_id': 1,
                                       'ticket_no': '0005432816945'},
                                      {'flight_id': 1,
                                       'ticket_no': '0005432816941'}]}},
 2: {'ticket_flights': {'flight_id': [{'flight_id': 2,
                                       'ticket_no': '0005433101832'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005433101864'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005432919715'}]}},
 3: {'ticket_flights': {'flight_id': [{'flight_id': 3,
                                       'ticket_no': '0005432817560'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817568'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817559'}]}}}

Pamakéan dina kode

Salian ngajalankeun eta dina konsol nu, perpustakaan bisa dipaké dina kode Python. Conto telepon dina lingkungan interaktif iPython dipidangkeun di handap.

Arsip méja

>>> from pg_graph.main import setup_logging
>>> setup_logging(log_level='DEBUG')
>>> from pg_graph.api import PgGraphApi
>>> api = PgGraphApi('config.hw.local.ini')
>>> api.archive_table('flights', [4,5])
2020-06-20 23:12:08 INFO: flights - START
2020-06-20 23:12:08 INFO: flights - start archive_recursive 2 rows (depth=0)
2020-06-20 23:12:08 INFO: 	START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 	ticket_flights - ForeignKey(pk_main='flight_id', pk_ref='flight_id, ticket_no', fk_ref='flight_id')
2020-06-20 23:12:08 DEBUG: 	SQL('SELECT flight_id, ticket_no FROM bookings.ticket_flights WHERE (flight_id) IN (%s, %s)')
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 3 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 3 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 3 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 3 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 3 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 3 rows
2020-06-20 23:12:08 INFO: 	END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: flights - archive_by_ids 2 rows by flight_id
2020-06-20 23:12:09 DEBUG: SQL('CREATE TABLE IF NOT EXISTS bookings.flights_archive (LIKE bookings.flights)')
2020-06-20 23:12:09 DEBUG: DELETE FROM flights by flight_id - 2 rows
2020-06-20 23:12:09 DEBUG: INSERT INTO flights_archive - 2 rows
2020-06-20 23:12:09 INFO: flights - END

Manggihan kagumantungan pikeun tabel husus

>>> from pg_graph.api import PgGraphApi
>>> from pprint import pprint
>>> api = PgGraphApi('config.hw.local.ini')
>>> res = api.get_table_references('flights')
>>> pprint(res)
{'in_refs': {'ticket_flights': [ForeignKey(pk_main='flight_id', pk_ref='flight_id, ticket_no', fk_ref='flight_id')]},
 'out_refs': {'aircrafts': [ForeignKey(pk_main='aircraft_code', pk_ref='flight_id', fk_ref='aircraft_code')],
              'airports': [ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='arrival_airport'),
                           ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='departure_airport')]}}

Milarian rujukan kana senar sareng Key Primary anu ditangtukeun

>>> from pg_graph.api import PgGraphApi
>>> from pprint import pprint
>>> api = PgGraphApi('config.hw.local.ini')
>>> rows = api.get_rows_references('flights', [1,2,3])
>>> pprint(rows)
{1: {'ticket_flights': {'flight_id': [{'flight_id': 1,
                                       'ticket_no': '0005432816945'},
                                      {'flight_id': 1,
                                       'ticket_no': '0005432816941'}]}},
 2: {'ticket_flights': {'flight_id': [{'flight_id': 2,
                                       'ticket_no': '0005433101832'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005433101864'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005432919715'}]}},
 3: {'ticket_flights': {'flight_id': [{'flight_id': 3,
                                       'ticket_no': '0005432817560'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817568'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817559'}]}}}

Kode sumber perpustakaan nyaéta sadia di GitHub handapeun lisénsi MIT, kitu ogé dina gudang PyPI.

Kuring bakal bungah koméntar, komitmen sareng saran.

Kuring bakal nyobian ngajawab patarosan anu pangsaéna tina kamampuan kuring di dieu sareng di gudang.

sumber: www.habr.com

Tambahkeun komentar