PgGraph se yon sèvis piblik pou achiv ak jwenn depandans tab nan PostgreSQL

PgGraph se yon sèvis piblik pou achiv ak jwenn depandans tab nan PostgreSQL
Jodi a mwen vle prezante lektè Habr ak yon sèvis piblik ekri nan Python pou travay ak depandans tab nan DBMS PostgreSQL la.

API sèvis piblik la senp epi li gen twa metòd:

  • archive_table - rekursif achiv / efase ranje ak kle prensipal espesifye
  • get_table_references - rechèch pou depandans pou yon tab (ap montre tab ki refere yo pa youn nan espesifye ak sa yo ki fè referans li)
  • get_rows_references - chèche ranje nan lòt tab ki fè referans a ranje espesifye nan tablo a vle

pre-istwa

Non mwen se Oleg Borzov, mwen se yon pwomotè nan ekip CRM pou manadjè prete ipotèk nan Domklik.

Baz done prensipal sistèm CRM nou an se youn nan pi gwo an tèm de volim nan konpayi an. Li se tou youn nan pi ansyen an: li te parèt nan lansman an menm nan pwojè a, lè pye bwa yo te gwo, Domklik te yon demaraj, ak olye pou yo yon mikwosèvis sou yon kad alamòd Python asynchrone te gen yon monolit gwo nan PHP.

Tranzisyon an soti nan PHP a Python te trè long epi li te mande sipò similtane nan tou de sistèm, ki afekte konsepsyon baz done a.

Kòm yon rezilta, nou gen yon baz done ak yon gwo kantite trè konekte ak tab gwo ak yon pakèt moun sou endèks pou diferan kalite demann. Tout bagay sa yo afekte pèfòmans baz done a yon fason negatif: akòz tab gwo ak yon pakèt relasyon ant yo, konpleksite demann yo toujou ap ogmante, ki se sitou kritik pou tab ki pi chaje yo.

Pou diminye chaj la sou baz done a, nou deside ekri yon script ki ta transfere ansyen dosye ki soti nan tab ki pi volumineuz ak chaje nan achiv yo (pa egzanp, soti nan task в task_archive).

Travay sa a konplike pa gwo kantite relasyon ant tab yo: tou senpleman deplase ranje soti nan task в task_archive se pa ase, anvan sa ou bezwen fè menm bagay la tou recursively ak tout moun sa yo referans task tab.

Mwen pral demontre ak yon egzanp baz done Demo soti nan sit la postgrespro.ru:

PgGraph se yon sèvis piblik pou achiv ak jwenn depandans tab nan PostgreSQL
Ann di nou bezwen efase dosye nan yon tab Flights. Postgres pa pral pèmèt nou fè sa jis konsa: nou premye bezwen efase dosye ki soti nan tout tab referans yo, ak sou sa recursively desann nan tab ki pa fè referans pa nenpòt moun.

Nan egzanp nou an nan Flights refere Ticket_flights, ak sou li - Boarding_passes.

Se poutèt sa, ou bezwen efase li nan lòd sa a:

  1. Nou jwenn kle prensipal yo (PK) valè ranje nan Ticket_flights, ki refere a ranje yo dwe efase nan Flights.
  2. Nou jwenn ranje PK Boarding_passes, ki refere a Ticket_flights.
  3. Nou efase ranje pa PK soti nan etap 2 nan tablo a Boarding_passes.
  4. Efase liy pa PK soti nan etap 1 nan Ticket_flights.
  5. Retire liy yo nan Flights.

Rezilta a se te yon sèvis piblik ki rele PgGraph, ke nou deside fè sous louvri.

Kouman pou itilize

Sèvis piblik la sipòte de mòd itilizasyon:

  • Rele soti nan liy lòd la (pggraph …).
  • Itilizasyon nan kòd Python (klas PgGraphApi).

Enstalasyon ak konfigirasyon

Premyèman, ou bezwen enstale sèvis piblik la nan repozitwa Pypi la:

pip3 install pggraph

Lè sa a, kreye yon fichye config.ini sou machin lokal la ak konfigirasyon baz done a ak script achiv la:

[db]
host = localhost
port = 5432
user = postgres
password = postgres
dbname = postgres
schema = public ; Необязательный параметр, указано значение по умолчанию

[archive]  ; Данный раздел заполнять необязательно, ниже указаны значения по умолчанию
is_debug = false
chunk_size = 1000
max_depth = 20
to_archive = true
archive_suffix = 'archive'

Kouri soti nan konsole

Paramèt

$ pggraph -h
usage: pggraph action [-h] --table TABLE [--ids IDS] [--config_path CONFIG_PATH]
positional arguments:
  action        required action: archive_table, get_table_references, get_rows_references

optional arguments:
  -h, --help                    show this help message and exit
  --table TABLE                 table name
  --ids IDS                     primary key ids, separated by comma, e.g. 1,2,3
  --config_path CONFIG_PATH     path to config.ini
  --log_path LOG_PATH           path to log dir
  --log_level LOG_LEVEL         log level (debug, info, error)

Agiman pozisyon:

  • action - aksyon ki nesesè: archive_table, get_table_references oswa get_rows_references.

Agiman yo rele:

  • --config_path - chemen nan dosye a konfigirasyon;
  • --table - yon tab ak ki ou bezwen fè yon aksyon;
  • --ids - lis id separe pa vigil, pou egzanp, 1,2,3 (paramèt si ou vle);
  • --log_path - chemen nan katab la pou mòso bwa (paramèt si ou vle, pa default - katab lakay);
  • --log_level — nivo anrejistreman (paramèt si ou vle, default se ENFO).

Egzanp kòmand

Achiv yon tab

Fonksyon prensipal sèvis piblik la se achiv done, i.e. transfere ranje soti nan tab prensipal la nan tab la achiv (pa egzanp, soti nan tab la liv в liv_achiv).

Sipresyon san yo pa achiv tou sipòte: pou sa ou bezwen mete paramèt la nan config.ini to_archive = fo).

Paramèt obligatwa - config_path, tab ak id.

Apre lansman, dosye yo pral efase repetitivman ids nan tablo a table ak nan tout tablo ki fè referans a li.

$ pggraph archive_table --config_path config.hw.local.ini --table flights --ids 1,2,3
2020-06-20 19:27:44 INFO: flights - START
2020-06-20 19:27:44 INFO: flights - start archive_recursive 3 rows (depth=0)
2020-06-20 19:27:44 INFO:       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:       ticket_flights - start archive_recursive 3 rows (depth=1)
2020-06-20 19:27:44 INFO:               START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:       ticket_flights - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: flights - archive_by_ids 3 rows by id
2020-06-20 19:27:44 INFO: flights - END

Jwenn depandans pou yon tab espesifye

Fonksyon pou jwenn depandans nan yon tab espesifye table. Paramèt obligatwa - config_path и table.

Apre lansman, yon diksyonè ap parèt sou ekran an, kote:

  • in_refs — yon diksyonè nan tab ki fè referans ak yon sèl, kote kle a se non tab la, valè a se yon lis objè kle etranje (pk_main - kle prensipal nan tablo prensipal la, pk_ref - kle prensipal nan tablo referans lan, fk_ref — non kolòn ki se kle etranje tab sous la);
  • out_refs — yon diksyonè nan tab sa a refere a.

$ pggraph get_table_references --config_path config.hw.local.ini --table flights
{'in_refs': {'ticket_flights': [ForeignKey(pk_main='flight_id', pk_ref='ticket_no, flight_id', fk_ref='flight_id')]},
 'out_refs': {'aircrafts': [ForeignKey(pk_main='aircraft_code', pk_ref='flight_id', fk_ref='aircraft_code')],
              'airports': [ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='arrival_airport'),
                           ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='departure_airport')]}}

Jwenn referans a fisèl ak kle prensipal la espesifye

Fonksyon pou chèche ranje nan lòt tablo ki fè referans a ranje atravè kle etranje ids tab table. Paramèt obligatwa - config_path, table и ids.

Apre lansman, yon diksyonè ak estrikti sa a pral parèt sou ekran an:

{
	pk_id_1: {
		reffering_table_name_1: {
			foreign_key_1: [
				{row_pk_1: value, row_pk_2: value},
				...
			], 
			...
		},
		...
	},
	pk_id_2: {...},
	...
}

Egzanp apèl:

$ pggraph get_rows_references --config_path config.hw.local.ini --table flights --ids 1,2,3
{1: {'ticket_flights': {'flight_id': [{'flight_id': 1,
                                       'ticket_no': '0005432816945'},
                                      {'flight_id': 1,
                                       'ticket_no': '0005432816941'}]}},
 2: {'ticket_flights': {'flight_id': [{'flight_id': 2,
                                       'ticket_no': '0005433101832'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005433101864'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005432919715'}]}},
 3: {'ticket_flights': {'flight_id': [{'flight_id': 3,
                                       'ticket_no': '0005432817560'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817568'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817559'}]}}}

Itilizasyon nan kòd

Anplis de sa nan kouri li nan konsole a, bibliyotèk la ka itilize nan kòd Python. Egzanp apèl nan anviwònman entèaktif iPython yo montre anba a.

Achiv yon tab

>>> from pg_graph.main import setup_logging
>>> setup_logging(log_level='DEBUG')
>>> from pg_graph.api import PgGraphApi
>>> api = PgGraphApi('config.hw.local.ini')
>>> api.archive_table('flights', [4,5])
2020-06-20 23:12:08 INFO: flights - START
2020-06-20 23:12:08 INFO: flights - start archive_recursive 2 rows (depth=0)
2020-06-20 23:12:08 INFO: 	START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 	ticket_flights - ForeignKey(pk_main='flight_id', pk_ref='flight_id, ticket_no', fk_ref='flight_id')
2020-06-20 23:12:08 DEBUG: 	SQL('SELECT flight_id, ticket_no FROM bookings.ticket_flights WHERE (flight_id) IN (%s, %s)')
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 3 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 3 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 3 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 3 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 3 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 3 rows
2020-06-20 23:12:08 INFO: 	END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: flights - archive_by_ids 2 rows by flight_id
2020-06-20 23:12:09 DEBUG: SQL('CREATE TABLE IF NOT EXISTS bookings.flights_archive (LIKE bookings.flights)')
2020-06-20 23:12:09 DEBUG: DELETE FROM flights by flight_id - 2 rows
2020-06-20 23:12:09 DEBUG: INSERT INTO flights_archive - 2 rows
2020-06-20 23:12:09 INFO: flights - END

Jwenn depandans pou yon tab espesifye

>>> from pg_graph.api import PgGraphApi
>>> from pprint import pprint
>>> api = PgGraphApi('config.hw.local.ini')
>>> res = api.get_table_references('flights')
>>> pprint(res)
{'in_refs': {'ticket_flights': [ForeignKey(pk_main='flight_id', pk_ref='flight_id, ticket_no', fk_ref='flight_id')]},
 'out_refs': {'aircrafts': [ForeignKey(pk_main='aircraft_code', pk_ref='flight_id', fk_ref='aircraft_code')],
              'airports': [ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='arrival_airport'),
                           ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='departure_airport')]}}

Jwenn referans a fisèl ak kle prensipal la espesifye

>>> from pg_graph.api import PgGraphApi
>>> from pprint import pprint
>>> api = PgGraphApi('config.hw.local.ini')
>>> rows = api.get_rows_references('flights', [1,2,3])
>>> pprint(rows)
{1: {'ticket_flights': {'flight_id': [{'flight_id': 1,
                                       'ticket_no': '0005432816945'},
                                      {'flight_id': 1,
                                       'ticket_no': '0005432816941'}]}},
 2: {'ticket_flights': {'flight_id': [{'flight_id': 2,
                                       'ticket_no': '0005433101832'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005433101864'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005432919715'}]}},
 3: {'ticket_flights': {'flight_id': [{'flight_id': 3,
                                       'ticket_no': '0005432817560'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817568'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817559'}]}}}

Kòd sous bibliyotèk la disponib nan GitHub anba lisans MIT, osi byen ke nan depo a P&PI.

Mwen pral kontan fè kòmantè, komèt ak sijesyon.

Mwen pral eseye reponn kesyon yo pi byen nan kapasite mwen isit la ak nan depo a.

Sous: www.habr.com

Add nouvo kòmantè