PgGraph yog qhov khoom siv rau kev khaws cia thiab nrhiav cov lus nyob hauv PostgreSQL

PgGraph yog qhov khoom siv rau kev khaws cia thiab nrhiav cov lus nyob hauv PostgreSQL
Hnub no kuv xav nthuav tawm Habr cov neeg nyeem nrog cov khoom siv sau hauv Python rau kev ua haujlwm nrog cov lus nyob hauv PostgreSQL DBMS.

Tus nqi hluav taws xob API yog qhov yooj yim thiab muaj peb txoj hauv kev:

  • archive_table - recursive archiving / rho tawm kab nrog cov ntsiab lus tseem ceeb
  • get_table_references - tshawb nrhiav kev vam khom rau ib lub rooj (yuav qhia cov lus hais los ntawm ib qho thiab cov neeg xa mus rau nws)
  • get_rows_references - Tshawb nrhiav cov kab hauv lwm cov rooj uas siv cov kab hauv cov lus xav tau

prehistory

Kuv lub npe yog Oleg Borzov, Kuv yog tus tsim tawm hauv pab pawg CRM rau tus thawj tswj hwm qiv nyiaj hauv Domklik.

Lub ntsiab database ntawm peb CRM system yog ib qho ntawm cov loj tshaj plaws nyob rau hauv cov nqe lus ntawm ntim nyob rau hauv lub tuam txhab. Nws kuj yog ib qho ntawm cov laus tshaj plaws: nws tau tshwm sim thaum lub sij hawm tshaj tawm ntawm qhov project, thaum cov ntoo loj, Domklik yog ib qho kev pib, thiab es tsis txhob ntawm microservice ntawm ib tug fashionable Python asynchronous moj khaum muaj ib tug loj loj monolith nyob rau hauv PHP.

Kev hloov pauv ntawm PHP mus rau Python tau ntev heev thiab xav tau kev txhawb nqa ib txhij ntawm ob lub tshuab, uas cuam tshuam rau kev tsim cov ntaub ntawv.

Raws li qhov tshwm sim, peb muaj cov ntaub ntawv nrog ntau tus neeg sib txuas thiab cov rooj loj loj nrog ntau qhov kev ntsuas rau ntau hom lus nug. Tag nrho cov no cuam tshuam tsis zoo rau kev ua haujlwm ntawm cov ntaub ntawv: vim muaj cov ntxhuav loj thiab ntau cov kev sib raug zoo ntawm lawv, qhov nyuaj ntawm cov lus nug tau nce zuj zus, uas yog qhov tseem ceeb rau cov rooj zaum loj tshaj plaws.

Txhawm rau txo cov load ntawm cov ntaub ntawv, peb txiav txim siab los sau ib tsab ntawv uas yuav hloov cov ntaub ntawv qub los ntawm cov ntaub ntawv voluminous thiab loaded tshaj plaws rau cov archived (piv txwv li, los ntawm task Π² task_archive).

Txoj haujlwm no nyuaj los ntawm ntau qhov kev sib raug zoo ntawm cov ntxhuav: tsuas yog txav cov kab los ntawm task Π² task_archive tsis txaus, ua ntej koj yuav tsum ua tib yam recursively nrog tag nrho cov referencing task rooj.

Kuv yuav ua qauv qhia demo database los ntawm qhov chaw postgrespro.ru:

PgGraph yog qhov khoom siv rau kev khaws cia thiab nrhiav cov lus nyob hauv PostgreSQL
Cia peb hais tias peb yuav tsum rho tawm cov ntaub ntawv los ntawm lub rooj Flights. Postgres yuav tsis tso cai rau peb ua qhov no ib yam li ntawd: peb thawj zaug yuav tsum rho tawm cov ntaub ntawv los ntawm tag nrho cov lus hais, thiab lwm yam rov qab mus rau cov lus uas tsis muaj leej twg hais.

Hauv peb qhov piv txwv ntawm Flights hais txog Ticket_flights, thiab ntawm nws - Boarding_passes.

Yog li ntawd, koj yuav tsum rho tawm nws hauv qhov kev txiav txim no:

  1. Peb tau txais cov yuam sij tseem ceeb (PK) qhov tseem ceeb ntawm kab hauv Ticket_flights, uas xa mus rau cov kab yuav tsum deleted nyob rau hauv Flights.
  2. Peb tau txais PK kab Boarding_passes, uas hais txog Ticket_flights.
  3. Peb rho tawm kab los ntawm PK los ntawm kauj ruam 2 hauv lub rooj Boarding_passes.
  4. Rho tawm kab los ntawm PK los ntawm kauj ruam 1 hauv Ticket_flights.
  5. Tshem tawm cov kab ntawm Flights.

Qhov tshwm sim yog qhov khoom siv hu ua PgGraph, uas peb tau txiav txim siab los ua qhov qhib.

Yuav siv li cas

Kev siv hluav taws xob txhawb nqa ob hom kev siv:

  • Hu los ntawm kab hais kom ua (pggraph …).
  • Kev siv hauv Python code (chav kawm PgGraphApi).

Teeb thiab kev teeb tsa

Ua ntej koj yuav tsum tau nruab cov khoom siv hluav taws xob los ntawm Pypi repository:

pip3 install pggraph

Tom qab ntawd tsim cov ntaub ntawv config.ini ntawm lub tshuab hauv zos nrog kev teeb tsa ntawm cov ntaub ntawv thiab cov ntawv sau cia:

[db]
host = localhost
port = 5432
user = postgres
password = postgres
dbname = postgres
schema = public ; ΠΠ΅ΠΎΠ±ΡΠ·Π°Ρ‚Π΅Π»ΡŒΠ½Ρ‹ΠΉ ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€, ΡƒΠΊΠ°Π·Π°Π½ΠΎ Π·Π½Π°Ρ‡Π΅Π½ΠΈΠ΅ ΠΏΠΎ ΡƒΠΌΠΎΠ»Ρ‡Π°Π½ΠΈΡŽ

[archive]  ; Π”Π°Π½Π½Ρ‹ΠΉ Ρ€Π°Π·Π΄Π΅Π» Π·Π°ΠΏΠΎΠ»Π½ΡΡ‚ΡŒ Π½Π΅ΠΎΠ±ΡΠ·Π°Ρ‚Π΅Π»ΡŒΠ½ΠΎ, Π½ΠΈΠΆΠ΅ ΡƒΠΊΠ°Π·Π°Π½Ρ‹ значСния ΠΏΠΎ ΡƒΠΌΠΎΠ»Ρ‡Π°Π½ΠΈΡŽ
is_debug = false
chunk_size = 1000
max_depth = 20
to_archive = true
archive_suffix = 'archive'

Khiav los ntawm console

tsis

$ pggraph -h
usage: pggraph action [-h] --table TABLE [--ids IDS] [--config_path CONFIG_PATH]
positional arguments:
  action        required action: archive_table, get_table_references, get_rows_references

optional arguments:
  -h, --help                    show this help message and exit
  --table TABLE                 table name
  --ids IDS                     primary key ids, separated by comma, e.g. 1,2,3
  --config_path CONFIG_PATH     path to config.ini
  --log_path LOG_PATH           path to log dir
  --log_level LOG_LEVEL         log level (debug, info, error)

Positional arguments:

  • action - yuav tsum tau ua: archive_table, get_table_references los yog get_rows_references.

Cov lus muaj npe:

  • --config_path - txoj kev mus rau cov ntaub ntawv config;
  • --table - ib lub rooj uas koj yuav tsum tau ua ib qho kev txiav txim;
  • --ids - daim ntawv teev tus id sib cais los ntawm commas, piv txwv li, 1,2,3 (optional parameter);
  • --log_path - txoj kev mus rau lub nplaub tshev rau cov cav (optional parameter, los ntawm lub neej ntawd - home folder);
  • --log_level - nkag qib (optional parameter, default is INFO).

Cov piv txwv hais kom ua

Archiving ib lub rooj

Lub luag haujlwm tseem ceeb ntawm kev siv hluav taws xob yog cov ntaub ntawv khaws cia, i.e. hloov kab los ntawm lub ntsiab lus mus rau lub rooj archive (piv txwv li, los ntawm lub rooj cov phau ntawv Π² phau ntawv_archive).

Kev rho tawm yam tsis muaj archiving kuj tau txais kev txhawb nqa: rau qhov no koj yuav tsum teeb tsa qhov ntsuas hauv config.ini to_archive = cuav).

Yuav tsum tau parameter - config_path, rooj thiab ids.

Tom qab tso tawm, cov ntaub ntawv yuav rov qab deleted ids hauv lub rooj table thiab nyob rau hauv txhua lub rooj uas hais txog nws.

$ pggraph archive_table --config_path config.hw.local.ini --table flights --ids 1,2,3
2020-06-20 19:27:44 INFO: flights - START
2020-06-20 19:27:44 INFO: flights - start archive_recursive 3 rows (depth=0)
2020-06-20 19:27:44 INFO:       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:       ticket_flights - start archive_recursive 3 rows (depth=1)
2020-06-20 19:27:44 INFO:               START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:       ticket_flights - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: flights - archive_by_ids 3 rows by id
2020-06-20 19:27:44 INFO: flights - END

Nrhiav kev vam khom rau ib lub rooj teev tseg

Muaj nuj nqi txhawm rau nrhiav kev vam khom ntawm cov lus teev tseg table. Yuav tsum tau parameter - config_path ΠΈ table.

Tom qab tso tawm, phau ntawv txhais lus yuav tshwm rau ntawm qhov screen, qhov twg:

  • in_refs - phau ntawv txhais lus ntawm cov lus hais txog ib qho muab, qhov twg tus yuam sij yog lub npe ntawm lub rooj, tus nqi yog ib daim ntawv teev cov khoom tseem ceeb txawv teb chaws (pk_main - qhov tseem ceeb hauv lub ntsiab lus, pk_ref - qhov tseem ceeb hauv cov lus qhia, fk_ref - lub npe ntawm kab uas yog txawv teb chaws tus yuam sij rau lub hauv paus rooj);
  • out_refs - phau ntawv txhais lus ntawm cov lus no yog hais txog.

$ pggraph get_table_references --config_path config.hw.local.ini --table flights
{'in_refs': {'ticket_flights': [ForeignKey(pk_main='flight_id', pk_ref='ticket_no, flight_id', fk_ref='flight_id')]},
 'out_refs': {'aircrafts': [ForeignKey(pk_main='aircraft_code', pk_ref='flight_id', fk_ref='aircraft_code')],
              'airports': [ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='arrival_airport'),
                           ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='departure_airport')]}}

Nrhiav cov ntaub ntawv xa mus rau cov hlua nrog lub ntsiab lus tseem ceeb

Muaj nuj nqi los tshawb nrhiav kab hauv lwm lub rooj uas xa mus rau kab ntawm Foreign Key ids rooj table. Yuav tsum tau parameter - config_path, table ΠΈ ids.

Tom qab tso tawm, phau ntawv txhais lus nrog cov qauv hauv qab no yuav tshwm sim ntawm qhov screen:

{
	pk_id_1: {
		reffering_table_name_1: {
			foreign_key_1: [
				{row_pk_1: value, row_pk_2: value},
				...
			], 
			...
		},
		...
	},
	pk_id_2: {...},
	...
}

Piv txwv hu:

$ pggraph get_rows_references --config_path config.hw.local.ini --table flights --ids 1,2,3
{1: {'ticket_flights': {'flight_id': [{'flight_id': 1,
                                       'ticket_no': '0005432816945'},
                                      {'flight_id': 1,
                                       'ticket_no': '0005432816941'}]}},
 2: {'ticket_flights': {'flight_id': [{'flight_id': 2,
                                       'ticket_no': '0005433101832'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005433101864'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005432919715'}]}},
 3: {'ticket_flights': {'flight_id': [{'flight_id': 3,
                                       'ticket_no': '0005432817560'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817568'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817559'}]}}}

Kev siv hauv code

Ntxiv rau kev khiav nws hauv console, lub tsev qiv ntawv tuaj yeem siv tau hauv Python code. Piv txwv ntawm kev hu xov tooj hauv iPython kev sib tham ib puag ncig yog qhia hauv qab no.

Archiving ib lub rooj

>>> from pg_graph.main import setup_logging
>>> setup_logging(log_level='DEBUG')
>>> from pg_graph.api import PgGraphApi
>>> api = PgGraphApi('config.hw.local.ini')
>>> api.archive_table('flights', [4,5])
2020-06-20 23:12:08 INFO: flights - START
2020-06-20 23:12:08 INFO: flights - start archive_recursive 2 rows (depth=0)
2020-06-20 23:12:08 INFO: 	START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 	ticket_flights - ForeignKey(pk_main='flight_id', pk_ref='flight_id, ticket_no', fk_ref='flight_id')
2020-06-20 23:12:08 DEBUG: 	SQL('SELECT flight_id, ticket_no FROM bookings.ticket_flights WHERE (flight_id) IN (%s, %s)')
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 3 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 3 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 3 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 3 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 3 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 3 rows
2020-06-20 23:12:08 INFO: 	END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: flights - archive_by_ids 2 rows by flight_id
2020-06-20 23:12:09 DEBUG: SQL('CREATE TABLE IF NOT EXISTS bookings.flights_archive (LIKE bookings.flights)')
2020-06-20 23:12:09 DEBUG: DELETE FROM flights by flight_id - 2 rows
2020-06-20 23:12:09 DEBUG: INSERT INTO flights_archive - 2 rows
2020-06-20 23:12:09 INFO: flights - END

Nrhiav kev vam khom rau ib lub rooj teev tseg

>>> from pg_graph.api import PgGraphApi
>>> from pprint import pprint
>>> api = PgGraphApi('config.hw.local.ini')
>>> res = api.get_table_references('flights')
>>> pprint(res)
{'in_refs': {'ticket_flights': [ForeignKey(pk_main='flight_id', pk_ref='flight_id, ticket_no', fk_ref='flight_id')]},
 'out_refs': {'aircrafts': [ForeignKey(pk_main='aircraft_code', pk_ref='flight_id', fk_ref='aircraft_code')],
              'airports': [ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='arrival_airport'),
                           ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='departure_airport')]}}

Nrhiav cov ntaub ntawv xa mus rau cov hlua nrog lub ntsiab lus tseem ceeb

>>> from pg_graph.api import PgGraphApi
>>> from pprint import pprint
>>> api = PgGraphApi('config.hw.local.ini')
>>> rows = api.get_rows_references('flights', [1,2,3])
>>> pprint(rows)
{1: {'ticket_flights': {'flight_id': [{'flight_id': 1,
                                       'ticket_no': '0005432816945'},
                                      {'flight_id': 1,
                                       'ticket_no': '0005432816941'}]}},
 2: {'ticket_flights': {'flight_id': [{'flight_id': 2,
                                       'ticket_no': '0005433101832'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005433101864'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005432919715'}]}},
 3: {'ticket_flights': {'flight_id': [{'flight_id': 3,
                                       'ticket_no': '0005432817560'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817568'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817559'}]}}}

Lub tsev qiv ntawv qhov chaws muaj nyob ntawm GitHub nyob rau hauv MIT daim ntawv tso cai, thiab nyob rau hauv lub repository PyPI.

Kuv yuav zoo siab rau cov lus, cog lus thiab cov lus qhia.

Kuv yuav sim teb cov lus nug kom zoo tshaj plaws ntawm kuv lub peev xwm ntawm no thiab hauv qhov chaw cia khoom.

Tau qhov twg los: www.hab.com

Ntxiv ib saib