Tha PgGraph na ghoireas airson tasglann agus lorg eisimeileachd bùird ann am PostgreSQL

Tha PgGraph na ghoireas airson tasglann agus lorg eisimeileachd bùird ann am PostgreSQL
An-diugh tha mi airson goireas sgrìobhte ann am Python a thaisbeanadh do luchd-leughaidh Habr airson a bhith ag obair le eisimeileachd bùird anns an PostgreSQL DBMS.

Tha API an goireas sìmplidh agus tha trì dòighean ann:

  • tasglann_clàr - tasglann ath-chuairteach / cuir às do shreathan le prìomh iuchraichean ainmichte
  • faigh_clàr_iomraidhean - lorg eisimeileachd airson clàr (seallaidh e clàran air an tug an tè ainmichte agus an fheadhainn a tha a’ toirt iomradh air)
  • faigh_rows_references - lorg sreathan ann an clàran eile a bheir iomradh air sreathan ainmichte sa chlàr a tha thu ag iarraidh

ro-eachdraidheil

Is e m ’ainm Oleg Borzov, tha mi nam leasaiche anns an sgioba CRM airson manaidsearan iasad morgaids ann an Domklik.

Is e prìomh stòr-dàta an t-siostam CRM againn aon den fheadhainn as motha a thaobh meud sa chompanaidh. Tha e cuideachd mar aon den fheadhainn as sine: nochd e aig fìor fhoillseachadh a 'phròiseict, nuair a bha na craobhan mòr, bha Domklik na thoiseach tòiseachaidh, agus an àite microservice air frèam fasanta Python asyncronach bha monolith mòr ann am PHP.

Bha an gluasad bho PHP gu Python glè fhada agus bha feum air taic aig an aon àm bhon dà shiostam, a thug buaidh air dealbhadh an stòr-dàta.

Mar thoradh air an sin, tha stòr-dàta againn le àireamh mhòr de chlàran làn-cheangailte agus mòr le dòrlach de chlàran-amais airson diofar sheòrsaichean cheistean. Tha seo uile a’ toirt droch bhuaidh air coileanadh an stòr-dàta: mar thoradh air clàran mòra agus dòrlach de dhàimhean eatorra, tha iom-fhillteachd cheistean a’ sìor dhol am meud, a tha gu sònraichte deatamach airson na clàran as luchdaichte.

Gus an luchd air an stòr-dàta a lughdachadh, chuir sinn romhainn sgriobt a sgrìobhadh a ghluaiseadh seann chlàran bho na clàran as tomadaiche agus as luchdaichte gu na clàran tasglainn (mar eisimpleir, bho task в task_archive).

Tha an obair seo iom-fhillte leis an àireamh mhòr de dhàimhean eadar bùird: dìreach gluais sreathan bho task в task_archive Chan eil sin gu leòr, ron sin feumaidh tu an aon rud a dhèanamh gu ath-chùrsach leis na h-iomraidhean sin uile task bùird.

Seallaidh mi le eisimpleir stòr-dàta demo bhon làrach postgrespro.ru:

Tha PgGraph na ghoireas airson tasglann agus lorg eisimeileachd bùird ann am PostgreSQL
Canaidh sinn gum feum sinn clàran a dhubhadh às bho bhòrd Flights. Cha leig Postgres leinn seo a dhèanamh dìreach mar sin: feumaidh sinn an-toiseach clàran a sguabadh às a h-uile clàr iomraidh, agus mar sin air adhart gu ath-chuairteach sìos gu clàran air nach eil duine a 'toirt iomradh.

Anns an eisimpleir againn aig Flights a ’toirt iomradh Ticket_flights, agus oirre- Boarding_passes.

Mar sin, feumaidh tu a sguabadh às san òrdugh seo:

  1. Gheibh sinn na prìomh iuchraichean (PK) luachan sreathan a-steach Ticket_flights, a tha a 'toirt iomradh air na sreathan a thèid a sguabadh às Flights.
  2. Gheibh sinn sreathan PK Boarding_passes, a tha toirt iomradh air Ticket_flights.
  3. Sguabaidh sinn às sreathan le PK bho cheum 2 sa chlàr Boarding_passes.
  4. Sguab às loidhnichean le PK bho cheum 1 a-steach Ticket_flights.
  5. A 'toirt air falbh sreathan bho Flights.

B’ e an toradh seo goireas ris an canar PgGraph, a chuir sinn romhainn stòr fosgailte a dhèanamh.

Mar a chleachdar tu

Bidh an goireas a’ toirt taic do dhà dhòigh cleachdaidh:

  • Cuir fòn bhon loidhne-àithne (pggraph …).
  • Cleachdadh ann an còd Python (clas PgGraphApi).

Stàladh agus rèiteachadh

An toiseach feumaidh tu an goireas a stàladh bho stòr Pypi:

pip3 install pggraph

An uairsin cruthaich faidhle config.ini air an inneal ionadail le rèiteachadh an stòr-dàta agus an sgriobt tasglann:

[db]
host = localhost
port = 5432
user = postgres
password = postgres
dbname = postgres
schema = public ; Необязательный параметр, указано значение по умолчанию

[archive]  ; Данный раздел заполнять необязательно, ниже указаны значения по умолчанию
is_debug = false
chunk_size = 1000
max_depth = 20
to_archive = true
archive_suffix = 'archive'

Ruith bhon chonsail

paramadairean

$ pggraph -h
usage: pggraph action [-h] --table TABLE [--ids IDS] [--config_path CONFIG_PATH]
positional arguments:
  action        required action: archive_table, get_table_references, get_rows_references

optional arguments:
  -h, --help                    show this help message and exit
  --table TABLE                 table name
  --ids IDS                     primary key ids, separated by comma, e.g. 1,2,3
  --config_path CONFIG_PATH     path to config.ini
  --log_path LOG_PATH           path to log dir
  --log_level LOG_LEVEL         log level (debug, info, error)

Argamaidean suidheachaidh:

  • action - gnìomh riatanach: archive_table, get_table_references no get_rows_references.

Argamaidean ainmichte:

  • --config_path - slighe chun fhaidhle config;
  • --table - clàr leis am feum thu gnìomh a dhèanamh;
  • --ids - liosta de id air a sgaradh le cromagan, mar eisimpleir, 1,2,3 (paramadair roghainneil);
  • --log_path - slighe chun phasgan airson logaichean (paramadair roghainneil, gu bunaiteach - pasgan dachaigh);
  • --log_level - ìre logaidh (paramadair roghainneil, is e INFO an àbhaist).

Eisimpleirean àithne

A’ tasgadh clàr

Is e prìomh obair a’ ghoireas tasglann dàta, i.e. gluasad sreathan bhon phrìomh chlàr gu clàr an tasglann (mar eisimpleir, bhon chlàr leabhraichean в leabhraichean_tasglann).

Thathas cuideachd a’ toirt taic do sguabadh às gun tasglann: airson seo feumaidh tu am paramadair a shuidheachadh ann an config.ini to_archive = meallta).

Paramadairean riatanach - config_path, clàr agus ids.

Às deidh an cur air bhog, thèid clàran a dhubhadh às a-rithist ids sa chlàr table agus anns a h-uile clàr a tha a 'toirt iomradh air.

$ pggraph archive_table --config_path config.hw.local.ini --table flights --ids 1,2,3
2020-06-20 19:27:44 INFO: flights - START
2020-06-20 19:27:44 INFO: flights - start archive_recursive 3 rows (depth=0)
2020-06-20 19:27:44 INFO:       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:       ticket_flights - start archive_recursive 3 rows (depth=1)
2020-06-20 19:27:44 INFO:               START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:       ticket_flights - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: flights - archive_by_ids 3 rows by id
2020-06-20 19:27:44 INFO: flights - END

Lorg eisimeileachd airson clàr sònraichte

Gnìomh gus eisimeileachd clàr sònraichte a lorg table. Paramadairean riatanach - config_path и table.

Às deidh a chuir air bhog, thèid faclair a thaisbeanadh air an sgrion, far a bheil:

  • in_refs - faclair de chlàran a’ toirt iomradh air fear a chaidh a thoirt seachad, far a bheil an iuchair ainm a’ chlàir, is e an luach liosta de phrìomh nithean cèin (pk_main - prìomh iuchair sa phrìomh chlàr, pk_ref - prìomh iuchair sa chlàr iomraidh, fk_ref - ainm a’ cholbh a tha na iuchair chèin don chlàr stòr);
  • out_refs — faclair de bhùird air a bheil am fear seo a’ toirt iomradh.

$ pggraph get_table_references --config_path config.hw.local.ini --table flights
{'in_refs': {'ticket_flights': [ForeignKey(pk_main='flight_id', pk_ref='ticket_no, flight_id', fk_ref='flight_id')]},
 'out_refs': {'aircrafts': [ForeignKey(pk_main='aircraft_code', pk_ref='flight_id', fk_ref='aircraft_code')],
              'airports': [ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='arrival_airport'),
                           ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='departure_airport')]}}

Lorg iomraidhean air teudan leis a’ phrìomh iuchair ainmichte

Gnìomh gus sreathan a lorg ann an clàran eile a tha a’ toirt iomradh air sreathan tro Iuchrach Cèin ids bùird table. Paramadairean riatanach - config_path, table и ids.

Às deidh a chuir air bhog, thèid faclair leis an structar a leanas a thaisbeanadh air an sgrion:

{
	pk_id_1: {
		reffering_table_name_1: {
			foreign_key_1: [
				{row_pk_1: value, row_pk_2: value},
				...
			], 
			...
		},
		...
	},
	pk_id_2: {...},
	...
}

Call eisimpleir:

$ pggraph get_rows_references --config_path config.hw.local.ini --table flights --ids 1,2,3
{1: {'ticket_flights': {'flight_id': [{'flight_id': 1,
                                       'ticket_no': '0005432816945'},
                                      {'flight_id': 1,
                                       'ticket_no': '0005432816941'}]}},
 2: {'ticket_flights': {'flight_id': [{'flight_id': 2,
                                       'ticket_no': '0005433101832'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005433101864'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005432919715'}]}},
 3: {'ticket_flights': {'flight_id': [{'flight_id': 3,
                                       'ticket_no': '0005432817560'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817568'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817559'}]}}}

Cleachdadh ann an còd

A bharrachd air a bhith ga ruith sa chonsail, faodar an leabharlann a chleachdadh ann an còd Python. Tha eisimpleirean de ghairmean ann an àrainneachd eadar-ghnìomhach iPython air an sealltainn gu h-ìosal.

A’ tasgadh clàr

>>> from pg_graph.main import setup_logging
>>> setup_logging(log_level='DEBUG')
>>> from pg_graph.api import PgGraphApi
>>> api = PgGraphApi('config.hw.local.ini')
>>> api.archive_table('flights', [4,5])
2020-06-20 23:12:08 INFO: flights - START
2020-06-20 23:12:08 INFO: flights - start archive_recursive 2 rows (depth=0)
2020-06-20 23:12:08 INFO: 	START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 	ticket_flights - ForeignKey(pk_main='flight_id', pk_ref='flight_id, ticket_no', fk_ref='flight_id')
2020-06-20 23:12:08 DEBUG: 	SQL('SELECT flight_id, ticket_no FROM bookings.ticket_flights WHERE (flight_id) IN (%s, %s)')
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 3 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 3 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 3 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 3 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 3 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 3 rows
2020-06-20 23:12:08 INFO: 	END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: flights - archive_by_ids 2 rows by flight_id
2020-06-20 23:12:09 DEBUG: SQL('CREATE TABLE IF NOT EXISTS bookings.flights_archive (LIKE bookings.flights)')
2020-06-20 23:12:09 DEBUG: DELETE FROM flights by flight_id - 2 rows
2020-06-20 23:12:09 DEBUG: INSERT INTO flights_archive - 2 rows
2020-06-20 23:12:09 INFO: flights - END

Lorg eisimeileachd airson clàr sònraichte

>>> from pg_graph.api import PgGraphApi
>>> from pprint import pprint
>>> api = PgGraphApi('config.hw.local.ini')
>>> res = api.get_table_references('flights')
>>> pprint(res)
{'in_refs': {'ticket_flights': [ForeignKey(pk_main='flight_id', pk_ref='flight_id, ticket_no', fk_ref='flight_id')]},
 'out_refs': {'aircrafts': [ForeignKey(pk_main='aircraft_code', pk_ref='flight_id', fk_ref='aircraft_code')],
              'airports': [ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='arrival_airport'),
                           ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='departure_airport')]}}

Lorg iomraidhean air teudan leis a’ phrìomh iuchair ainmichte

>>> from pg_graph.api import PgGraphApi
>>> from pprint import pprint
>>> api = PgGraphApi('config.hw.local.ini')
>>> rows = api.get_rows_references('flights', [1,2,3])
>>> pprint(rows)
{1: {'ticket_flights': {'flight_id': [{'flight_id': 1,
                                       'ticket_no': '0005432816945'},
                                      {'flight_id': 1,
                                       'ticket_no': '0005432816941'}]}},
 2: {'ticket_flights': {'flight_id': [{'flight_id': 2,
                                       'ticket_no': '0005433101832'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005433101864'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005432919715'}]}},
 3: {'ticket_flights': {'flight_id': [{'flight_id': 3,
                                       'ticket_no': '0005432817560'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817568'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817559'}]}}}

Tha còd stòr an leabharlainn ri fhaighinn aig GitHub fo chead MIT, a bharrachd air san stòr PyPI.

Bidh mi toilichte beachdan, gealltainn agus molaidhean.

Feuchaidh mi ri ceistean a fhreagairt cho math 's as urrainn dhomh an seo agus anns an ionad-tasgaidh.

Source: www.habr.com

Cuir beachd ann