Is áis é PgGraph chun spleáchais tábla a chartlannú agus a aimsiú in PostgreSQL

Is áis é PgGraph chun spleáchais tábla a chartlannú agus a aimsiú in PostgreSQL
Sa lá atá inniu ba mhaith liom fóntais scríofa i Python a chur i láthair do léitheoirí Habr chun oibriú le spleáchais boird sa DBMS PostgreSQL.

Tá API an áirgiúlachta simplí agus tá trí mhodh ann:

  • cartlann_ tábla - cartlannú athchúrsach / sraitheanna a scriosadh le Príomheochracha sonraithe
  • get_table_tagairtí — spleáchais a chuardach le haghaidh tábla (taispeánfar táblaí dá dtagraítear sa cheann sonraithe agus iad siúd a thagraíonn dó)
  • get_rows_references - cuardaigh sraitheanna i dtáblaí eile a thagraíonn do rónna sonraithe sa tábla atá ag teastáil

réamhstair

Oleg Borzov is ainm dom, is forbróir mé san fhoireann CRM do bhainisteoirí iasachtaí morgáiste i Domklik.

Tá príomhbhunachar sonraí ár gcóras CRM ar cheann de na cinn is mó sa chuideachta i dtéarmaí toirte. Tá sé ar cheann de na cinn is sine freisin: bhí sé le feiceáil ag an seoladh an-an tionscadail, nuair a bhí na crainn mór, bhí Domklik tosaithe, agus in ionad microservice ar chreat faiseanta Python asincrónach bhí monolith ollmhór i PHP.

Bhí an t-aistriú ó PHP go Python an-fhada agus bhí tacaíocht chomhuaineach ag teastáil ón dá chóras, rud a chuir isteach ar dhearadh an bhunachair sonraí.

Mar thoradh air sin, tá bunachar sonraí againn le líon mór táblaí an-nasctha agus ollmhór le bunch innéacsanna le haghaidh cineálacha éagsúla fiosrúchán. Bíonn tionchar diúltach ag seo go léir ar fheidhmíocht an bhunachair sonraí: mar gheall ar tháblaí móra agus go leor caidrimh eatarthu, tá castacht na bhfiosruithe ag méadú i gcónaí, rud atá ríthábhachtach go háirithe do na táblaí is mó ualach.

Chun an t-ualach ar an mbunachar sonraí a laghdú, bheartaíomar script a scríobh a d'aistreofaí seantaifid ó na táblaí is toirtiúla agus is lódáilte go dtí na cinn chartlainne (mar shampla, ó task в task_archive).

Tá an tasc seo casta ag an líon mór caidreamh idir táblaí: bog na sraitheanna ó task в task_archive Ní leor, roimhe sin ní mór duit an rud céanna a dhéanamh go hathchúrsach leis na tagairtí sin go léir task táblaí.

Léireoidh mé le sampla bunachar sonraí taispeána ón suíomh postgrespro.ru:

Is áis é PgGraph chun spleáchais tábla a chartlannú agus a aimsiú in PostgreSQL
Ligean le rá go gcaithfimid taifid a scriosadh as tábla Flights. Ní ligfidh Postgres dúinn é seo a dhéanamh díreach mar sin: ní mór dúinn ar dtús taifid a scriosadh as gach tábla tagartha, agus mar sin de réir a chéile síos go dtí táblaí nach ndéanann duine ar bith tagairt dóibh.

Inár sampla ag Flights tagraíonn Ticket_flights, agus uirthi - Boarding_passes.

Mar sin, ní mór duit é a scriosadh san ord seo:

  1. Faighimid na heochracha bunscoile (PK) luachanna na sraitheanna i Ticket_flights, a thagraíonn do na sraitheanna atá le scriosadh i Flights.
  2. Faighimid sraitheanna PK Boarding_passes, a thagraíonn do Ticket_flights.
  3. Scriosaimid sraitheanna de réir PK ó chéim 2 sa tábla Boarding_passes.
  4. Scrios línte ag PK ó chéim 1 isteach Ticket_flights.
  5. Línte a bhaint as Flights.

Ba é an toradh a bhí air ná fóntais ar a dtugtar PgGraph, a shocraigh muid foinse oscailte a dhéanamh.

Conas a úsáid

Tacaíonn an fóntais le dhá mhodh úsáide:

  • Glaoigh ón líne ordaithe (pggraph …).
  • Úsáid i gcód Python (rang PgGraphApi).

Suiteáil agus cumraíocht

Ar dtús is gá duit an fóntais a shuiteáil ó stór Pypi:

pip3 install pggraph

Ansin cruthaigh comhad config.ini ar an meaisín áitiúil le cumraíocht an bhunachair shonraí agus an script cartlainne:

[db]
host = localhost
port = 5432
user = postgres
password = postgres
dbname = postgres
schema = public ; Необязательный параметр, указано значение по умолчанию

[archive]  ; Данный раздел заполнять необязательно, ниже указаны значения по умолчанию
is_debug = false
chunk_size = 1000
max_depth = 20
to_archive = true
archive_suffix = 'archive'

Rith ón consól

Paraiméadair

$ pggraph -h
usage: pggraph action [-h] --table TABLE [--ids IDS] [--config_path CONFIG_PATH]
positional arguments:
  action        required action: archive_table, get_table_references, get_rows_references

optional arguments:
  -h, --help                    show this help message and exit
  --table TABLE                 table name
  --ids IDS                     primary key ids, separated by comma, e.g. 1,2,3
  --config_path CONFIG_PATH     path to config.ini
  --log_path LOG_PATH           path to log dir
  --log_level LOG_LEVEL         log level (debug, info, error)

Argóintí seasaimh:

  • action - gníomh riachtanach: archive_table, get_table_referencesget_rows_references.

Argóintí ainmnithe:

  • --config_path - cosán go dtí an comhad cumraíochta;
  • --table — tábla a gcaithfidh tú gníomh a dhéanamh leis;
  • --ids — liosta aitheantais scartha le camóga, mar shampla, 1,2,3 (paraiméadar roghnach);
  • --log_path - cosán go dtí an fillteán le haghaidh logs (paraiméadar roghnach, de réir réamhshocraithe - fillteán baile);
  • --log_level — leibhéal logála (paraiméadar roghnach, is é INFO an réamhshocrú).

Samplaí ordú

Tábla á chur i gcartlann

Is í príomhfheidhm an áirgiúlachta ná cartlannú sonraí, i.e. sraitheanna a aistriú ón bpríomhthábla go dtí an tábla cartlainne (mar shampla, ón tábla leabhair в leabhair_cartlann).

Tacaítear freisin le scriosadh gan cartlannú: chuige seo ní mór duit an paraiméadar a shocrú i config.ini to_archive = bréagach).

Paraiméadair riachtanacha - config_path, tábla agus aitheantais.

Tar éis iad a sheoladh, scriosfar taifid go hathchúrsach ids sa tábla table agus i ngach tábla a thagraíonn dó.

$ pggraph archive_table --config_path config.hw.local.ini --table flights --ids 1,2,3
2020-06-20 19:27:44 INFO: flights - START
2020-06-20 19:27:44 INFO: flights - start archive_recursive 3 rows (depth=0)
2020-06-20 19:27:44 INFO:       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:       ticket_flights - start archive_recursive 3 rows (depth=1)
2020-06-20 19:27:44 INFO:               START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:       ticket_flights - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: flights - archive_by_ids 3 rows by id
2020-06-20 19:27:44 INFO: flights - END

Faigh spleáchais le haghaidh tábla sonraithe

Feidhm chun spleáchais tábla sonraithe a fháil table. Paraiméadair riachtanacha - config_path и table.

Tar éis é a sheoladh, taispeánfar foclóir ar an scáileán, áit:

  • in_refs — foclóir táblaí a thagraíonn do cheann tugtha, áit arb í an eochair ainm an tábla, is é an luach ná liosta de na cuspóirí Eochracha Coigríche (pk_main - príomheochair sa phríomhthábla, pk_ref - príomheochair sa tábla tagartha, fk_ref — ainm an cholúin arb é an eochair eachtrach don tábla foinse é);
  • out_refs — foclóir táblaí a dtagraíonn an ceann seo dó.

$ pggraph get_table_references --config_path config.hw.local.ini --table flights
{'in_refs': {'ticket_flights': [ForeignKey(pk_main='flight_id', pk_ref='ticket_no, flight_id', fk_ref='flight_id')]},
 'out_refs': {'aircrafts': [ForeignKey(pk_main='aircraft_code', pk_ref='flight_id', fk_ref='aircraft_code')],
              'airports': [ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='arrival_airport'),
                           ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='departure_airport')]}}

Tagairtí a aimsiú do teaghráin leis an bPríomheochair sonraithe

Feidhm chun sraitheanna a chuardach i dtáblaí eile a thagraíonn do rónna trí Eochair Choigríche ids táblaí table. Paraiméadair riachtanacha - config_path, table и ids.

Tar éis é a sheoladh, taispeánfar foclóir leis an struchtúr seo a leanas ar an scáileán:

{
	pk_id_1: {
		reffering_table_name_1: {
			foreign_key_1: [
				{row_pk_1: value, row_pk_2: value},
				...
			], 
			...
		},
		...
	},
	pk_id_2: {...},
	...
}

Glao samplach:

$ pggraph get_rows_references --config_path config.hw.local.ini --table flights --ids 1,2,3
{1: {'ticket_flights': {'flight_id': [{'flight_id': 1,
                                       'ticket_no': '0005432816945'},
                                      {'flight_id': 1,
                                       'ticket_no': '0005432816941'}]}},
 2: {'ticket_flights': {'flight_id': [{'flight_id': 2,
                                       'ticket_no': '0005433101832'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005433101864'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005432919715'}]}},
 3: {'ticket_flights': {'flight_id': [{'flight_id': 3,
                                       'ticket_no': '0005432817560'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817568'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817559'}]}}}

Úsáid i gcód

Chomh maith lena rith sa chonsól, is féidir an leabharlann a úsáid i gcód Python. Taispeántar samplaí de ghlaonna i dtimpeallacht idirghníomhach iPython thíos.

Tábla á chur i gcartlann

>>> from pg_graph.main import setup_logging
>>> setup_logging(log_level='DEBUG')
>>> from pg_graph.api import PgGraphApi
>>> api = PgGraphApi('config.hw.local.ini')
>>> api.archive_table('flights', [4,5])
2020-06-20 23:12:08 INFO: flights - START
2020-06-20 23:12:08 INFO: flights - start archive_recursive 2 rows (depth=0)
2020-06-20 23:12:08 INFO: 	START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 	ticket_flights - ForeignKey(pk_main='flight_id', pk_ref='flight_id, ticket_no', fk_ref='flight_id')
2020-06-20 23:12:08 DEBUG: 	SQL('SELECT flight_id, ticket_no FROM bookings.ticket_flights WHERE (flight_id) IN (%s, %s)')
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 3 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 3 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 3 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 3 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 3 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 3 rows
2020-06-20 23:12:08 INFO: 	END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: flights - archive_by_ids 2 rows by flight_id
2020-06-20 23:12:09 DEBUG: SQL('CREATE TABLE IF NOT EXISTS bookings.flights_archive (LIKE bookings.flights)')
2020-06-20 23:12:09 DEBUG: DELETE FROM flights by flight_id - 2 rows
2020-06-20 23:12:09 DEBUG: INSERT INTO flights_archive - 2 rows
2020-06-20 23:12:09 INFO: flights - END

Faigh spleáchais le haghaidh tábla sonraithe

>>> from pg_graph.api import PgGraphApi
>>> from pprint import pprint
>>> api = PgGraphApi('config.hw.local.ini')
>>> res = api.get_table_references('flights')
>>> pprint(res)
{'in_refs': {'ticket_flights': [ForeignKey(pk_main='flight_id', pk_ref='flight_id, ticket_no', fk_ref='flight_id')]},
 'out_refs': {'aircrafts': [ForeignKey(pk_main='aircraft_code', pk_ref='flight_id', fk_ref='aircraft_code')],
              'airports': [ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='arrival_airport'),
                           ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='departure_airport')]}}

Tagairtí a aimsiú do teaghráin leis an bPríomheochair sonraithe

>>> from pg_graph.api import PgGraphApi
>>> from pprint import pprint
>>> api = PgGraphApi('config.hw.local.ini')
>>> rows = api.get_rows_references('flights', [1,2,3])
>>> pprint(rows)
{1: {'ticket_flights': {'flight_id': [{'flight_id': 1,
                                       'ticket_no': '0005432816945'},
                                      {'flight_id': 1,
                                       'ticket_no': '0005432816941'}]}},
 2: {'ticket_flights': {'flight_id': [{'flight_id': 2,
                                       'ticket_no': '0005433101832'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005433101864'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005432919715'}]}},
 3: {'ticket_flights': {'flight_id': [{'flight_id': 3,
                                       'ticket_no': '0005432817560'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817568'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817559'}]}}}

Tá cód foinse na leabharlainne ar fáil ag GitHub faoi ​​cheadúnas MIT, chomh maith leis an stór PyPI.

Beidh áthas orm tuairimí, gealltanais agus moltaí a thabhairt.

Déanfaidh mé iarracht ceisteanna a fhreagairt chomh fada agus is féidir liom anseo agus sa stór.

Foinse: will.com

Add a comment