PgGraph ni matumizi ya kuhifadhi na kutafuta vitegemezi vya jedwali katika PostgreSQL

PgGraph ni matumizi ya kuhifadhi na kutafuta vitegemezi vya jedwali katika PostgreSQL
Leo nataka kuwasilisha wasomaji wa Habr na matumizi yaliyoandikwa katika Python ya kufanya kazi na utegemezi wa meza katika DBMS ya PostgreSQL.

API ya matumizi ni rahisi na ina njia tatu:

  • archive_meza - kuhifadhi/kufuta safu mlalo kwa kujirudia kwa Vifunguo vya Msingi vilivyobainishwa
  • pata_marejeleo_ya_meza - tafuta utegemezi wa jedwali (itaonyesha jedwali zilizorejelewa na iliyoainishwa na wale wanaoirejelea)
  • pata_safu_marejeleo - tafuta safu mlalo katika jedwali zingine zinazorejelea safu mlalo zilizobainishwa kwenye jedwali unalotaka

kabla ya historia

Jina langu ni Oleg Borzov, mimi ni msanidi programu katika timu ya CRM kwa wasimamizi wa mikopo ya nyumba huko Domklik.

Hifadhidata kuu ya mfumo wetu wa CRM ni mojawapo ya kubwa zaidi katika suala la kiasi katika kampuni. Pia ni moja ya kongwe zaidi: ilionekana wakati wa uzinduzi wa mradi huo, wakati miti ilikuwa kubwa, Domklik ilikuwa mwanzo, na badala ya huduma ndogo kwenye mfumo wa asynchronous wa Python kulikuwa na monolith kubwa katika PHP.

Mpito kutoka PHP hadi Python ulikuwa mrefu sana na ulihitaji usaidizi wa wakati mmoja wa mifumo yote miwili, ambayo iliathiri muundo wa hifadhidata.

Kwa hivyo, tuna hifadhidata iliyo na idadi kubwa ya jedwali zilizounganishwa sana na kubwa zilizo na rundo la faharasa za aina tofauti za maswali. Yote hii inathiri vibaya utendaji wa hifadhidata: kwa sababu ya meza kubwa na rundo la uhusiano kati yao, ugumu wa maswali unaongezeka kila wakati, ambayo ni muhimu sana kwa meza zilizopakiwa zaidi.

Ili kupunguza mzigo kwenye hifadhidata, tuliamua kuandika hati ambayo ingehamisha rekodi za zamani kutoka kwa meza zenye nguvu na zilizopakiwa hadi zilizohifadhiwa (kwa mfano, kutoka task Π² task_archive).

Kazi hii ni ngumu na idadi kubwa ya uhusiano kati ya jedwali: songa tu safu kutoka task Π² task_archive haitoshi, kabla ya hapo unahitaji kufanya vivyo hivyo kwa kujirudia na marejeleo hayo yote task meza.

Nitaonyesha kwa mfano hifadhidata ya demo kutoka kwa tovuti postgrespro.ru:

PgGraph ni matumizi ya kuhifadhi na kutafuta vitegemezi vya jedwali katika PostgreSQL
Wacha tuseme tunahitaji kufuta rekodi kutoka kwa jedwali Flights. Postgres haitaturuhusu kufanya hivi hivi: kwanza tunahitaji kufuta rekodi kutoka kwa majedwali yote ya marejeleo, na kadhalika kwa kurudia hadi majedwali ambayo hayarejelewi na mtu yeyote.

Katika mfano wetu katika Flights inahusu Ticket_flights, na juu yake - Boarding_passes.

Kwa hivyo, unahitaji kuifuta kwa mpangilio huu:

  1. Tunapata maadili ya funguo za msingi (PK) za safu Ticket_flights, ambayo inarejelea safu mlalo zinazopaswa kufutwa ndani Flights.
  2. Tunapata safu za PK Boarding_passes, ambayo inarejelea Ticket_flights.
  3. Tunafuta safu na PK kutoka hatua ya 2 kwenye jedwali Boarding_passes.
  4. Futa mistari kulingana na PK kutoka hatua ya 1 ndani Ticket_flights.
  5. Kuondoa mistari kutoka Flights.

Matokeo yake yalikuwa matumizi yanayoitwa PgGraph, ambayo tuliamua kutengeneza chanzo wazi.

Jinsi ya kutumia

Huduma inasaidia njia mbili za matumizi:

  • Piga simu kutoka kwa mstari wa amri (pggraph …).
  • Matumizi katika nambari ya Python (darasa PgGraphApi).

Ufungaji na usanidi

Kwanza unahitaji kusanikisha matumizi kutoka kwa hazina ya Pypi:

pip3 install pggraph

Kisha unda faili ya config.ini kwenye mashine ya ndani na usanidi wa hifadhidata na hati ya kuhifadhi kumbukumbu:

[db]
host = localhost
port = 5432
user = postgres
password = postgres
dbname = postgres
schema = public ; ΠΠ΅ΠΎΠ±ΡΠ·Π°Ρ‚Π΅Π»ΡŒΠ½Ρ‹ΠΉ ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€, ΡƒΠΊΠ°Π·Π°Π½ΠΎ Π·Π½Π°Ρ‡Π΅Π½ΠΈΠ΅ ΠΏΠΎ ΡƒΠΌΠΎΠ»Ρ‡Π°Π½ΠΈΡŽ

[archive]  ; Π”Π°Π½Π½Ρ‹ΠΉ Ρ€Π°Π·Π΄Π΅Π» Π·Π°ΠΏΠΎΠ»Π½ΡΡ‚ΡŒ Π½Π΅ΠΎΠ±ΡΠ·Π°Ρ‚Π΅Π»ΡŒΠ½ΠΎ, Π½ΠΈΠΆΠ΅ ΡƒΠΊΠ°Π·Π°Π½Ρ‹ значСния ΠΏΠΎ ΡƒΠΌΠΎΠ»Ρ‡Π°Π½ΠΈΡŽ
is_debug = false
chunk_size = 1000
max_depth = 20
to_archive = true
archive_suffix = 'archive'

Endesha kutoka kwa koni

Vigezo

$ pggraph -h
usage: pggraph action [-h] --table TABLE [--ids IDS] [--config_path CONFIG_PATH]
positional arguments:
  action        required action: archive_table, get_table_references, get_rows_references

optional arguments:
  -h, --help                    show this help message and exit
  --table TABLE                 table name
  --ids IDS                     primary key ids, separated by comma, e.g. 1,2,3
  --config_path CONFIG_PATH     path to config.ini
  --log_path LOG_PATH           path to log dir
  --log_level LOG_LEVEL         log level (debug, info, error)

Hoja za msimamo:

  • action - hatua inayohitajika: archive_table, get_table_references au get_rows_references.

Hoja zilizopewa jina:

  • --config_path - njia ya faili ya usanidi;
  • --table - meza ambayo unahitaji kufanya kitendo;
  • --ids - orodha ya kitambulisho ikitenganishwa na koma, kwa mfano, 1,2,3 (kigezo cha hiari);
  • --log_path - njia ya folda kwa magogo (parameter ya hiari, kwa default - folda ya nyumbani);
  • --log_level - kiwango cha ukataji miti (kigezo cha hiari, chaguo-msingi ni INFO).

Mifano ya amri

Kuhifadhi meza

Kazi kuu ya shirika ni kuhifadhi data, i.e. kuhamisha safu kutoka kwa jedwali kuu hadi kwenye jedwali la kumbukumbu (kwa mfano, kutoka kwa jedwali vitabu Π² hifadhi_ya_vitabu).

Kufuta bila kuhifadhi pia kunasaidiwa: kwa hili unahitaji kuweka parameter katika config.ini to_archive = uongo).

Vigezo vinavyohitajika - config_path, jedwali na vitambulisho.

Baada ya uzinduzi, rekodi zitafutwa kwa kujirudia ids mezani table na katika majedwali yote yanayorejelea.

$ pggraph archive_table --config_path config.hw.local.ini --table flights --ids 1,2,3
2020-06-20 19:27:44 INFO: flights - START
2020-06-20 19:27:44 INFO: flights - start archive_recursive 3 rows (depth=0)
2020-06-20 19:27:44 INFO:       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:       ticket_flights - start archive_recursive 3 rows (depth=1)
2020-06-20 19:27:44 INFO:               START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               boarding_passes - start archive_recursive 3 rows (depth=2)
2020-06-20 19:27:44 INFO:                       START ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:                       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:               boarding_passes - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:               END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO:       ticket_flights - archive_by_ids 3 rows by ticket_no, flight_id
2020-06-20 19:27:44 INFO:       END ARCHIVE REFERRING TABLES
2020-06-20 19:27:44 INFO: flights - archive_by_ids 3 rows by id
2020-06-20 19:27:44 INFO: flights - END

Tafuta vitegemezi vya jedwali maalum

Kazi ya kupata utegemezi wa jedwali maalum table. Vigezo vinavyohitajika - config_path ΠΈ table.

Baada ya uzinduzi, kamusi itaonyeshwa kwenye skrini, ambapo:

  • in_refs - Kamusi ya majedwali inayorejelea ile iliyotolewa, ambapo ufunguo ni jina la jedwali, thamani ni orodha ya vitu muhimu vya Kigeni (pk_main - ufunguo wa msingi kwenye jedwali kuu, pk_ref - ufunguo wa msingi katika jedwali la marejeleo, fk_ref - jina la safu ambayo ni ufunguo wa kigeni kwenye jedwali la chanzo);
  • out_refs - kamusi ya majedwali ambayo hii inarejelea.

$ pggraph get_table_references --config_path config.hw.local.ini --table flights
{'in_refs': {'ticket_flights': [ForeignKey(pk_main='flight_id', pk_ref='ticket_no, flight_id', fk_ref='flight_id')]},
 'out_refs': {'aircrafts': [ForeignKey(pk_main='aircraft_code', pk_ref='flight_id', fk_ref='aircraft_code')],
              'airports': [ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='arrival_airport'),
                           ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='departure_airport')]}}

Inatafuta marejeleo ya mifuatano yenye Ufunguo Msingi uliobainishwa

Kazi ya kutafuta safu mlalo katika jedwali zingine zinazorejelea safu mlalo kupitia Ufunguo wa Kigeni ids meza table. Vigezo vinavyohitajika - config_path, table ΠΈ ids.

Baada ya uzinduzi, kamusi iliyo na muundo ufuatao itaonyeshwa kwenye skrini:

{
	pk_id_1: {
		reffering_table_name_1: {
			foreign_key_1: [
				{row_pk_1: value, row_pk_2: value},
				...
			], 
			...
		},
		...
	},
	pk_id_2: {...},
	...
}

Simu ya mfano:

$ pggraph get_rows_references --config_path config.hw.local.ini --table flights --ids 1,2,3
{1: {'ticket_flights': {'flight_id': [{'flight_id': 1,
                                       'ticket_no': '0005432816945'},
                                      {'flight_id': 1,
                                       'ticket_no': '0005432816941'}]}},
 2: {'ticket_flights': {'flight_id': [{'flight_id': 2,
                                       'ticket_no': '0005433101832'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005433101864'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005432919715'}]}},
 3: {'ticket_flights': {'flight_id': [{'flight_id': 3,
                                       'ticket_no': '0005432817560'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817568'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817559'}]}}}

Matumizi katika kanuni

Mbali na kuiendesha kwenye koni, maktaba inaweza kutumika katika msimbo wa Python. Mifano ya simu katika mazingira ya mwingiliano ya iPython imeonyeshwa hapa chini.

Kuhifadhi meza

>>> from pg_graph.main import setup_logging
>>> setup_logging(log_level='DEBUG')
>>> from pg_graph.api import PgGraphApi
>>> api = PgGraphApi('config.hw.local.ini')
>>> api.archive_table('flights', [4,5])
2020-06-20 23:12:08 INFO: flights - START
2020-06-20 23:12:08 INFO: flights - start archive_recursive 2 rows (depth=0)
2020-06-20 23:12:08 INFO: 	START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 	ticket_flights - ForeignKey(pk_main='flight_id', pk_ref='flight_id, ticket_no', fk_ref='flight_id')
2020-06-20 23:12:08 DEBUG: 	SQL('SELECT flight_id, ticket_no FROM bookings.ticket_flights WHERE (flight_id) IN (%s, %s)')
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 30 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 30 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 30 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 30 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 30 rows
2020-06-20 23:12:08 INFO: 	ticket_flights - start archive_recursive 3 rows (depth=1)
2020-06-20 23:12:08 INFO: 		START ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 DEBUG: 		boarding_passes - ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 INFO: 		boarding_passes - archive_by_fk 3 rows by ForeignKey(pk_main='flight_id, ticket_no', pk_ref='flight_id, ticket_no', fk_ref='flight_id, ticket_no')
2020-06-20 23:12:08 DEBUG: 		SQL('CREATE TABLE IF NOT EXISTS bookings.boarding_passes_archive (LIKE bookings.boarding_passes)')
2020-06-20 23:12:08 DEBUG: 		DELETE FROM boarding_passes by FK flight_id, ticket_no - 3 rows
2020-06-20 23:12:08 INFO: 		END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: 	ticket_flights - archive_by_ids 3 rows by flight_id, ticket_no
2020-06-20 23:12:08 DEBUG: 	SQL('CREATE TABLE IF NOT EXISTS bookings.ticket_flights_archive (LIKE bookings.ticket_flights)')
2020-06-20 23:12:08 DEBUG: 	DELETE FROM ticket_flights by flight_id, ticket_no - 3 rows
2020-06-20 23:12:08 DEBUG: 	INSERT INTO ticket_flights_archive - 3 rows
2020-06-20 23:12:08 INFO: 	END ARCHIVE REFERRING TABLES
2020-06-20 23:12:08 INFO: flights - archive_by_ids 2 rows by flight_id
2020-06-20 23:12:09 DEBUG: SQL('CREATE TABLE IF NOT EXISTS bookings.flights_archive (LIKE bookings.flights)')
2020-06-20 23:12:09 DEBUG: DELETE FROM flights by flight_id - 2 rows
2020-06-20 23:12:09 DEBUG: INSERT INTO flights_archive - 2 rows
2020-06-20 23:12:09 INFO: flights - END

Tafuta vitegemezi vya jedwali maalum

>>> from pg_graph.api import PgGraphApi
>>> from pprint import pprint
>>> api = PgGraphApi('config.hw.local.ini')
>>> res = api.get_table_references('flights')
>>> pprint(res)
{'in_refs': {'ticket_flights': [ForeignKey(pk_main='flight_id', pk_ref='flight_id, ticket_no', fk_ref='flight_id')]},
 'out_refs': {'aircrafts': [ForeignKey(pk_main='aircraft_code', pk_ref='flight_id', fk_ref='aircraft_code')],
              'airports': [ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='arrival_airport'),
                           ForeignKey(pk_main='airport_code', pk_ref='flight_id', fk_ref='departure_airport')]}}

Inatafuta marejeleo ya mifuatano yenye Ufunguo Msingi uliobainishwa

>>> from pg_graph.api import PgGraphApi
>>> from pprint import pprint
>>> api = PgGraphApi('config.hw.local.ini')
>>> rows = api.get_rows_references('flights', [1,2,3])
>>> pprint(rows)
{1: {'ticket_flights': {'flight_id': [{'flight_id': 1,
                                       'ticket_no': '0005432816945'},
                                      {'flight_id': 1,
                                       'ticket_no': '0005432816941'}]}},
 2: {'ticket_flights': {'flight_id': [{'flight_id': 2,
                                       'ticket_no': '0005433101832'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005433101864'},
                                      {'flight_id': 2,
                                       'ticket_no': '0005432919715'}]}},
 3: {'ticket_flights': {'flight_id': [{'flight_id': 3,
                                       'ticket_no': '0005432817560'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817568'},
                                      {'flight_id': 3,
                                       'ticket_no': '0005432817559'}]}}}

Msimbo wa chanzo wa maktaba unapatikana GitHub chini ya leseni ya MIT, na vile vile kwenye hazina PyPI.

Nitafurahi kwa maoni, ahadi na maoni.

Nitajaribu kujibu maswali kwa kadri ya uwezo wangu hapa na kwenye hazina.

Chanzo: mapenzi.com

Kuongeza maoni