Business logic in the database with SchemaKeeper

The purpose of this article is on the example of a library schema-keeper show tools that can significantly facilitate the process of developing databases within PHP projects using the PostgreSQL DBMS.

The information from this article, first of all, will be useful to developers who want to use the PostgreSQL capabilities to the maximum, but are faced with the problems of maintaining the business logic placed in the database.

The article will not describe the advantages or disadvantages of storing business logic in a database. It is assumed that the choice has already been made by the reader.

The following questions will be considered:

  1. In what form to store the dump of the database structure in the version control system (hereinafter referred to as VCS)
  2. How to track changes in the database structure after saving the dump
  3. How to transfer changes in the database structure to other environments without conflicts and giant migration files
  4. How to set up the process of parallel work on a project of several developers
  5. How to safely deploy more changes in the database structure to the production environment

    SchemaKeeper sharpened to work with stored procedures written in the language PL/pgSQL. Testing with other languages ​​has not been conducted, therefore, the use may not be as effective or impossible.

How to store a database structure dump in VCS

Library schema-keeper provides a function saveDump, which saves the structure of all objects from the database as separate text files. The output creates a directory containing the database structure, divided into grouped files that are easy to add to the VCS.

Consider converting objects from a database to files using several examples:

Object type
scheme
Name
Relative file path

Table
public insurance
accounts
./public/tables/accounts.txt

Stored procedure
public insurance
auth(hash bigint)
./public/functions/auth(int8).sql

Performance
booking
, signature
./booking/views/tariffs.txt

The content of the files is a textual representation of the structure of a particular database object. For example, for stored procedures, the contents of the file will be the full definition of the stored procedure, starting with the block CREATE OR REPLACE FUNCTION.

As can be seen from the table above, the path to the file contains information about the type, scheme, and name of the object. This approach facilitates navigation through the dump and code review of changes in the database.

Expansion .sql for files with source code of stored procedures, it is selected so that the IDE automatically provides tools for interacting with the database when the file is opened.

How to track changes in the database structure after saving the dump

By saving the dump of the current database structure in the VCS, we get the opportunity to check whether changes were made to the database structure after the dump was created. In library schema-keeper to detect changes in the structure of the database, a function is provided verifyDump, which returns information about differences without side effects.

An alternative way to check is to re-call the function saveDump, specifying the same directory, and check the VCS for changes. Since all objects from the database are stored in separate files, VCS will show only changed objects.
The main disadvantage of this method is the need to overwrite files in order to see the changes.

How to transfer changes in the database structure to other environments without conflicts and giant migration files

Thanks to the function deployDump stored procedure source code can be edited in exactly the same way as regular application source code. You can add/delete new lines in the stored procedure code and immediately submit changes to the version control system, or create/delete stored procedures by creating/deleting the appropriate files in the dump directory.

For example, to create a new stored procedure in the schema public just create a new file with the extension .sql in the directory public/functions, put the source code of the stored procedure into it, including the block CREATE OR REPLACE FUNCTION, then call the function deployDump. Similarly, there is a change and removal of a stored procedure. Thus, the code simultaneously enters both the VCS and the database.

If an error appears in the source code of any stored procedure, or a mismatch between the names of the file and the stored procedure, then deployDump fails, displaying the error text. Mismatch of stored procedures between the dump and the current database is not possible when using deployDump.

When creating a new stored procedure, there is no need to manually enter the correct file name. It is enough that the file has an extension .sql. After the call deployDump the error text will contain the correct name that can be used to rename the file.

deployDump allows you to change the parameters of a function or return type without additional steps, while with the classical approach you would have to
execute first DROP FUNCTION, and only then CREATE OR REPLACE FUNCTION.

Unfortunately, there are some situations where deployDump unable to automatically apply the changes. For example, if a trigger function is deleted that is used by at least one trigger. Such situations are solved manually using migration files.

If he himself is responsible for transferring changes in stored procedures schema-keeper, then to transfer the rest of the changes in the structure, you must use the migration files. For example, a good library for working with migrations is doctrine/migrations.

Migrations must be applied prior to launch deployDump. This allows you to make all changes to the structure and resolve problematic situations so that changes in stored procedures are later transferred without problems.

Working with migrations will be described in more detail in the following sections.

How to set up the process of parallel work on a project of several developers

It is necessary to create a complete database initialization script, which will be run by the developer on his working machine, bringing the structure of the local database in line with the dump saved in the VCS. The easiest way is to divide the initialization of the local database into 3 steps:

  1. Importing a file with a basic structure, which will be called, for example, base.sql
  2. Applying migrations
  3. Вызов deployDump

base.sql is the starting point over which migrations are applied and deployDumpThat is, base.sql + миграции + deployDump = актуальная структура БД. You can generate such a file using the utility pg_dump... Used by base.sql only when initializing the database from scratch.

Let's call the full database initialization script refresh.sh. The workflow might look like this:

  1. The developer runs in his environment refresh.sh and gets the current database structure
  2. The developer begins work on the task, modifying the local database to meet the needs of the new functionality (ALTER TABLE ... ADD COLUMN etc)
  3. After the task is completed, the developer calls the function saveDumpto commit the changes made to the database to the VCS
  4. Developer re-launches refresh.shThen verifyDumpwhich now shows the list of changes to include in the migration
  5. The developer transfers all structure changes to the migration file, launches again refresh.sh и verifyDump, and, if the migration is compiled correctly, verifyDump will show no differences between the local database and the saved dump

The process described above is compatible with the principles of gitflow. Each branch in the VCS will contain its own version of the dump, and when branches are merged, dumps will be merged. In most cases, no further action is required after the merge, but if changes were made in different branches, for example, to the same table, a conflict may arise.

Consider a conflict situation using an example: there is a branch develop, from which two branches branch off: feature1 и feature2, which do not have conflicts with developbut have conflicts with each other. The task is to merge both branches into develop. For such a case, it is recommended to first merge one of the branches in developand then merge develop to the remaining branch, while resolving conflicts in the remaining branch, and then merge the last branch into develop. During the conflict resolution step, you may need to fix the migration file in the latest branch to match the final dump that includes the merges.

How to safely deploy more changes in the database structure to the production environment

Due to the presence of a dump of the current database structure in the VCS, it becomes possible to check the production database for exact compliance with the required structure. This ensures that all the changes that the developers intended were successfully transferred to the production base.

As DDL in PostgreSQL is transactional, it is recommended to adhere to the following deployment order, so that, in case of an unexpected error, it is “painless” to execute ROLLBACK:

  1. Start transaction
  2. Run all migrations in a transaction
  3. In the same transaction, execute deployDump
  4. Without completing the transaction, execute verifyDump. If there are no errors, run COMMIT. If there are errors, run ROLLBACK

These steps are fairly easy to integrate into existing approaches to application deployment, including zero-downtime.

Conclusion

Thanks to the above methods, you can get the most performance out of "PHP + PostgreSQL" projects, while sacrificing a relatively small amount of development convenience compared to implementing all the business logic in the main application code. Moreover, data processing PL/pgSQL often looks more transparent and requires less code than the same functionality written in PHP.

Source: habr.com

Add a comment