How to make friends with the Progress OpenEdge banking system and Oracle DBMS

Since 1999, our bank has been using the integrated banking system BISKVIT based on the Progress OpenEdge platform to serve the back office, which is widely used all over the world, including in the financial sector. The performance of this DBMS allows you to read up to a million or more records per second in one database (DB). Progress OpenEdge serves about 1,5 million individual deposits and about 22,2 million contracts for active products (car loans and mortgages), and is also responsible for all settlements with the regulator (CB) and SWIFT.

How to make friends with the Progress OpenEdge banking system and Oracle DBMS

Using Progress OpenEdge, we are faced with the fact that we need to make friends with the Oracle DBMS. Initially, this bundle was the "bottleneck" of our infrastructure - until we installed and configured Pro2 CDC - a Progress product that allows you to send data from the Progress DBMS to the Oracle DBMS directly, in online mode. In this post, we will tell you in detail, with all the pitfalls, how to effectively make OpenEdge and Oracle friends.

How it was: uploading data to QCD via file exchange

First, some facts about our infrastructure. The number of active users of the database is approximately 15. The volume of all productive databases, including replica and standby, is 600 TB, the largest database is 16,5 TB. At the same time, the databases are constantly replenished: in the last year alone, about 120 TB of productive data have been added. The system is supported by 150 x86 front servers. The databases are hosted on 21 IBM platform servers.

How to make friends with the Progress OpenEdge banking system and Oracle DBMS
Front systems, various ABS and banking services are integrated with OpenEdge Progress (IBS BISKVIT) via the Sonic ESB bus. Data is uploaded to QCD through file exchange. Until a certain point in time, such a solution had two big problems at once - low performance of uploading information to a corporate data warehouse (CWD) and a long time to perform data reconciliation (reconciliation) with other systems.
How to make friends with the Progress OpenEdge banking system and Oracle DBMS
Therefore, we began to look for a tool that could speed up these processes. The solution to both problems was the new Progress OpenEdge product - Pro2 CDC (Change Data Capture). So, let's begin.

Installing Progress OpenEdge and Pro2Oracle

Progress Oracle OpenEdge Developer Kit Classroom Edition, which can be download for free. Default OpenEdge installation directories:

DLC: C:ProgressOpenEdge
WRK: C:OpenEdgeWRK

ETL processes require Progress OpenEdge version 11.7+ licenses - namely OE DataServer for Oracle and 4GL Development System. These licenses are included with the Pro2. For full-fledged operation of DataServer for Oracle with a remote Oracle database, the Full Oracle Client is installed.

On the Oracle server, you need to install the version of Oracle Database 12+, create an empty database and add a user (let's call it cDC).

To install Pro2Oracle, download a fresh distribution from the download center progress software. Unpack the archive into a directory C:Pro2 (to configure Pro2 on Unix, the same distribution is used and the same principles of configuration apply).

Creating a cdc replication database

Replication database cdc (repl) used by Pro2 to store configuration information, including the replication map, the names of the replicated databases and their tables. It also contains a replication queue consisting of notes about the fact that a table row has changed in the source database. The replication queue data is used by the ETL processes to identify the rows that need to be copied to Oracle from the source database.

We create a separate database cdc.

Procedure for creating a base

  1. On the database server, we create a directory for the cdc database - for example, on the server /database/cdc/.
  2. Create a dummy for the cdc database: procopy $DLC/empty cdc
  3. Enable support for large files: proutil cdc -C EnableLargeFiles
  4. We prepare the script for starting the cdc database. The start parameters must be similar to the start parameters of the replicated database.
  5. We start the cdc database.
  6. Connect to cdc database and load Pro2 schema from file cdc.df, which is included with the Pro2.
  7. Create the following users in the cdc database:

pro2adm - to connect from the Pro2 administrative panel;
pro2etl - for connecting ETL processes (ReplBatch);
pro2cdc - for connecting CDC processes (CDCBatch);

Activating OpenEdge Change Data Capture

Now let's turn on the CDC mechanism itself, with the help of which data will be replicated to an additional technological area. In each source Progress OpenEdge database, you need to add separate storage areas to which the source data will be duplicated, and activate the mechanism itself using the command proutil.

Example procedure for the bisquit database

  1. Copy from directory C:Pro2db file cdcadd.st to the original bisquit database directory.
  2. We describe in cdcadd.st fixed size extents for areas "ReplCDCArea" ΠΈ "ReplCDCArea_IDX". You can add new storage areas online: prostrct addonline bisquit cdcadd.st
  3. Activate OpenEdge CDC:
    proutil bisquit -C enablecdc area "ReplCDCArea" indexarea "ReplCDCArea_IDX"
  4. The following users must be created in the source database to identify running processes:
    a. pro2adm - to connect from the Pro2 administrative panel.
    b. pro2etl - for connecting ETL processes (ReplBatch).
    c. pro2cdc - for connecting CDC processes (CDCBatch).

Creating Schema Holder for DataServer for Oracle

Next, we need to create a Schema Holder database on the server where data from the Progress DBMS to the Oracle DBMS will be replicated. The DataServer Schema Holder is an empty Progress OpenEdge database with no users or application data, containing a mapping between source tables and external, Oracle tables.

The Schema Holder for Progress OpenEdge DataServer for Oracle for Pro2 must be located on the ETL process server and is created separately for each branch.

How to create a Schema Holder

  1. Unpack the Pro2 distribution into a directory /pro2
  2. Create and change directory /pro2/dbsh
  3. Create the Schema Holder database using the command copy $DLC/empty bisquitsh
  4. Performing the conversion bisquitsh into the required encoding - for example, in UTF-8 if Oracle databases have UTF-8 encoding: proutil bisquitsh -C convchar convert UTF-8
  5. After creating an empty database bisquitsh connect to it in single user mode: pro bisquitsh
  6. Go to the Data Dictionary: Tools -> Data Dictionary -> DataServer -> ORACLE Utilities -> Create DataServer Schema
  7. Launch Schema Holder
  8. Set up the Oracle DataServer broker:
    a. Start AdminServer.
    proadsv -start
    b. Starting Oracle DataServer Broker
    oraman -name orabroker1 -start

Setting up the administrative panel and replication scheme

The Pro2 administrative panel configures Pro2 settings, including replication schema setup and ETL process program generation (Processor Library), primary synchronization programs (Bulk-Copy Processor), replication triggers, and OpenEdge CDC policies. There are also primary tools for monitoring and managing ETL and CDC processes. First of all, we set up the parameter files.

How to set up parameter files

  1. Go to catalog C:Pro2bpreplScripts
  2. Opening the file for editing replProc.pf
  3. Add connection parameters to the cdc replication database:
    # Replication Database
    -db cdc -ld repl -H <main db hostname> -S <cdc db broker port>
    -U pro2admin -P <password>
  4. Let's add to replProc.pf connection settings to source databases and Schema Holder as settings files. The name of the parameter file must match the name of the source database being connected.
    # Connect to all replicated sources BISQUIT
    -pf bpreplscriptsbisquit.pf
  5. Let's add to replProc.pf connection parameters to Schema Holder.
    #Target Pro DB Schema Holder
    -db bisquitsh -ld bisquitsh
    -H <hostname of ETL processes>
    -S <biskuitsh broker port>
    -db bisquitsql
    -ld bisquitsql
    -dt ORACLE
    -S 5162 -H <Oracle broker hostname>
    -DataService orabroker1
  6. Saving the settings file replProc.pf
  7. Next, you need to create and open for editing parameter files for each pluggable source database in the directory C:Pro2bpreplScripts: bisquit.pf. Each pf file specifies the parameters for connecting to the corresponding database, for example:
    -db bisquit -ld bisquit -H <hostname> -S <broker port>
    -U pro2admin -P <password>

To configure Windows shortcuts, go to the directory C:Pro2bpreplScripts and edit the label "Pro2 - Administration". To do this, open the properties of the shortcut and in the line start in specify the Pro2 installation directory. A similar operation must be done for the shortcuts "Pro2 - Editor" and "RunBulkLoader".

Pro2 Administration Setup: Load Primary Configuration

We start the console.

How to make friends with the Progress OpenEdge banking system and Oracle DBMS

Go to "DB Map".

How to make friends with the Progress OpenEdge banking system and Oracle DBMS

To link databases in Pro2 - Administration, go to the tab db map. Adding source database mapping - Schema Holder - Oracle.

How to make friends with the Progress OpenEdge banking system and Oracle DBMS

Go to the tab Mapping. In the list Source Database by default, the first connected source database is selected. To the right of the list should be the inscription All Databases Connected β€” the selected bases are connected. Below, on the left, you should see a list of Progress tables from bisquit. On the right is a list of tables from the Oracle database.

Creating SQL Schemas and Databases in Oracle

To create a replication map, you must first generate SQL schema in Oracle. In Pro2 Administration, execute the menu item Tools -> Generate Code -> Target Schema, then in the dialog box Select Database select one or more source databases and move them to the right.

How to make friends with the Progress OpenEdge banking system and Oracle DBMS

Click OK and select a directory to save the SQL schemas.

Next, we create the base. This can be done, for example, through Oracle SQL Developer. To do this, we connect to the Oracle database and load the schema for adding tables. After changing the composition of the Oracle tables, you need to update the SQL schemas in the Schema Holder.

How to make friends with the Progress OpenEdge banking system and Oracle DBMS

After the download is successfully completed, exit the bisquitsh database and open the Pro2 administrative panel. Tables from the Oracle database should appear on the Mapping tab on the right.

Table mapping

To create a replication map in the Pro2 administrative panel, go to the Mapping tab, select the source database. We click on Map Tables, select on the left Select Changes the tables that should be replicated in Oracle, transfer them to the right and confirm the selection. For the selected tables, a map will be created automatically. We repeat the operation to create a replication map for other source databases.

How to make friends with the Progress OpenEdge banking system and Oracle DBMS

Generate Pro2 Replication Processor Library and Bulk-Copy Processor Programs

The Processor Library is for dedicated replication processes (ETL) that process the Pro2 replication queue and push changes to the Oracle database. Replication processor library programs after generation are automatically saved to the directory bprepl/repl_proc (PROC_DIRECTORY parameter). To generate the replication processor library, go to Tools -> Generate Code -> Processor Library. After the generation is completed, the programs will appear in the catalog bprepl/repl_proc.

Bulk loader programs are used to synchronize source Progress databases with the target Oracle database based on the Progress ABL (4GL) programming language. To generate them, go to the menu item Tools -> Generate Code -> Bulk-Copy Processor. In the Select Database dialog box, select the source databases, transfer them to the right side of the window and click OK. After the generation is completed, the programs will appear in the catalog bpreplrepl_mproc.

Setting up replication processes in Pro2

Dividing tables into sets served by a separate replication thread improves the performance and efficiency of Pro2 Oracle. By default, all links created in the replication map for new replication tables are bound to stream number 1. It is recommended to split tables into different streams.

Information about the status of replication streams is displayed on the Pro2 Administration screen in the Monitor tab in the Replication Status section. A detailed description of the parameter values ​​can be found in the Pro2 documentation (C:Pro2Docs directory).

Creating and activating CDC policies

Policies are a set of rules for the OpenEdge CDC engine that track changes to tables. At the time of writing, Pro2 only supports CDC policies with level 0, meaning only the fact is tracked record changes.

To create a CDC policy in the administrative panel, go to the Mapping tab, select the source database, and click the Add/Remove Policies button. In the Select Changes window that opens, select on the left side and transfer to the right tables for which you need to create or delete a CDC policy.

To activate again, open the Mapping tab, select the source database and click on the button (In)Activate Policies. Select and move to the right side of the table whose policies need to be activated, click OK. After that, they are marked in green. By using (In)Activate Policies you can also disable CDC policies. All transactions are performed online.

How to make friends with the Progress OpenEdge banking system and Oracle DBMS

After the CDC policy is activated, notes about the modified records are stored in the storage area "ReplCDCArea" according to the original database. These notes will be handled by a special process CDCBatch, which, based on them, will create notes in the Pro2 replication queue in the database cdc (repl).

Thus, we have two queues for replication. The first queue is CDCBatch: from the source database, the data first enters the intermediate CDC database. The second stage is when data is transferred from the CDC database to Oracle. This is a feature of the current architecture and the product itself - until the developers were able to establish direct replication.

Primary sync

After enabling the CDC mechanism and configuring the Pro2 replication server, we need to start the initial synchronization. Primary sync start command:

/pro2/bprepl/Script/replLoad.sh bisquit table-name

After the initial synchronization is completed, you can start replication processes.

Start of replication processes

To start replication processes, you need to execute the script replbatch.sh. Before starting, make sure that replbatch scripts are available for all threads - replbatch1, replbatch2, etc. If everything is in place, open the command line (for example, proenv), go to the directory /bprepl/scripts and run the script. In the administrative panel, we check that the corresponding process has received the RUNNING status.

How to make friends with the Progress OpenEdge banking system and Oracle DBMS

The results

How to make friends with the Progress OpenEdge banking system and Oracle DBMS
After the implementation, we greatly accelerated the uploading of information to the corporate data warehouse. The data independently gets to Oracle online. There is no need to waste time on running some long-running queries to collect data from different systems. In addition, in this solution, the replication process can compress data, which also has a positive effect on speed. Now the daily reconciliation of the BISKVIT system with other systems began to take 15-20 minutes instead of 2-2,5 hours, and a full reconciliation took several hours instead of two days.

Source: habr.com

Add a comment