Which is better - Oracle or Redis or How to justify the choice of platform

β€œIt’s necessary,” she said loudly to no one. - It's necessary! So it is directly written - the main task of the company is to extract profit in the interests of shareholders. Well, you think! Don't be afraid of anything!

Julius Dubov, "The Lesser Evil"

Seeing such a headline, you probably already decided that the article is either stupidity or a provocation. But do not rush to conclusions: employees of large corporations, especially state-owned corporations, quite often have to compare different platforms, including completely different ones - for example, those in the title.

Which is better - Oracle or Redis or How to justify the choice of platform

Of course, no one compares DBMS in this way, because their strengths and weaknesses are well known. As a rule, platforms that solve some applied problem are subject to comparison. In the article, I will show the methodology that is used in this case, using the example of databases as a subject familiar to Habr readers firsthand. So,

Motivation

When you start a study project or a hobby project, the motivation for choosing a platform can be very diverse: β€œI know this platform best”, β€œI am interested in learning about this one”, β€œhere is the best documentation” ... In the case of a commercial company, the selection criterion is one: how much will I have to pay and what will I get for this money.

Naturally, you want to pay less and get more. However, it is necessary to decide what is more important - to pay less or get more, and assign a weight to each node. Let's assume that a high-quality solution is more important to us than a cheap one, and we assign a weight of 40% to the "Cost" node, and 60% to the "Features" node.

Which is better - Oracle or Redis or How to justify the choice of platform

In large corporations, usually the opposite is true - the cost weight does not fall below 50%, and maybe more than 60%. In the model example, it is only important that the total weight of the child nodes of any parent node must be 100%.

Cut-off conditions

Site db-engines.com About 500 database management systems are known. Naturally, if you choose a target platform from so many options, you might end up with a review article, but not a commercial project. In order to reduce the choice space, cut-off criteria are formulated, and if the platform does not meet these criteria, then it is not considered.

Cut-off criteria may refer to technological features, for example:

  • ACID guarantees;
  • relational data model;
  • support for the SQL language (note that this is not the same as "relational model");
  • possibility of horizontal scaling.

There may be general criteria:

  • availability of commercial support in Russia;
  • open source;
  • presence of a platform in the Register of the Ministry of Telecom and Mass Communications;
  • presence of the platform in some rating (for example, in the first hundred of the db-engines.com rating);
  • availability of experts in the market (for example, based on the results of searching for the name of the platform in the summary on the hh.ru website).

After all, there may be enterprise-specific criteria:

  • availability of specialists in the state;
  • compatibility with monitoring system X or backup system Y, on which all maintenance is tied ...

The most important thing is that there should be a list of cut-off criteria. Otherwise, there will definitely be some expert (or "expert"), who enjoys the special trust of management, who will say, "why didn't you choose the Z platform, I know it is the best."

Cost estimate

The cost of the solution obviously consists of the cost of licenses, the cost of maintenance and the cost of equipment.

If the systems are approximately the same class (for example, Microsoft SQL Server and PostgreSQL), then for simplicity we can assume that the amount of equipment for both solutions will be approximately the same. This will allow you not to evaluate the equipment, thereby saving a lot of time and effort. If you have to compare completely different systems (say, Oracle vs. Redis), then it is obvious that for a correct assessment it is necessary to do sizing (calculation of the amount of equipment). Sizing a non-existent system is a very thankless task, so they still try not to allow such a comparison. It is easy to do this: zero data loss and a relational model are written in cut-off conditions, or vice versa - a load of 50 thousand transactions per second.

To evaluate licenses, it is enough to ask the vendor or its partners for the cost of a license for a fixed number of cores and support for a fixed period. As a rule, companies already have strong relationships with software vendors, and if the database operations department cannot answer the question about the cost on its own, then one letter is enough to obtain this information.

Different vendors may have different licensing metrics: by the number of cores, the amount of data, or the number of nodes. The standby base can be free, or it can be licensed in the same way as the main one. If only some differences in the metrics are found, you will have to describe the model stand in detail and calculate the cost of licenses for the stand.

An important point for a correct comparison is the same support conditions. Let's say Oracle support costs 22% of the license price per year, but you can not pay for PostgreSQL support. Is it correct to compare like this? No, because an error that cannot be fixed on its own has completely different consequences: in the first case, support specialists will quickly help you fix it, and in the second case, there is a risk of delaying the project or downtime of the finished system for an indefinite period.

You can equalize the calculation conditions in three ways:

  1. Use Oracle without support (in reality, this does not happen).
  2. Buy support on PostgreSQL - for example, from Postgres Professional.
  3. Take into account the risks associated with the lack of support.

For example, a risk calculation might look like this: in the event of a fatal database failure, a simple system will be 1 business day. The planned profit from the use of the system is MNT 40 billion per year, the accident rate is estimated at 1/400, so the risk of unaccompanying is estimated at about MNT 100 million per year. It is obvious that "planned profit" and "estimated frequency of accidents" are virtual values, but it is much better to have such a model than not to have any.

In reality, the system may be too important, and the reputational loss from a long downtime will be unacceptable, so support is required. If downtime is allowed, then dropping support can sometimes be a good way to save money.

Assume that after all calculations, the cost of operating platform A for 5 years turned out to be 800 million MNT, the cost of operating platform B was 650 million MNT, and the cost of operating platform C was 600 million MNT. Platform C, as the winner, gets a full point for the cost, and platforms A and B get a little less, in proportion to how many times they are more expensive. In this case, 0.75 and 0.92 points, respectively.

Assessment of opportunities

Opportunity assessment is divided into many groups, the number of which is limited only by the imagination of the one who makes the assessment. The best option seems to be the division of opportunities by teams that will use these opportunities; in our example, these are developers, administrators, and information security officers. Let's say that the weights of these functions are distributed as 40:40:20.

Development features include:

  • ease of data manipulation;
  • scaling;
  • presence of secondary indexes.

The list of criteria, like their weights, is very subjective. Even when solving the same problem, these lists, item weights, and answers will vary significantly depending on the composition of your team. For example, Facebook uses MySQL for data storage, and Instagram is built on Cassandra. It is unlikely that the developers of these applications filled out such tables. One can only guess that Mark Zuckerberg chose a full-fledged relational model, paying for it with the need for applied sharding, while Kevin Systrom laid down scaling using platform tools, sacrificing ease of access to data.

Administration functions include:

  • backup system capabilities;
  • ease of monitoring;
  • ease of capacity management - disks and nodes;
  • data replication capabilities.

Note that the wording of the questions must be quantifiable. You can even agree on how to evaluate a particular function. Let's, for example, try to rate backup tools using the example of tools supplied with Oracle DBMS:

Tool
Comment
Evaluation

imp/exp
Data upload and download
0.1

begin/end backup
Copying files
0.3

RMAN
Possibility of incremental copying
0.7

ZDLRA
Incremental backup only, fastest point restore
1.0

If there are no clear evaluation criteria, it makes sense to ask several experts to give marks, and then average them.

Finally, let's just list the information security features:

  • the presence of password management policies;
  • the ability to connect external authentication tools (LDAP, Kerberos);
  • role model of access;
  • audit capabilities;
  • disk encryption;
  • encryption in transit over the network (TLS);
  • data protection from the administrator.

Performance testing

Separately, I would like to warn against using the results of any load tests that were not made by you as arguments.

First, the data structure and load profile of the applications under test may differ significantly from the task that you are going to solve. 10-15 years ago, database vendors liked to flaunt their TPC scores, but now no one seems to take those scores seriously.

Secondly, the performance of the system depends quite strongly on the platform for which the code was originally written and on what equipment the test was carried out. I have seen a lot of benchmarks comparing Oracle with PostgreSQL. The results range from the unconditional superiority of one system to the equally unconditional superiority of another.

And finally, thirdly, you do not know anything about who conducted the test. Both qualifications, which affect the quality of OS and platform settings, are important, as well as motivation, which affects test results more than all other factors combined.

If performance is a critical factor, perform the test yourself, preferably with the experts who will tune and maintain the production system.

Experience the Power of Effective Results

Finally, the result of all the work done should be a spreadsheet, where all the scores are summarized, multiplied and summed up:

Which is better - Oracle or Redis or How to justify the choice of platform

As you understand, by changing the weights and adjusting the estimates, you can achieve any desired result, but that's a completely different story ...

Source: habr.com

Add a comment