AWR: How Exaggerated is the Database?

With this short post, I would like to dispel one misunderstanding related to the analysis of AWR databases running on Oracle Exadata. For almost 10 years I have been constantly faced with the question: what is the contribution of Exadata Software to performance? Or with the use of newly formed words: to what extent does the work of a particular database “exdate”?

AWR: How Exaggerated is the Database?

Often this correct question, in my opinion, is given the wrong answer with reference to AWR statistics. It presents a system wait method that treats the response time as the sum of the processor time (DB CPU) and the wait time of various classes.

With the advent of Exadata in the AWR statistics, specific system expectations associated with the work of Exadata Software appeared. As a rule, the names of such waits begin with the word “cell” (the cell is called the Exadata Storage server), of which the most common are waits with the telling names “cell smart table scan”, “cell multiblock physical read” and “cell single block physical read”.

In most cases, the share of such Exadata waits in the total response time is small, and therefore they do not even fall into the Top10 Foreground Events by Total Wait Time section (in this case, they should be looked for in the Foreground Wait Events section). With great difficulty, we found an example of a daily AWR among our customers, in which Exadata-expectations fell into the Top10 section and amounted to about 5% in total:

Events

Waits

Total Wait Time (sec)

Avg Wait

%DB time

Wait Class

DB CPU

115.2K

70.4

SQL*Net more data from dblink

670,196

5471.5

8.16ms

3.3

Network

cell single block physical read

5,661,452

3827.6

676.07us

2.3

User I/O

Sync ASM balance

4,350,012

3481.3

800.30us

2.1

Other

cell multiblock physical read

759,885

2252

2.96ms

1.4

User I/O

direct path read

374,368

1811.3

4.84ms

1.1

User I/O

SQL*Net message from dblink

7,983

1725

216.08ms

1.1

Network

cell smart table scan

1,007,520

1260.7

1.25ms

0.8

User I/O

direct path read temp

520,211

808.4

1.55ms

0.5

User I/O

enq:TM-contention

652

795.8

1220.55ms

0.5

Application

From such AWR statistics, the following conclusions are often drawn:

1. The contribution of Exadata magic to database performance is not high - it does not exceed 5%, and the database "exadata" is bad.

2. If such a database is transferred from Exadata to the classic “server + array” architecture, then the performance will not change much. Because even if this array turns out to be three times slower than the Exadata storage system (which is hardly possible for modern All Flash arrays), then by multiplying 5% by three we will get an increase in the proportion of I / O waits up to 15% - the database will surely survive this!

Both of these conclusions are inaccurate, moreover, they distort the understanding of the idea embodied in Exadata Software. Exadata not only provides fast I/O, it works in a fundamentally different way compared to the classic “server + array” architecture. If the work of the database is really “exaggerated”, then the SQL logic is transferred to the storage system. Storage servers, thanks to a number of special mechanisms (primarily Exadata Storage Indexes, but not only), find the necessary data themselves and send DB to the servers. They do this quite efficiently, so the share of typical Exadata waits in the total response time is small. 

How will this share change outside of Exadata? How will this affect the overall performance of the database? Testing is the best way to answer these questions. For example, waiting for a “cell smart table scan” outside of Exadata can become such a heavy Table Full Scan that I/O takes up the entire response time and performance degrades dramatically. That is why, when analyzing AWR, it is wrong to consider the total percentage of Exadata waits as the contribution of its magic to performance, and even more so to use this percentage to predict performance outside of Exadata. To understand how “exadata” the work of the database is, you need to study the AWR statistics of the “Instance Activity Stats” section (there are a lot of statistics with telling names) and compare them with each other.

And in order to understand how the database will feel outside of Exadata, it is best to make a clone of the database on the target architecture from the backup and analyze the performance of this clone under load. As a rule, Exadata owners have such an opportunity.

Author: Alexey Struchenko, head of the Jet Infosystems database

Source: habr.com

Add a comment