Oracle AWR Reports: Understanding I/O Statistics

truth

One consequence of my job is that I spend a lot of time looking at Oracle Automatic Workload Repository reports, specifically at information about I/O. I really do mean a lot of time (honestly, I’m not kidding, I have had dreams about AWR reports). One thing that comes up very frequently is the confusion relating to how the measurements of IOPS and throughput are displayed in the AWR report Load Profile section. The answer, is that they aren’t. Well, not exactly… let me explain.

Physical Read and Write I/O

Right at the top of an AWR report, just after the Database and Host details, you’ll find the familiar Load Profile section. Until recently, it had changed very little through the releases of Oracle since its introduction in 10g. Here’s a sample from 11g Release 2:

Load Profile              Per Second    Per Transaction   Per Exec   Per Call
~~~~~~~~~~~~         ---------------    --------------- ---------- ----------
      DB Time(s):               44.1                0.4       0.07       1.56
       DB CPU(s):                1.6                0.0       0.00       0.06
       Redo size:      154,034,644.3        1,544,561.0
    Logical read:          154,436.1            1,548.6
   Block changes:           82,491.9              827.2
  Physical reads:              150.6                1.5
 Physical writes:           18,135.2              181.9
      User calls:               28.3                0.3
          Parses:              142.7                1.4
     Hard parses:                7.5                0.1
W/A MB processed:                2.1                0.0
          Logons:                0.1                0.0
        Executes:              607.7                6.1
       Rollbacks:                0.0                0.0
    Transactions:               99.7

In my role I have to look at the amount of I/O being driven by a database, so I can size a solution based on flash memory. This means knowing two specific metrics: the number of I/Os per second (IOPS) and the throughput (typically measured in MB/sec). I need to know these values for both read and write I/O so that I can understand the ratio. I also want to understand things like the amount of random versus sequential I/O, but that’s beyond the scope of this post.

The first thing to understand is that none of this information is shown above. There are values for Physical reads and Physical writes but these are actually measured in database blocks. Even if we knew the block size (which we don’t because Oracle databases can have multiple block sizes) we do not know how many I/Os were required. Ten Oracle blocks could be written in one sequential I/O or ten individual “random” I/Os, completely changing the IOPS measurement. To find any of this information we have to descend into the depths of the AWR report to find the Instance Activity Stats section.

In Oracle 12c, the format of the AWR report changed, especially the AWR Load Profile section, which was modified to show the units that each measurement uses. It also includes some new lines such as Read/Write IO Requests and Read/Write IO. Here’s a sample from a 12c database (taken during a 30 second run of SLOB):

Load Profile                    Per Second   Per Transaction  Per Exec  Per Call
~~~~~~~~~~~~~~~            ---------------   --------------- --------- ---------
             DB Time(s):              44.1               0.4      0.07      1.56
              DB CPU(s):               1.6               0.0      0.00      0.06
      Redo size (bytes):     154,034,644.3       1,544,561.0
  Logical read (blocks):         154,436.1           1,548.6
          Block changes:          82,491.9             827.2
 Physical read (blocks):             150.6               1.5
Physical write (blocks):          18,135.2             181.9
       Read IO requests:             150.3               1.5
      Write IO requests:          15,096.9             151.4
           Read IO (MB):               1.2               0.0
          Write IO (MB):             141.7               1.4
             User calls:              28.3               0.3
           Parses (SQL):             142.7               1.4
      Hard parses (SQL):               7.5               0.1
     SQL Work Area (MB):               2.1               0.0
                 Logons:               0.1               0.0
         Executes (SQL):             607.7               6.1
              Rollbacks:               0.0               0.0
           Transactions:              99.7

Now, you might be forgiven for thinking that the values highlighted in red and blue above tell me the very IOPS and throughput information I need. If this were the case, we could say that this system performed 150 physical read IOPS and 15k write IOPS, with throughput of 1.2 MB/sec reads and 141.7 MB/sec writes. Right?

But that isn’t the case – and to understand why, we need to page down five thousand times through the increasingly-verbose AWR report until we eventually find the Other Instance Activity Stats section (or just Instance Activity Stats in pre-12c reports) and see this information (edited for brevity):

Other Instance Activity Stats                  DB/Inst: ORCL/orcl  Snaps: 7-8
-> Ordered by statistic name

Statistic                                     Total     per Second     per Trans
-------------------------------- ------------------ -------------- -------------
physical read IO requests                     5,123          150.3           1.5
physical read bytes                      42,049,536    1,233,739.3      12,371.2
physical read total IO requests              37,162        1,090.3          10.9
physical read total bytes            23,001,900,544  674,878,987.9   6,767,255.2
physical read total multi block              21,741          637.9           6.4
....
physical write IO requests                  514,547       15,096.9         151.4
physical write bytes                  5,063,483,392  148,563,312.9   1,489,698.0
physical write total IO requests            537,251       15,763.0         158.1
physical write total bytes           18,251,309,056  535,495,967.4   5,369,611.4
physical write total multi block             18,152          532.6           5.3

The numbers in red and blue match up with those above, albeit with the throughput values using different units of bytes/sec instead of MB/sec. But the problem is, these aren’t the “total” values – which are highlighted in green. So what are those total values showing us?

Over to the Oracle Database 12c Reference Guide:

physical read IO requests:  Number of read requests for application activity (mainly buffer cache and direct load operation) which read one or more database blocks per request. This is a subset of “physical read total IO requests” statistic.

physical read total IO requests: Number of read requests which read one or more database blocks for all instance activity including application, backup and recovery, and other utilities. The difference between this value and “physical read total multi block requests” gives the total number of single block read requests.

The values that don’t have the word total in them, i.e. the values shown in the AWR Profile section at the start of a report, are only showing what Oracle describes as “application activity“. That’s all very well, but it’s meaningless if you want to know how much your database is driving your storage. This is why the values with total in the name are the ones you should consider. And in the case of my sample report above, there is a massive discrepancy between the two: for example, the read throughput value for application activity is just 1.2 MB/sec while the total value is actually 644 MB/sec – over 500x higher! That extra non-application activity is definitely worth knowing about. (In fact, I was running a highly parallelised RMAN backup during the test just to make the point…)

[Note: There was another section here detailing how to find and include the I/O generated by redo into the totals, but after consultation with guru and legend Tanel Poder it’s come to my attention that this is incorrect. In fact, reads and writes to redo logs are included in the physical read/write total statistics…]

Oracle 12c IO Profile Section

Luckily, Oracle 12c now has a new section which presents all the information in one table. Here’s a sample extracted from the same report as above:

IO Profile                  Read+Write/Second     Read/Second    Write/Second
~~~~~~~~~~                  ----------------- --------------- ---------------
            Total Requests:          16,853.4         1,090.3        15,763.0
         Database Requests:          15,247.2           150.3        15,096.9
        Optimized Requests:               0.1             0.0             0.0
             Redo Requests:             517.5             1.2           516.3
                Total (MB):           1,154.3           643.6           510.7
             Database (MB):             142.9             1.2           141.7
      Optimized Total (MB):               0.0             0.0             0.0
                 Redo (MB):             295.7             0.0           295.7
         Database (blocks):          18,285.8           150.6        18,135.2
 Via Buffer Cache (blocks):          18,282.1           150.0        18,132.0
           Direct (blocks):               3.7             0.6             3.1

Suddenly life is more simple. You want to know the total IOPS and throughput? It’s all in one place. You want to calculate the ratio of reads to writes? Just compare the read and write columns. Happy days.

One word of warning though: there are other database processes driving I/O which may not be tracked in these statistics. I see no evidence for control file reads and writes being shown, although these are insignificant in magnitude. More significant would be I/O from the archiver process for databases running in archive log mode, as each redo log must be sequentially read and re-written out as an archive log. Are these included? Yet another possibility would be the Recovery Writer (RVWR) process which is responsible for writing flashback logs when database flashback logging is enabled. [Discussions with Jonathan Lewis suggest these stats are all included – and let’s face it, he wrote the book on the subject…!]  It all adds up… Oracle really needs to provide better clarity on what these statistics are measuring.

Conclusion

If you want to know how much I/O is being driven by your database, do not use the information in the Load Profile section of an AWR report. Use the I/O Profile section if available, or otherwise skip to the Instance Activity Stats section and look at the total values for physical reads and writes (and redo). Everything else is just lies, damned lies and (I/O) statistics.

Advertisement

40 Responses to Oracle AWR Reports: Understanding I/O Statistics

  1. kevinclosson says:

    No mention of the fact that instances write 2x or 3x (ASM normal, high redundancy ) more than is tracked in AWR? 🙂

    Are you external redundancy with Violin? Probably not if one uses multiple Violin arrays, right?

    • flashdba says:

      True, very true. But my post was limited to just database activity (for now). Ok so technically it’s the database that’s performing more I/O if you use ASM normal or high redundancy, but I figured the post was long enough already without going down that avenue.

      Maybe I’ll have to write a part two 😉

      In general I would expect customers to use external redundancy on Violin, but as you well know some customers have an unlimited capacity for surprise.

  2. kevinclosson says:

    > Ok so technically it’s the database that’s performing more I/O if you use ASM normal or high redundancy

    …foreground processes issue multiple (2x,3x) writes in direct path and DBWR the same so not only “technically” but absolutely it’s the database. The fact that AWR doesn’t report this I/O overhead has always been a personal beef of mine because it is a huge loss of performance data 😦

    > In general I would expect customers to use external redundancy on Violin

    …just out of curiosity, how would someone have implement balanced data placement across two arrays if their I/O demand surpasses what a single array can handle? I suppose that question is off topic. Sorry.

    • flashdba says:

      > how would someone have implement balanced data placement across two arrays if their I/O demand surpasses what a single array can handle?

      In the few situations where this happens I’ve always found that the customer wants to manually balance their workload, usually by placement of tablespaces or redo logs in different ASM diskgroups or the use of virtualization technology. It’s an interesting question actually, but in my experience so far most customers run out of capacity before they run out of performance.

      • flashdba says:

        Just thinking a bit more about this Kevin. I have a couple of large customers who use IBM SVC above Violin arrays for this purpose (SVC was a pre-existing architectural choice). Other than that I guess the simplest approach seems to be to spread an ASM diskgroup across multiple arrays, with x number of LUNs on each array. Of course, this does somewhat rely on ASM’s ability to evenly stripe and distribute data.

        • kevinclosson says:

          Right…SVC…or maybe Falconstor (if they are still around)…

          I should think ASM striping between arrays would be as balanced as striping between two simple disks, no? But, since that doesn’t equate to ASM mirroring (and thus would foul the AWR write accounting) it is off topic…sorry..

          • flashdba says:

            I should think it would be a bit … agricultural, shall we say. It would be good to test it in the lab one day. Don’t sweat it Kev, your off-topic thoughts are usually more interesting than my on-topic ones.

    • Hi Kevin,
      So (just to be quickly) the below report is from my system (4 RAC nodes and ASM with normal redundancy) the IOPS from my system needs to be multiplied by x2 ?
      The report:
      1 Physical Read Total IO Requests Per Sec 13598,1191744341
      2 Physical Read Total IO Requests Per Sec 28134,8593307808
      3 Physical Read Total IO Requests Per Sec 20404,1444740346
      4 Physical Read Total IO Requests Per Sec 14526,8308921438
      1 Physical Read IO Requests Per Sec 13594,3575233023
      2 Physical Read IO Requests Per Sec 28130,4811053771
      3 Physical Read IO Requests Per Sec 20401,2150466045
      4 Physical Read IO Requests Per Sec 14523,8182423435

      I retrieve this metrics from GV$SYSMETRIC.

      So, the IOPS from my Storage will be: (13598*2)+(28134*2)+(20404*2)+(14526*2) = 553308

      • flashdba says:

        Fernando – when taking into account ASM normal redundancy you should not double the read IO requests, only the write IO requests.

        • Hi,
          My bad, I posted the wrong metrics (thanks for clarify).
          The real metrics:
          1 Physical Write Total Bytes Per Sec 8648692,41011984
          2 Physical Write Total Bytes Per Sec 69056664,7137151
          3 Physical Write Total Bytes Per Sec 18220859,6537949
          4 Physical Write Total Bytes Per Sec 65992692,2139472
          1 Physical Write Total IO Requests Per Sec 607,090545938748
          2 Physical Write Total IO Requests Per Sec 649,367509986685
          3 Physical Write Total IO Requests Per Sec 482,057256990679
          4 Physical Write Total IO Requests Per Sec 630,72105619499

          • flashdba says:

            🙂

            Then the write IOPS will be double what you see there in the metrics “Physical Write Total IO Requests Per Sec” since each write must be mirrored to the primary and secondary extents on ASM.

  3. Kyle Hailey says:

    Great work clearing up some of this confusing stuff.
    And even with all that, we still don’t know what I/O is actually coming from disk verses the file system cache when we aren’t using Direct I/O which is often the case.

    – Kyle
    http://kylehailey.com

    • flashdba says:

      Very true Kyle, although I’d like to think everyone is running with direct I/O. And as Kevin says, adding ASM normal or high redundancy introduces even more problems with trusting the values in an AWR.

  4. K.P says:

    Please help me to clarify about IO Profile Section in my database

    Database (blocks): 10,564.90 10,344.30 220.6
    Via Buffer Cache (blocks): 451.9 437.4 14.5
    Direct (blocks): 10,113.00 9,906.90 206.1

    is it not good ?

    Thanks
    K.P

    • flashdba says:

      Your database is performing a lot of direct read I/O. It doesn’t do much buffered I/O and it doesn’t perform much in the way of writes.

      I have no way of telling you whether this is good or not, because it depends on your application. It’s like if I said to you, “It takes me 34 minutes to drive to work today. Is that good?” – you cannot answer because you have no context.

      • K.P says:

        You are right. “Direct path read” is 1st on ‘Top 10 Foreground Events by Total Wait Time”.
        How to analyze this event in AWR ?
        With Oracle Database is 11gR2, MEMORY_TARGET = 450GB (but SGA is only about 165GB ), how can I tune this event ?
        Thanks,
        K.P

        • flashdba says:

          It really depends on what your application is doing. Analysing AWR reports requires a lot more than just one or two details – it requires context. If direct path read is the top wait event that could be a bad thing because I/O is holding up your database. But it could be a good thing because that might be the most efficient method of getting your data from disk. It might mean that you need to use more parallelism to retrieve the data using more concurrent processes… or it might mean that you need to rethink the application SQL in order to find a more optimal method of locating the information you need. If being a DBA was a simple job, there wouldn’t be so many blogs out there trying to explain the complexities of each task.

          Start off by asking what you want to achieve. If your users aren’t complaining and/or your batch jobs are completing in time, maybe you don’t need to change anything.

          • K.P says:

            Thank you for your time.

            Actually, my database is still working good. I just want to understand AWR deeply. Understanding the relation between others section in AWR.

            Thank you again.
            K.P

  5. dadbm says:

    Thanks for a very useful post.
    just had a side comment that these AWR values unfortunately provide the averages over the AWR snapshot interval. To be more precise I thought of using metrics that sample every minute.
    However the same IO totals I could find only for the last hour in V$SYSMETRIC_HISTORY but not in DBA_HIST_SYSMETRIC_HISTORY. Instead of that there are 2 other metrics there:
    IO Megabytes per Second and IO Requests per Second.
    Do you think they are appropriate for the same use?
    Thanks for any hints on this

    — Kirill Loifman

    • flashdba says:

      Sorry for the delayed reply, I’ve been travelling… a lot.

      Those look like the correct metrics but what are you hoping to achieve? I think they will still be samples over the snapshot periods (one hour by default) rather than every minute.

  6. Chagan says:

    Hi,

    Thank for the very useful post.

    Just wanted to ask, in the 12c IO profile why is there a big difference between the total and database related throughput ie DB + Redo MB does equate to the Total MB. Is there something else that is taking up IO?

    Thanks!

  7. George says:

    The IO profile section is also included in 11gR2.

  8. Wllg says:

    Hi,

    First of all thanks for a very useful post!!
    I don’t know if I will be clear in my question (I hope So 🙂
    About IOPS, How could I see whats was the average block size or exactly block size in a specific workload time ?

    For instance, imagine I’m analyze a AWR snapshot in a database that I have a different block size configured (one tablespace 32k with some tables) and in this snapshot workload the work could be done using either the standard 8k and the 32k or only the 8k (because some time the tables in this non-standard tablespace block are not in use).

    How could I know what was the average or specific block size in IOPS in this snapshot ? (let’s say, no backup and the another stuff’s that require non standard block IO are in execution..)

    The reason of this question is about in cases of capacity IOPS in cloud like AWS. The IOPS performance metrics they show I believe are about 16K block size So, one IOPS of them is different of the one IOPS in my database, So I need double the math.

    I hope I get a point here 🙂

    • flashdba says:

      Working out the average blocksize is pretty simple if you have a measurement of the specific time. You just take the Throughput figure (remembering to convert between MB/sec and bytes/sec if necessary) and divide it by the number of elapsed seconds.

      Let’s use this ten minute AWR snapshot as an example – it’s a real report I happen to have in front of me (sorry, but WordPress won’t format it very well):

      IO Profile Read+Write/Second Read/Second Write/Second
      ~~~~~~~~~~ —————– ————— —————
      Total Requests: 81,114.1 199.1 80,915.0
      Database Requests: 80,072.3 0.0 80,072.3
      Optimized Requests: 0.0 0.0 0.0
      Redo Requests: 811.0 2.7 808.3
      Total (MB): 819.3 4.5 814.8

      So the read throughput here was 4.5 MB/sec, which is 4718592 bytes/second. The elapsed time is 10.04 mins which is 602.4 seconds. Therefore the average read blocksize was 4718592 / 602.4 = 7833 bytes (approx).

      But what does that actually tell us? Not a lot, since there is no Oracle process which will perform a 7833 byte read. In fact, the database blocksize is 8192 bytes so we can take an educated guess here that the majority of reads were single database blocks but there were also a minority of smaller reads (probably things like redo blocks, which have a 512 byte blocksize) to drag the average down. However, there were probably also some control file reads – and they are performed in 16k blocks, which would also tilt the average up. So nothing scientific really comes from this exercise, just guesswork.

      • Wllg says:

        Yeah, you right!!
        I would like to know if you can help me one more time ! 🙂
        How can I get IOPS/Throughput for whole exadata machine at one time for sizing another server in AWS(non-exadata) for instance ?
        Exadata has a lot of features and measure IOPS is more complicated than get the number in all database hosted in the machine, I guess!
        Do you have another post about this ?

        Thank you very much!!

        • flashdba says:

          I’m not sure I understand. You mean you have an Exadata machine and you want the combined IOPS/Throughput across the whole appliance? I can’t think of a way, other than by adding up the stats for each database.

          Or maybe you can use Bertrand Drouvot’s ASM I/O statistics scripts:

          https://bdrouvot.wordpress.com/2013/02/15/asm-io-statistics-utility/

          • Wllg says:

            Oh it’s really nice information. I very appreciate you support.

            I was reading another blog post yours about IOPS in AWR (compare Stat sections and Instance Activity) because I’m very in focus of IOPS real numbers when I was research about a strange situation!

            At the sizing moment I think it’s very important to get the average but MAX values for specific database server before try configure a new one server.
            But I have a situation that It makes me wonder if AWR it’s a good peace of information to get the “MAX” Values for sizing because in the default AWR report I get diff information comparing with AWR tables information.

            Here is the example: ( sorry for the format 😦 )

            RAC (But onle instance 1 are in use )AWR SNAP 30 minute interval (19018-19019 / 20-Apr-17 17:00:19 until 20-Apr-17 17:30:23)

            Other Instance Activity Stats
            Ordered by statistic name
            Statistic Total per Second per Trans
            physical read IO requests 6,010,170 3,330.70 10,186.73
            physical read bytes 6,205,607,600,128 3,439,002,393.56 10,517,978,983.27
            physical read requests optimized 6,023,649 3,338.17 10,209.57
            physical read total IO requests 6,024,717 3,338.76 10,211.38
            physical read total bytes 6,205,833,396,224 3,439,127,524.47 10,518,361,688.52
            physical read total bytes optimized 6,205,824,385,024 3,439,122,530.67 10,518,346,415.29
            physical read total multi block requests 5,973,254 3,310.24 10,124.16

            and using the same snap_id in the query dba_hist_sysmetric_summary view: (please copy/past in tool like notepad++ ou textwrange for best format)

            SNAP_ID BEGIN_TIME END_TIME GROUP_ID METRIC_NAME NUM_INTERVAL MINVAL MAXVAL AVERAGE STANDARD_DEVIATION SUM_SQUARES
            19019 20/04/2017 17:00 20/04/2017 17:30 2 Physical Read Total Bytes Per Sec 30 0 86387923055 3447972896 15783609620 7.5812E+21
            19019 20/04/2017 17:00 20/04/2017 17:30 2 Physical Read Total IO Requests Per Sec 30 0 83169.05792 3346.809631 15191.11126 7028360021
            19019 20/04/2017 17:00 20/04/2017 17:30 2 Physical Write Total Bytes Per Sec 30 0 210177.3635 52392.47823 36428.82626 1.20834E+11
            19019 20/04/2017 17:00 20/04/2017 17:30 2 Physical Write Total IO Requests Per Sec 30 0 13.79597271 5.750924206 2.167278544 1128.409669

            Talking about “Physical Read Total IO Requests Per Sec” and “Physical Read Total Bytes Per Sec” specifically.
            In the dba_hist_sysmetric_summary view, the MAXVAL I have is 83.169 IOPS and 86,387,923,055 bytes. This is the max value observad in this snap_id but in default AWR report, the values were written like the AVERAGE column.

            I need to point the max value my current database server got and if I use only the AWR default report information, I think I will have a problem because the difference is very huge in this peak case and I could sizing the new server in a wrong way.
            Do you have some guesses what am’I missing here ?

            Thank very much again!!

            Regards

            • flashdba says:

              Right, the standard AWR report isn’t showing you max values, only averages. Personally I don’t see much use in max values since they typically represent a tiny peak in a single moment in time. It’s quite often the case that this performance spike is vastly higher than the average, so sizing your system for the MAXVAL makes no sense.

              A far better method, in my opinion, is to graph many consecutive averages over a long period of time. So, for example, taking an entire week’s worth of hourly AWR reports and graphing them. I do it using this:

              AWR-Generator: https://flashdba.com/database/useful-scripts/awr-generator/

              AWR-Parser: https://flashdba.com/database/useful-scripts/awr-parser/

              • Wllg says:

                Hi Flashdba,

                The awr-generator is amazing and Very useful. Thank you so much for share 🙂
                I have a doubt about one specific math the awr-parser calcule.

                In the Top5Event* columns specifically “Top5 Event* Average Time (ms)” . We are talking about get consecutive averages over a long period (including latency) that is important.

                Here in my lab(exadata), almost all AWR reports about the metric: “cell single block physical read” has a 0 value, but in the csv output (awr-parser) write a value above 0 all the times (ex: 0.234, 0.320, 0.730 and so on).

                Here is the AWR example

                Top 10 Foreground Events by Total Wait Time
                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                Total Wait % DB
                Event Waits Time(Sec) Avg(ms) time Wait Class
                —————————— ———— — ——- —— ———-
                DB CPU 27.7 77.8
                cell single block physical rea 27,429 6.9 0 19.3 User I/O

                I think the script get the “Total Wait Time Sec” and divide by “Waits” to get the result in awr-parser, but considering the example above, it’s correct we assume that in a snapshot 30minutes (1800 seconds) that the latency in this “Event” in whole snapshot was of the 0.25ms (total wait secs / waits) ? Being that of the 1800 seconds only 6.9 seconds was in wait, but it’s appears the all other 1793.1 seconds “with no wait” was impaired in final result.

                I don’t know if I was clear (english is not my first language, sorry for that) but I hope so!!

                Thank you again… We teach me news things all the days!!

                Regards

                • flashdba says:

                  Hi there

                  You are correct in your assumptions about the averages. Since AWR reports tend to round up average time values to the nearest millisecond – and since most modern storage platforms can deliver I/O at sub-millisecond latency – I calculate the average by taking the total time and dividing it by the number of waits. Thus, in your example, an average shown as 0 in the AWR report is calculated as 6.9 / 27,429 = 0.252ms.

                  Does that answer your question?

  9. Gustav says:

    Hi, first of all thank you for this great post.

    I would like you help me a answer a question about physical IOPS STORAGE/DISK sizing (I think it’s your specialty too 🙂 )
    When I think about change the storage hardware to a new one, I need to calculate how the current storage was doing to sizing a better one(environment hosted Oracle database).

    If I calculate “physical read(writes) total IO requests” (that is “physical read(write) IO requests” PLUS another IO stuffs) I actually would be measure a lot of BUFFER IOPS in which it’s very fast because it’s a memory work and the values would be very HIGH and not specifically would be a real disk/storage IO.

    For real physical IOPS (storage/disk IO’s) What do you think I need track ? Something outside of the database using some O.S tool like sar, iostat and so on, right ?

    Because in memory perspective, the difference between two storages (old and new) aside real disk IOPS, would be compensated by memory, right ?

    Thank you very much!

    • flashdba says:

      Hi Gustav

      The values “physical [read|write] total IO requests” will give you the exact physical IOPS, as will the values in the IO Profile section of AWR reports in more modern Oracle versions. That’s all you need, unless you have multiple instances per host – in which case you will need to sum the values from all instances. But these statistics show you just the physical IOPS, not the logical (buffer) IOPS (although it must be considered that every physical read/write results in a corresponding logical read/write).

      You can, of course, use host-level utilities such as iostat to get the same data. Or, if you are using ASM, you could get the statistics from that by using a tool such as:

      https://bdrouvot.wordpress.com/2013/02/15/asm-io-statistics-utility/

      And finally, it’s always possible to get the performance numbers from your storage array if it has this capability.

      I’m not sure if this is answering your question – let me know?

  10. Gustav says:

    Hi,

    Oh really. I thinkI was confused about this reference in your post:

    Over to the Oracle Database 12c Reference Guide:

    “physical read IO requests: Number of read requests for application activity (“mainly buffer cache” and direct load operation) which read one or more database blocks per request. This is a subset of “physical read total IO requests” statistic.”

    As it reference “mainly buffer cache” operation I thought buffer reads were accounted for.

    Thank you for your attention!

    • flashdba says:

      Yes, I can see how it can be confusing. The manual is trying to explain that these are physical reads which populate the buffer cache, or feed direct load operations… 🙂

  11. Gustav says:

    Cool!
    In a Exadata env, May I use the following metrics to get the IOPS/throughput too ? If not, What do you sugest me ?

    select to_char(sysdate,’dd/mm/yyyy hh24:mi’) data ,
    round((sum(decode(metric_name, ‘Physical Read Total Bytes Per Sec’ , value,0)))/1024,0) r_kbps,
    round((sum(decode(metric_name, ‘Physical Read Total IO Requests Per Sec’ , value,0 ))),1) r_iops,
    round((sum(decode(metric_name, ‘Physical Write Total Bytes Per Sec’ , value,0 )))/1024,0) w_kbps,
    round((sum(decode(metric_name, ‘Physical Write Total IO Requests Per Sec’, value,0 ))),1) w_iops
    from v$sysmetric
    where metric_name in (
    ‘Physical Read Total Bytes Per Sec’ ,
    ‘Physical Write Total Bytes Per Sec’ ,
    ‘Physical Write Total IO Requests Per Sec’,
    ‘Physical Read Total IO Requests Per Sec’
    )
    and group_id=2;

    Thank you again and again and again ..

  12. Enda says:

    in exa or SuperCluter env, smartscan is not accounted for in any of the load IO profile stats,
    for instance we have a SuperCluster that reports 21G byte per second
    Read IO (MB): 21,096.5 152.2
    and all the other stats said the same, segment statistics shows physical/logical and driewct reads all on one table etc.
    But it turns out

    cell physical IO bytes eligible for predicate offload showing 21G byte per second
    and
    cell physical IO interconnect bytes returned by smart scan

    showed 30M per second.
    and as the rig had just two IB hca ports active we couldn;t be doing 21G per second the stats are showing the READ IO “issued” not actual IO and smartscan saved about 21G poer second, so in fact we had almost no physical IO 🙂

    • flashdba says:

      You make a great point. Exadata kinda lies on the AWR reports, by showing the IO values that it *would* be delivering if it weren’t for smart scan. Lesson learnt: don’t trust Exadata when it comes to IO Profiles.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: