Querying DBA_HIST_SNAPSHOT and DBA_HIST_SYSSTAT

Why is it so hard in Oracle to get a decent answer to the question of how many seconds elapsed between two timestamps?

I’m looking at DBA_HIST_SNAPSHOT and wondering how many seconds each snapshot spans, because later on I want to use this to generate metrics like Redo Size per Second, etc.

SQL> desc dba_hist_snapshot
 Name                                Null?    Type
 ----------------------------------- -------- ------------------------
 SNAP_ID                             NOT NULL NUMBER
 DBID                                NOT NULL NUMBER
 INSTANCE_NUMBER                     NOT NULL NUMBER
 STARTUP_TIME                        NOT NULL TIMESTAMP(3)
 BEGIN_INTERVAL_TIME                 NOT NULL TIMESTAMP(3)
 END_INTERVAL_TIME                   NOT NULL TIMESTAMP(3)

So surely I can just subtract the begin time from the end time, right?

SQL> select SNAP_ID, END_INTERVAL_TIME - BEGIN_INTERVAL_TIME as elapsed
  2  from DBA_HIST_SNAPSHOT
  3  where SNAP_ID < 6
  4 order by 1;

   SNAP_ID ELAPSED
---------- ------------------------------
         1 +000000000 00:00:44.281
         2 +000000000 00:04:32.488
         3 +000000000 00:51:39.969
         4 +000000000 01:00:01.675
         5 +000000000 00:59:01.697

Gaaaah…. It’s given me one of those stupid interval datatypes! I’ve never been a fan of these. I just want to know the value in seconds.

Luckily I can cast a timestamp (the datatype in DBA_HIST_SNAPSHOT) as a good old fashioned DATE. We love dates, you can treat them as numbers and add, subtract etc. The integer values represent days, so you just need to multiply by 24 x 60 x 60 = 86400 to get seconds:

SQL> select SNAP_ID, END_INTERVAL_TIME - BEGIN_INTERVAL_TIME as elapsed,
  2  (cast(END_INTERVAL_TIME as date) - cast(BEGIN_INTERVAL_TIME as date))
  3    *86400 as elapsed2
  4  from dba_hist_snapshot
  5  where snap_id < 6
  6  order by 1;

   SNAP_ID ELAPSED                          ELAPSED2
---------- ------------------------------ ----------
         1 +000000000 00:00:44.281                44
         2 +000000000 00:04:32.488               272
         3 +000000000 00:51:39.969              3100
         4 +000000000 01:00:01.675              3602
         5 +000000000 00:59:01.697              3542

That’s much better. But I notice that, in snapshot 1 for example, the elapsed time was 44.281 seconds and in my CAST version it’s only 44 seconds. In casting to the DATA datatype there has been some rounding. Maybe that isn’t an issue, but surely there’s a way to keep that extra accuracy?

Here’s the answer I came up with – using EXTRACT:

SQL> select SNAP_ID, END_INTERVAL_TIME - BEGIN_INTERVAL_TIME as elapsed,
  2  (cast(END_INTERVAL_TIME as date) - cast(BEGIN_INTERVAL_TIME as date))
  3    *86400 as elapsed2,
  4  (extract(day from END_INTERVAL_TIME)-extract(day from BEGIN_INTERVAL_TIME))*86400 +
  5  (extract(hour from END_INTERVAL_TIME)-extract(hour from BEGIN_INTERVAL_TIME))*3600 +
  6  (extract(minute from END_INTERVAL_TIME)-extract(minute from BEGIN_INTERVAL_TIME))*60 +
  7  (extract(second from END_INTERVAL_TIME)-extract(second from BEGIN_INTERVAL_TIME)) as elapsed3
  8  from dba_hist_snapshot
  9  where snap_id < 6
 10  order by 1;

   SNAP_ID ELAPSED                          ELAPSED2   ELAPSED3
---------- ------------------------------ ---------- ----------
         1 +000000000 00:00:44.281                44     44.281
         2 +000000000 00:04:32.488               272    272.488
         3 +000000000 00:51:39.969              3100   3099.969
         4 +000000000 01:00:01.675              3602   3601.675
         5 +000000000 00:59:01.697              3542   3541.697

Not particularly simple, but at least accurate. I’m happy to be told that there’s an easier way?

Why?

Why am I doing this? Because I am trying to look at the performance of some benchmarks I am working on. The load generation tool creates regular AWR snapshots so I want to look at the peak IO rates for each snapshot to save myself generating a million AWR reports.

I am specifically interested in the statistics redo size, physical reads, and physical writes from DBA_HIST_SYSSTAT. The tests aim to push each of these metrics (independently though, not all at the same time… yet!)

With that in mind, and with thanks to Ludovico Caldara for some code on which to build, here is the SQL that I am using to view the performance in each snapshot. First the output though, truncated to save some space on the screen:

SNAPSHOTID SNAPSHOTTIME    REDO_MBSEC REDO_GRAPH           READ_MBSEC READ_GRAPH           WRITE_MBSEC WRITE_GRAPH
---------- --------------- ---------- -------------------- ---------- -------------------- ----------- --------------------
       239 30-AUG 13:37:14     384.83 ******                        0 *                            .04 ******
       240 30-AUG 13:38:07     284.27 ****                          0 *                            .03 ****
       241 30-AUG 13:42:14     296.62 ****                          0                              .03 ****
       242 30-AUG 13:43:00    1242.08 ********************          0 *                            .12 ********************
       243 30-AUG 13:47:14     258.75 ****                          0                              .02 ****
       244 30-AUG 13:48:28     866.83 **************                0 *                            .08 *************
       245 30-AUG 13:52:14     456.24 *******                       0                              .04 *******
       246 30-AUG 13:54:43     773.61 ************                  0                              .07 ************
       247 30-AUG 13:57:38     624.23 **********                    0                              .06 *********
       248 30-AUG 14:00:22     613.98 *********                     0                              .05 *********

I’m currently running redo generation tests so I’m only interested in the redo size metric and the calculation of redo per second, i.e. column 3. I can use the graph at column 4 to instantly see which snapshot I need to look at: 243. That’s the one where redo was being generated at over 1.2Gb/sec – not bad for a two-socket machine attached to a single Violin 6616 array.

Now for the SQL… I warn you now, it’s a bit dense!

-- for educational use only - use at your own risk!
-- display physical IO statistics from DBA_HIST_SYSSTAT
-- specifically redo size, physical reads and physical writes

set lines 140 pages 45
accept num_days prompt 'Enter the number of days to report on [default is 0.5]: '
set verify off

SELECT redo_hist.snap_id AS SnapshotID
,      TO_CHAR(redo_hist.snaptime, 'DD-MON HH24:MI:SS') as SnapshotTime
,      ROUND(redo_hist.statval/elapsed_time/1048576,2) AS Redo_MBsec
,      SUBSTR(RPAD('*', 20 * ROUND ((redo_hist.statval/elapsed_time) / MAX (redo_hist.statval/elapsed_time) OVER (), 2), '*'), 1, 20) AS Redo_Graph
,      ROUND(physical_read_hist.statval/elapsed_time/1048576,2) AS Read_MBsec
,      SUBSTR(RPAD('*', 20 * ROUND ((physical_read_hist.statval/elapsed_time) / MAX (physical_read_hist.statval/elapsed_time) OVER (), 2), '*'), 1, 20) AS Read_Graph
,      ROUND(physical_write_hist.statval/elapsed_time/1048576,2) AS Write_MBsec
,      SUBSTR(RPAD('*', 20 * ROUND ((physical_write_hist.statval/elapsed_time) / MAX (physical_write_hist.statval/elapsed_time) OVER (), 2), '*'), 1, 20) AS Write_Graph
FROM (SELECT s.snap_id
            ,g.value AS stattot
            ,s.end_interval_time AS snaptime
            ,NVL(DECODE(GREATEST(VALUE, NVL(lag (VALUE) OVER (PARTITION BY s.dbid, s.instance_number, g.stat_name
                 ORDER BY s.snap_id), 0)), VALUE, VALUE - LAG (VALUE) OVER (PARTITION BY s.dbid, s.instance_number, g.stat_name
                     ORDER BY s.snap_id), VALUE), 0) AS statval
            ,(EXTRACT(day FROM s.end_interval_time)-EXTRACT(day FROM s.begin_interval_time))*86400 +
             (EXTRACT(hour FROM s.end_interval_time)-EXTRACT(hour FROM s.begin_interval_time))*3600 +
             (EXTRACT(minute FROM s.end_interval_time)-EXTRACT(minute FROM s.begin_interval_time))*60 +
             (EXTRACT(second FROM s.end_interval_time)-EXTRACT(second FROM s.begin_interval_time)) as elapsed_time
        FROM dba_hist_snapshot s,
             dba_hist_sysstat g,
             v$instance i
       WHERE s.snap_id = g.snap_id
         AND s.begin_interval_time >= sysdate-NVL('&num_days', 0.5)
         AND s.instance_number = i.instance_number
         AND s.instance_number = g.instance_number
         AND g.stat_name = 'redo size') redo_hist,
     (SELECT s.snap_id
            ,g.value AS stattot
            ,NVL(DECODE(GREATEST(VALUE, NVL(lag (VALUE) OVER (PARTITION BY s.dbid, s.instance_number, g.stat_name
                 ORDER BY s.snap_id), 0)), VALUE, VALUE - LAG (VALUE) OVER (PARTITION BY s.dbid, s.instance_number, g.stat_name
                     ORDER BY s.snap_id), VALUE), 0) AS statval
        FROM dba_hist_snapshot s,
             dba_hist_sysstat g,
             v$instance i
       WHERE s.snap_id = g.snap_id
         AND s.begin_interval_time >= sysdate-NVL('&num_days', 0.5)
         AND s.instance_number = i.instance_number
         AND s.instance_number = g.instance_number
         AND g.stat_name = 'physical read total bytes') physical_read_hist,
     (SELECT s.snap_id
            ,g.value AS stattot
            ,NVL(DECODE(GREATEST(VALUE, NVL(lag (VALUE) OVER (PARTITION BY s.dbid, s.instance_number, g.stat_name
                 ORDER BY s.snap_id), 0)), VALUE, VALUE - LAG (VALUE) OVER (PARTITION BY s.dbid, s.instance_number, g.stat_name
                     ORDER BY s.snap_id), VALUE), 0) AS statval
        FROM dba_hist_snapshot s,
             dba_hist_sysstat g,
             v$instance i
       WHERE s.snap_id = g.snap_id
         AND s.begin_interval_time >= sysdate-NVL('&num_days', 0.5)
         AND s.instance_number = i.instance_number
         AND s.instance_number = g.instance_number
         AND g.stat_name = 'physical write total bytes') physical_write_hist
WHERE redo_hist.snap_id = physical_read_hist.snap_id
  AND redo_hist.snap_id = physical_write_hist.snap_id
ORDER BY 1;

7 thoughts on “Querying DBA_HIST_SNAPSHOT and DBA_HIST_SYSSTAT”

olivier bernhard says:

February 5, 2013 at 8:08 am

This is a very usefull script to mine the AWR repository. I find it very re-usable since one can simply add an additional metric by simply copy/pasting. Thanks for sharing !

1. flashdba says:
  
  February 5, 2013 at 10:24 am
  
  Thanks for the feedback, it makes sharing worthwhile!
  
Lynn Sattler says:

March 12, 2013 at 6:17 pm

thanks for posting.
Lynn, NW Ohio

Pingback: Retrieve and visualize system statistics metrics from AWR with R | bdt's oracle blog
Richard says:

July 8, 2013 at 4:53 pm

I sympathize with the “Gaaaah…”!

Pan says:

December 12, 2013 at 7:52 pm

Hello,

The script is quite impressive and it works fine for standalone/restart instance. Can you please share the RAC specific SQL code to capture the different events. i mean capture the information across instances and show up by instance wise. The reason behind that we have cluster GRID environment wherein we have 30 instance runs on a single node and it’s pretty hard to know about which node was highly use in term of IOPS generation.

1. flashdba says:
  
  December 13, 2013 at 10:19 am
  
  I’m not entirely sure what you mean? Do you mean capturing data from multiple databases? If you have 30 instances on a single node then they must be different databases, which means it’s virtually impossible to do what you are asking. The only methods I can think of would be creating a horrible mess of database links or writing some kind of wrapper script which logs in to each database to run the SQL. Neither of those sound appealing to me…