Querying DBA_HIST_SNAPSHOT and DBA_HIST_SYSSTAT
August 30, 2012 7 Comments
Why is it so hard in Oracle to get a decent answer to the question of how many seconds elapsed between two timestamps?
I’m looking at DBA_HIST_SNAPSHOT and wondering how many seconds each snapshot spans, because later on I want to use this to generate metrics like Redo Size per Second, etc.
SQL> desc dba_hist_snapshot Name Null? Type ----------------------------------- -------- ------------------------ SNAP_ID NOT NULL NUMBER DBID NOT NULL NUMBER INSTANCE_NUMBER NOT NULL NUMBER STARTUP_TIME NOT NULL TIMESTAMP(3) BEGIN_INTERVAL_TIME NOT NULL TIMESTAMP(3) END_INTERVAL_TIME NOT NULL TIMESTAMP(3)
So surely I can just subtract the begin time from the end time, right?
SQL> select SNAP_ID, END_INTERVAL_TIME - BEGIN_INTERVAL_TIME as elapsed
2 from DBA_HIST_SNAPSHOT
3 where SNAP_ID < 6
4 order by 1;
SNAP_ID ELAPSED
---------- ------------------------------
1 +000000000 00:00:44.281
2 +000000000 00:04:32.488
3 +000000000 00:51:39.969
4 +000000000 01:00:01.675
5 +000000000 00:59:01.697
Gaaaah…. It’s given me one of those stupid interval datatypes! I’ve never been a fan of these. I just want to know the value in seconds.
Luckily I can cast a timestamp (the datatype in DBA_HIST_SNAPSHOT) as a good old fashioned DATE. We love dates, you can treat them as numbers and add, subtract etc. The integer values represent days, so you just need to multiply by 24 x 60 x 60 = 86400 to get seconds:
SQL> select SNAP_ID, END_INTERVAL_TIME - BEGIN_INTERVAL_TIME as elapsed,
2 (cast(END_INTERVAL_TIME as date) - cast(BEGIN_INTERVAL_TIME as date))
3 *86400 as elapsed2
4 from dba_hist_snapshot
5 where snap_id < 6
6 order by 1;
SNAP_ID ELAPSED ELAPSED2
---------- ------------------------------ ----------
1 +000000000 00:00:44.281 44
2 +000000000 00:04:32.488 272
3 +000000000 00:51:39.969 3100
4 +000000000 01:00:01.675 3602
5 +000000000 00:59:01.697 3542
That’s much better. But I notice that, in snapshot 1 for example, the elapsed time was 44.281 seconds and in my CAST version it’s only 44 seconds. In casting to the DATA datatype there has been some rounding. Maybe that isn’t an issue, but surely there’s a way to keep that extra accuracy?
Here’s the answer I came up with – using EXTRACT:
SQL> select SNAP_ID, END_INTERVAL_TIME - BEGIN_INTERVAL_TIME as elapsed,
2 (cast(END_INTERVAL_TIME as date) - cast(BEGIN_INTERVAL_TIME as date))
3 *86400 as elapsed2,
4 (extract(day from END_INTERVAL_TIME)-extract(day from BEGIN_INTERVAL_TIME))*86400 +
5 (extract(hour from END_INTERVAL_TIME)-extract(hour from BEGIN_INTERVAL_TIME))*3600 +
6 (extract(minute from END_INTERVAL_TIME)-extract(minute from BEGIN_INTERVAL_TIME))*60 +
7 (extract(second from END_INTERVAL_TIME)-extract(second from BEGIN_INTERVAL_TIME)) as elapsed3
8 from dba_hist_snapshot
9 where snap_id < 6
10 order by 1;
SNAP_ID ELAPSED ELAPSED2 ELAPSED3
---------- ------------------------------ ---------- ----------
1 +000000000 00:00:44.281 44 44.281
2 +000000000 00:04:32.488 272 272.488
3 +000000000 00:51:39.969 3100 3099.969
4 +000000000 01:00:01.675 3602 3601.675
5 +000000000 00:59:01.697 3542 3541.697
Not particularly simple, but at least accurate. I’m happy to be told that there’s an easier way?
Why?
Why am I doing this? Because I am trying to look at the performance of some benchmarks I am working on. The load generation tool creates regular AWR snapshots so I want to look at the peak IO rates for each snapshot to save myself generating a million AWR reports.
I am specifically interested in the statistics redo size, physical reads, and physical writes from DBA_HIST_SYSSTAT. The tests aim to push each of these metrics (independently though, not all at the same time… yet!)
With that in mind, and with thanks to Ludovico Caldara for some code on which to build, here is the SQL that I am using to view the performance in each snapshot. First the output though, truncated to save some space on the screen:
SNAPSHOTID SNAPSHOTTIME REDO_MBSEC REDO_GRAPH READ_MBSEC READ_GRAPH WRITE_MBSEC WRITE_GRAPH
---------- --------------- ---------- -------------------- ---------- -------------------- ----------- --------------------
239 30-AUG 13:37:14 384.83 ****** 0 * .04 ******
240 30-AUG 13:38:07 284.27 **** 0 * .03 ****
241 30-AUG 13:42:14 296.62 **** 0 .03 ****
242 30-AUG 13:43:00 1242.08 ******************** 0 * .12 ********************
243 30-AUG 13:47:14 258.75 **** 0 .02 ****
244 30-AUG 13:48:28 866.83 ************** 0 * .08 *************
245 30-AUG 13:52:14 456.24 ******* 0 .04 *******
246 30-AUG 13:54:43 773.61 ************ 0 .07 ************
247 30-AUG 13:57:38 624.23 ********** 0 .06 *********
248 30-AUG 14:00:22 613.98 ********* 0 .05 *********
I’m currently running redo generation tests so I’m only interested in the redo size metric and the calculation of redo per second, i.e. column 3. I can use the graph at column 4 to instantly see which snapshot I need to look at: 243. That’s the one where redo was being generated at over 1.2Gb/sec – not bad for a two-socket machine attached to a single Violin 6616 array.
Now for the SQL… I warn you now, it’s a bit dense!
-- for educational use only - use at your own risk! -- display physical IO statistics from DBA_HIST_SYSSTAT -- specifically redo size, physical reads and physical writes set lines 140 pages 45 accept num_days prompt 'Enter the number of days to report on [default is 0.5]: ' set verify off SELECT redo_hist.snap_id AS SnapshotID , TO_CHAR(redo_hist.snaptime, 'DD-MON HH24:MI:SS') as SnapshotTime , ROUND(redo_hist.statval/elapsed_time/1048576,2) AS Redo_MBsec , SUBSTR(RPAD('*', 20 * ROUND ((redo_hist.statval/elapsed_time) / MAX (redo_hist.statval/elapsed_time) OVER (), 2), '*'), 1, 20) AS Redo_Graph , ROUND(physical_read_hist.statval/elapsed_time/1048576,2) AS Read_MBsec , SUBSTR(RPAD('*', 20 * ROUND ((physical_read_hist.statval/elapsed_time) / MAX (physical_read_hist.statval/elapsed_time) OVER (), 2), '*'), 1, 20) AS Read_Graph , ROUND(physical_write_hist.statval/elapsed_time/1048576,2) AS Write_MBsec , SUBSTR(RPAD('*', 20 * ROUND ((physical_write_hist.statval/elapsed_time) / MAX (physical_write_hist.statval/elapsed_time) OVER (), 2), '*'), 1, 20) AS Write_Graph FROM (SELECT s.snap_id ,g.value AS stattot ,s.end_interval_time AS snaptime ,NVL(DECODE(GREATEST(VALUE, NVL(lag (VALUE) OVER (PARTITION BY s.dbid, s.instance_number, g.stat_name ORDER BY s.snap_id), 0)), VALUE, VALUE - LAG (VALUE) OVER (PARTITION BY s.dbid, s.instance_number, g.stat_name ORDER BY s.snap_id), VALUE), 0) AS statval ,(EXTRACT(day FROM s.end_interval_time)-EXTRACT(day FROM s.begin_interval_time))*86400 + (EXTRACT(hour FROM s.end_interval_time)-EXTRACT(hour FROM s.begin_interval_time))*3600 + (EXTRACT(minute FROM s.end_interval_time)-EXTRACT(minute FROM s.begin_interval_time))*60 + (EXTRACT(second FROM s.end_interval_time)-EXTRACT(second FROM s.begin_interval_time)) as elapsed_time FROM dba_hist_snapshot s, dba_hist_sysstat g, v$instance i WHERE s.snap_id = g.snap_id AND s.begin_interval_time >= sysdate-NVL('&num_days', 0.5) AND s.instance_number = i.instance_number AND s.instance_number = g.instance_number AND g.stat_name = 'redo size') redo_hist, (SELECT s.snap_id ,g.value AS stattot ,NVL(DECODE(GREATEST(VALUE, NVL(lag (VALUE) OVER (PARTITION BY s.dbid, s.instance_number, g.stat_name ORDER BY s.snap_id), 0)), VALUE, VALUE - LAG (VALUE) OVER (PARTITION BY s.dbid, s.instance_number, g.stat_name ORDER BY s.snap_id), VALUE), 0) AS statval FROM dba_hist_snapshot s, dba_hist_sysstat g, v$instance i WHERE s.snap_id = g.snap_id AND s.begin_interval_time >= sysdate-NVL('&num_days', 0.5) AND s.instance_number = i.instance_number AND s.instance_number = g.instance_number AND g.stat_name = 'physical read total bytes') physical_read_hist, (SELECT s.snap_id ,g.value AS stattot ,NVL(DECODE(GREATEST(VALUE, NVL(lag (VALUE) OVER (PARTITION BY s.dbid, s.instance_number, g.stat_name ORDER BY s.snap_id), 0)), VALUE, VALUE - LAG (VALUE) OVER (PARTITION BY s.dbid, s.instance_number, g.stat_name ORDER BY s.snap_id), VALUE), 0) AS statval FROM dba_hist_snapshot s, dba_hist_sysstat g, v$instance i WHERE s.snap_id = g.snap_id AND s.begin_interval_time >= sysdate-NVL('&num_days', 0.5) AND s.instance_number = i.instance_number AND s.instance_number = g.instance_number AND g.stat_name = 'physical write total bytes') physical_write_hist WHERE redo_hist.snap_id = physical_read_hist.snap_id AND redo_hist.snap_id = physical_write_hist.snap_id ORDER BY 1;
This is a very usefull script to mine the AWR repository. I find it very re-usable since one can simply add an additional metric by simply copy/pasting. Thanks for sharing !
Thanks for the feedback, it makes sharing worthwhile!
thanks for posting.
Lynn, NW Ohio
Pingback: Retrieve and visualize system statistics metrics from AWR with R | bdt's oracle blog
I sympathize with the “Gaaaah…”!
Hello,
The script is quite impressive and it works fine for standalone/restart instance. Can you please share the RAC specific SQL code to capture the different events. i mean capture the information across instances and show up by instance wise. The reason behind that we have cluster GRID environment wherein we have 30 instance runs on a single node and it’s pretty hard to know about which node was highly use in term of IOPS generation.
I’m not entirely sure what you mean? Do you mean capturing data from multiple databases? If you have 30 instances on a single node then they must be different databases, which means it’s virtually impossible to do what you are asking. The only methods I can think of would be creating a horrible mess of database links or writing some kind of wrapper script which logs in to each database to run the SQL. Neither of those sound appealing to me…