Oracle 12.1.0.2 ASM Filter Driver: Advanced Format Fail
August 11, 2014 8 Comments
[Please note that a more up-to-date post on this subject can be found here]
In my previous post on the subject of the new ASM Filter Driver (AFD) feature introduced in Oracle’s 12.1.0.2 patchset, I installed the AFD to see how it fulfilled its promise that it “filters out all non-Oracle I/Os which could cause accidental overwrites“. However, because I was ten minutes away from my summer vacation at the point of finishing that post, I didn’t actually get round to writing about what happens when you try and create ASM diskgroups on the devices it presents.
Obviously I’ve spent the intervening period constantly worrying about this oversight – indeed, it was only through the judicious application of good food and drink plus some committed relaxation in the sun that I was able to pull through. However, I’m back now and it seems like time to rectify that mistake. So here goes.
Creating ASM Diskgroups with the ASM Filter Driver
It turns out I need not have worried, because it doesn’t work right now… at least, not for me. Here’s why:
First of all, I installed Oracle 12.1.0.2 Grid Infrastructure. I then labelled some block devices presented from my Violin storage array. As I’ve already pasted all the output from those two steps in the previous post, I won’t repeat myself.
The next step is therefore to create a diskgroup. Since I’ve only just come back from holiday and so I’m still half brain-dead, I’ll choose the simple route and fire up the ASM Configuration Assistant (ASMCA) so that I don’t have to look up any of that nasty SQL. Here goes:
But guess what happened when I hit the OK button? It failed, bigtime. Here’s the alert log – if you don’t like huge amounts of meaningless text I suggest you skip down… a lot… (although thinking about it, my entire blog could be described as meaningless text):
SQL> CREATE DISKGROUP DATA EXTERNAL REDUNDANCY DISK 'AFD:DATA1' SIZE 72704M , 'AFD:DATA2' SIZE 72704M , 'AFD:DATA3' SIZE 72704M , 'AFD:DATA4' SIZE 72704M , 'AFD:DATA5' SIZE 72704M , 'AFD:DATA6' SIZE 72704M , 'AFD:DATA7' SIZE 72704M , 'AFD:DATA8' SIZE 72704M ATTRIBUTE 'compatible.asm'='12.1.0.0.0','au_size'='1M' /* ASMCA */ Fri Jul 25 16:25:33 2014 WARNING: Library 'AFD Library - Generic , version 3 (KABI_V3)' does not support advanced format disks Fri Jul 25 16:25:33 2014 NOTE: Assigning number (1,0) to disk (AFD:DATA1) NOTE: Assigning number (1,1) to disk (AFD:DATA2) NOTE: Assigning number (1,2) to disk (AFD:DATA3) NOTE: Assigning number (1,3) to disk (AFD:DATA4) NOTE: Assigning number (1,4) to disk (AFD:DATA5) NOTE: Assigning number (1,5) to disk (AFD:DATA6) NOTE: Assigning number (1,6) to disk (AFD:DATA7) NOTE: Assigning number (1,7) to disk (AFD:DATA8) NOTE: initializing header (replicated) on grp 1 disk DATA1 NOTE: initializing header (replicated) on grp 1 disk DATA2 NOTE: initializing header (replicated) on grp 1 disk DATA3 NOTE: initializing header (replicated) on grp 1 disk DATA4 NOTE: initializing header (replicated) on grp 1 disk DATA5 NOTE: initializing header (replicated) on grp 1 disk DATA6 NOTE: initializing header (replicated) on grp 1 disk DATA7 NOTE: initializing header (replicated) on grp 1 disk DATA8 NOTE: initializing header on grp 1 disk DATA1 NOTE: initializing header on grp 1 disk DATA2 NOTE: initializing header on grp 1 disk DATA3 NOTE: initializing header on grp 1 disk DATA4 NOTE: initializing header on grp 1 disk DATA5 NOTE: initializing header on grp 1 disk DATA6 NOTE: initializing header on grp 1 disk DATA7 NOTE: initializing header on grp 1 disk DATA8 NOTE: Disk 0 in group 1 is assigned fgnum=1 NOTE: Disk 1 in group 1 is assigned fgnum=2 NOTE: Disk 2 in group 1 is assigned fgnum=3 NOTE: Disk 3 in group 1 is assigned fgnum=4 NOTE: Disk 4 in group 1 is assigned fgnum=5 NOTE: Disk 5 in group 1 is assigned fgnum=6 NOTE: Disk 6 in group 1 is assigned fgnum=7 NOTE: Disk 7 in group 1 is assigned fgnum=8 NOTE: initiating PST update: grp = 1 Fri Jul 25 16:25:33 2014 GMON updating group 1 at 1 for pid 7, osid 16745 NOTE: group DATA: initial PST location: disk 0000 (PST copy 0) NOTE: set version 1 for asmCompat 12.1.0.0.0 Fri Jul 25 16:25:33 2014 NOTE: PST update grp = 1 completed successfully NOTE: cache registered group DATA 1/0xD9B6AE8D NOTE: cache began mount (first) of group DATA 1/0xD9B6AE8D NOTE: cache is mounting group DATA created on 2014/07/25 16:25:33 NOTE: cache opening disk 0 of grp 1: DATA1 label:DATA1 NOTE: cache opening disk 1 of grp 1: DATA2 label:DATA2 NOTE: cache opening disk 2 of grp 1: DATA3 label:DATA3 NOTE: cache opening disk 3 of grp 1: DATA4 label:DATA4 NOTE: cache opening disk 4 of grp 1: DATA5 label:DATA5 NOTE: cache opening disk 5 of grp 1: DATA6 label:DATA6 NOTE: cache opening disk 6 of grp 1: DATA7 label:DATA7 NOTE: cache opening disk 7 of grp 1: DATA8 label:DATA8 NOTE: cache creating group 1/0xD9B6AE8D (DATA) NOTE: cache mounting group 1/0xD9B6AE8D (DATA) succeeded WARNING: cache read a corrupt block: group=1(DATA) dsk=0 blk=1 disk=0 (DATA1) incarn=3493224069 au=0 blk=1 count=1 Fri Jul 25 16:25:33 2014 Errors in file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_16745.trc: ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] NOTE: a corrupted block from group DATA was dumped to /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_16745.trc WARNING: cache read (retry) a corrupt block: group=1(DATA) dsk=0 blk=1 disk=0 (DATA1) incarn=3493224069 au=0 blk=1 count=1 Fri Jul 25 16:25:33 2014 Errors in file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_16745.trc: ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] WARNING: cache read (retry) a corrupt block: group=1(DATA) dsk=0 blk=1 disk=0 (DATA1) incarn=3493224069 au=11 blk=1 count=1 Fri Jul 25 16:25:33 2014 Errors in file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_16745.trc: ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] NOTE: a corrupted block from group DATA was dumped to /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_16745.trc WARNING: cache read (retry) a corrupt block: group=1(DATA) dsk=0 blk=1 disk=0 (DATA1) incarn=3493224069 au=11 blk=1 count=1 Fri Jul 25 16:25:33 2014 Errors in file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_16745.trc: ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] ERROR: cache failed to read group=1(DATA) dsk=0 blk=1 from disk(s): 0(DATA1) 0(DATA1) ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] NOTE: cache initiating offline of disk 0 group DATA NOTE: process _user16745_+asm (16745) initiating offline of disk 0.3493224069 (DATA1) with mask 0x7e in group 1 (DATA) with client assisting NOTE: initiating PST update: grp 1 (DATA), dsk = 0/0xd0365e85, mask = 0x6a, op = clear Fri Jul 25 16:25:34 2014 GMON updating disk modes for group 1 at 2 for pid 7, osid 16745 ERROR: disk 0(DATA1) in group 1(DATA) cannot be offlined because the disk group has external redundancy. Fri Jul 25 16:25:34 2014 ERROR: too many offline disks in PST (grp 1) Fri Jul 25 16:25:34 2014 ERROR: no read quorum in group: required 1, found 0 disks ERROR: Could not read PST for grp 1. Force dismounting the disk group. Fri Jul 25 16:25:34 2014 NOTE: halting all I/Os to diskgroup 1 (DATA) Fri Jul 25 16:25:34 2014 ERROR: no read quorum in group: required 1, found 0 disks ASM Health Checker found 1 new failures Fri Jul 25 16:25:36 2014 ERROR: no read quorum in group: required 1, found 0 disks Fri Jul 25 16:25:36 2014 ERROR: Could not read PST for grp 1. Force dismounting the disk group. Fri Jul 25 16:25:36 2014 ERROR: no read quorum in group: required 1, found 0 disks ERROR: Could not read PST for grp 1. Force dismounting the disk group. Fri Jul 25 16:25:36 2014 ERROR: no read quorum in group: required 1, found 0 disks ERROR: Could not read PST for grp 1. Force dismounting the disk group. Fri Jul 25 16:25:37 2014 NOTE: AMDU dump of disk group DATA initiated at /u01/app/oracle/diag/asm/+asm/+ASM/trace Errors in file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_16745.trc (incident=3257): ORA-15335: ASM metadata corruption detected in disk group 'DATA' ORA-15130: diskgroup "DATA" is being dismounted ORA-15066: offlining disk "DATA1" in group "DATA" may result in a data loss ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1] Incident details in: /u01/app/oracle/diag/asm/+asm/+ASM/incident/incdir_3257/+ASM_ora_16745_i3257.trc Fri Jul 25 16:25:37 2014 Sweep [inc][3257]: completed Fri Jul 25 16:25:37 2014 SQL> alter diskgroup DATA check System State dumped to trace file /u01/app/oracle/diag/asm/+asm/+ASM/incident/incdir_3257/+ASM_ora_16745_i3257.trc NOTE: erasing header (replicated) on grp 1 disk DATA1 NOTE: erasing header (replicated) on grp 1 disk DATA2 NOTE: erasing header (replicated) on grp 1 disk DATA3 NOTE: erasing header (replicated) on grp 1 disk DATA4 NOTE: erasing header (replicated) on grp 1 disk DATA5 NOTE: erasing header (replicated) on grp 1 disk DATA6 NOTE: erasing header (replicated) on grp 1 disk DATA7 NOTE: erasing header (replicated) on grp 1 disk DATA8 NOTE: erasing header on grp 1 disk DATA1 NOTE: erasing header on grp 1 disk DATA2 NOTE: erasing header on grp 1 disk DATA3 NOTE: erasing header on grp 1 disk DATA4 NOTE: erasing header on grp 1 disk DATA5 NOTE: erasing header on grp 1 disk DATA6 NOTE: erasing header on grp 1 disk DATA7 NOTE: erasing header on grp 1 disk DATA8 Fri Jul 25 16:25:37 2014 NOTE: cache dismounting (clean) group 1/0xD9B6AE8D (DATA) NOTE: messaging CKPT to quiesce pins Unix process pid: 16745, image: oracle@server3.local (TNS V1-V3) NOTE: dbwr not being msg'd to dismount NOTE: LGWR not being messaged to dismount NOTE: cache dismounted group 1/0xD9B6AE8D (DATA) NOTE: cache ending mount (fail) of group DATA number=1 incarn=0xd9b6ae8d NOTE: cache deleting context for group DATA 1/0xd9b6ae8d Fri Jul 25 16:25:37 2014 GMON dismounting group 1 at 3 for pid 7, osid 16745 Fri Jul 25 16:25:37 2014 NOTE: Disk DATA1 in mode 0x7f marked for de-assignment NOTE: Disk DATA2 in mode 0x7f marked for de-assignment NOTE: Disk DATA3 in mode 0x7f marked for de-assignment NOTE: Disk DATA4 in mode 0x7f marked for de-assignment NOTE: Disk DATA5 in mode 0x7f marked for de-assignment NOTE: Disk DATA6 in mode 0x7f marked for de-assignment NOTE: Disk DATA7 in mode 0x7f marked for de-assignment NOTE: Disk DATA8 in mode 0x7f marked for de-assignment ERROR: diskgroup DATA was not created ORA-15018: diskgroup cannot be created ORA-15335: ASM metadata corruption detected in disk group 'DATA' ORA-15130: diskgroup "DATA" is being dismounted Fri Jul 25 16:25:37 2014 ORA-15032: not all alterations performed ORA-15066: offlining disk "DATA1" in group "DATA" may result in a data loss ORA-15001: diskgroup "DATA" does not exist or is not mounted ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648] [1] [0 != 1]
Now then. First of all, thanks for making it this far – I promise not to do that again in this post. Secondly, in case you really did just hit page down *a lot* you might want to skip back up and look for the bits I’ve conveniently highlighted in red. Specifically, this bit:
WARNING: Library 'AFD Library - Generic , version 3 (KABI_V3)' does not support advanced format disks
Many modern storage platforms use Advanced Format – if you want to know what that means, read here. The idea that AFD doesn’t support advanced format is somewhat alarming – and indeed incorrect, according to interactions I have subsequently had with Oracle’s ASM Product Management people. From what I understand, the problem is tracked as bug 19297177 (currently unpublished) and is caused by AFD incorrectly checking the physical blocksize of the storage device (4k) instead of the logical block size (which was 512 bytes). I currently have a request open with Oracle Support for the patch, so when that arrives I will re-test and add another blog article.
Until then, I guess I might as well take another well-earned vacation?
Pingback: Oracle
Pingback: Oracle
Did you ever get a patch for this bug?
I have just patched to 12.1.0.2.2 as there was a mos note saying not to run afd_config unless you have applied the first psu (last oct 2014). Drives that where raw mapper disk now afd with sec siz 512 work fine but when I’m trying to
@mkdg.sql
CREATE DISKGROUP DAT4 EXTERNAL REDUNDANCY DISK ‘AFD:DAT_0001’
*
ERROR at line 1:
ORA-15018: diskgroup cannot be created
ORA-15335: ASM metadata corruption detected in disk group ‘DAT4’
ORA-15130: diskgroup “DAT4” is being dismounted
ORA-15066: offlining disk “DAT_0001” in group “DAT4” may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648]
[1] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648]
[1] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648]
[1] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [2147483648]
using a new 4K disk it runs into this issue.
After a yr would have though it would be fixed by now.
Do you think redo ing a afd_config, and afd_label on the new disk would fix this issue?
btw all my afd_disks come up disabled on reboot but crs can restart the db.
I haven’t been back and re-tested AFD since I wrote that article. It’s on my to-do list, but then so are many other things…
This seems more and more like a bug with 4096 sector drives, which was opened last yr but not published. Did oracle’s asm contacts ever say they fix the bug?
where you ever able to create a disk group using 4096 drives? I guess it might work if the drive uses 512 byte emulation,
well after a yr of trying I was able to get Afd to fully work….
steps
1. install psu 1 or higher I used 2
2. asmcmd Afd_reconfigure
3. asmcmd Afd_configure
4. asmcmd Afd_label label path –migrate
e.g. asmcmd Afd_label CRS_0000 /devdm-5 –migrate.
doesn’t seem to like /dev/mapper/ devices so use related blk dm device
5. asmcmd Afd_lsdsk
6. crsctl start has CRS
CRS must be done for steps after 1
also make sure /dev/Afd.conf has path to all your devices
Thanks Terry, that’s kind of rekindled my interest. I might see if I can grab a test box and give this a go myself.
You’re must welcome. I would suggest waiting for the July ’15 psu to come out, apply it then run the steps above. That way you can see just how far they have gotten with all the needed patches. Hey they maybe have a fix for SDD on afd, which I saw on a community.oracle.com post.
If you add in a step to asmdisk remove the existing path then you should have what’s needed to migrate existing raw disks to afd. asmdisk migrate afd would read whatever existing disks are there and generate and run the asmcmd afd_label $path –migrate commands needed. I can’t say enough that you have to have crs/has down during the migrate and under any condition never have afd and non afd (e. g. asmlib) disks up in any way (as in having asmlib even on the same server) as it will forever hose your os. I had to re-image from a backup server just to get linux to boot even in single user.
Good Luck