This is a very quick post, because I’m about to log off and take an extended summer holiday (or vacation as my crazy American friends call it… but then they call football “soccer” too). Before I go, I wanted to document my initial findings with the new ASM Filter Driver feature introduced in this week’s 12.1.0.2 patchset. [For a more recent post on this topic, read here]
Currently a Linux-only feature, the ASM Filter Driver (or AFD) is a replacement for ASMLib and is described by Oracle as follows:
Oracle ASM Filter Driver (Oracle ASMFD) is a kernel module that resides in the I/O path of the Oracle ASM disks. Oracle ASM uses the filter driver to validate write I/O requests to Oracle ASM disks.
The Oracle ASMFD simplifies the configuration and management of disk devices by eliminating the need to rebind disk devices used with Oracle ASM each time the system is restarted.
The Oracle ASM Filter Driver rejects any I/O requests that are invalid. This action eliminates accidental overwrites of Oracle ASM disks that would cause corruption in the disks and files within the disk group. For example, the Oracle ASM Filter Driver filters out all non-Oracle I/Os which could cause accidental overwrites.
Interesting, eh? So let’s find out how that works.
Installation
I found this a real pain as you need to have 12.1.0.2 installed before the AFD is available to label your disks, yet the default OUI mode wants to create an ASM diskgroup… and you cannot do that without any labelled disks.
The only solution I could come up with was to perform a software-only install, which in itself is a pain. I’ll skip the numerous screenshots of that part though and just skip straight to the bit where I have 12.1.0.2 Grid Infrastructure installed.
I’m following these instructions because I am using a single-instance Oracle Restart system rather than a true cluster.
First of all we need to do this:
[oracle@server3 ~]$ $ORACLE_HOME/bin/asmcmd dsset 'AFD:*'
[oracle@server3 ~]$ $ORACLE_HOME/bin/asmcmd dsget
parameter:AFD:*
profile:AFD:*
[oracle@server3 ~]$ srvctl config asm
ASM home:
Password file:
ASM listener: LISTENER
Spfile: /u01/app/oracle/admin/+ASM/pfile/spfile+ASM.ora
ASM diskgroup discovery string: AFD:*
Then we need to stop HAS and run the AFD_CONFIGURE command:
[root@server3 ~]# $ORACLE_HOME/bin/crsctl stop has -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'server3'
CRS-2673: Attempting to stop 'ora.asm' on 'server3'
CRS-2673: Attempting to stop 'ora.evmd' on 'server3'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'server3'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'server3' succeeded
CRS-2677: Stop of 'ora.evmd' on 'server3' succeeded
CRS-2677: Stop of 'ora.asm' on 'server3' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'server3'
CRS-2677: Stop of 'ora.cssd' on 'server3' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'server3' has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@server3 ~]# $ORACLE_HOME/bin/asmcmd afd_configure
Connected to an idle instance.
AFD-627: AFD distribution files found.
AFD-636: Installing requested AFD software.
AFD-637: Loading installed AFD drivers.
AFD-9321: Creating udev for AFD.
AFD-9323: Creating module dependencies - this may take some time.
AFD-9154: Loading 'oracleafd.ko' driver.
AFD-649: Verifying AFD devices.
AFD-9156: Detecting control device '/dev/oracleafd/admin'.
AFD-638: AFD installation correctness verified.
Modifying resource dependencies - this may take some time.
ASMCMD-9524: AFD configuration failed 'ERROR: OHASD start failed'
Er… that’s not really what I had in mind. But hey, let’s carry on regardless:
[root@server3 oracleafd]# $ORACLE_HOME/bin/asmcmd afd_state
Connected to an idle instance.
ASMCMD-9526: The AFD state is 'LOADED' and filtering is 'DEFAULT' on host 'server3.local'
[root@server3 oracleafd]# $ORACLE_HOME/bin/crsctl start has
CRS-4123: Oracle High Availability Services has been started.
Ok it seems to be working. I wonder what it’s done?
Investigation
The first thing I notice is some Oracle kernel modules have been loaded:
[root@server3 ~]# lsmod | grep ora
oracleafd 208499 1
oracleacfs 3307969 0
oracleadvm 506254 0
oracleoks 505749 2 oracleacfs,oracleadvm
I also see that, just like ASMLib, a driver has been plonked into the /opt/oracle/extapi directory:
[root@server3 1]# find /opt/oracle/extapi -ls
2752765 4 drwxr-xr-x 3 root root 4096 Jul 25 15:15 /opt/oracle/extapi
2752766 4 drwxr-xr-x 3 root root 4096 Jul 25 15:15 /opt/oracle/extapi/64
2753508 4 drwxr-xr-x 3 root root 4096 Jul 25 15:15 /opt/oracle/extapi/64/asm
2756532 4 drwxr-xr-x 3 root root 4096 Jul 25 15:15 /opt/oracle/extapi/64/asm/orcl
2756562 4 drwxr-xr-x 2 root root 4096 Jul 25 15:15 /opt/oracle/extapi/64/asm/orcl/1
2756578 268 -rwxr-xr-x 1 oracle dba 272513 Jul 25 15:15 /opt/oracle/extapi/64/asm/orcl/1/libafd12.so
And again, just like ASMLib, there is a new directory under /dev called /dev/oracleafd (whereas for ASMLib it’s called /dev/oracleasm):
[root@server3 ~]# ls -la /dev/oracleafd/
total 0
drwxrwx--- 3 oracle dba 80 Jul 25 15:15 .
drwxr-xr-x 21 root root 15820 Jul 25 15:15 ..
brwxrwx--- 1 oracle dba 249, 0 Jul 25 15:15 admin
drwxrwx--- 2 oracle dba 40 Jul 25 15:15 disks
The disks directory is currently empty. Maybe I should create some AFD devices and see what happens?
Labelling
So let’s look at my Violin devices and see if I can label them:
root@server3 mapper]# ls -l /dev/mapper
total 0
crw-rw---- 1 root root 10, 236 Jul 11 16:52 control
lrwxrwxrwx 1 root root 7 Jul 25 15:49 data1 -> ../dm-3
lrwxrwxrwx 1 root root 7 Jul 25 15:49 data2 -> ../dm-4
lrwxrwxrwx 1 root root 7 Jul 25 15:49 data3 -> ../dm-5
lrwxrwxrwx 1 root root 7 Jul 25 15:49 data4 -> ../dm-6
lrwxrwxrwx 1 root root 7 Jul 25 15:49 data5 -> ../dm-7
lrwxrwxrwx 1 root root 7 Jul 25 15:49 data6 -> ../dm-8
lrwxrwxrwx 1 root root 7 Jul 25 15:49 data7 -> ../dm-9
lrwxrwxrwx 1 root root 8 Jul 25 15:49 data8 -> ../dm-10
lrwxrwxrwx 1 root root 7 Jul 11 16:53 VolGroup-lv_home -> ../dm-2
lrwxrwxrwx 1 root root 7 Jul 11 16:53 VolGroup-lv_root -> ../dm-0
lrwxrwxrwx 1 root root 7 Jul 11 16:52 VolGroup-lv_swap -> ../dm-1
The documentation appears to be incorrect here, when it says to use the command $ORACLE_HOME/bin/afd_label. It’s actually $ORACLE_HOME/bin/asmcmd with the first parameter afd_label. I’m going to label the devices called /dev/mapper/data*:
[root@server3 mapper]# for lun in 1 2 3 4 5 6 7 8; do
> asmcmd afd_label DATA$lun /dev/mapper/data$lun
> done
Connected to an idle instance.
Connected to an idle instance.
Connected to an idle instance.
Connected to an idle instance.
Connected to an idle instance.
Connected to an idle instance.
Connected to an idle instance.
Connected to an idle instance.
root@server3 mapper]# asmcmd afd_lsdsk
Connected to an idle instance.
--------------------------------------------------------------------------------
Label Filtering Path
================================================================================
DATA1 ENABLED /dev/mapper/data1
DATA2 ENABLED /dev/mapper/data2
DATA3 ENABLED /dev/mapper/data3
DATA4 ENABLED /dev/mapper/data4
DATA5 ENABLED /dev/mapper/data5
DATA6 ENABLED /dev/mapper/data6
DATA7 ENABLED /dev/mapper/data7
DATA8 ENABLED /dev/mapper/data8
That seemed to work ok. So what’s going on in the /dev/oracleafd/disks directory now?
[root@server3 ~]# ls -l /dev/oracleafd/disks/
total 32
-rw-r--r-- 1 root root 26 Jul 25 15:52 DATA1
-rw-r--r-- 1 root root 26 Jul 25 15:49 DATA2
-rw-r--r-- 1 root root 26 Jul 25 15:49 DATA3
-rw-r--r-- 1 root root 26 Jul 25 15:49 DATA4
-rw-r--r-- 1 root root 26 Jul 25 15:49 DATA5
-rw-r--r-- 1 root root 26 Jul 25 15:49 DATA6
-rw-r--r-- 1 root root 26 Jul 25 15:49 DATA7
-rw-r--r-- 1 root root 26 Jul 25 15:49 DATA8
There they are, just like with ASMLib. But look at the permissions, they are all owned by root with read-only privs for other users. In an ASMLib environment these devices are owned by oracle:dba, which means non-Oracle processes can write to them and corrupt them in some situations. Is this how Oracle claims the AFD protects devices?
I haven’t had time to investigate further but I assume that the database will access the devices via this mysterious block device:
[oracle@server3 oracleafd]$ ls -l /dev/oracleafd/admin
brwxrwx--- 1 oracle dba 249, 0 Jul 25 16:25 /dev/oracleafd/admin
It will be interesting to find out.
Distruction
Of course, if you are logged in as root you aren’t going to be protected from any crazy behaviour:
[root@server3 ~]# cd /dev/oracleafd/disks
[root@server3 disks]# ls -l
total 496
-rw-r--r-- 1 root root 475877 Jul 25 16:40 DATA1
-rw-r--r-- 1 root root 26 Jul 25 15:49 DATA2
-rw-r--r-- 1 root root 26 Jul 25 15:49 DATA3
-rw-r--r-- 1 root root 26 Jul 25 15:49 DATA4
-rw-r--r-- 1 root root 26 Jul 25 15:49 DATA5
-rw-r--r-- 1 root root 26 Jul 25 15:49 DATA6
-rw-r--r-- 1 root root 26 Jul 25 15:49 DATA7
-rw-r--r-- 1 root root 26 Jul 25 15:49 DATA8
[root@server3 disks]# od -c -N 256 DATA8
0000000 / d e v / m a p p e r / d a t a
0000020 8 \n
0000032
[root@server3 disks]# dmesg >> DATA8
[root@server3 disks]# od -c -N 256 DATA8
0000000 / d e v / m a p p e r / d a t a
0000020 8 \n z r d b t e 2 l I n i t i a
0000040 l i z i n g c g r o u p s u
0000060 b s y s c p u s e t \n I n i t
0000100 i a l i z i n g c g r o u p
0000120 s u b s y s c p u \n L i n u x
0000140 v e r s i o n 3 . 8 . 1 3 -
0000160 2 6 . 2 . 3 . e l 6 u e k . x 8
0000200 6 _ 6 4 ( m o c k b u i l d @
0000220 c a - b u i l d 4 4 . u s . o r
0000240 a c l e . c o m ) ( g c c v
0000260 e r s i o n 4 . 4 . 7 2 0 1
0000300 2 0 3 1 3 ( R e d H a t 4
0000320 . 4 . 7 - 3 ) ( G C C ) )
0000340 # 2 S M P W e d A p r 1
0000360 6 0 2 : 5 1 : 1 0 P D T 2
0000400
Proof, if ever you need it, that root access is still the fastest and easiest route to total disaster…
[Update July 2015: Ok, so look. I was wrong in this post – these /dev/oracleafd/disks devices are simply pointers to devices in /dev/dm-* and thus I was only overwriting the pointer. To read a more accurate post on the subject, please read here]