ASM disk group fails to mount with ORA-15017 ORA-15066 errors offlining disk "RACQ$LUN3" in group "DATA" may result in a data loss¶
Encountered an ORA-15066
preventing an ASM disk group mount.
This post shows how the issue was analyzed and a workaround found.
It is a test 21c cluster that was not shut down properly yesterday.
While mounting a DATA
disk group, the following error ORA-15066
was encountered.
SQL> alter diskgroup data mount;
alter diskgroup data mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15066: offlining disk "RACQ$LUN3" in group "DATA" may result in a data loss
Mounting the disk group with the FORCE
option is not possible either, and it fails with the same errors:
ASMCMD> mount data -f
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15066: offlining disk "RACQ$LUN3" in group "DATA" may result in a data loss (DBD ERROR: OCIStmtExecute)
With the next entries in the alert log:
2021-08-31T10:31:52.117810+00:00
SQL> alter diskgroup data mount
2021-08-31T10:31:52.124846+00:00
NOTE: cache registered group DATA 2/0x05D14A8C
NOTE: cache began mount (first) of group DATA 2/0x05D14A8C
NOTE: Assigning number (2,5) to disk (/dev/flashgrid/rac2.lun5)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,1) to disk (/dev/flashgrid/rac2.lun6)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,4) to disk (/dev/flashgrid/rac2.lun3)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,2) to disk (/dev/flashgrid/racq.lun3)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,6) to disk (/dev/flashgrid/rac1.lun5)
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: Assigning number (2,0) to disk (/dev/flashgrid/rac1.lun6)
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: Assigning number (2,3) to disk (/dev/flashgrid/rac1.lun3)
WARNING: DATA has too many failure groups for a stretch cluster.
2021-08-31T10:31:52.294934+00:00
cluster guid (5f307c7210446f13bfcd86fa9d15c5f1) generated for PST Hbeat for instance 1
NOTE: initial disk modes for disk 2 (RACQ$LUN3) in group 2 (DATA) is not completely online: modes 0x1 lflags 0x4
2021-08-31T10:31:52.297979+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:31:58.301169+00:00
ERROR: disk 2 (RACQ$LUN3) in group 2 cannot be offlined because all disks [2(RACQ$LUN3)] with mirror data would be offline.
2021-08-31T10:31:58.301239+00:00
ERROR: too many offline disks in PST (grp 2)
2021-08-31T10:31:58.301932+00:00
NOTE: cache dismounting (clean) group 2/0x05D14A8C (DATA)
NOTE: messaging CKPT to quiesce pins Unix process pid: 7067, NID: 4026531836, image: oracle@rac1.example.com (TNS V1-V3)
NOTE: dbwr not being msg'd to dismount
NOTE: LGWR not being messaged to dismount
NOTE: cache dismounted group 2/0x05D14A8C (DATA)
NOTE: cache ending mount (fail) of group DATA number=2 incarn=0x05d14a8c
NOTE: cache deleting context for group DATA 2/0x05d14a8c
2021-08-31T10:31:58.303103+00:00
GMON dismounting group 2 at 90 for pid 57, osid 7067
2021-08-31T10:31:58.303346+00:00
NOTE: Disk RAC1$LUN6 in mode 0x7f marked for de-assignment
NOTE: Disk RAC2$LUN6 in mode 0x7f marked for de-assignment
NOTE: Disk RACQ$LUN3 in mode 0x1 marked for de-assignment
NOTE: Disk RAC1$LUN3 in mode 0x7f marked for de-assignment
NOTE: Disk RAC2$LUN3 in mode 0x7f marked for de-assignment
NOTE: Disk RAC2$LUN5 in mode 0x7f marked for de-assignment
NOTE: Disk RAC1$LUN5 in mode 0x7f marked for de-assignment
ERROR: diskgroup DATA was not mounted
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15066: offlining disk "RACQ$LUN3" in group "DATA" may result in a data loss
2021-08-31T10:31:58.314373+00:00
ERROR: alter diskgroup data mount
DATA
is a NORMAL
redundancy disk group with two regular and one quorum failure group.
It is one of the cases when there might be a need to do some undocumented stuff:
SQL> select name,
path,
mount_status,
header_status,
mode_status,
state,
failgroup
from v$asm_disk
where group_number=(select group_number from v$asm_diskgroup where name='DATA')
order by path;
2 3 4 5 6 7 8 9 10
NAME PATH MOUNT_S HEADER_STATU MODE_ST STATE FAILGROUP
------------------------------ ------------------------------ ------- ------------ ------- -------- ------------------------------
/dev/flashgrid/rac1.lun3 CLOSED MEMBER ONLINE NORMAL
/dev/flashgrid/rac1.lun5 CLOSED MEMBER ONLINE NORMAL
/dev/flashgrid/rac1.lun6 CLOSED MEMBER ONLINE NORMAL
/dev/flashgrid/rac2.lun3 CLOSED MEMBER ONLINE NORMAL
/dev/flashgrid/rac2.lun5 CLOSED MEMBER ONLINE NORMAL
/dev/flashgrid/rac2.lun6 CLOSED MEMBER ONLINE NORMAL
/dev/flashgrid/racq.lun2 CLOSED FORMER ONLINE NORMAL
/dev/flashgrid/racq.lun3 CLOSED MEMBER ONLINE NORMAL
8 rows selected.
SQL> alter diskgroup data mount restricted for recovery;
Diskgroup altered.
SQL> select name,
path,
mount_status,
header_status,
mode_status,
state,
failgroup
from v$asm_disk
where group_number=(select group_number from v$asm_diskgroup where name='DATA')
order by path;
2 3 4 5 6 7 8 9 10
NAME PATH MOUNT_S HEADER_STATU MODE_ST STATE FAILGROUP
------------------------------ ------------------------------ ------- ------------ ------- -------- ------------------------------
RAC1$LUN3 /dev/flashgrid/rac1.lun3 CACHED MEMBER ONLINE NORMAL RAC1
RAC1$LUN5 /dev/flashgrid/rac1.lun5 CACHED MEMBER ONLINE NORMAL RAC1
RAC1$LUN6 /dev/flashgrid/rac1.lun6 CACHED MEMBER ONLINE NORMAL RAC1
RAC2$LUN3 /dev/flashgrid/rac2.lun3 CACHED MEMBER ONLINE NORMAL RAC2
RAC2$LUN5 /dev/flashgrid/rac2.lun5 CACHED MEMBER ONLINE NORMAL RAC2
RAC2$LUN6 /dev/flashgrid/rac2.lun6 CACHED MEMBER ONLINE NORMAL RAC2
RACQ$LUN3 /dev/flashgrid/racq.lun3 CACHED MEMBER ONLINE NORMAL RACQ
7 rows selected.
The alert log:
2021-08-31T10:37:56.317024+00:00
SQL> alter diskgroup data mount restricted for recovery
2021-08-31T10:37:56.323711+00:00
NOTE: cache registered group DATA 2/0xEA114A90
NOTE: cache began mount (first) of group DATA 2/0xEA114A90
NOTE: Assigning number (2,5) to disk (/dev/flashgrid/rac2.lun5)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,1) to disk (/dev/flashgrid/rac2.lun6)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,4) to disk (/dev/flashgrid/rac2.lun3)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,2) to disk (/dev/flashgrid/racq.lun3)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,6) to disk (/dev/flashgrid/rac1.lun5)
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: Assigning number (2,0) to disk (/dev/flashgrid/rac1.lun6)
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: Assigning number (2,3) to disk (/dev/flashgrid/rac1.lun3)
WARNING: DATA has too many failure groups for a stretch cluster.
2021-08-31T10:37:56.529659+00:00
cluster guid (5f307c7210446f13bfcd86fa9d15c5f1) generated for PST Hbeat for instance 1
NOTE: initial disk modes for disk 2 (RACQ$LUN3) in group 2 (DATA) is not completely online: modes 0x1 lflags 0x4
2021-08-31T10:37:56.532517+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:02.539918+00:00
NOTE: GMON heartbeating for grp 2 (DATA)
2021-08-31T10:38:02.540568+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
GMON querying group 2 at 125 for pid 57, osid 7067
2021-08-31T10:38:02.540905+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:02.541129+00:00
NOTE: cache is mounting group DATA created on 2021/07/30 12:37:34
NOTE: cache opening disk 0 of grp 2: RAC1$LUN6 path:/dev/flashgrid/rac1.lun6
NOTE: group 2 (DATA) high disk header ckpt advanced to fcn 0.42883
NOTE: 08/31/21 10:38:02 DATA.F1X0 found on disk 0 au 10 fcn 0.42883 datfmt 2
NOTE: cache opening disk 1 of grp 2: RAC2$LUN6 path:/dev/flashgrid/rac2.lun6
NOTE: cache opening disk 3 of grp 2: RAC1$LUN3 path:/dev/flashgrid/rac1.lun3
NOTE: cache opening disk 4 of grp 2: RAC2$LUN3 path:/dev/flashgrid/rac2.lun3
NOTE: cache opening disk 5 of grp 2: RAC2$LUN5 path:/dev/flashgrid/rac2.lun5
NOTE: 08/31/21 10:38:02 DATA.F1X0 found on disk 5 au 10 fcn 0.42883 datfmt 2
NOTE: cache opening disk 6 of grp 2: RAC1$LUN5 path:/dev/flashgrid/rac1.lun5
2021-08-31T10:38:02.541726+00:00
NOTE: cache mounting (first) normal redundancy group 2/0xEA114A90 (DATA)
2021-08-31T10:38:02.963014+00:00
NOTE: attached to recovery domain 2
2021-08-31T10:38:03.009050+00:00
validate pdb 2, flags x4, valid 0, pdb flags x204
* validated domain 2, flags = 0x200
NOTE: cache recovered group 2 to fcn 0.43425
NOTE: redo buffer size is 512 blocks (2105344 bytes)
2021-08-31T10:38:03.011498+00:00
NOTE: LGWR attempting to mount thread 1 for diskgroup 2 (DATA)
NOTE: LGWR found thread 1 closed at ABA 30.7188 lock domain=0 inc#=0 instnum=1
NOTE: LGWR mounted thread 1 for diskgroup 2 (DATA)
2021-08-31T10:38:03.022814+00:00
NOTE: LGWR opened thread 1 (DATA) at fcn 0.43425 ABA 31.7189 lock domain=2 inc#=2 instnum=1 gx.incarn=3927001744 mntstmp=2021/08/31 10:38:03.012000
2021-08-31T10:38:03.023034+00:00
NOTE: cache mounting group 2/0xEA114A90 (DATA) succeeded
NOTE: cache ending mount (success) of group DATA number=2 incarn=0xea114a90
WARNING: DATA has too many failure groups for a stretch cluster.
2021-08-31T10:38:03.103835+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:03.104063+00:00
NOTE: Instance updated compatible.asm to 19.0.0.0.0 for grp 2 (DATA).
2021-08-31T10:38:03.104307+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:03.104463+00:00
NOTE: Instance updated compatible.asm to 19.0.0.0.0 for grp 2 (DATA).
2021-08-31T10:38:03.105130+00:00
NOTE: Instance updated compatible.rdbms to 19.0.0.0.0 for grp 2 (DATA).
2021-08-31T10:38:03.105436+00:00
NOTE: Instance updated compatible.rdbms to 19.0.0.0.0 for grp 2 (DATA).
WARNING: DATA has too many failure groups for a stretch cluster.
WARNING: DATA has too many failure groups for a stretch cluster.
2021-08-31T10:38:03.148108+00:00
SUCCESS: diskgroup DATA was mounted
2021-08-31T10:38:03.157095+00:00
SUCCESS: alter diskgroup data mount restricted for recovery
2021-08-31T10:38:03.167013+00:00
NOTE: diskgroup resource ora.DATA.dg is online
2021-08-31T10:38:19.113318+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.114528+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.475013+00:00
SQL> ALTER DISKGROUP "DATA" ONLINE QUORUM DISK "RACQ$LUN3"
2021-08-31T10:38:19.478374+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.486520+00:00
NOTE: GroupBlock outside rolling migration privileged region
NOTE: initiating resync of disk group 2 disks
RACQ$LUN3 (2)
NOTE: process _user21083_+asm1 (21083) initiating offline of disk 2.4042374244 (RACQ$LUN3) with mask 0x7e in group 2 (DATA) without client assisting
2021-08-31T10:38:19.521378+00:00
NOTE: sending set offline flag message (2320259704) to 1 disk(s) in group 2
2021-08-31T10:38:19.521792+00:00
WARNING: Disk 2 (RACQ$LUN3) in group 2 mode 0x1 is now being offlined
2021-08-31T10:38:19.522023+00:00
NOTE: initiating PST update: grp 2 (DATA), dsk = 2/0xf0f1bc64, mask = 0x6a, op = clear mandatory
2021-08-31T10:38:19.522202+00:00
GMON updating disk modes for group 2 at 135 for pid 52, osid 21083
2021-08-31T10:38:19.522431+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.523238+00:00
NOTE: PST update grp = 2 completed successfully
NOTE: initiating PST update: grp 2 (DATA), dsk = 2/0xf0f1bc64, mask = 0x7e, op = clear mandatory
2021-08-31T10:38:19.523484+00:00
GMON updating disk modes for group 2 at 136 for pid 52, osid 21083
2021-08-31T10:38:19.523608+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.524311+00:00
NOTE: PST update grp = 2 completed successfully
NOTE: requesting all-instance membership refresh for group=2
NOTE: initiating PST update: grp 2 (DATA), dsk = 2/0x0, mask = 0x11, op = assign mandatory
2021-08-31T10:38:19.532576+00:00
GMON updating disk modes for group 2 at 137 for pid 52, osid 21083
2021-08-31T10:38:19.532724+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.540473+00:00
NOTE: PST update grp = 2 completed successfully
NOTE: requesting all-instance disk validation for group=2
2021-08-31T10:38:19.540830+00:00
NOTE: disk validation pending for 1 disk in group 2/0xea114a90 (DATA)
NOTE: Found /dev/flashgrid/racq.lun3 for disk RACQ$LUN3
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: completed disk validation for 2/0xea114a90 (DATA)
2021-08-31T10:38:19.677322+00:00
NOTE: running client discovery for group 2 (reqid:16866344061815119730)
NOTE: discarding redo for group 2 disk 2
NOTE: initiating PST update: grp 2 (DATA), dsk = 2/0x0, mask = 0x19, op = assign mandatory
2021-08-31T10:38:20.176650+00:00
GMON updating disk modes for group 2 at 138 for pid 52, osid 21083
NOTE: group DATA: updated PST location: disks 0005 0000 0002
2021-08-31T10:38:20.225248+00:00
NOTE: PST update grp = 2 completed successfully
WARNING: DATA has too many failure groups for a stretch cluster.
2021-08-31T10:38:20.225980+00:00
NOTE: membership refresh pending for group 2/0xea114a90 (DATA)
2021-08-31T10:38:20.227553+00:00
GMON querying group 2 at 139 for pid 31, osid 22344
2021-08-31T10:38:20.237960+00:00
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: cache opening disk 2 of grp 2: RACQ$LUN3 path:/dev/flashgrid/racq.lun3
SUCCESS: refreshed membership for 2/0xea114a90 (DATA)
2021-08-31T10:38:20.238448+00:00
NOTE: initiating PST update: grp 2 (DATA), dsk = 2/0x0, mask = 0x7f, op = assign mandatory
2021-08-31T10:38:20.239039+00:00
GMON updating disk modes for group 2 at 140 for pid 52, osid 21083
2021-08-31T10:38:20.284132+00:00
NOTE: PST update grp = 2 completed successfully
2021-08-31T10:38:20.284349+00:00
SUCCESS: ALTER DISKGROUP "DATA" ONLINE QUORUM DISK "RACQ$LUN3"
2021-08-31T10:38:21.596382+00:00
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
The alert log shows that the RACQ$LUN3
disk is brought offline first and then put online.
It can also be seen that ASM corrected the issue itself.
The FOR RECOVERY
mount option used to be documented somewhere on MOS, but I cannot find where it is now.
Looks like Oracle Support made the document non-public.
I can now remount the disk group cleanly:
SQL> alter diskgroup data dismount;
Diskgroup altered.
SQL> alter diskgroup data mount;
Diskgroup altered.