Category Archives: Automatic Storage Management

Automatic Storage Management

12.2 New Feature: the FLEX ASM disk group part 3

In the previous 2 parts of this mini series I introduced the Flex ASM disk group and two related concepts, the Quota Group and File Group. In what should have become the final part (but isn’t) I am interested in checking whether quotas are enforced.

(Un)fortunately I have uncovered a few more things that are worth investigating and blogging about, which is why a) this isn’t the last post and b) it got a bit shorter than the previous two. Had I combined part 3 and 4 it would have been too long for sure … BTW, you can navigate all posts using the links at the very bottom of the page.

Are quotas enforced?

The purpose of the Quota Group is … to enforce quotas on a disk group, much like on a file system. This is quite interesting, because you now have a hard limit to which databases can grow within a disk group even for non-CDBs.

The question I set out to answer in this part is whether quotas are enforced. As you read in the previous post’s interlude, I have created 2 databases on the FLEX ASM disk group: ORCL and CDB, both container databases.

The current state of affairs for my File Groups is this:

ASMCMD> lsfg
File Group         Disk Group  Quota Group  Used Quota MB  Client Name    Client Type  
DEFAULT_FILEGROUP  FLEX        GENERIC      2488                                       
CDB_CDB$ROOT       FLEX        QG_CDB       6744           CDB_CDB$ROOT   DATABASE     
CDB_PDB$SEED       FLEX        QG_CDB       1656           CDB_PDB$SEED   DATABASE     
PDB1               FLEX        QG_CDB       1784           PDB1           DATABASE     
ORCL_CDB$ROOT      FLEX        GENERIC      9104           ORCL_CDB$ROOT  DATABASE     
ORCL_PDB$SEED      FLEX        GENERIC      1616           ORCL_PDB$SEED  DATABASE     
PDB1_0001          FLEX        GENERIC      9424           PDB1           DATABASE   

Database CDB is nicely tucked away in QG_CDB, but none of the ORCL database’s components are assigned to a Quota Group yet. I wanted to have another Quota Group QG_ORCL, for my second CDB. Somehow I think that a Quota Group per database makes sense.

ASMCMD> mkqg -G FLEX QG_ORCL quota 20G
Diskgroup altered.
ASMCMD> lsqg
Group_Num  Quotagroup_Num  Quotagroup_Name  Incarnation  Used_Quota_MB  Quota_Limit_MB  
2          1               GENERIC          1            22632          0               
2          2               QG_ORCL          7            0              20480           
2          3               QG_CDB           1            10184          20480           
ASMCMD> 
If you followed part 1 and 2 you may have noticed that my FLEX disk group group_number has changed from 5 to 2, after a server reboot.

SQL> select group_number, name, state, type from v$asm_diskgroup
  2  where group_number = 2;

GROUP_NUMBER NAME                           STATE       TYPE
------------ ------------------------------ ----------- ------
           2 FLEX                           CONNECTED   FLEX

Don’t let the change in numbers confuse you

Back to the example: File Groups ORCL_CDB$ROOT, ORCL_PDB$SEED and PDB1_0001 are not yet within QG_ORCL. This can be rectified using 3 simple mvfg commands in ASMCMD, or the corresponding SQL commands. After the move commands completed, the Quota Group still has space left (although it’s very limited)

ASMCMD> mvfg -G flex --filegroup PDB1_0001 QG_ORCL
Diskgroup altered.
ASMCMD> mvfg -G flex --filegroup ORCL_PDB$SEED  QG_ORCL
Diskgroup altered.
ASMCMD> mvfg -G flex --filegroup ORCL_CDB$ROOT  QG_ORCL
Diskgroup altered.
ASMCMD> lsfg
File Group         Disk Group  Quota Group  Used Quota MB  Client Name    Client Type  
DEFAULT_FILEGROUP  FLEX        GENERIC      2488                                       
CDB_CDB$ROOT       FLEX        QG_CDB       6744           CDB_CDB$ROOT   DATABASE     
CDB_PDB$SEED       FLEX        QG_CDB       1656           CDB_PDB$SEED   DATABASE     
PDB1               FLEX        QG_CDB       1784           PDB1           DATABASE     
ORCL_CDB$ROOT      FLEX        QG_ORCL      9104           ORCL_CDB$ROOT  DATABASE     
ORCL_PDB$SEED      FLEX        QG_ORCL      1616           ORCL_PDB$SEED  DATABASE     
PDB1_0001          FLEX        QG_ORCL      9424           PDB1           DATABASE     
ASMCMD> lsqg
Group_Num  Quotagroup_Num  Quotagroup_Name  Incarnation  Used_Quota_MB  Quota_Limit_MB  
2          1               GENERIC          1            2488           0               
2          2               QG_ORCL          7            20144          20480           
2          3               QG_CDB           1            10184          20480           
ASMCMD> 

Now I just need to try and push it over the edge to see if the quota has any effect. This is quite simple: all I need to do is create another tablespace in ORCL:PDB1

SQL> show con_name

CON_NAME
------------------------------
PDB1
SQL> create tablespace userdata datafile size 500m;
create tablespace userdata datafile size 500m
*
ERROR at line 1:
ORA-01119: error in creating database file '+FLEX'
ORA-17502: ksfdcre:4 Failed to create file +FLEX
ORA-15437: Not enough quota available in quota group QG_ORCL.

SQL> 

This is similar to what we could already enforce using the storage clause in the create pluggable database command. But what about ASM file types that aren’t data files? Querying the database I can find quite a few of these:

SQL> get /tmp/non_data_files
  1  SELECT
  2      f.group_number as DG_NUMBER,
  3      f.file_number,
  4      f.incarnation as file_incarnation,
  5      f.type,
  6      fg.name as filegroup_name,
  7      a.name as file_name
  8  FROM
  9      v$asm_alias a,
 10      v$asm_file f,
 11      v$asm_filegroup fg
 12  WHERE
 13          a.group_number = f.group_number
 14      AND
 15          a.file_number = f.file_number
 16      AND
 17          a.file_incarnation = f.incarnation
 18      AND
 19          fg.group_number = f.group_number
 20      AND
 21          fg.filegroup_number = f.filegroup_number
 22      AND
 23*         f.type  'DATAFILE';
SQL> start /tmp/non_data_files

 DG_NUMBER FILE_NUMBER FILE_INCARNATION TYPE            FILEGROUP_NAME       FILE_NAME
---------- ----------- ---------------- --------------- -------------------- ------------------------------
         2         282        948453843 PASSWORD        DEFAULT_FILEGROUP    pwdorcl.282.948453843
         2         287        948462715 PASSWORD        DEFAULT_FILEGROUP    pwdorcl.287.948462715
         2         293        948462849 CONTROLFILE     ORCL_CDB$ROOT        Current.293.948462849
         2         292        948462849 CONTROLFILE     ORCL_CDB$ROOT        Current.292.948462849
         2         294        948462855 ONLINELOG       ORCL_CDB$ROOT        group_2.294.948462855
         2         295        948462855 ONLINELOG       ORCL_CDB$ROOT        group_1.295.948462855
         2         296        948462861 ONLINELOG       ORCL_CDB$ROOT        group_1.296.948462861
         2         297        948462861 ONLINELOG       ORCL_CDB$ROOT        group_2.297.948462861
         2         304        948463227 ONLINELOG       ORCL_CDB$ROOT        group_3.304.948463227
         2         305        948463231 ONLINELOG       ORCL_CDB$ROOT        group_3.305.948463231
         2         306        948463239 ONLINELOG       ORCL_CDB$ROOT        group_4.306.948463239
         2         307        948463243 ONLINELOG       ORCL_CDB$ROOT        group_4.307.948463243
         2         298        948462891 TEMPFILE        ORCL_CDB$ROOT        TEMP.298.948462891
         2         302        948462937 TEMPFILE        ORCL_PDB$SEED        TEMP.302.948462937
         2         308        948463249 PARAMETERFILE   ORCL_CDB$ROOT        spfile.308.948463249
         2         312        948464283 TEMPFILE        PDB1_0001            TEMP.312.948464283

16 rows selected.

Let’s take the online redo logs and add another thread to instance 1:

SQL> alter database add logfile thread 1 size 1g;
alter database add logfile thread 1 size 1g
*
ERROR at line 1:
ORA-00301: error in adding log file '+FLEX' - file cannot be created
ORA-17502: ksfdcre:4 Failed to create file +FLEX
ORA-15437: Not enough quota available in quota group QG_ORCL.


SQL> 

This proves that quotas are enforced, at least for a couple of very simple examples.

Summary Part 3

There is certainly a lot more to discover about Quota Groups, datafiles set to autoextent, archivelog growth to a certain size, temp- and undo tablespaces etc. At first glance, it looks good. The key again is to have suitable monitoring in place that warns DBAs about File Groups running out of space. I wonder if that’s port of Enterprise Manager, otherwise it’s easy to write such checks yourself either as custom metrics in OEM or in tools such as Nagios.

12.2 New Feature: the FLEX ASM disk group part 2

In the first part of this series I explained the basics and some potential motivation behind the use of ASM Flex disk groups. In this part I would like to complete the description of new concepts.

New Concepts related to FLEX ASM Disk Groups

With the Flex disk group mounted, the next steps are to create a few new entities. First, I want to create a Quota Group. The Quota Group – as the name implies – will enforce quotas for entities residing within it. It is optional to add one yourself, Oracle creates a default Quota Group for you that does not enforce storage limits. As you will see later, the default Quota Group will be assigned to all new databases in the Flex ASM disk group.

The entity to be stored inside the Quota Group is named a File Group, and serves to logically group files such as those belonging to a database. According to the SQL Language Reference version 12.2, the File Group can be created for a

  • Database (non-CDB, CDB, and PDB)
  • Cluster or
  • ASM Volume

Remember that the compatible.rdbms and compatible.asm parameters are to be set to 12.2.0.1(+) on the Flex ASM disk group, ruling out pre 12.2 databases. For this post I am intending to store database-related files in the File Group. And since I like CDBs, I’m using this database type.

Database Creation

With the FLEX disk group in place I fired up dbca in silent mode to create a database for me. Before submitting the command however I created a Quota Group, connected to the ASM instance, like this:

SQL> alter diskgroup flex add quotagroup QG_CDB set quota = 20g;

Diskgroup altered.

SQL> select QUOTAGROUP_NUMBER,NAME,USED_QUOTA_MB,QUOTA_LIMIT_MB from v$asm_quotagroup;

QUOTAGROUP_NUMBER NAME                           USED_QUOTA_MB QUOTA_LIMIT_MB
----------------- ------------------------------ ------------- --------------
                1 GENERIC                                    0              0
                3 QG_CDB                                     0          20480
SQL> 

Oracle provides a GENERIC Quota Group without storage limitations when the Flex ASM disk group is created. QG_CDB is the Quota Group I just created. In hindsight I don’t think that creating the Quota Group at this stage was necessary because it wasn’t used straight away… but I’m getting ahead of myself.

Here is the command to create the two-node RAC database on my FLEX diskgroup, which I ran next:

[oracle@rac122pri1 ~]$ dbca -silent -createDatabase -templateName martin_cdb12cr2_001.dbc \
> -gdbName CDB -sysPassword secretpwd1 -systemPassword secretpwd2 -storageType ASM \
> -diskGroupName FLEX -recoveryGroupName FLEX -sampleSchema true \
> -totalMemory 2048 -dbsnmpPassword secretpwd3 -nodeinfo rac122pri1,rac122pri2 \
> -createAsContainerDatabase true -databaseConfigType RAC

I should probably create a disk group +FLEXRECO for my Fast Recovery Area, but since this is a lab system I don’t quite care enough to justify the extra space usage. Your mileage will most definitely vary, this isn’t supposed to be a blue print, instead it’s just me dabbling around with new technology :)

It takes a little time for the dbca command to return. It appears as if the database had automatically created File Groups for the CDB’s components, mapping to the GENERIC Quota Group (quotagroup_number 1 from the listing above). Querying the ASM instance, I can find these:

SQL> select FILEGROUP_NUMBER, NAME, CLIENT_NAME, USED_QUOTA_MB, QUOTAGROUP_NUMBER from v$asm_filegroup
  2  /

FILEGROUP_NUMBER NAME                 CLIENT_NAME          USED_QUOTA_MB QUOTAGROUP_NUMBER
---------------- -------------------- -------------------- ------------- -----------------
               0 DEFAULT_FILEGROUP                                     0                 1
               1 CDB_CDB$ROOT         CDB_CDB$ROOT                  6704                 1
               2 CDB_PDB$SEED         CDB_PDB$SEED                  1656                 1

Just like with Quota Groups, there is a default entity, named DEFAULT_FILEGROUP. The CDB_CDB$ROOT and CDB_PDB$SEED seem to map to the newly created database. As you can see a little later in this post, the first “CDB” in CDB_CDB$ROOT maps to the database name. CDB$ROOT and PDB$SEED should sound familiar if you have previously worked with Container Databases.

Oracle creating File Groups for databases on its own is not too bad, because it takes half the work away from me. To test whether this holds true for newly created PDBs, I added a PDB named PDB1 to my CDB. Sure enough, after the create pluggable database command returned, there was a new File Group:

SQL> select FILEGROUP_NUMBER, NAME, CLIENT_NAME from v$asm_filegroup;

FILEGROUP_NUMBER NAME                 CLIENT_NAME
---------------- -------------------- --------------------
               0 DEFAULT_FILEGROUP
               1 CDB_CDB$ROOT         CDB_CDB$ROOT
               2 CDB_PDB$SEED         CDB_PDB$SEED
               3 PDB1                 PDB1

The output made me think – somehow name and client_name no longer look that useful for establishing a relation between CDB and PDB1. This might actually not be necessary from a technical point of view, but somehow I need to understand which PDB1 is meant in the view in case there is more than 1. I can think of a PDB1 in CDB as super important, whereas PDB1 in some other CDB is less important. One way would be to name the PDB differently to allow a human to map the PDB name to a CDB. Having said that I certainly don’t want to have references to the CDB in my PDB name, that defeats the purpose of pluggable databases and wouldn’t hold true anyway when I unplug/plug the PDB to a different CDB.

Interlude
If you wonder what happens when you create another PDB1 in a different CDB, I did so, too. After a quick change to my dbca command and after the creation of another CDB I named ORCL, I create a new PDB1 within it. This is the result:

SQL> select filegroup_number, name, client_name, guid from v$asm_filegroup;

FILEGROUP_NUMBER NAME                 CLIENT_NAME          GUID
---------------- -------------------- -------------------- --------------------------------
               0 DEFAULT_FILEGROUP
               1 CDB_CDB$ROOT         CDB_CDB$ROOT         4700A987085A3DFAE05387E5E50A8C7B
               2 CDB_PDB$SEED         CDB_PDB$SEED         536DF51E8E28221BE0534764A8C0FD81
               3 PDB1                 PDB1                 537B677EF8DA0F1AE0534764A8C05729
               4 ORCL_CDB$ROOT        ORCL_CDB$ROOT        4700A987085A3DFAE05387E5E50A8C7B
               5 ORCL_PDB$SEED        ORCL_PDB$SEED        537E63B952183748E0534764A8C09A7F
               6 PDB1_0001            PDB1                 537EB5B87E62586EE0534764A8C05530

7 rows selected.

The good news is there is no clash, or even a failure of the “create pluggable database” command. The new database’s CDB$ROOT and PDB$SEED namespaces don’t collide because of the database-name prefix. CDB.PDB1 and ORCL.PDB1 don’t clash because Oracle appends a number to the File Group’s name.

The not-so-good news is that the (File Group) name and client_name became ambiguous. But there is a solution: playing around a bit I found that the GUID column in v$asm_filegroup maps to the GUID in v$pdbs/v$container.

SQL> select sys_context('USERENV','CDB_NAME') cdb_name, guid 
  2  from v$pdbs where guid = '537EB5B87E62586EE0534764A8C05530';

CDB_NAME                       GUID
------------------------------ --------------------------------
ORCL                           537EB5B87E62586EE0534764A8C05530

Looks like it’s time to understand those GUIDs better, hopefully there will be more to come soon.

Quotas

As you can see from the previous example, is not necessary to enforce a quota, however it is possible. I’ll do this for the sake of completeness (and in preparation for part 3).

NOTE: This section of the blog post was actually written before I had created the second CDB (“ORCL”), which is why you don’t see it in the command output.

Many of the administrative commands related to File Group and Quota Groups that I entered in SQL can also be issued via asmcmd, as shown here:

ASMCMD> lsqg
Group_Num  Quotagroup_Num  Quotagroup_Name  Incarnation  Used_Quota_MB  Quota_Limit_MB  
5          1               GENERIC          1            10016          0               
5          3               QG_CDB           1            0              20480           

ASMCMD> lsfg
File Group         Disk Group  Quota Group  Used Quota MB  Client Name   Client Type  
DEFAULT_FILEGROUP  FLEX        GENERIC      0                                         
CDB_CDB$ROOT       FLEX        GENERIC      6704           CDB_CDB$ROOT  DATABASE     
CDB_PDB$SEED       FLEX        GENERIC      1656           CDB_PDB$SEED  DATABASE     
PDB1               FLEX        GENERIC      1656           PDB1          DATABASE     

ASMCMD> help mvfg
mvfg
        Moves a file group in a disk group to the specified Quota Group.

Synopsis
        mvfg -G  --filegroup  

Description
        The options for the mvfg command are described below.

        -G diskgroup     - Disk group name.
        --filegroup      - File group name.

Examples
        The following is an example of the mvfg command. The file group
        FG1 in the DATA disk group is moved to the Quota Group QG1.

        ASMCMD [+] > mvfg -G DATA --filegroup FG1 QG1

See Also
       mkqg rmqg chqg lsqg

ASMCMD> 

The first two should be self-explanatory: lsqg is short for list Quota Group, and lsfg is the equivalent for File Groups. the mvfg takes a couple of parameters but should be straight forward. With the commands introduced it’s time to perform the action. I need to move File Groups using mvfg:

ASMCMD> mvfg -G flex --filegroup CDB_CDB$ROOT QG_CDB
Diskgroup altered.
ASMCMD> mvfg -G flex --filegroup CDB_PDB$SEED QG_CDB
Diskgroup altered.
ASMCMD> mvfg -G flex --filegroup PDB1 QG_CDB
Diskgroup altered.
ASMCMD> lsfg
File Group         Disk Group  Quota Group  Used Quota MB  Client Name   Client Type  
DEFAULT_FILEGROUP  FLEX        GENERIC      0                                         
CDB_CDB$ROOT       FLEX        QG_CDB       6704           CDB_CDB$ROOT  DATABASE     
CDB_PDB$SEED       FLEX        QG_CDB       1656           CDB_PDB$SEED  DATABASE     
PDB1               FLEX        QG_CDB       1656           PDB1          DATABASE     

ASMCMD> lsqg
Group_Num  Quotagroup_Num  Quotagroup_Name  Incarnation  Used_Quota_MB  Quota_Limit_MB  
5          1               GENERIC          1            0              0               
5          3               QG_CDB           1            10016          20480           
ASMCMD> 

The command completes pretty much instantaneously, so it’s not actually “moving” data, it appears that all there is done is an update on meta-data. Moving the File Group is translated to the following SQL commands, found in the ASM instance’s alert.log:

2017-07-04 11:01:53.492000 +01:00
SQL> /* ASMCMD */ALTER DISKGROUP FLEX MOVE FILEGROUP CDB_CDB$ROOT TO QG_CDB
SUCCESS: /* ASMCMD */ALTER DISKGROUP FLEX MOVE FILEGROUP CDB_CDB$ROOT TO QG_CDB
2017-07-04 11:02:08.645000 +01:00
SQL> /* ASMCMD */ALTER DISKGROUP FLEX MOVE FILEGROUP CDB_PDB$SEED TO QG_CDB
SUCCESS: /* ASMCMD */ALTER DISKGROUP FLEX MOVE FILEGROUP CDB_PDB$SEED TO QG_CDB
SQL> /* ASMCMD */ALTER DISKGROUP FLEX MOVE FILEGROUP PDB1 TO QG_CDB
SUCCESS: /* ASMCMD */ALTER DISKGROUP FLEX MOVE FILEGROUP PDB1 TO QG_CDB

Summary Part 2

In part 2 of the article series I played around with Quota Groups and File Groups. Getting to grips with these concepts is necessary for a better understanding of how a Flex ASM disk group works. In part 3 I’ll try and investigate the effect of changing properties to file groups, and whether quotas are enforced.

12.2 New Feature: the FLEX ASM disk group part 1

I knew about the 12.2 FLEX ASM disk group type from other presenters but until now – when researching the feature for the upcoming DOAG HA Day – I haven’t been able to appreciate how cool this is. And I really think it is pretty cool and worth sharing! There is a lot to be said about the feature and these tests, which is why I am splitting it into multiple parts.

Please be aware that this post is about my lab experiments, I have no production experience with FLEX ASM disk groups. As with all new features it might take a while to mature, so test, test, test…

Background

In previous versions of Oracle Grid Infrastructure/Automatic Storage Management (ASM), especially in consolidation environments, certain operations I would have liked to perform were not easily possible. Most properties of the disk group – such as redundancy levels etc. – are valid for all files that reside within it. For example, if you wanted to have normal redundancy for some databases, and high redundancy for others, you typically ended up with two “data” disk groups. The same goes for +RECO. I can think of scenarios where “filesystem-like” attributes within a disk group would have been nice. In the build-up to ASM 12.2, Oracle has steadily increased ASM limits to allow for more customisation, and in 12.1 you really could go a little over the top. You could have 63 ASM disk groups in 11.2 and up to 511 in 12.1. Although this should allow for plenty customisation, it adds a little maintenance overhead.

A more granular option to manage storage came with 12.1 and the introduction of Container Databases (CDBs). As part of the Pluggable Database’s creation, the administrator could specify a pdb_storage_clause as in this very basic example:

SQL> create pluggable database pdb1 
  2   admin user someone identified by somepassword
  3    ...
  4   storage (maxsize 200G);

Pluggable database created.

However, if the database was created within a disk group with high redundancy, all files residing in that disk group inherited that property. I couldn’t define a PDB with normal redundancy in a high redundancy disk group, at least I wasn’t aware of a way to do so in 12.1.

Flex Disk Group

A Flex Disk Group promises fine-granular management of data within the disk group. You can also enforce quotas in a disk group (probably most useful on database-level), and you can define properties such as redundancy settings per file type (also per database or Pluggable Database). So in other words you can now have a disk group containing 2 databases for example, out of which database 1 uses normal redundancy, and database 2 uses high redundancy. If database 2 is a Container Database (CDB), you can even manage settings as low down as the PDB level. A few new concepts need introducing before that can happen. Let’s begin with the most essential part: the new Flex ASM disk group. There is a variation of the theme for extended distance clusters, which is not in scope of this article series.

In my lab system, I have created a new ASM disk group of flex redundancy. The system I am using is a two-node RAC running 12.2.0.1.170620 on Oracle Linux 7.3 with UEK4. I called the disk group FLEX, here is the command used for its creation:

CREATE DISKGROUP FLEX FLEX REDUNDANCY  
 DISK 'AFD:FLEX1' 
 DISK 'AFD:FLEX2'   
 DISK 'AFD:FLEX3' 
 DISK 'AFD:FLEX4'  
 DISK 'AFD:FLEX5'  
ATTRIBUTE 
 'compatible.asm'='12.2.0.1',
 'compatible.rdbms'='12.2.0.1',
 'compatible.advm'='12.2.0.1',
 'au_size'='4M’;

Note the use of ASM Filter Driver which I am testing as part of my lab set up. It’s also enabled by default when you install ASM 12.2. Looking at the code example I do realise now that the disk group name is maybe not ideal … The important bit in the example is the use of “FLEX REDUNDANCY”, the 5 implicit different failure groups and the compatibility settings that need to be 12.2.

The documentation (Automatic Storage Management Administrator’s Guide, chapter “Managing Oracle ASM Flex Disk Groups”) states that a Flex disk group generally tolerates the loss of 2 failure groups (FGs). The same bullet point then elaborates that at least 5 failure groups are needed to absorb the loss of 2 FGs. The minimum number of FGs within a Flex disk group are 3.

If you now get all excited about this new feature, there is one Big Caveat: you need a 12.2 RDBMS instance to use the feature.

This command results in a flurry of activity in the ASM instance, I have captured the output from the command initiation to completion because it’s quite interesting to see what happens in 12.2 ASM when you create a new disk group. Feel free to scroll down past the listing if you aren’t interested in the finer details.

SQL> CREATE DISKGROUP FLEX FLEX REDUNDANCY  DISK 'AFD:FLEX1' SIZE 10239M
 DISK 'AFD:FLEX2' SIZE 10239M
 DISK 'AFD:FLEX3' SIZE 10239M
 DISK 'AFD:FLEX4' SIZE 10239M
 DISK 'AFD:FLEX5' SIZE 10239M
 ATTRIBUTE 'compatible.asm'='12.2.0.1','compatible.rdbms'='12.2.0.1','compatible.advm'='12.2.0.1','au_size'='4M'
NOTE: Assigning number (5,0) to disk (AFD:FLEX1)
NOTE: Assigning number (5,1) to disk (AFD:FLEX2)
NOTE: Assigning number (5,2) to disk (AFD:FLEX3)
NOTE: Assigning number (5,3) to disk (AFD:FLEX4)
NOTE: Assigning number (5,4) to disk (AFD:FLEX5)
2017-07-03 10:38:53.811000 +01:00
NOTE: initializing header (replicated) on grp 5 disk FLEX1
NOTE: initializing header (replicated) on grp 5 disk FLEX2
NOTE: initializing header (replicated) on grp 5 disk FLEX3
NOTE: initializing header (replicated) on grp 5 disk FLEX4
NOTE: initializing header (replicated) on grp 5 disk FLEX5
NOTE: initializing header on grp 5 disk FLEX1
NOTE: initializing header on grp 5 disk FLEX2
NOTE: initializing header on grp 5 disk FLEX3
NOTE: initializing header on grp 5 disk FLEX4
NOTE: initializing header on grp 5 disk FLEX5
NOTE: Disk 0 in group 5 is assigned fgnum=1
NOTE: Disk 1 in group 5 is assigned fgnum=2
NOTE: Disk 2 in group 5 is assigned fgnum=3
NOTE: Disk 3 in group 5 is assigned fgnum=4
NOTE: Disk 4 in group 5 is assigned fgnum=5
GMON updating for reconfiguration, group 5 at 657 for pid 45, osid 25857
NOTE: group 5 PST updated.
NOTE: initiating PST update: grp = 5
GMON updating group 5 at 658 for pid 45, osid 25857
NOTE: set version 0 for asmCompat 12.2.0.1.0 for group 5
NOTE: group FLEX: initial PST location: disks 0000 0001 0002 0003 0004
NOTE: PST update grp = 5 completed successfully
NOTE: cache registered group FLEX 5/0x0A58F009
NOTE: cache began mount (first) of group FLEX 5/0x0A58F009
NOTE: cache is mounting group FLEX created on 2017/07/03 10:38:52
NOTE: cache opening disk 0 of grp 5: FLEX1 label:FLEX1
NOTE: cache opening disk 1 of grp 5: FLEX2 label:FLEX2
NOTE: cache opening disk 2 of grp 5: FLEX3 label:FLEX3
NOTE: cache opening disk 3 of grp 5: FLEX4 label:FLEX4
NOTE: cache opening disk 4 of grp 5: FLEX5 label:FLEX5
* allocate domain 5, valid ? 0
kjbdomatt send to inst 2
NOTE: attached to recovery domain 5
NOTE: cache creating group 5/0x0A58F009 (FLEX)
NOTE: cache mounting group 5/0x0A58F009 (FLEX) succeeded
NOTE: allocating F1X0 (replicated) on grp 5 disk FLEX1
NOTE: allocating F1X0 (replicated) on grp 5 disk FLEX2
NOTE: allocating F1X0 (replicated) on grp 5 disk FLEX3
NOTE: allocating F1X0 on grp 5 disk FLEX1
NOTE: allocating F1X0 on grp 5 disk FLEX2
NOTE: allocating F1X0 on grp 5 disk FLEX3
2017-07-03 10:38:56.621000 +01:00
NOTE: Created Used Space Directory for 1 threads
NOTE: Created Virtual Allocation Locator (1 extents) and Table (5 extents) directories for group 5/0x0A58F009 (FLEX)
2017-07-03 10:39:00.153000 +01:00
NOTE: VAM migration has completed for group 5/0x0A58F009 (FLEX)
NOTE: diskgroup must now be re-mounted prior to first use
NOTE: cache dismounting (clean) group 5/0x0A58F009 (FLEX)
NOTE: messaging CKPT to quiesce pins Unix process pid: 25857, image: oracle@rac122pri1 (TNS V1-V3)
2017-07-03 10:39:01.805000 +01:00
NOTE: LGWR not being messaged to dismount
kjbdomdet send to inst 2
detach from dom 5, sending detach message to inst 2
freeing rdom 5
NOTE: detached from domain 5
NOTE: cache dismounted group 5/0x0A58F009 (FLEX)
GMON dismounting group 5 at 659 for pid 45, osid 25857
GMON dismounting group 5 at 660 for pid 45, osid 25857
NOTE: Disk FLEX1 in mode 0x7f marked for de-assignment
NOTE: Disk FLEX2 in mode 0x7f marked for de-assignment
NOTE: Disk FLEX3 in mode 0x7f marked for de-assignment
NOTE: Disk FLEX4 in mode 0x7f marked for de-assignment
NOTE: Disk FLEX5 in mode 0x7f marked for de-assignment
SUCCESS: diskgroup FLEX was created
NOTE: cache deleting context for group FLEX 5/0x0a58f009
NOTE: cache registered group FLEX 5/0x4718F00C
NOTE: cache began mount (first) of group FLEX 5/0x4718F00C
NOTE: Assigning number (5,0) to disk (AFD:FLEX1)
NOTE: Assigning number (5,1) to disk (AFD:FLEX2)
NOTE: Assigning number (5,2) to disk (AFD:FLEX3)
NOTE: Assigning number (5,3) to disk (AFD:FLEX4)
NOTE: Assigning number (5,4) to disk (AFD:FLEX5)
2017-07-03 10:39:08.161000 +01:00
NOTE: GMON heartbeating for grp 5 (FLEX)
GMON querying group 5 at 663 for pid 45, osid 25857
NOTE: cache is mounting group FLEX created on 2017/07/03 10:38:52
NOTE: cache opening disk 0 of grp 5: FLEX1 label:FLEX1
NOTE: 07/03/17 10:39:07 FLEX.F1X0 found on disk 0 au 10 fcn 0.0 datfmt 1
NOTE: cache opening disk 1 of grp 5: FLEX2 label:FLEX2
NOTE: 07/03/17 10:39:07 FLEX.F1X0 found on disk 1 au 10 fcn 0.0 datfmt 1
NOTE: cache opening disk 2 of grp 5: FLEX3 label:FLEX3
NOTE: 07/03/17 10:39:07 FLEX.F1X0 found on disk 2 au 10 fcn 0.0 datfmt 1
NOTE: cache opening disk 3 of grp 5: FLEX4 label:FLEX4
NOTE: cache opening disk 4 of grp 5: FLEX5 label:FLEX5
NOTE: cache mounting (first) flex redundancy group 5/0x4718F00C (FLEX)
* allocate domain 5, valid ? 0
kjbdomatt send to inst 2
NOTE: attached to recovery domain 5
start recovery: pdb 5, passed in flags x4 (domain enable 0)
validate pdb 5, flags x4, valid 0, pdb flags x204
* validated domain 5, flags = 0x200
NOTE: cache recovered group 5 to fcn 0.0
NOTE: redo buffer size is 512 blocks (2105344 bytes)
NOTE: LGWR attempting to mount thread 1 for diskgroup 5 (FLEX)
NOTE: LGWR found thread 1 closed at ABA 0.11262 lock domain=0 inc#=0 instnum=0
NOTE: LGWR mounted thread 1 for diskgroup 5 (FLEX)
NOTE: setting 11.2 start ABA for group FLEX thread 1 to 2.0
NOTE: LGWR opened thread 1 (FLEX) at fcn 0.0 ABA 2.0 lock domain=5 inc#=12 instnum=1 gx.incarn=1192816652 mntstmp=2017/07/03 10:39:08.437000
NOTE: cache mounting group 5/0x4718F00C (FLEX) succeeded
NOTE: cache ending mount (success) of group FLEX number=5 incarn=0x4718f00c
NOTE: Instance updated compatible.asm to 12.2.0.1.0 for grp 5 (FLEX).
NOTE: Instance updated compatible.asm to 12.2.0.1.0 for grp 5 (FLEX).
NOTE: Instance updated compatible.rdbms to 12.2.0.1.0 for grp 5 (FLEX).
NOTE: Instance updated compatible.rdbms to 12.2.0.1.0 for grp 5 (FLEX).
SUCCESS: diskgroup FLEX was mounted
NOTE: diskgroup resource ora.FLEX.dg is online
SUCCESS: CREATE DISKGROUP FLEX FLEX REDUNDANCY  DISK 'AFD:FLEX1' SIZE 10239M
 DISK 'AFD:FLEX2' SIZE 10239M
 DISK 'AFD:FLEX3' SIZE 10239M
 DISK 'AFD:FLEX4' SIZE 10239M
 DISK 'AFD:FLEX5' SIZE 10239M
 ATTRIBUTE 'compatible.asm'='12.2.0.1','compatible.rdbms'='12.2.0.1','compatible.advm'='12.2.0.1','au_size'='4M'
2017-07-03 10:39:09.429000 +01:00
NOTE: enlarging ACD to 2 threads for group 5/0x4718f00c (FLEX)
2017-07-03 10:39:11.438000 +01:00
SUCCESS: ACD enlarged for group 5/0x4718f00c (FLEX)
NOTE: Physical metadata for diskgroup 5 (FLEX) was replicated.
adrci> 

Quite a bit of activity for just 1 command… I checked the attributes of the disk group, they don’t seem too alien to me:

SQL> select name, value from v$asm_attribute 
  2   where group_number = 5 
  3   and name not like 'template%';

NAME                           VALUE
------------------------------ ------------------------------
idp.type                       dynamic
idp.boundary                   auto
disk_repair_time               3.6h
phys_meta_replicated           true
failgroup_repair_time          24.0h
thin_provisioned               FALSE
preferred_read.enabled         FALSE
sector_size                    512
logical_sector_size            512
content.type                   data
content.check                  FALSE
au_size                        4194304
appliance._partnering_type     GENERIC
compatible.asm                 12.2.0.1.0
compatible.rdbms               12.2.0.1.0
compatible.advm                12.2.0.1.0
cell.smart_scan_capable        FALSE
cell.sparse_dg                 allnonsparse
access_control.enabled         FALSE
access_control.umask           066
scrub_async_limit              1
scrub_metadata.enabled         FALSE

22 rows selected.

I didn’t specify anything about failure groups in the create disk group command, hence I am getting 5 of them:

SQL> select name, os_mb, failgroup, path from v$asm_disk where group_number = 5;

NAME            OS_MB FAILGROUP                      PATH
---------- ---------- ------------------------------ --------------------
FLEX1           10239 FLEX1                          AFD:FLEX1
FLEX2           10239 FLEX2                          AFD:FLEX2
FLEX3           10239 FLEX3                          AFD:FLEX3
FLEX4           10239 FLEX4                          AFD:FLEX4
FLEX5           10239 FLEX5                          AFD:FLEX5

The result is the new disk group, a new part of my existing lab setup. As you can see in the following output I went with a separate disk group for the Grid Infrastructure Management Repository (GIMR), named +MGMT. In addition I have a disk group named +OCR which (surprise!) I use for OCR and voting files plus the usual suspects, +DATA and +RECO. Except +FLEX, all these are disk group types that have been available forever.

[oracle@rac122pri1 ~]$ asmcmd lsdg
State    Type    Rebal  Sector  Logical_Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512             512   4096  4194304     20476    17068                0           17068              0             N  DATA/
MOUNTED  FLEX    N         512             512   4096  4194304     51180    50676                0               0              0             N  FLEX/
MOUNTED  EXTERN  N         512             512   4096  4194304     40956     6560                0            6560              0             N  MGMT/
MOUNTED  NORMAL  N         512             512   4096  4194304     15348    14480             5116            4682              0             Y  OCR/
MOUNTED  EXTERN  N         512             512   4096  4194304     15356    15224                0           15224              0             N  RECO/
[oracle@rac122pri1 ~]$ 

The values of 0 in required_mirror_free_mb and useable_file_mb for +FLEX aren’t bugs, they are documented to be empty for this type of disk group. You need to consult other entities – to be covered in the next posts – to check the space usage for your databases.

Wrap Up Part 1

Flex ASM disk groups are hugely interesting and worth keeping an eye out for when you are testing Oracle 12.2. I admit that 12.2 is still quite new, and the cautious person I am won’t use it for production until the first major patch set comes out and I tested it to death. I am also curious how the use of the ASM Filter Driver will change the way I am working with ASM. It might be similar to ASMLib but I have yet to work that out.

In the next part(s) I will cover additional concepts you need to understand in the context of Flex ASM disk groups.

Little things worth knowing: when a transient ASM disk failure cannot be fixed in time

In the previous blog post I used libvirt and KVM in my lab environment to simulate a transient disk failure and how to recover from it. This post takes this example a step further: I am simulating another disk failure, but this time won’t pretend I can fix the issue and put it back. In other words, I simulate the effect of the disk_repair_time hitting zero.

Most of what I am covering here is an extension of the previous post, I’ll mention the main detail here for your benefit, but would like to invite you to revert to the previous post for more detail.

The idea is to show you the output of the ASM alert.log and result of the lost disk in the V$-views.

As with the previous post, the code examples in this one are for demonstration purposes only!

The setup in a nutshell

I am using Oracle Linux 7.2 with UEK4 as the host system; KVM and libvirt are responsible for the guests (virtual machines). The guest VM used for this example is named asmtest, and uses Oracle Linux 7.2 as well. There are 9 ASM disks – 8 for +DATA featuring normal redundancy in 2 failure groups. I added +RECO for the sake of completeness with external redundancy. This post is about +DATA. To keep it simple I used an Oracle Restart configuration patched to the July 2016 PSU.

Removing the disk

As with the previous example I am using libvirt to remove a “LUN” temporarily from the guest. And sure enough, the VM picks this up. This is the relevant output obtained via journalctl -f

Sep 29 15:34:36 asmtest kernel: sd 2:0:0:0: [sda] Synchronizing SCSI cache
Sep 29 15:34:36 asmtest kernel: sd 2:0:0:0: [sda]
Sep 29 15:34:36 asmtest kernel: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Sep 29 15:34:36 asmtest kernel: sd 2:0:0:0: [sda]
Sep 29 15:34:36 asmtest kernel: Sense Key : Illegal Request [current] 
Sep 29 15:34:36 asmtest kernel: sd 2:0:0:0: [sda]
Sep 29 15:34:36 asmtest kernel: Add. Sense: Logical unit not supported

In the previous post I keep referring to I copied the part of the ASM instance’s alert.log that showed how the disk repair timer was ticking down. This time I am simulating the case where – for whatever reason – the transient failure could not be fixed. In that case, this is what you would see in the alert.log:

2016-09-29 15:38:21.752000 +01:00
WARNING: Started Drop Disk Timeout for Disk 1 (DATA_0001) in group 1 with a value 600
WARNING: Disk 1 (DATA_0001) in group 1 will be dropped in: (600) secs on ASM inst 1
2016-09-29 15:41:25.379000 +01:00
WARNING: Disk 1 (DATA_0001) in group 1 will be dropped in: (416) secs on ASM inst 1
2016-09-29 15:44:29.012000 +01:00
WARNING: Disk 1 (DATA_0001) in group 1 will be dropped in: (232) secs on ASM inst 1
2016-09-29 15:47:32.643000 +01:00
WARNING: Disk 1 (DATA_0001) in group 1 will be dropped in: (48) secs on ASM inst 1
2016-09-29 15:50:36.259000 +01:00
WARNING: Disk 1 (DATA_0001) in group 1 will be dropped in: (0) secs on ASM inst 1

The last line in the above output is definitely NOT what you want to see. If at all possible, you should fix the problem causing DATA_0001 in this example from being dropped. If you see the message about the disk being dropped you are facing the inevitable rebalance operation. Here is the continued output from the ASM alert.log:

WARNING: PST-initiated drop of 1 disk(s) in group 1(.4232658126))
SQL> alter diskgroup DATA drop disk DATA_0001 force /* ASM SERVER */
NOTE: cache closing disk 1 of grp 1: (not open) DATA_0001
NOTE: requesting all-instance membership refresh for group=1
GMON updating for reconfiguration, group 1 at 40 for pid 23, osid 3216
NOTE: cache closing disk 1 of grp 1: (not open) DATA_0001
NOTE: group DATA: updated PST location: disk 0003 (PST copy 0)
NOTE: group DATA: updated PST location: disk 0000 (PST copy 1)
NOTE: group 1 PST updated.
NOTE: membership refresh pending for group 1/0xfc493cce (DATA)
GMON querying group 1 at 41 for pid 16, osid 2701
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA
SUCCESS: refreshed membership for 1/0xfc493cce (DATA)
SUCCESS: alter diskgroup DATA drop disk DATA_0001 force /* ASM SERVER */
SUCCESS: PST-initiated drop disk in group 1(4232658126))
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: starting rebalance of group 1/0xfc493cce (DATA) at power 1
Starting background process ARB0
ARB0 started with pid=20, OS id=3655 
NOTE: assigning ARB0 to group 1/0xfc493cce (DATA) with 1 parallel I/O
2016-09-29 15:50:40.257000 +01:00
NOTE: restored redundancy of control and online logs in DATA
NOTE: Rebalance has restored redundancy for any existing control file or redo log in disk group DATA
NOTE: restored redundancy of control and online logs in DATA
2016-09-29 15:50:51.655000 +01:00
NOTE: requesting all-instance membership refresh for group=1
GMON updating for reconfiguration, group 1 at 42 for pid 21, osid 3719
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA
NOTE: group DATA: updated PST location: disk 0003 (PST copy 0)
NOTE: group DATA: updated PST location: disk 0000 (PST copy 1)
NOTE: group 1 PST updated.
2016-09-29 15:50:54.647000 +01:00
GMON updating for reconfiguration, group 1 at 43 for pid 21, osid 3719
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA
NOTE: group DATA: updated PST location: disk 0003 (PST copy 0)
NOTE: group DATA: updated PST location: disk 0000 (PST copy 1)
NOTE: group 1 PST updated.
2016-09-29 15:50:57.571000 +01:00
NOTE: membership refresh pending for group 1/0xfc493cce (DATA)
GMON querying group 1 at 44 for pid 16, osid 2701
GMON querying group 1 at 45 for pid 16, osid 2701
NOTE: Disk _DROPPED_0001_DATA in mode 0x0 marked for de-assignment
SUCCESS: refreshed membership for 1/0xfc493cce (DATA)
NOTE: Attempting voting file refresh on diskgroup DATA
2016-09-29 15:51:03.589000 +01:00
NOTE: stopping process ARB0
SUCCESS: rebalance completed for group 1/0xfc493cce (DATA)

The last line indicates that the rebalance operation is complete. My disk group was almost empty, the rebalance operation took almost no time. But indeed, when you check V$ASM_DISK, the disk is gone:

SQL> r
  1  select name,path,disk_number,mount_status,header_status,mode_status,state,failgroup,repair_timer
  2* from v$asm_disk where group_number = 1 order by disk_number

NAME       PATH            DISK_NUMBER MOUNT_S HEADER_STATU MODE_ST STATE    FAILGROUP  REPAIR_TIMER
---------- --------------- ----------- ------- ------------ ------- -------- ---------- ------------
DATA_0000  /dev/asm-disk02           0 CACHED  MEMBER       ONLINE  NORMAL   BATCH1                0
DATA_0002  /dev/asm-disk03           2 CACHED  MEMBER       ONLINE  NORMAL   BATCH1                0
DATA_0003  /dev/asm-disk06           3 CACHED  MEMBER       ONLINE  NORMAL   BATCH2                0
DATA_0004  /dev/asm-disk07           4 CACHED  MEMBER       ONLINE  NORMAL   BATCH2                0
DATA_0005  /dev/asm-disk08           5 CACHED  MEMBER       ONLINE  NORMAL   BATCH2                0
DATA_0006  /dev/asm-disk04           6 CACHED  MEMBER       ONLINE  NORMAL   BATCH1                0
DATA_0007  /dev/asm-disk05           7 CACHED  MEMBER       ONLINE  NORMAL   BATCH2                0

7 rows selected.

SQL> select count(*), failgroup from v$asm_disk where name like 'DATA%' group by failgroup;

  COUNT(*) FAILGROUP
---------- ----------
         3 BATCH1
         4 BATCH2

My system is fully operational, and the rebalance did not run into any space problems. Space problems are the last thing you want to have when rebalancing. I did some research about this subject earlier and documented it in a blog post.

In a future blog post I am going to investigate what happens when a partner disk of DATA_0001 fails while the repair timer is ticking down.

Fixing a problem with the ASM spfile preventing RAC 12c from starting

This is a little note to myself on how to fix a corrupt spfile in clustered ASM. I hope you find it useful, too.

Let’s assume you made a change to the ASM (server) parameter file that causes an issue. You are most likely to notice this once CRS is restarted but parts of the stack fail to come up. If “crsctl check crs” mentions any component not started you can try to find out where in the bootstrap process you are stuck. Here is the output from my system.

[root@rac12pri1 ~]# crsctl stat res -t -init
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  OFFLINE                               STABLE
ora.cluster_interconnect.haip
      1        ONLINE  OFFLINE      rac12pri1                STARTING
ora.crf
      1        ONLINE  OFFLINE                               STABLE
ora.crsd
      1        ONLINE  OFFLINE                               STABLE
ora.cssd
      1        ONLINE  ONLINE       rac12pri1                STABLE
ora.cssdmonitor
      1        ONLINE  ONLINE       rac12pri1                STABLE
ora.ctssd
      1        ONLINE  ONLINE       rac12pri1                OBSERVER,STABLE
ora.diskmon
      1        OFFLINE OFFLINE                               STABLE
ora.drivers.acfs
      1        ONLINE  ONLINE       rac12pri1                STABLE
ora.evmd
      1        ONLINE  INTERMEDIATE rac12pri1                STABLE
ora.gipcd
      1        ONLINE  ONLINE       rac12pri1                STABLE
ora.gpnpd
      1        ONLINE  ONLINE       rac12pri1                STABLE
ora.mdnsd
      1        ONLINE  ONLINE       rac12pri1                STABLE
ora.storage
      1        ONLINE  OFFLINE                               STABLE
--------------------------------------------------------------------------------
[root@rac12pri1 ~]#

I noticed that lots of components are not started. If you are interested in the startup order and dependencies between processes you can find this documented in the Clusterware Administration and Deployment Guide – Chapter 1, Figure 1-2

Another useful piece of information is the Clusterware alert.log. Unlike Oracle Clusterware version 11.2 where log information was in the $GRID_HOME the 12c CRS logs moved to the ADR. A quick look at the alert.log showed


2015-07-28 09:16:51.247 [OCSSD(11611)]CRS-8500: Oracle Clusterware OCSSD process is starting with operating system process ID 11611
2015-07-28 09:16:52.347 [OCSSD(11611)]CRS-1713: CSSD daemon is started in hub mode
2015-07-28 09:16:57.974 [OCSSD(11611)]CRS-1707: Lease acquisition for node rac12pri1 number 1 completed
2015-07-28 09:16:59.076 [OCSSD(11611)]CRS-1605: CSSD voting file is online: /dev/vdc1; details in /u01/app/oracle/diag/crs/rac12pri1/crs/trace/ocssd.trc.
2015-07-28 09:16:59.089 [OCSSD(11611)]CRS-1672: The number of voting files currently available 1 has fallen to the minimum number of voting files required 1.
2015-07-28 09:17:08.198 [OCSSD(11611)]CRS-1601: CSSD Reconfiguration complete. Active nodes are rac12pri1 .
2015-07-28 09:17:10.276 [OCTSSD(11694)]CRS-8500: Oracle Clusterware OCTSSD process is starting with operating system process ID 11694
2015-07-28 09:17:11.261 [OCTSSD(11694)]CRS-2403: The Cluster Time Synchronization Service on host rac12pri1 is in observer mode.
2015-07-28 09:17:11.469 [OCTSSD(11694)]CRS-2407: The new Cluster Time Synchronization Service reference node is host rac12pri1.
2015-07-28 09:17:11.469 [OCTSSD(11694)]CRS-2401: The Cluster Time Synchronization Service started on host rac12pri1.
2015-07-28 09:17:43.016 [ORAROOTAGENT(11376)]CRS-5019: All OCR locations are on ASM disk groups [CHM], and none of these disk groups are
   mounted. Details are at "(:CLSN00140:)" in "/u01/app/oracle/diag/crs/rac12pri1/crs/trace/ohasd_orarootagent_root.trc".
2015-07-28 09:18:05.139 [OCSSD(11611)]CRS-1625: Node rac12pri2, number 2, was shut down
2015-07-28 09:18:05.139 [OCSSD(11611)]CRS-1625: Node rac12pri3, number 3, was shut down
2015-07-28 09:18:05.139 [OCSSD(11611)]CRS-1625: Node rac12pri4, number 4, was shut down

In other words, CSSD has found the block device I use for the voting files, and concludes its initial work. However, the oracle ROOT agent (orarootagent) cannot proceed since none of the OCR locations on ASM can be opened. Checking the log file at that particular time I can see where the problem is:

2015-07-28 09:17:42.989946*:kgfo.c@2846: kgfoCheckMount dg=CHM ok=0
2015-07-28 09:17:42.990045 : USRTHRD:3741497088: {0:9:3} -- trace dump on error exit --

2015-07-28 09:17:42.990057 : USRTHRD:3741497088: {0:9:3} Error [kgfoAl06] in [kgfokge] at kgfo.c:2850

2015-07-28 09:17:42.990067 : USRTHRD:3741497088: {0:9:3} ORA-15077: could not locate ASM instance serving a
  required diskgroup

2015-07-28 09:17:42.990077 : USRTHRD:3741497088: {0:9:3} Category: 7

2015-07-28 09:17:42.990115 : USRTHRD:3741497088: {0:9:3} DepInfo: 15077

2015-07-28 09:17:42.990382 : USRTHRD:3741497088: {0:9:3} -- trace dump end --

2015-07-28 09:17:42.990408 :CLSDYNAM:3741497088: [ora.storage]{0:9:3} [start] retcode = 7, kgfoCheckMount(CHM)
2015-07-28 09:17:42.990423 :CLSDYNAM:3741497088: [ora.storage]{0:9:3} [start] (null) category: 7, operation:
 kgfoAl06, loc: kgfokge, OS error: 15077,
 other: ORA-15077: could not locate ASM instance serving a required diskgroup

So there is not a single ASM instance that could serve the required diskgroup. Hmmm… So maybe I have to back out the change I just made. I have developed a habit of creating backups (pfiles) of spfiles prior to implementing changes. But even if there is no backup of the spfile I can still get the system back, and here are the steps I used. Just as with the database, I need to

  1. Create a temporary pfile on the file system
  2. Start ASM using this temporary pfile
  3. Create a backup my (bad) spfile from the ASM disk group
  4. Extract all parameters
  5. Create a proper pfile that I use to start the cluster with
  6. Convert that to a spfile in ASM

Fixing the problem

The first step is to create a temporary pfile. Using the ASM instance’s alert.log I can go scroll up to a point in time before the change I made to check which parameters are needed. These following are just an example, your settings are different!

...
Using parameter settings in server-side spfile +CHM/rac12pri/ASMPARAMETERFILE/registry.253.885820125
System parameters with non-default values:
  large_pool_size          = 12M
  remote_login_passwordfile= "EXCLUSIVE"
  asm_diskstring           = "/dev/vd*1"
  asm_diskgroups           = "DATA"
  asm_diskgroups           = "RECO"
  asm_power_limit          = 1
NOTE: remote asm mode is remote (mode 0x202; from cluster type)
Cluster communication is configured to use the following interface(s) for this instance
  169.254.106.70
  169.254.184.41
cluster interconnect IPC version: Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
...

The new pfile, /tmp/init+ASM1.ora, has the following contents:

  large_pool_size          = 12M
  remote_login_passwordfile= "EXCLUSIVE"
  asm_diskstring           = "/dev/vd*1"
  asm_diskgroups           = "DATA"
  asm_diskgroups           = "RECO"
  asm_power_limit          = 1

I can now start the first ASM instance:

[oracle@rac12pri1 ~]$ sqlplus / as sysasm

SQL*Plus: Release 12.1.0.2.0 Production on Tue Jul 28 09:23:23 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup pfile='/tmp/init+ASM1.ora'
ASM instance started

Total System Global Area 1140850688 bytes
Fixed Size                  2933400 bytes
Variable Size            1112751464 bytes
ASM Cache                  25165824 bytes
ASM diskgroups mounted
SQL>

The alert.log also records the location of the spfile-you should back this up now (using asmcmd or any other tool). Using the backup, you should be able to reconstruct your spfile, but make sure to take the offending parameter out.

I decided to create the spfile as spfileASM.ora in ASM. I amended my temporary pfile with the settings from the recovered spfile and put it back into the cluster.


SQL> create spfile='+CHM/rac12pri/spfileASM.ora' from pfile='/tmp/init+ASM1.ora';

File created.

Why the name change? You cannot create files in ASM that have OMF names. Trying to create the spfile with the original name will cause an error:


SQL> create spfile='+CHM/rac12pri/ASMPARAMETERFILE/registry.253.885820125' from pfile='/tmp/init+ASM1.ora';
create spfile='+CHM/rac12pri/ASMPARAMETERFILE/registry.253.885820125' from pfile='/tmp/init+ASM1.ora'
*
ERROR at line 1:
ORA-17502: ksfdcre:4 Failed to create file
+CHM/rac12pri/ASMPARAMETERFILE/registry.253.885820125
ORA-15177: cannot operate on system aliases

The really nice thing is that this is reflected in the Grid Plug And Play (GPNP) profile immediately. The ASM alert.log showed:

2015-07-28 09:25:01.323000 +01:00
NOTE: updated gpnp profile ASM SPFILE to
NOTE: header on disk 0 advanced to format #2 using fcn 0.0
2015-07-28 09:25:58.332000 +01:00
NOTE: updated gpnp profile ASM diskstring: /dev/vd*1
NOTE: updated gpnp profile ASM diskstring: /dev/vd*1
NOTE: updated gpnp profile ASM SPFILE to +CHM/rac12pri/spfileASM.ora

And the XML profile is updated too (reformatted for better readability)


[oracle@rac12pri1 ~]$ gpnptool get -o-

<?xml version="1.0" encoding="UTF-8"?>
<gpnp:GPnP-Profile Version="1.0" xmlns="http://www.grid-pnp.org/2005/11/gpnp-profile"
  xmlns:gpnp="http://www.grid-pnp.org/2005/11/gpnp-profile"
  xmlns:orcl="http://www.oracle.com/gpnp/2005/11/gpnp-profile"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd"
  ProfileSequence="7" ClusterUId="886a0e42a...5d805357c76a"
  ClusterName="rac12pri" PALocation="">
  <gpnp:Network-Profile>
    <gpnp:HostNetwork id="gen" HostName="*">
     <gpnp:Network id="net1" IP="192.168.100.0" Adapter="eth0" Use="public"/>
     <gpnp:Network id="net2" IP="192.168.101.0" Adapter="eth1" Use="cluster_interconnect"/>
     <gpnp:Network id="net3" IP="192.168.102.0" Adapter="eth2" Use="asm,cluster_interconnect"/>
    </gpnp:HostNetwork>
  </gpnp:Network-Profile>
  <orcl:CSS-Profile id="css" DiscoveryString="+asm" LeaseDuration="400"/>
  <orcl:ASM-Profile id="asm" DiscoveryString="/dev/vd*1" SPFile="+CHM/rac12pri/spfileASM.ora" Mode="remote"/>
  <ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#">...</ds:Signature>
</gpnp:GPnP-Profile>

This should be it-the correct values have been restored, the spfile is back on shared storage, and I should be able to start with this combination. After having issued the stop/start commands to CRS it was indeed all well:

[root@rac12pri1 ~]# crsctl check cluster
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online   

[root@rac12pri1 ~]# crsctl stat res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr
               ONLINE  ONLINE       rac12pri1                STABLE
ora.CHM.dg
               ONLINE  ONLINE       rac12pri1                STABLE
ora.DATA.dg
               ONLINE  ONLINE       rac12pri1                STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac12pri1                STABLE
ora.RECO.dg
               ONLINE  ONLINE       rac12pri1                STABLE
ora.net1.network
               ONLINE  ONLINE       rac12pri1                STABLE
ora.ons
               ONLINE  ONLINE       rac12pri1                STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac12pri1                STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       rac12pri1                STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       rac12pri1                STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       rac12pri1                169.254.1.137 192.16
                                                             8.101.10 192.168.102
                                                             .10,STABLE
ora.asm
      1        ONLINE  ONLINE       rac12pri1                STABLE
      2        ONLINE  OFFLINE                               STABLE
      3        ONLINE  OFFLINE                               STABLE
ora.cdb.db
      1        OFFLINE OFFLINE                               Instance Shutdown,ST
                                                             ABLE
      2        OFFLINE OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
      4        OFFLINE OFFLINE                               STABLE
ora.cvu
      1        ONLINE  ONLINE       rac12pri1                STABLE
ora.mgmtdb
      1        ONLINE  ONLINE       rac12pri1                Open,STABLE
ora.ncdb.db
      1        ONLINE  ONLINE       rac12pri1                Open,STABLE
      2        ONLINE  OFFLINE                               STABLE
      3        ONLINE  OFFLINE                               STABLE
      4        ONLINE  OFFLINE                               STABLE
ora.ncdb.fotest.svc
      1        ONLINE  OFFLINE                               STABLE
      2        ONLINE  ONLINE       rac12pri1                STABLE
ora.oc4j
      1        ONLINE  ONLINE       rac12pri1                STABLE
ora.rac12pri1.vip
      1        ONLINE  ONLINE       rac12pri1                STABLE
ora.rac12pri2.vip
      1        ONLINE  INTERMEDIATE rac12pri1                FAILED OVER,STABLE
ora.rac12pri3.vip
      1        ONLINE  INTERMEDIATE rac12pri1                FAILED OVER,STABLE
ora.rac12pri4.vip
      1        ONLINE  INTERMEDIATE rac12pri1                FAILED OVER,STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       rac12pri1                STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       rac12pri1                STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       rac12pri1                STABLE
--------------------------------------------------------------------------------

Time to start Clusterware on the other nodes and to report “We are back and running” :)

Reference

UKOUG post conference geek update part 1 – ACFS for Oracle databases

One of the many interesting things I heard at the conference this time around was that Oracle’s future direction includes the use of database files on ACFS. When ACFS came out this was strictly ruled out, but has been possible for a little while now, I believe with 12.1.0.1.0. With the Oracle Database Appliance (ODA) using this deployment option and hearing about it at the conference, a little further investigation was in order. During one of the presentation @OracleRACPM Markus Michalewicz had a reference to a script that I didn’t know on his slides. The script is called gDBClone, and I wanted to see how it works. The idea is that the script can be used to create a snap-clone of a database if the source is on ACFS and in archivelog mode.

As it turned out there were a few hurdles along the way and I will point them out so you don’t run into the same issues.

UPDATE: these hurdles might be there since Oracle databases aren’t supported on ACFS in Oracle Restart: https://docs.oracle.com/database/121/OSTMG/asmfilesystem.htm#OSTMG95961 Please consider yourself warned! For the rest of the article let’s pretend that this is a clustered environment. The article is therefore purely educational and no encouragement to do this in real life.

The script and associated white paper assume that you have two systems-production and dev/test/uat. The goal of the clone procedure is to be able to create a copy of your live database on your dev/test/uat cluster. This database can then be used as the source for ACFS snapshots. This can be represented in ASCII art:


+------+   clone   +------------------+
| PROD |   ---->   |      MASTER      |
+------+           |      /     \     |  snap #1
                   |     /       \    |  snap #2
                   | CLONE1    CLONE2 |
                   +------------------+

It is by no means required to follow this approach, and if nothing else then you can use the clone script to run a RMAN duplication in a single command. I once wrote a script to do the same but this was a truly complex thing to do.

For this article I’ll assume that you clone PROD (named ORCL) to MASTER, and snap MASTER to CLONE1.

The setup

Since my “travel-lab” is not a RAC cluster I opted for an installation of 12.1.0.2.1 for Oracle Restart and the database to keep it simple.

ACFS storage will be provided by and ADVM volume from disk group data. In my case this was quite easy to accomplish. Since this was an 12.1 system anyway I created my DATA disk group with ASM, RDBMS and ADVM compatibility for 12.1.0.2.0.

You create the ASM Dynamic Volume Manager (ADVM) volume on top of the ASM disk group, in my case on DATA. This is a simple task and can be performed by a variety of tools, I opted for a call to asmcmd:

ASMCMD> volcreate
usage: volcreate -G <diskgroup> -s <size> [ --column <number> ] [ --width <stripe_width> ]
[--redundancy {high|mirror|unprotected} ] [--primary {hot|cold}] [--secondary {hot|cold}] <volume>
help:  help volcreate
ASMCMD> volcreate -G data -s 10G volume1
ASMCMD> volinfo --all
Diskgroup Name: DATA

         Volume Name: VOLUME1
         Volume Device: /dev/asm/volume1-162
         State: ENABLED
         Size (MB): 10240
         Resize Unit (MB): 64
         Redundancy: UNPROT
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage:
         Mountpath:

Initially I got a message that ASM could not communicate with the (ASM) Volume driver.

SQL> /* ASMCMD */alter diskgroup data add volume 'volume1' size 10G
ORA-15032: not all alterations performed
ORA-15477: cannot communicate with the volume driver
ERROR: /* ASMCMD */alter diskgroup data add volume 'volume1' size 10G

This was solved by modprobe calls to the oracle kernel modules.

[root@server5 ~]# modprobe  oracleacfs
[root@server5 ~]# modprobe  oracleadvm
[root@server5 ~]# modprobe  oracleoks
[root@server5 ~]# lsmod | grep oracle
oracleadvm            507006  7
oracleacfs           3307457  1
oracleoks             505749  2 oracleadvm,oracleacfs

There is a known issue with the execution of udev rules (/etc/udev/rules.d/55-usm.rules) that might delay the setting of permissions. On my system a udevadm trigger solved it. Still odd (Oracle Linux 6.6/ UEK 3 3.8.13-44.1.1.el6uek.x86_64), especially since a call to acfsdriverstate supported stated it was supported.

Once the volume is created it needs to be formatted using ACFS. This can be done in this way:

[oracle@server5 ~]$  mkfs -t acfs /dev/asm/volume1-162
mkfs.acfs: version                   = 12.1.0.2.0
mkfs.acfs: on-disk version           = 39.0
mkfs.acfs: volume                    = /dev/asm/volume1-162
mkfs.acfs: volume size               = 10737418240  (  10.00 GB )
mkfs.acfs: Format complete.

Trying to register the file system in the ACFS registry pointed me to the first problem with the procedure on Oracle Restart:

[root@server5 ~]# acfsutil registry -a /dev/asm/volume1-162 /u01/oradata
Usage: srvctl <command> <object> [<options>]
    commands: enable|disable|start|stop|status|add|remove|modify|update|getenv|setenv|
       unsetenv|config|upgrade|downgrade
    objects: database|service|asm|diskgroup|listener|home|ons
For detailed help on each command and object and its options use:
  srvctl <command> -help [-compatible] or
  srvctl <command> <object> -help [-compatible]
PRKO-2012 : filesystem object is not supported in Oracle Restart
acfsutil registry: ACFS-03111: unable to add ACFS mount /u01/oradata within Oracle Registry

Interesting-but not a problem in the lab. I usually mount file systems where in my opinion they logically belong to. In this case I mounted the file system to /u01/oradata. Spoiler alert: this is not what you are supposed to do if you want to use the gDBClone script.

To cut a long story short, the mount point was assumed to be in /acfs for a snap’d or cloned database. The script also assumes that your system is a RAC environment, and I found it not to work well in 12.1 at all due to the way it tries to get the database version from the OCR (OLR) profile. First the new ACFS file system is mounted, then made accessible to the oracle user:

[root@server5 ~]# mount -t acfs /dev/asm/volume1-162 /acfs
[root@server5 ~]# chown -R oracle:dba /acfs
[root@server5 ~]# mount | grep acfs
/dev/asm/volume1-162 on /acfs type acfs (rw)

Creating the MASTER database

My source system resides in ASM, and there is no way of creating COW clones in ASM. The MASTER database must be moved to ACFS first as a result, from where you can take storage snapshots.

A quick hack was required since I only had 1 machine, so I created the source database (PROD in the ASCII art example) as “orcl” using the following call to dbca and setting it to archivelog mode:

dbca -silent -createDatabase -templateName General_Purpose.dbc  -gdbname orcl \
> -sid orcl -sysPassword pwd1 -systemPassword pwd2  -emConfiguration none  \
> -storageType ASM  -asmsnmpPassword pwd3 -diskGroupName data -recoveryGroupName reco  \
> -totalMemory 2048

The next step is to create the MASTER database. The gDBClone script checks if a database is on ACFS in function checkIfACFS(). If a database is found to be on ASM (by checking the SPFILE location in the Clusterware profile) it requires it to be CLONED as opposed to SNAPped. Here is the command to clone ORCL to MASTER. You must set your environment to an RDBMS home before executing the script.

# ./gDBClone clone -sdbname orcl -sdbhost server5 -sdbport 1521 -tdbname master -acfs /acfs -debug

It uses a RMAN duplicate under the covers. The arguments are almost self-explanatory. It takes the location of the source database (can be remote) and where you want to store the database. Since I desperately want to store the clone on ACFS I specified it in the command. The -debug flag prints more verbose output, a lot of information is also found in /var/log/gDBName/. Note that the script is to be run as root :(

To get there a few tricks were necessary in Oracle Restart environments, I’ll feed them back to Oracle to see if they can be added to the script. You probably won’t encounter problems when using a clustered 11.2 installation.

At one point the script checks the VERSION flag in the database resource profile (crsctl stat res ora..db -p | grep ‘^VERSION’, and since that field no longer shows up in 12.1 the variable is undefined in perl and the execution fails. The problem with Oracle Restart is related to setting the database type to SINGLE on the local host (srvctl add database … -c SINGLE -x …). Oracle Restart doesn’t understand those switches. Also you can’t set the db_unique_name in 12c to a name of a database already registered in the OCR. Moving the step to register the database further down in the execution helped.

Please don’t get me wrong: the script is very neat in that it allows you to run an RMAN duplicate command over the network, potentially creating the backup on the fly. In 12c RMAN will pull backup pieces if they exist instead of creating a backup on the fly to reduce its impact on the production database. Oh and I forgot-you can even convert it to a clustered database if it is not already.

Creating an ACFS clone database

Next up is the creation of the CLONE database. My first attempts were unsuccessful. The main trick seems to be to keep the ACFS mount underneath the / (root) file system. Mounting it elsewhere caused the script to fail. If you can read perl, check the $acfs variable and checkIfACFS() routine to understand why. Also, the compatibility of your ASM diskgroup containing the volume has to be greater than 11.2.0.3 or you get this error:

[root@server5 ~]# ./gDBClone snap -sdbname master -tdbname clone1
2014-12-12 11:03:53: I Getting host info...
2014-12-12 11:03:53: I Starting.....
2014-12-12 11:03:53: I Validating environment.....

Enter the source MASTER SYSDBA password:
2014-12-12 11:03:56: I Getting OH version...
2014-12-12 11:04:04: I Checking SCAN listener server5.example.com:1521...
2014-12-12 11:04:05: I Checking database CLONE1 existence...
2014-12-12 11:04:05: I Checking registered instance CLONE1 ...
2014-12-12 11:04:10: I Checking if the Source Database MASTER it's on ASM
2014-12-12 11:04:10: I Source Database MASTER it's on ACFS
2014-12-12 11:04:10: I Checking snapshot CLONE1 existence
2014-12-12 11:04:10: I Setting up clone environment....
2014-12-12 11:04:10: I Starting auxiliary listener....
2014-12-12 11:05:10: I Creating ACFS snapshot.....
2014-12-12 11:05:10: I Start/stop MASTER to check consistency.
2014-12-12 11:05:42: I Checking if the source database MASTER is stored on an ACFS snapshot
acfsutil snap create: ACFS-03048: Snapshot operation could not complete.
acfsutil snap create: ACFS-03174: The Oracle ASM Dynamic Volume Manager (Oracle ADVM) compatibility
    attribute for the disk group is less than 11.2.0.3.0.
2014-12-12 11:05:42: E Error getting ACFS snapshot

This can be fixed quite easily provided that you don’t break anything. Remember that compatibility can be raised but never lowered. With everything in place, the clone is quick:

[root@server5 ~]# ./gDBClone snap -sdbname master -tdbname clone1
2014-12-12 11:20:17: I Getting host info...
2014-12-12 11:20:17: I Starting.....
2014-12-12 11:20:17: I Validating environment.....

Enter the source MASTER SYSDBA password:
2014-12-12 11:20:19: I Getting OH version...
2014-12-12 11:20:27: I Checking SCAN listener server5.example.com:1521...
2014-12-12 11:20:27: I Checking database CLONE1 existence...
2014-12-12 11:20:28: I Checking registered instance CLONE1 ...
2014-12-12 11:20:31: I Checking if the Source Database MASTER it's on ASM
2014-12-12 11:20:31: I Source Database MASTER it's on ACFS
2014-12-12 11:20:31: I Checking snapshot CLONE1 existence
2014-12-12 11:20:31: I Setting up clone environment....
2014-12-12 11:20:31: I Starting auxiliary listener....
2014-12-12 11:21:31: I Creating ACFS snapshot.....
2014-12-12 11:21:31: I Start/stop MASTER to check consistency.
2014-12-12 11:21:58: I Checking if the source database MASTER is stored on an ACFS snapshot
2014-12-12 11:21:58: I Setting up snapshot database.....
2014-12-12 11:21:58: I Creating Clone parameter files
2014-12-12 11:22:01: I Activating clone database.....
PRKO-2002 : Invalid command line option: -j
2014-12-12 11:23:04: I Successfully created clone "CLONE1" database

Now what does this database look like? I remember a conversation with Andy Colvin about this since he saw it on our ODA first, but consider this:

[oracle@server5 ~]$ sq

SQL*Plus: Release 12.1.0.2.0 Production on Fri Dec 12 11:51:22 2014

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
/acfs/.ACFS/snaps/CLONE1/MASTER/MASTER/datafile/o1_mf_system_b8p3pxs7_.dbf
/acfs/.ACFS/snaps/CLONE1/MASTER/MASTER/datafile/o1_mf_sysaux_b8p3pz4f_.dbf
/acfs/.ACFS/snaps/CLONE1/MASTER/MASTER/datafile/o1_mf_undotbs1_b8p3q2q0_.dbf
/acfs/.ACFS/snaps/CLONE1/MASTER/MASTER/datafile/o1_mf_users_b8p3qo2o_.dbf

Now the interesting thing is when you look into that top-level ACFS directory:

SQL> !ls -la /acfs/
total 76
drwxr-xr-x.  5 oracle dba       4096 Dec 12 10:35 .
dr-xr-xr-x. 27 root   root      4096 Dec 12 10:30 ..
drwx------.  2 oracle dba      65536 Dec 12 10:31 lost+found
drwxr-xr-x.  3 oracle oinstall  4096 Dec 12 10:52 MASTER

No .ACFS file! Even more interestingly, you can actually see what’s in the .ACFS directory when referring to it directly

SQL> !ls -l /acfs/.ACFS
total 8
drwxrwx---. 6 root root 4096 Dec 12 10:31 repl
drwxr-xr-x. 3 root root 4096 Dec 12 11:21 snaps

What I want to do next is to run a few performance benchmarks on a database in ACFS to see how it holds up. But that’s for another day…

Can you have high redundancy files in a normal redundancy diskgroup?

One of the perks of teaching classes is that I get to research questions asked. In the last Exadata Administration Class I taught someone asked: can you have your disk groups in Exadata on normal redundancy yet have certain databases use high redundancy? This would be a good interview question …

The answer is yes, which I remembered from researching material on the 11g RAC book but I wanted to prove that it is the case.

Update: I planned a second blog post where I wanted to test the effect but Alex Fatkulin was quicker, and I promise I didn’t see his post when I wrote mine. Otherwise there probably wouldn’t have been one :) In summary, you aren’t really any better protected. The disk group remains at normal redundancy, even with the data files in high. Looking at Alex’s results (and I encourage you to do so) I concur with his summary that although you have a 3rd copy of the extent protecting you from corruption, you don’t have higher resilience.

This is not Exadata specific by the way. When I face a question around ASM and internals my first idea is to use an Internet search engine and look up the ASM work by Luca Canali. This page is almost always of relevance when looking at ASM file structures: https://twiki.cern.ch/twiki/bin/view/PDBService/ASM_Internals.

ASM Templates

I am not aware of many users of ASM who know about ASM templates, although we all make use of them. The templates define the striping and mirroring of files, and are explained in the ASM Admin Guide chapter 5. You are not limited to using the Oracle provided templates, you can create your own as well, which is key to this post. I headed over to the ASM instance on my cluster and created the template:

SQL> alter diskgroup data add template adminclassHigh attributes(high);

Diskgroup altered.

With the template in place I can create a high redundancy tablespace for example in my normal redundancy diskgroup:

SQL> select name, group_number, type from v$asm_diskgroup;

NAME                           GROUP_NUMBER TYPE
------------------------------ ------------ ------
RECO                                      3 NORMAL
DBFS_DG                                   2 NORMAL
DATA                                      1 NORMAL

Unlike what the documentation suggests you do not need to change db_create_file_dest for this to work.

SQL> show parameter db_create_file_dest

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_create_file_dest                  string      +DATA

Let’s put that template to use.

SQL> create tablespace testhigh datafile '+data(adminclassHigh)' size 10m;

Tablespace created.

SQL> select file_name from dba_data_Files;

FILE_NAME
-----------------------------------------------------------------------------------------------
+DATA/PRI/02810A3FB7F65856E053D908A8C003E1/DATAFILE/system.363.857664247
+DATA/PRI/02810A3FB7F65856E053D908A8C003E1/DATAFILE/sysaux.364.857664247
+DATA/PRI/02810A3FB7F65856E053D908A8C003E1/DATAFILE/users.365.857664253
+DATA/PRI/02810A3FB7F65856E053D908A8C003E1/DATAFILE/testhigh.357.857880753

Note that I specify the newly created template in () in the datafile clause. The tablespace has been created, next I need the file number and the incarnation to look up how the extents are distributed.

SQL> select DISK_KFFXP,AU_KFFXP,PXN_KFFXP,XNUM_KFFXP,LXN_KFFXP 
  2   from x$kffxp  where number_kffxp = 357 and incarn_kffxp = 857880753;

DISK_KFFXP   AU_KFFXP  PXN_KFFXP XNUM_KFFXP  LXN_KFFXP
---------- ---------- ---------- ---------- ----------
         4      73706          0          0          0
        23      73767          1          0          1
        24      28895          2          0          2
         1      73679          3          1          0
        27      30015          4          1          1
        22      73717          5          1          2
        21      73725          6          2          0
         7      73698          7          2          1
        33      73707          8          2          2

9 rows selected.

What looks a little cryptic can be translated using Luca’s notes as:

  • DISK is the ASM disk where the extent is located (as in V$ASM_DISK)
  • AU indicates the relative position of the allocation unit from the beginning of the disk
  • PXN is the progressive extent number ( = actual extent)
  • XNUM is the ASM file extent number (mirrored extent pairs have the same extent value)
  • LXN indicates the type of extent: 0 = primary extent, 1 = first mirror copy, 2 = second mirror copy

So you can see that each primary extent has 2 mirror copies, also known as high redundancy.

If you omit the template in the datafile clause you get what you’d expect: normal redundancy.

SQL> drop tablespace testhigh including contents and datafiles;

Tablespace dropped.

SQL> create tablespace testhigh datafile '+data' size 10m;

Tablespace created.

SQL>  select file_name from dba_data_Files;

FILE_NAME
-----------------------------------------------------------------------------------------------
+DATA/PRI/02810A3FB7F65856E053D908A8C003E1/DATAFILE/system.363.857664247
+DATA/PRI/02810A3FB7F65856E053D908A8C003E1/DATAFILE/sysaux.364.857664247
+DATA/PRI/02810A3FB7F65856E053D908A8C003E1/DATAFILE/users.365.857664253
+DATA/PRI/02810A3FB7F65856E053D908A8C003E1/DATAFILE/testhigh.357.857880927

And on ASM:

SQL> select DISK_KFFXP,AU_KFFXP,PXN_KFFXP,XNUM_KFFXP,LXN_KFFXP
  2  from x$kffxp  where number_kffxp = 357 and incarn_kffxp = 857880927;

DISK_KFFXP   AU_KFFXP  PXN_KFFXP XNUM_KFFXP  LXN_KFFXP
---------- ---------- ---------- ---------- ----------
        30      73724          0          0          0
        16      73749          1          0          1
         0      73690          2          1          0
        28      73682          3          1          1
        17      73722          4          2          0
        10      73697          5          2          1

6 rows selected.

As you can see each extent has 1 mirror copy, not 2.