Martins Blog

Trying to explain complex things in simple terms

Implications of threaded_execution = true in 12c

Posted by Martin Bach on July 21, 2014

I had an interesting discussion as part of my latest presentation at the UKOUG RAC CIA & Database Combined SIG. Part of my talk was about the implications of the new threaded execution model in Oracle.

Since “we do not use Windows” (except for gaming) I can’t compare the Windows thread model to the new 12c implementation on UNIX. There are however interesting implications when switching to the new model, some of which I’d like to demonstrate here. First of all, threaded execution is not enabled by default. With 12.1.0.1.3 on top of Oracle Restart you get the either all or a subset of the following background and auxiliary processes for a CDB:

[oracle@london ~]$ ps -eLf | egrep -i "UID|$ORACLE_SID"
UID        PID  PPID   LWP  C NLWP STIME TTY          TIME CMD
oracle    6041     1  6041  0    1 08:56 ?        00:00:00 ora_pmon_CDB1
oracle    6043     1  6043  0    1 08:56 ?        00:00:00 ora_psp0_CDB1
oracle    6045     1  6045  2    1 08:56 ?        00:00:08 ora_vktm_CDB1
oracle    6049     1  6049  0    1 08:56 ?        00:00:00 ora_gen0_CDB1
oracle    6051     1  6051  0    1 08:56 ?        00:00:00 ora_mman_CDB1
oracle    6055     1  6055  0    1 08:56 ?        00:00:00 ora_diag_CDB1
oracle    6057     1  6057  0    1 08:56 ?        00:00:00 ora_dbrm_CDB1
oracle    6059     1  6059  0    1 08:56 ?        00:00:00 ora_dia0_CDB1
oracle    6061     1  6061  0    1 08:56 ?        00:00:00 ora_dbw0_CDB1
oracle    6063     1  6063  0    1 08:56 ?        00:00:00 ora_lgwr_CDB1
oracle    6065     1  6065  0    1 08:56 ?        00:00:00 ora_ckpt_CDB1
oracle    6067     1  6067  0    1 08:56 ?        00:00:00 ora_lg00_CDB1
oracle    6069     1  6069  0    1 08:56 ?        00:00:00 ora_lg01_CDB1
oracle    6071     1  6071  0    1 08:56 ?        00:00:00 ora_smon_CDB1
oracle    6073     1  6073  0    1 08:56 ?        00:00:00 ora_reco_CDB1
oracle    6075     1  6075  0    1 08:56 ?        00:00:00 ora_lreg_CDB1
oracle    6077     1  6077  0    1 08:56 ?        00:00:00 ora_rbal_CDB1
oracle    6079     1  6079  0    1 08:56 ?        00:00:00 ora_asmb_CDB1
oracle    6081     1  6081  0    1 08:56 ?        00:00:00 ora_mmon_CDB1
oracle    6083     1  6083  0    1 08:56 ?        00:00:00 ora_mmnl_CDB1
grid      6086     1  6086  0    1 08:56 ?        00:00:00 oracle+ASM_asmb_cdb1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle    6087     1  6087  0    1 08:56 ?        00:00:00 ora_d000_CDB1
oracle    6089     1  6089  0    1 08:56 ?        00:00:00 ora_s000_CDB1
oracle    6092     1  6092  0    1 08:56 ?        00:00:00 ora_mark_CDB1
oracle    6094     1  6094  0    1 08:56 ?        00:00:00 ora_ocf0_CDB1
oracle    6109     1  6109  0    1 08:56 ?        00:00:00 ora_o000_CDB1
grid      6111     1  6111  0    1 08:56 ?        00:00:00 oracle+ASM_ocf0_cdb1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
grid      6113     1  6113  0    1 08:56 ?        00:00:00 oracle+ASM_o000_cdb1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle    6118     1  6118  0    1 08:56 ?        00:00:00 ora_nss2_CDB1
oracle    6148     1  6148  0    1 08:56 ?        00:00:01 ora_p000_CDB1
oracle    6150     1  6150  0    1 08:56 ?        00:00:02 ora_p001_CDB1
oracle    6154     1  6154  0    1 08:56 ?        00:00:00 ora_o001_CDB1
grid      6156     1  6156  0    1 08:56 ?        00:00:00 oracle+ASM_o001_cdb1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle    6158     1  6158  0    1 08:56 ?        00:00:00 ora_tmon_CDB1
oracle    6160     1  6160  0    1 08:56 ?        00:00:00 ora_arc0_CDB1
oracle    6162     1  6162  0    1 08:56 ?        00:00:00 ora_arc1_CDB1
oracle    6164     1  6164  0    1 08:56 ?        00:00:00 ora_arc2_CDB1
oracle    6166     1  6166  0    1 08:56 ?        00:00:00 ora_arc3_CDB1
oracle    6169     1  6169  0    1 08:57 ?        00:00:00 ora_tt00_CDB1
oracle    6171     1  6171  0    1 08:57 ?        00:00:00 ora_smco_CDB1
oracle    6176     1  6176  0    1 08:57 ?        00:00:00 ora_aqpc_CDB1
oracle    6180     1  6180  0    1 08:57 ?        00:00:00 ora_p002_CDB1
oracle    6182     1  6182  0    1 08:57 ?        00:00:00 ora_p003_CDB1
oracle    6184     1  6184  0    1 08:57 ?        00:00:00 ora_p004_CDB1
oracle    6186     1  6186  0    1 08:57 ?        00:00:00 ora_p005_CDB1
oracle    6188     1  6188  0    1 08:57 ?        00:00:00 ora_p006_CDB1
oracle    6190     1  6190  0    1 08:57 ?        00:00:00 ora_p007_CDB1
oracle    6192     1  6192  0    1 08:57 ?        00:00:00 ora_qm02_CDB1
oracle    6194     1  6194  0    1 08:57 ?        00:00:00 ora_qm00_CDB1
oracle    6196     1  6196  0    1 08:57 ?        00:00:00 ora_q002_CDB1
oracle    6200     1  6200  0    1 08:57 ?        00:00:00 ora_q004_CDB1
oracle    6202     1  6202  0    1 08:57 ?        00:00:00 ora_q005_CDB1
oracle    6204     1  6204  0    1 08:57 ?        00:00:00 ora_q006_CDB1
oracle    6206     1  6206  0    1 08:57 ?        00:00:00 ora_q007_CDB1
oracle    6208     1  6208  0    1 08:57 ?        00:00:00 ora_q008_CDB1
oracle    6210     1  6210  0    1 08:57 ?        00:00:00 ora_q009_CDB1
oracle    6212     1  6212  0    1 08:57 ?        00:00:00 ora_q00a_CDB1
oracle    6292     1  6292  0    1 08:57 ?        00:00:00 ora_w000_CDB1
oracle    6318     1  6318  0    1 08:57 ?        00:00:00 ora_cjq0_CDB1
oracle    6320     1  6320  0    1 08:57 ?        00:00:00 oracleCDB1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle    6322     1  6322  0    1 08:57 ?        00:00:01 oracleCDB1 (LOCAL=NO)
oracle    6324     1  6324  0    1 08:57 ?        00:00:00 oracleCDB1 (LOCAL=NO)
oracle    6326     1  6326  0    1 08:57 ?        00:00:00 oracleCDB1 (LOCAL=NO)
oracle    6407     1  6407  0    1 08:57 ?        00:00:00 oracleCDB1 (LOCAL=NO)
oracle    6438     1  6438  0    1 08:57 ?        00:00:00 ora_p008_CDB1
oracle    6440     1  6440  0    1 08:57 ?        00:00:00 ora_p009_CDB1
oracle    6442     1  6442  0    1 08:57 ?        00:00:01 oracleCDB1 (LOCAL=NO)
oracle    6490     1  6490  0    1 08:58 ?        00:00:00 ora_p00a_CDB1
oracle    6492     1  6492  0    1 08:58 ?        00:00:00 ora_p00b_CDB1
oracle    6640  5653  6640  0    1 09:01 pts/0    00:00:00 egrep -i UID|CDB1

Quite a few more than in 9i it seems. Most of the background processes are documented in the database reference. As you can see, there are only processes (-> NLWP = 1), no threads. If you wondered about the strange output, bear with me, there is a reason I chose those flags to the ps command.

The change with the threaded model

Let’s enable threaded execution. It’s simple:

SYS@CDB$ROOT> alter system set threaded_execution=true scope=spfile;

System altered.

SYS@CDB$ROOT> shutdown immediate

Database closed.
Database dismounted.
ORACLE instance shut down.
SYS@CDB$ROOT> startup
ORA-01017: invalid username/password; logon denied

There’s the first surprise. Not to worry, it’s just that OS level authentication doesn’t work anymore, but you can continue by providing your username and password:

SYS@CDB$ROOT> conn sys as sysdba
Enter password:
Connected.
SYS@CDB$ROOT> alter database mount;

Database altered.

SYS@CDB$ROOT> alter database open;

Database altered.

SYS@CDB$ROOT>

What does that mean for our processes? First let’s use the old way of checking for Oracle processes using just ps -ef:

[oracle@london ~]$ ps -ef | egrep -i "UID|$ORACLE_SID"
UID        PID  PPID  C STIME TTY          TIME CMD
oracle    7138     1  0 09:04 ?        00:00:00 ora_pmon_CDB1
oracle    7140     1  0 09:04 ?        00:00:00 ora_psp0_CDB1
oracle    7142     1  2 09:04 ?        00:00:05 ora_vktm_CDB1
oracle    7146     1  0 09:04 ?        00:00:01 ora_u004_CDB1
oracle    7152     1  4 09:04 ?        00:00:08 ora_u005_CDB1
oracle    7158     1  0 09:04 ?        00:00:00 ora_dbw0_CDB1
grid      7173     1  0 09:04 ?        00:00:00 oracle+ASM_asmb_cdb1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
grid      7202     1  0 09:04 ?        00:00:00 oracle+ASM_ocf0_cdb1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
grid      7226     1  0 09:04 ?        00:00:00 oracle+ASM_o000_cdb1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
grid      7229     1  0 09:04 ?        00:00:00 oracle+ASM_o001_cdb1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
grid      7401     1  0 09:06 ?        00:00:00 oracle+ASM_o002_cdb1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle    7407     1  0 09:06 ?        00:00:00 oracleCDB1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle    7494     1  0 09:06 ?        00:00:00 oracleCDB1 (LOCAL=NO)
oracle    7526     1  0 09:07 ?        00:00:00 oracleCDB1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle    7573     1  1 09:07 ?        00:00:00 oracleCDB1 (LOCAL=NO)
oracle    7575     1  3 09:07 ?        00:00:02 oracleCDB1 (LOCAL=NO)
oracle    7578     1  0 09:07 ?        00:00:00 oracleCDB1 (LOCAL=NO)
oracle    7592     1  0 09:07 ?        00:00:00 oracleCDB1 (LOCAL=NO)
oracle    7594     1  0 09:07 ?        00:00:00 oracleCDB1 (LOCAL=NO)
oracle    7596     1  0 09:07 ?        00:00:00 oracleCDB1 (LOCAL=NO)
oracle    7691  5653  0 09:08 pts/0    00:00:00 egrep -i UID|CDB1
[oracle@london ~]$

The list is considerably shorter, and you will notice some strange process names: u004 and u005. If you squint you see that PMON, PSP0, VKTM and DBW0 are still processes. I am checking for threads (or light weight processes as they are also called) in the next step:

[oracle@london ~]$ ps -eLf | egrep -i "UID|$ORACLE_SID"
UID        PID  PPID   LWP  C NLWP STIME TTY          TIME CMD
oracle    7138     1  7138  0    1 09:04 ?        00:00:00 ora_pmon_CDB1
oracle    7140     1  7140  0    1 09:04 ?        00:00:00 ora_psp0_CDB1
oracle    7142     1  7142  2    1 09:04 ?        00:00:11 ora_vktm_CDB1
oracle    7146     1  7146  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7147  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7148  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7149  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7155  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7159  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7160  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7161  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7162  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7163  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7165  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7166  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7167  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7152     1  7152  0   46 09:04 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7153  0   46 09:04 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7154  0   46 09:04 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7156  0   46 09:04 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7164  0   46 09:04 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7168  0   46 09:04 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7169  0   46 09:04 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7170  0   46 09:04 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7172  0   46 09:04 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7174  0   46 09:04 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7175  0   46 09:04 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7224  0   46 09:04 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7227  0   46 09:04 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7249  0   46 09:05 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7361  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7362  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7365  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7366  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7367  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7396  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7397  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7404  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7405  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7416  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7418  0   46 09:06 ?        00:00:01 ora_u005_CDB1
oracle    7152     1  7419  0   46 09:06 ?        00:00:02 ora_u005_CDB1
oracle    7152     1  7420  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7421  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7422  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7423  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7424  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7425  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7426  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7427  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7428  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7430  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7431  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7432  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7433  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7434  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7435  0   46 09:06 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7576  0   46 09:07 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7618  0   46 09:07 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7619  0   46 09:07 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7696  0   46 09:08 ?        00:00:00 ora_u005_CDB1
oracle    7152     1  7697  0   46 09:08 ?        00:00:00 ora_u005_CDB1
oracle    7158     1  7158  0    1 09:04 ?        00:00:00 ora_dbw0_CDB1
grid      7173     1  7173  0    1 09:04 ?        00:00:00 oracle+ASM_asmb_cdb1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
grid      7226     1  7226  0    1 09:04 ?        00:00:00 oracle+ASM_o000_cdb1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
grid      7229     1  7229  0    1 09:04 ?        00:00:00 oracle+ASM_o001_cdb1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
grid      7401     1  7401  0    1 09:06 ?        00:00:00 oracle+ASM_o002_cdb1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle    7494     1  7494  0    1 09:06 ?        00:00:00 oracleCDB1 (LOCAL=NO)
oracle    7526     1  7526  0    1 09:07 ?        00:00:00 oracleCDB1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle    7573     1  7573  0    1 09:07 ?        00:00:00 oracleCDB1 (LOCAL=NO)
oracle    7575     1  7575  1    1 09:07 ?        00:00:02 oracleCDB1 (LOCAL=NO)
oracle    7578     1  7578  0    1 09:07 ?        00:00:00 oracleCDB1 (LOCAL=NO)
oracle    7592     1  7592  0    1 09:07 ?        00:00:00 oracleCDB1 (LOCAL=NO)
oracle    7594     1  7594  0    1 09:07 ?        00:00:00 oracleCDB1 (LOCAL=NO)
oracle    7596     1  7596  0    1 09:07 ?        00:00:00 oracleCDB1 (LOCAL=NO)
oracle    7810     1  7810  0    1 09:10 ?        00:00:00 oracleCDB1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle    7840  5653  7840  0    1 09:11 pts/0    00:00:00 egrep -i UID|CDB1

This is where it’s interesting. First let’s match this to the database, focusing on the background processes.

SYS@CDB$ROOT> select pname, pid, sosid, spid, stid, execution_type
  2  from v$process where background = 1
  3  order by pname
  4  /

PNAME        PID SOSID                    SPID                     STID                     EXECUTION_
----- ---------- ------------------------ ------------------------ ------------------------ ----------
AQPC          43 7152_7416                7152                     7416                     THREAD
ARC0          35 7152_7362                7152                     7362                     THREAD
ARC1          36 7152_7365                7152                     7365                     THREAD
ARC2          37 7152_7366                7152                     7366                     THREAD
ARC3          38 7152_7367                7152                     7367                     THREAD
ASMB          22 7146_7167                7146                     7167                     THREAD
CJQ0          68 7152_7576                7152                     7576                     THREAD
CKPT          15 7146_7160                7146                     7160                     THREAD
DBRM          11 7146_7155                7146                     7155                     THREAD
DBW0          13 7158                     7158                     7158                     PROCESS
DIA0          12 7152_7156                7152                     7156                     THREAD
DIAG           9 7152_7154                7152                     7154                     THREAD
GEN0           5 7146_7148                7146                     7148                     THREAD
LG00          16 7146_7161                7146                     7161                     THREAD
LG01          18 7146_7163                7146                     7163                     THREAD
LGWR          14 7146_7159                7146                     7159                     THREAD
LREG          20 7146_7165                7146                     7165                     THREAD
MARK          28 7152_7175                7152                     7175                     THREAD
MMAN           7 7146_7149                7146                     7149                     THREAD
MMNL          24 7152_7169                7152                     7169                     THREAD
MMON          23 7152_7168                7152                     7168                     THREAD
NSS2          33 7152_7249                7152                     7249                     THREAD
O002          40 7152_7397                7152                     7397                     THREAD
PMON           2 7138                     7138                     7138                     PROCESS
PSP0           3 7140                     7140                     7140                     PROCESS
Q002          54 7152_7428                7152                     7428                     THREAD
Q004          56 7152_7430                7152                     7430                     THREAD
Q005          57 7152_7431                7152                     7431                     THREAD
Q006          58 7152_7432                7152                     7432                     THREAD
Q007          59 7152_7433                7152                     7433                     THREAD
Q008          60 7152_7434                7152                     7434                     THREAD
Q009          61 7152_7435                7152                     7435                     THREAD
QM00          53 7152_7427                7152                     7427                     THREAD
QM02          44 7152_7426                7152                     7426                     THREAD
RBAL          21 7146_7166                7146                     7166                     THREAD
RECO          19 7152_7164                7152                     7164                     THREAD
SCMN           6 7146_7146                7146                     7146                     THREAD
SCMN          10 7152_7152                7152                     7152                     THREAD
SMCO          41 7152_7404                7152                     7404                     THREAD
SMON          17 7146_7162                7146                     7162                     THREAD
TMON          34 7152_7361                7152                     7361                     THREAD
TT00          39 7152_7396                7152                     7396                     THREAD
VKTM           4 7142                     7142                     7142                     PROCESS
W000          42 7152_7405                7152                     7405                     THREAD

44 rows selected.

Here you clearly see which one of the background processes is a thread, and which isn’t. The threads will have multiple STIDs per SPID, or thread IDs to the process:

SYS@CDB$ROOT> select count(spid),spid,execution_type from v$process where background = 1 group by spid, execution_type;

COUNT(SPID) SPID                     EXECUTION_
----------- ------------------------ ----------
          1 7158                     PROCESS
         12 7146                     THREAD
          1 7138                     PROCESS
          1 7140                     PROCESS
          1 7142                     PROCESS
         29 7152                     THREAD

6 rows selected.

Circling back to the OS level for SPID 7146-can you see the threads there as well? You can:

[oracle@london ~]$ ps -eLf | egrep -i "UID|7146"
UID        PID  PPID   LWP  C NLWP STIME TTY          TIME CMD
oracle    7146     1  7146  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7147  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7148  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7149  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7155  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7159  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7160  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7161  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7162  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7163  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7165  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7166  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    7146     1  7167  0   13 09:04 ?        00:00:00 ora_u004_CDB1
oracle    8156  6876  8156  0    1 09:19 pts/2    00:00:00 egrep -i UID|7146
[oracle@london ~]$

But what about user sessions?

Let’s connect to the database (a PDB in this case) as a user.

[oracle@london ~]$ sqlplus martin@localhost/pdb4
...
SQL*Plus: Release 12.1.0.1.0 Production on Fri Jul 18 09:22:39 2014

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Last Successful login time: Fri Jul 18 2014 09:21:27 -04:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

MARTIN@PDB4> select sys_context('userenv','sid') from dual;

SYS_CONTEXT('USERENV','SID')
--------------------------------------------------------------------------------
280
MARTIN@PDB4>

I can see the session from the CDB root:

SYS@CDB$ROOT> select s.username, s.sid, s.serial#, s.con_id, p.spid, p.sosid, p.stid, p.execution_type
  2  from v$session s, v$process p
  3  where s.sid = 280 and s.paddr = p.addr;

USERNAME          SID    SERIAL#     CON_ID SPID                     SOSID                    STID                     EXECUTION_
---------- ---------- ---------- ---------- ------------------------ ------------------------ ------------------------ ----------
MARTIN            280        407          4 8672                     8672                     8672                     PROCESS

SYS@CDB$ROOT>

Interestingly – as the output shows – user processes seem to remain UNIX processes (there is a twist to that too, bear with me). Remember that this process has been created using a net service and the EZConnect syntax. What about my session that bypasses the net*8 layer? I connected using sqlplus sys as sysdba:

SYS@CDB$ROOT> select sys_context('userenv','sid') from dual;

SYS_CONTEXT('USERENV','SID')
------------------------------------------------------------------------------------------------------------------------------------------------------
249

SYS@CDB$ROOT> select s.username, s.sid, s.serial#, s.con_id, p.spid, p.sosid, p.stid, p.execution_type
  2  from v$session s, v$process p
  3  where s.sid = 249 and s.paddr = p.addr;

USERNAME          SID    SERIAL#     CON_ID SPID                     SOSID                    STID                     EXECUTION_
---------- ---------- ---------- ---------- ------------------------ ------------------------ ------------------------ ----------
SYS               249        685          1 7152                     7152_8696                8696                     THREAD

SYS@CDB$ROOT>

Now that’s a thread. Surprised? Let’s test on a non CDB. First I try to connect by bypassing the listener:

[oracle@london ~]$ sqlplus martin

SQL*Plus: Release 12.1.0.1.0 Production on Fri Jul 18 10:34:58 2014

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Enter password:
Last Successful login time: Fri Jul 18 2014 10:33:25 -04:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

MARTIN@NCDB> select sys_context('userenv','sid') from dual;

SYS_CONTEXT('USERENV','SID')
--------------------------------------------------------------------------------
249

MARTIN@NCDB>
-- in a different session
SYS@NCDB> select s.username, s.sid, s.serial#, s.con_id, p.spid, p.sosid, p.stid, p.execution_type
  2  from v$session s, v$process p
  3  where s.sid = 249 and s.paddr = p.addr;

USERNAME                              SID    SERIAL#     CON_ID SPID                     SOSID                    STID                     EXECUTION_
------------------------------ ---------- ---------- ---------- ------------------------ ------------------------ ------------------------ ----------
MARTIN                                249         13          0 14634                    14634_15337              15337                    THREAD

This session is a thread. Now using the listener:

[oracle@london ~]$ tnsping ncdb

TNS Ping Utility for Linux: Version 12.1.0.1.0 - Production on 18-JUL-2014 10:36:33

Copyright (c) 1997, 2013, Oracle.  All rights reserved.

Used parameter files:

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = london.example.com)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = NCDB)))
OK (0 msec)
[oracle@london ~]$ sqlplus martin@ncdb

SQL*Plus: Release 12.1.0.1.0 Production on Fri Jul 18 10:36:36 2014

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Enter password:
Last Successful login time: Fri Jul 18 2014 10:36:18 -04:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

MARTIN@NCDB> select sys_context('userenv','sid') from dual;

SYS_CONTEXT('USERENV','SID')
--------------------------------------------------------------------------------
249

And what’s the guess?

SYS@NCDB> r
  1  select s.username, s.sid, s.serial#, s.con_id, p.spid, p.sosid, p.stid, p.execution_type
  2  from v$session s, v$process p
  3* where s.sid = 249 and s.paddr = p.addr

USERNAME                              SID    SERIAL#     CON_ID SPID                     SOSID                    STID                     EXECUTION_
------------------------------ ---------- ---------- ---------- ------------------------ ------------------------ ------------------------ ----------
MARTIN                                249         17          0 15507                    15507                    15507                    PROCESS

You were right-a process. But so far I haven’t shown you my listener.ora file. So far in the testing it didn’t use DEDICATED_THROUGH_BROKER_listener = ON. Let’s set this, and reload the listener. Here is the complete example:

[oracle@london ~]$ sqlplus martin@ncdb

SQL*Plus: Release 12.1.0.1.0 Production on Fri Jul 18 10:45:18 2014

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Enter password:
Last Successful login time: Fri Jul 18 2014 10:36:38 -04:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

MARTIN@NCDB> select sys_context('userenv','sid') from dual;

SYS_CONTEXT('USERENV','SID')
--------------------------------------------------------------------------------
14

-- and in another session

SYS@NCDB> select s.username, s.sid, s.serial#, s.con_id, p.spid, p.sosid, p.stid, p.execution_type
  2  from v$session s, v$process p
  3  where s.sid = 14 and s.paddr = p.addr;

USERNAME               SID    SERIAL#     CON_ID SPID                     SOSID                    STID                     EXECUTION_
--------------- ---------- ---------- ---------- ------------------------ ------------------------ ------------------------ ----------
MARTIN                  14         25          0 14634                    14634_16287              16287                    THREAD

The listener parameter DEDICATED_THROUGH_BROKER_<listener_name> allows you to create user sessions as threads. And now to wrap up, how do you kill that session? On my Oracle Linux 6.5 system I killed the LWP:

[oracle@london ~]$ ps -eLf | egrep -i "UID|16287"
UID        PID  PPID   LWP  C NLWP STIME TTY          TIME CMD
oracle   14634     1 16287  0   32 10:45 ?        00:00:00 ora_u005_NCDB
oracle   16531 16503 16531  1    1 10:48 pts/4    00:00:00 egrep -i UID|16287
[oracle@london ~]$ kill -9 16287

Which terminates my session:

MARTIN@NCDB> select user from dual
  2  /
select user from dual
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 14634 Thread ID: 16287
Session ID: 14 Serial number: 25

My database stayed up and running. Please do NEVER EVER kill the SPID when threaded_execution is enabled!!! You might bring the database down.

Disclaimer

As with everything on this blog, don’t take it literally-test it! Just because it worked for me on my VM with the specific set of packages does not mean this is applicable for other Linux or even Solaris/AIX!

Posted in 12c Release 1, Linux | Tagged: , | 1 Comment »

Setting up Direct NFS on Oracle 12c

Posted by Martin Bach on July 9, 2014

Direct NFS is a great feature that I have finally had the time to investigate further. Since I always forget how to set it up and I didn’t find blog posts about this subject elsewhere I decided to put something together.

In this configuration I am using a virtual machine named server1 to export a directory to server2. Oracle is not as lenient as myself and may have certain support requirements when it comes to dNFS servers but I just wanted to get started.

The export of the NFS mount is shown here:

[root@server1 ~]# cat /etc/exports
/u01/oraback	*.example.com(rw,sync,all_squash,insecure,anonuid=54321,anongid=54321)

There is nothing too special about the export definition here. The all_squash directive normally uses a uid and gid of 65534. Since this is most likely not matching the oracle user I chose to override this behaviour. You may have already guessed that I am using the oracle preinstall RPMs which use 54321 for the oracle account and oinstall group respectively. The “all_squash” directive maps all uids and gids to the anonymous user by default.

Oracle also recommends tweaking some network related parameters in /etc/sysctl.conf. They appear to have to do with better performance, but I didn’t have time to verify that claim yet. On the other hand, it seems that from kernel 2.6 onwards Linux tunes the send and receive buffers automatically, and it’s probably a good idea not trying to outsmart it.

Using “service nfs start” I started the NFS server process on my Oracle Linux 6.4 system. I had appropriate firewall rules in place, if you have a firewall then you might need to do the same.

Mounting

Mounting the file system on the second node requires a setting in the fstab file to begin with. In this example I want to mount the exported backups in /media/backups on server2.

The corresponding entry in fstab was:

...
server1:/u01/oraback	/media/backup	nfs rw,bg,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,vers=3,timeo=600	1 2

I kept the habit of enforcing NFSv3 even though 4 and 4.1 are now supported with 12c. With the fstab entry I can now mount the directory:

[root@server2 ~]# mount /media/backup
[root@server2 ~]# mount | grep backup
server1:/u01/oraback on /media/backup type nfs (rw,bg,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,vers=3,timeo=600,addr=192.168.56.44)

So far kernel NFS has been used, but not direct NFS. To do so, you need to create a mapping file named oranfstab. The file resides in $ORACLE_HOME/dbs. In this file you define how you get to the NFS filer. You can define up to 4 paths if you have multiple NICs in your database server. My VM does not so the most basic setup is used:

[root@server2 ~]# cat /u01/app/oracle/product/12.1.0.1/dbhome_1/dbs/oranfstab
server: server1
local: 192.168.56.45
path:  192.168.56.44
export: /u01/oraback mount: /media/backup
export: /m/CLONE mount: /u01/oradata/CLONE
[root@server2 ~]#

The “server” directive gives the NFS filer a name. Multiple sections are introduced using the “server” keyword.

Following the server keyword you define how you get the data: the “local” IP address shown here is the IP of server2. The “path” keyword indicates the path to the NFS filer, server1 or 192.168.56.44. And finally you tell Oracle the name of the export on the filer (/u01/oraback) and where it is mounted locally (/media/backup).

The last step necessary is to enable dNFS:

[oracle@server2 lib]$ make -f ins_rdbms.mk dnfs_on
rm -f /u01/app/oracle/product/12.1.0.1/dbhome_1/lib/libodm12.so;
cp /u01/app/oracle/product/12.1.0.1/dbhome_1/lib/libnfsodm12.so
 /u01/app/oracle/product/12.1.0.1/dbhome_1/lib/libodm12.so

All done! After the next start of your database you should see the a reference to the ODM library in the alert.log similar to this one:

...
NOTE: remote asm mode is local (mode 0x1; from cluster type)
Sun Nov 17 14:30:41 2013
Oracle instance running with ODM: Oracle Direct NFS ODM Library Version 3.0
Starting background process PMON
Starting background process PSP0
...

Hopefully you will now be able to query the metadata views too.

SQL> select * from v$dnfs_servers

        ID SVRNAME    DIRNAME                 MNTPORT    NFSPORT NFSVERSI      WTMAX      RTMAX     CON_ID
---------- ---------- -------------------- ---------- ---------- -------- ---------- ---------- ----------
         1 server1    /m/clone                  52690       2049 NFSv3.0     1048576    1048576          0

That’s all there is to say about dNFS for this post. Oh and if at first the query against v$dnfs_servers does not return anything, it doesn’t necessarily imply a problem. I just created a file on the mount point and – as if by magic – dNFS kicked in, opened the channel and opened the file descriptors.

References

MOS Doc ID 1464567.1: Collecting The Required Information For Support To Troubleshot DNFS (Direct NFS) Issues (11.1, 11.2 & 12c)

Posted in 12c Release 1, Linux | 3 Comments »

Why is P1 the only parameter populated in cell smart table scan?

Posted by Martin Bach on July 7, 2014

Anyone who has looked at Exadata might ask the question, and I did so too. After all, cell smart table scan is in wait class User IO so there should be more, right? This is what you find for a smart scan:

NAME                           PARAMETER1           PARAMETER2           PARAMETER3                     WAIT_CLASS
------------------------------ -------------------- -------------------- ------------------------------ ---------------
cell smart table scan          cellhash#                                                                User I/O
cell smart index scan          cellhash#                                                                User I/O

Compare this to the traditional IO request:

NAME                           PARAMETER1           PARAMETER2           PARAMETER3                     WAIT_CLASS
------------------------------ -------------------- -------------------- ------------------------------ ---------------
db file sequential read        file#                block#               blocks                         User I/O
db file scattered read         file#                block#               blocks                         User I/O
direct path read               file number          first dba            block cnt                      User I/O

At first I didn’t understand as to why that was the. Until tracing a lot more. Using the method described by my colleague Frits Hoogland I traced Exadata calls as well as the IB interconnect in addition to a SQL trace (level 8). Please don’t do this outside the lab, the trace will get huge!

Now that produces an interesting file, and the SQL trace information is only 1 intermittent line drowning in a sea of other information. So I used a sed oneliner to make them stand out a bit:

$ sed  "/WAIT/{x;p;x;G;}" db12c1_ora_11916.trc > kxd_libcell_sqltace.trc

This introduces a newline before and after each line containing WAIT. The SQL Trace information is familiar:

WAIT #139970814085680: nam='cell smart table scan' ela= 251 cellhash#=822451848 p2=0 p3=0 obj#=61471 tim=3470911250866

By the way the obj# is the data_object_ID, not the object_id in DBA_OBJECTS.

Stepping through the trace

The trace information reflects what is happening during the various smart scan processing steps. This first example shows you some of the processing during the smart scan initialisation phase, when the cells are being initialised. I initially struggled mapping the “appliance” hexadecimal number to an actual cell. But it’s easy if you grep for kcfis_open_appliance_fd, and match the cellhash to v$cell.cell_hashval. Continuing the example:

kcfis_open_appliance:
throttle: initialized for appliance 0x7f4d7e9bc4c0
kcfis_find_appliance_fd: appliance fd not found. appliance 192.168.12.10/ Cached appliance fd count 0
kcfis_open_appliance_fd: 0x7f4d7e9bc4c0
oss_open called for o/192.168.12.10//box/predicate
Sending message RQ_Tag_2827385864_11: 0xf9b3f08 with frag # 0, to box my_box_refid: 2 (box inc: 12)
Command with band 1, payload length 116, payload 0xf9b4110
RQ_Tag_2827385864_11: Command name OPEN, refid 11
Open flags: readwrite - 1, sync - 0, char - 1.
 Device name - /box/predicate
Vrpc completion - RQ_Tag_2827385864_11: cmd 0xf9b3f48, req 0xf9b3a40, frag # 1
Vrpc reply 0xf9b4000 with refid 11 for existing message: 0xf9b3f08
Reply with band 0, payload length 0, payload (nil)
RQ_Tag_2827385864_11: Command name REPLY OPEN, refid 11
Returned open file descriptor  1
opening file number 11

WAIT #139970814085680: nam='cell smart table scan' ela= 259 cellhash#=822451848 p2=0 p3=0 obj#=61471 tim=3470911167788

So here the cell with IP 192.168.12.10 maps to 0x7f4d7e9bc4c0 in the trace. As you can see, immediately before the wait event is written to the trace file a file is opened on the cell. This causes the line to be emitted in the SQL trace file. So far there hasn’t been any significant scanning activity against the data. But the trace line makes sense: the IO request was issued against the cell with hash 822451848. We don’t know what it did though, and which file it accessed.

SYS:db12c1> select cell_path from v$cell where cell_hashval = '822451848';

CELL_PATH
----------------------------
192.168.12.10

I was initially quite pleased with myself initially when I found this out.

More trace information. A lot more.

Lesson learned trawling through the trace: just because there is a single line in the trace file doesn’t mean there aren’t other requests either in flight or under preparation.

The next few thousand (!) lines show you the mapping between RDBMS and cell addressing. This part is well explained in Frits’ post I referenced earlier. Further down in the trace it gets more interesting from a SQL trace perspective again. Here payload maps are generated and issued against the cells. Remember that these IO requests to cells are performed asynchronously. I found quite a few calls to the code generating payloads for the cells between individual SQL Trace WAIT lines. This makes reading the trace a bit confusing, it almost appears as if the SQL Trace couldn’t keep up with the speed the requests are created/sent.

Looking at the file I found the following pattern preceding a smart scan line in the first half of the process (before data comes back from the cells). Beginning immediately after the last WAIT line, in abbreviated format:

Ioctl completed.
appliance 0x7f4d7e9bc4c0 : cellsrv pid: 9563: predicate: /box/predicate37843
...
kcfis_create_maps_payload. appliance 0x7f4d7ea65570 num_disks 12
throttle: allowing map push: appliance 0x7f4d7ea65570, maps_to_push 7
...
Pushing request : disknumber = 0, offset = 936606973952, len = 1032192 rdba: 43778306 version 0 reqid=10 cver=1
disk=0x7f4d7e9c0aa8 state=1
trans_req_element = 0x7f4d7e9cba00

The appliance name, cellsrv (thread) ID, and the box predicate appear to remain constant for the lifetime of the smart scan. You can find them in other occurrences of the trace as well (but not used in the same context).

The preceding lines are then followed by many more lines beginning with “Pushing request”, iterating over all 12 disks (-> num_disks), for a range of Data Block Addresses. After that map payload has been created it is time to push it to the cell. Here it gets a bit murky: the KXD.* traces show requests being pushed, but not necessarily to the cell mentioned in the event line. Consider this:

kcfis_create_maps_payload. alloc_len 272 num maps 2
throttle: mappayload: maps_to_push 4 iosize_being_pushed 8388608
kcfis_push: pushing maps to appliance 0x7f4d7e9bc4c0
kcfis_issue_ioctl: payload_type 2
oss_ioctl open file: 11, opcode: 99
Sending message RQ_Tag_2827385864_54: 0xf9c5728 with frag # 0, to box my_box_refid: 2 (box inc: 12)
Command with band 1, payload length 272, payload 0x7f4d7e9b9ef8
RQ_Tag_2827385864_54: Command name IOCTL, refid 54
Ioctl arguments fd 1 opcode 99 size 272
oss_ioctl done
oss_wait called for request: 0xf9b3a40
Vrpc completion - RQ_Tag_2341095748_52: cmd 0xf9c6ec8, req 0xf9b3a40, frag # 1
Vrpc reply 0xf9c6f80 with refid 52 for existing message: 0xf9c6e88
Reply with band 1, payload length 40, payload 0x7f4d7ea65a90
RQ_Tag_2341095748_52: Command name REPLY IOCTL, refid 52
oss_wait done, request 0xf9b3a40, result 0x7f4d837e4c48

WAIT #139970814085680: nam='cell smart table scan' ela= 44 cellhash#=3249924569 p2=0 p3=0 obj#=61471 tim=3470911270172

The kcfis_push call is addressed to appliance 0x7f4d7e9bc4c0, which maps to 192.168.12.10. The cell hash from the event (3249924569) maps to a different cell though. You can find the corresponding kcfis_push earlier in the trace. In fact, it is the first occurrence after the last WAIT event line.

trans_req_element = 0x7f4d7e870780
Pushing request : disknumber = 5, offset = 937011707904, len = 4194304 rdba: 44570624 version 0 reqid=1021 cver=1
disk=0x7f4d7e9b5980 state=1
disk=0x7f4d7e9b4968 state=1
disk=0x7f4d7e9b3460 state=1
disk=0x7f4d7e9b2f70 state=1
disk=0x7f4d7e9df5a0 state=1
disk=0x7f4d7e9df0b0 state=1
disk=0x7f4d7e9dea78 state=1
kcfis_create_maps_payload. alloc_len 2648 num maps 35
throttle: mappayload: maps_to_push 10 iosize_being_pushed 102039552
kcfis_push: pushing maps to appliance 0x7f4d7ea65570
kcfis_issue_ioctl: payload_type 2
SKGXP-connecting to the box 192.168.12.8 OSS OS Pid 4541
oss_ioctl open file: 2, opcode: 99
Sending message RQ_Tag_2341095748_52: 0xf9c6e88 with frag # 0, to box my_box_refid: 0 (box inc: 16)
Command with band 3, payload length 2648, payload 0x7f4d7ea62e98
RQ_Tag_2341095748_52: Command name IOCTL, refid 52
Ioctl arguments fd 1 opcode 99 size 2648
oss_ioctl done

This is getting too complex, can you say it again?

Now let’s put this together again, simplified:

WAIT #139970814085680: nam='cell smart table scan' ela= 24 cellhash#=822451848 p2=0 p3=0 obj#=61471 tim=3470911267234

kcfis_push: num-appliances 3. payload_size 0x7f4d837e4cf0 ioctl_issued 0x7f4d837e4cc8 results 0x7f4d837e4c48 payload_type 0x7f4d837e4d20
kcfis_create_maps_payload. appliance 0x7f4d7ea65570 num_disks 12
...
kcfis_create_maps_payload. alloc_len 2648 num maps 35
throttle: mappayload: maps_to_push 10 iosize_being_pushed 102039552
kcfis_push: pushing maps to appliance 0x7f4d7ea65570
kcfis_issue_ioctl: payload_type 2
SKGXP-connecting to the box 192.168.12.8 OSS OS Pid 4541
oss_ioctl open file: 2, opcode: 99
Sending message RQ_Tag_2341095748_52: 0xf9c6e88 with frag # 0, to box my_box_refid: 0 (box inc: 16)
Command with band 3, payload length 2648, payload 0x7f4d7ea62e98
RQ_Tag_2341095748_52: Command name IOCTL, refid 52
Ioctl arguments fd 1 opcode 99 size 2648
oss_ioctl done
...
kcfis_create_maps_payload. appliance 0x7f4d7e9bf518 num_disks 12
...
kcfis_create_maps_payload. alloc_len 1208 num maps 15
throttle: mappayload: maps_to_push 10 iosize_being_pushed 34471936
kcfis_push: pushing maps to appliance 0x7f4d7e9bf518
kcfis_issue_ioctl: payload_type 2
SKGXP-connecting to the box 192.168.12.9 OSS OS Pid 9547
...
oss_ioctl done
...
kcfis_create_maps_payload. appliance 0x7f4d7e9bc4c0 num_disks 12
...
kcfis_create_maps_payload. alloc_len 272 num maps 2
throttle: mappayload: maps_to_push 4 iosize_being_pushed 8388608
kcfis_push: pushing maps to appliance 0x7f4d7e9bc4c0
...
oss_ioctl done
...
oss_wait done, request 0xf9b3a40, result 0x7f4d837e4c48

WAIT #139970814085680: nam='cell smart table scan' ela= 44 cellhash#=3249924569 p2=0 p3=0 obj#=61471 tim=3470911270172

Interestingly the results for the requests come back in that order as well. It seems coincidence, judging by the rest of the trace. Compare the appliance values with the summary you just read:

WAIT #139970814085680: nam='cell smart table scan' ela= 44 cellhash#=3249924569 p2=0 p3=0 obj#=61471 tim=3470911270172

Ioctl completed. Payload type 2
Ioctl quarantine response 1 for appliance 0x7f4d7ea65570
appliance 0x7f4d7ea65570 : cellsrv pid: 4541: predicate: /box/predicate114
oss_wait called for request: 0xf9c69c0
Vrpc completion - RQ_Tag_2825762783_53: cmd 0xf9c6768, req 0xf9c69c0, frag # 1
Vrpc reply 0xf9c6820 with refid 53 for existing message: 0xf9c6728
Reply with band 1, payload length 40, payload 0x7f4d7e9bfa38
RQ_Tag_2825762783_53: Command name REPLY IOCTL, refid 53
oss_wait done, request 0xf9c69c0, result 0x7f4d837e4c68

WAIT #139970814085680: nam='cell smart table scan' ela= 64 cellhash#=674246789 p2=0 p3=0 obj#=61471 tim=3470911270269

Ioctl completed. Payload type 2
Ioctl quarantine response 1 for appliance 0x7f4d7e9bf518
appliance 0x7f4d7e9bf518 : cellsrv pid: 9547: predicate: /box/predicate37211
oss_wait called for request: 0xf9c5260
Vrpc completion - RQ_Tag_2827385864_54: cmd 0xf9c5768, req 0xf9c5260, frag # 1
Vrpc reply 0xf9c5820 with refid 54 for existing message: 0xf9c5728
Reply with band 1, payload length 40, payload 0x7f4d7e9bc9e0
RQ_Tag_2827385864_54: Command name REPLY IOCTL, refid 54
oss_wait done, request 0xf9c5260, result 0x7f4d837e4c88

WAIT #139970814085680: nam='cell smart table scan' ela= 54 cellhash#=822451848 p2=0 p3=0 obj#=61471 tim=3470911270351

Ioctl completed. Payload type 2
Ioctl quarantine response 1 for appliance 0x7f4d7e9bc4c0

SQL trace wait events frequently seem to be written after an IOCTL (standard Linux call) or oss_wait (Oracle) has completed.  So it appears as if requests are created and treated on a cell-by-cell basis. This explains why you get so many cell table smart scan events in general. It’s not because they are necessarily small in size, the payloads always seem to include more than 1 disk to scan. Thinking about it now it makes sense, but as an analyst I would have appreciated more insights into what happens under the covers. But we can trace, right?

Posted in 11g Release 2, 12c Release 1, Exadata | Leave a Comment »

Why does the Optimiser not respect my qb_name() hint?

Posted by Martin Bach on June 30, 2014

I recently was involved in an investigation on a slow-running report on an Exadata system. This was rather interesting, the raw text file with the query was 33kb in size, and SQL Developer formatted the query to > 1000 lines. There were lots of interesting constructs in the query and the optimiser did its best to make sense of the inline views and various joins.

This almost lead to a different post about the importance of understanding query transformation. You will see in a bit what QT has to do with this post. Although access paths and join methods are still very relevant when trying to understand execution plans you will be amazed by the extent of the work happening during query transformation. I have said so on twitter already, but in case you missed it: Oracle has published a SQL Tuning Guide with the 12c documentation set. This is a really, really good, well-written piece of documentation. And it contains a lot of information about query transformation in chapter 5.

Query Blocks

One of the lessons I learned when looking at complex SQL statements is to think about Query Blocks. Let’s start with an example. I created some test tables, T1 and T2, both with rather random data. T1 is bigger, about 33 million rows, T2 has about 1 million rows. All of them are based on Jonathan Lewis’s presentation about “creating tests“. This is Oracle 12.1.0.1.3 on Exadata.

Here is the statement used for this blog post, unhinted. To keep things simple there are no indexes on T1 or T2.

select /*+ gather_plan_statistics */
  /* test0013 */
 t1.id,t1.N1
from t1
where exists ( select 1 from t2 where t1.id = t2.id and t2.id between 10 and 20)
and t1.n1 = (select n1 from t1 where id = 1 and rownum = 1)   -- the subquery would return 32 rows without the count stopkey
union all
select  t2.id, t2.n1 from t2
where t2.id between 10 and 12
union all
select t2.id, t2.n1 from t2
where t2.id between 13 and 15;

The statement does not make sense for humans, but it should be logically correct (this is my first post on the optimiser, I am hoping that the more knowledgeable readers point out any inaccuracies should there be any). The intent is to introduce a few query blocks.

This results in the following execution plan:

SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last +alias'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  5zrwsqnwjzzry, child number 0
-------------------------------------
select /*+ gather_plan_statistics */   /* test0013 */  t1.id,t1.N1 from
t1 where exists ( select 1 from t2 where t1.id = t2.id and t2.id
between 10 and 20) and t1.n1 = (select n1 from t1 where id = 1 and
rownum = 1)  union all select  t2.id, t2.n1 from t2 where t2.id between 10
and 12 union all select t2.id, t2.n1 from t2 where t2.id between 13 and
15

Plan hash value: 319705157

-----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |      |      1 |        |      6 |00:00:00.07 |     516K|    516K|       |       |          |
|   1 |  UNION-ALL                               |      |      1 |        |      6 |00:00:00.07 |     516K|    516K|       |       |          |
|*  2 |   HASH JOIN SEMI                         |      |      1 |      1 |      0 |00:00:00.07 |     485K|    485K|  1421K|  1421K|  540K (0)|
|*  3 |    TABLE ACCESS STORAGE FULL             | T1   |      1 |      1 |      0 |00:00:00.07 |     485K|    485K|  1025K|  1025K| 3085K (0)|
|*  4 |     COUNT STOPKEY                        |      |      1 |        |      1 |00:00:00.01 |       4 |      2 |       |       |          |
|*  5 |      TABLE ACCESS STORAGE FULL FIRST ROWS| T1   |      1 |      2 |      1 |00:00:00.01 |       4 |      2 |  1025K|  1025K| 1029K (0)|
|*  6 |    TABLE ACCESS STORAGE FULL             | T2   |      0 |     12 |      0 |00:00:00.01 |       0 |      0 |  1025K|  1025K|          |
|*  7 |   TABLE ACCESS STORAGE FULL              | T2   |      1 |      4 |      3 |00:00:00.01 |   15204 |  15200 |  1025K|  1025K| 3085K (0)|
|*  8 |   TABLE ACCESS STORAGE FULL              | T2   |      1 |      4 |      3 |00:00:00.01 |   15204 |  15200 |  1025K|  1025K| 3085K (0)|
-----------------------------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SET$1
   2 - SEL$5DA710D3
   3 - SEL$5DA710D3 / T1@SEL$1
   4 - SEL$3
   5 - SEL$3        / T1@SEL$3
   6 - SEL$5DA710D3 / T2@SEL$2
   7 - SEL$4        / T2@SEL$4
   8 - SEL$5        / T2@SEL$5

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"."ID"="T2"."ID")
   3 - storage(("T1"."ID"<=20 AND "T1"."ID">=10 AND "T1"."N1"=))
       filter(("T1"."ID"<=20 AND "T1"."ID">=10 AND "T1"."N1"=))
   4 - filter(ROWNUM=1)
   5 - storage("ID"=1)
       filter("ID"=1)
   6 - storage(("T2"."ID"<=20 AND "T2"."ID">=10))
       filter(("T2"."ID"<=20 AND "T2"."ID">=10))
   7 - storage(("T2"."ID"<=12 AND "T2"."ID">=10))
       filter(("T2"."ID"<=12 AND "T2"."ID">=10))
   8 - storage(("T2"."ID"<=15 AND "T2"."ID">=13))
       filter(("T2"."ID"<=15 AND "T2"."ID">=13))

54 rows selected.

So far so good, nothing really too surprising here but keep in mind this is a simple example. Now if I wanted to pass hints to the optimiser on how to deal with the subqueries I have to refer to the query blocks. Jonathan Lewis has described this in an older but still relevant blog post, and you can find presentations from others on the subject as well. Easy enough, I simply add a /*+ qb_name(name) */ immediately after every select, provide the query block name in the hint (/*+ full(@block_a t1@block_a) */) and I’m done. So let’s try:

select /*+ qb_name(block_a) gather_plan_statistics */
  /* test0013 */
 t1.id,t1.N1
from t1
where exists ( select /*+ qb_name(block_c) */ 1 from t2 where t1.id = t2.id and t2.id between 10 and 20)
and t1.n1 = (select /*+ qb_name(block_d) */ n1 from t1 where id = 1 and rownum = 1)
union all
select /*+ qb_name(block_e) */ t2.id, t2.n1 from t2
where t2.id between 10 and 12
union all
select /*+ qb_name(block_f) */ t2.id, t2.n1 from t2
where t2.id between 13 and 15;

Now surely I should get query blocks block_[ac-f] in the execution plan, right? When going over the post I noticed that block_b was missing in the SQL statement by the way, for some reasons it went missing, never mind.

SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last +alias'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  66ytd4g9xmwxb, child number 0
-------------------------------------
select /*+ qb_name(block_a) gather_plan_statistics */   /* test0012 */
t1.id,t1.N1 from t1 where exists ( select /*+ qb_name(block_c) */ 1
from t2 where t1.id = t2.id and t2.id between 10 and 20) and t1.n1 =
(select /*+ qb_name(block_d) */ n1 from t1 where id = 1 and rownum = 1)
union all select /*+ qb_name(block_e) */ t2.id, t2.n1 from t2 where
t2.id between 10 and 12 union all select /*+ qb_name(block_f) */ t2.id,
t2.n1 from t2 where t2.id between 13 and 15

Plan hash value: 319705157

-----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |      |      1 |        |      6 |00:00:00.04 |     516K|    516K|       |       |          |
|   1 |  UNION-ALL                               |      |      1 |        |      6 |00:00:00.04 |     516K|    516K|       |       |          |
|*  2 |   HASH JOIN SEMI                         |      |      1 |      1 |      0 |00:00:00.04 |     485K|    485K|  1421K|  1421K|  385K (0)|
|*  3 |    TABLE ACCESS STORAGE FULL             | T1   |      1 |      1 |      0 |00:00:00.04 |     485K|    485K|  1025K|  1025K| 3085K (0)|
|*  4 |     COUNT STOPKEY                        |      |      1 |        |      1 |00:00:00.01 |       4 |      2 |       |       |          |
|*  5 |      TABLE ACCESS STORAGE FULL FIRST ROWS| T1   |      1 |      2 |      1 |00:00:00.01 |       4 |      2 |  1025K|  1025K| 1029K (0)|
|*  6 |    TABLE ACCESS STORAGE FULL             | T2   |      0 |     12 |      0 |00:00:00.01 |       0 |      0 |  1025K|  1025K|          |
|*  7 |   TABLE ACCESS STORAGE FULL              | T2   |      1 |      4 |      3 |00:00:00.01 |   15204 |  15200 |  1025K|  1025K| 3085K (0)|
|*  8 |   TABLE ACCESS STORAGE FULL              | T2   |      1 |      4 |      3 |00:00:00.01 |   15204 |  15200 |  1025K|  1025K| 3085K (0)|
-----------------------------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SET$1
   2 - SEL$7CA7EA44
   3 - SEL$7CA7EA44 / T1@BLOCK_A
   4 - BLOCK_D
   5 - BLOCK_D      / T1@BLOCK_D
   6 - SEL$7CA7EA44 / T2@BLOCK_C
   7 - BLOCK_E      / T2@BLOCK_E
   8 - BLOCK_F      / T2@BLOCK_F

Surprise, surprise

Umm, where are the query blocks A and C (remember there is no hint for block_b)? I was hoping to get rid of all the system generated names, but no luck. Actually, that’s not quite right, and to understand this you have to look at the 10053 trace, which is mostly harmless. In the trace you find all the query block names, right at the beginning:

Registered qb: BLOCK_A 0x3bc2578 (HINT BLOCK_A)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=BLOCK_A nbfros=1 flg=0
    fro(0): flg=4 objn=61471 hint_alias="T1"@"BLOCK_A"

Registered qb: BLOCK_C 0x3bc5610 (HINT BLOCK_C)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=BLOCK_C nbfros=1 flg=0
    fro(0): flg=4 objn=61485 hint_alias="T2"@"BLOCK_C"

Registered qb: BLOCK_D 0xfedf8560 (HINT BLOCK_D)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=BLOCK_D nbfros=1 flg=0
    fro(0): flg=4 objn=61471 hint_alias="T1"@"BLOCK_D"

Registered qb: BLOCK_E 0xfedf7290 (HINT BLOCK_E)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=BLOCK_E nbfros=1 flg=0
    fro(0): flg=4 objn=61485 hint_alias="T2"@"BLOCK_E"

Registered qb: BLOCK_F 0xfedf5f50 (HINT BLOCK_F)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=BLOCK_F nbfros=1 flg=0
    fro(0): flg=4 objn=61485 hint_alias="T2"@"BLOCK_F"

Registered qb: SET$1 0xfedf4b28 (PARSER)

So why is there no BLOCK_A in the Query Block Name / Object Alias section of the plan? Now it’s time to scroll down (a lot) or to search for the query block from the execution plan (SEL$7CA7EA44 for example).

I found it in the query transformation part of the optimiser trace, more specifically in the cost-based query transformation part. There you can see how the optimiser tries various optimisations to the SQL statement you issued. Not really surprisingly the optimiser works on the EXISTS part trying to unnest it. Here’s the excerpt from the trace:

*****************************
Cost-Based Subquery Unnesting
*****************************
SU: Unnesting query blocks in query block SET$1 (#1) that are valid to unnest.
Subquery removal for query block BLOCK_D (#5)
RSW: Not valid for subquery removal BLOCK_D (#5)
Subquery unchanged.
Subquery Unnesting on query block BLOCK_A (#4)SU: Performing unnesting that does not require costing.
SU: Considering subquery unnest on query block BLOCK_A (#4).
SU:   Checking validity of unnesting subquery BLOCK_C (#6)
SU:   Passed validity checks.
SU:   Transforming EXISTS subquery to a join.
Registered qb: SEL$7CA7EA44 0xfedfa350 (SUBQUERY UNNEST BLOCK_A; BLOCK_C)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$7CA7EA44 nbfros=2 flg=0
    fro(0): flg=0 objn=61471 hint_alias="T1"@"BLOCK_A"
    fro(1): flg=0 objn=61485 hint_alias="T2"@"BLOCK_C"

Notice the new query block name: Registered qb: SEL$7CA7EA44 0xfedfa350 (SUBQUERY UNNEST BLOCK_A; BLOCK_C). And there you go, the rewritten part of the SQL statement “replaces” the original blocks A and C.

Many thanks go to Karen Morton who helped me along the way and had so many good suggestions!

Posted in 12c Release 1, Exadata, Linux | 5 Comments »

My SDU goes to 11 ^h^h I meant 2097152

Posted by Martin Bach on June 11, 2014

One of the cool new things in 12.1 is that you can set the Session Data Unit to 2MB. This might not sound like a Big Deal, but getting this to work required me to dig deeper into the TNS layer than I intended…Then I somehow got stuck on the wrong track, thankfully the team at Enkitec helped out here with pointers.

This post is rather boring if you just look at it but it’s probably one of the best examples where a few sentences in writing are vastly different from the time it took to get to the bottom of it. The solution explained here works on our 12.1.0.1.3 Exadata lab system, which is an X2-2 but should likewise be applicable for other configurations.

The Server Setup

I created a new TNS_ADMIN directory just for the new listener I wanted to test with. Experiments with the live listener are not encouraged at all, you might cause disruption. So here’s the listener.ora file I created for my new listener. It’s created on the IB network. In the little testing I performed I didn’t get more than 64k SDU when using bondeth0 as the network.

> cat listener.ora
listener_sdutest =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (SDU = 2097152)
      (ADDRESS =
        (PROTOCOL = TCP)
        (HOST = 192.168.x.x)
        (PORT = 1891)
      )
    )
  )


sid_list_listener_sdutest =
  (sid_desc=
    (GLOBAL_DBNAME=db12c)
    (ORACLE_HOME = /u01/app/oracle/product/12.1.0.1/dbhome_1)
    (sid_name = db12c1)
  )

TRACE_LEVEL_listener_sdutest=16
trace_directory_listener_sdutest=/home/oracle/mbach/tns/trace
diag_adr_enabled_listener_sdutest=off
trace_file_listener_sdutest = bigsdu.trc

Please ignore the tracing information for now. In the file I define the SDU to be 2M. Note how the SDU = directive belongs immediately beneath the description. It does not seem to have an effect in the sid_list_listener part, at least not in my tests. This is actually the result of about 2 hours of research…quite underwhelming if you asked me now.

Alongside the listener.ora I created a sqlnet.ora file to set the default_sdu_size, just in case.

> cat sqlnet.ora
DEFAULT_SDU_SIZE=2097152

That’s it! Over to the client.

The Client Configuration

Now I need a 12c client to test. I am using the following tnsnames.ora and sqlnet.ora files:

> cat sqlnet.ora
DEFAULT_SDU_SIZE=2097152

trace_level_client=16
trace_directory_client=/home/oracle/mbach/tns/trace
trace_file_client=client.trc
trace_unique_client = true
DIAG_ADR_ENABLED=OFF

> cat tnsnames.ora
db12cbigsdu =
  (DESCRIPTION =
    (SDU=2097152)
    (ADDRESS =
      (PROTOCOL = TCP)
      (HOST = 192.168.x.x)
      (PORT = 1891)
    )
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = DB12C)
    )
  )

Ignore the tracing information for now, you will see in a little while why this is important. The connection information points me to the IB network again. Now if I start the listener and connect to it, I can grep for the SDU sizes in the trace files.

The Result

Now I can check if I have actually been able not only to negotiate, but also receive the 2M SDU. On the listener side it looks good:

> grep 'nsconneg.*.sdu.*' trace/bigsdu.trc
[10-JUN-2014 16:51:16:840] nsconneg: vsn=315, lov=300, opt=0xc41, sdu=65535, tdu=65535, ntc=0x7f08
[10-JUN-2014 16:51:16:840] nsconneg: Large sdu=2097152,tdu=2097152,compression flg=0
[10-JUN-2014 16:51:16:840] nsconneg: vsn=315, gbl=0xc01, sdu=2097152, tdu=2097152

Likewise, on the client side:

> grep 'nsconneg.*.sdu.*' trace/client_26421.trc
(2998822304) [10-JUN-2014 16:51:16:859] nsconneg: vsn=315, gbl=0xa41, sdu=2097152, tdu=2097152

So it all looks ok!

Considerations for RAC

Based on Stefan Koehler’s comment (see below) I thought I’d add the word-of-warning to the RAC users amongst us. In the above example I created a single listener, not registered in Clusterware, for a test. For the typical use-case: data migration-this listener wouldn’t be on the Exadata, it would be on the source databases system. I usually pull data from the source to Exadata.

If however you need to change the SDU in a RAC listener, please refer to “Setting Parameters for Scan and Node Listeners on RAC, Queuesize, SDU, Ports, etc (Doc ID 1292915.1)” for more information. In summary changes to the listener configuration files are discouraged because most listener properties are maintained by Clusterware. The SDU size specifically requires you to set the DEFAULT_SDU_SIZE in sqlnet.ora (in $GRID_HOME/network/admin), not in the listener configuration file.

References

How to Determine SDU Value Being Negotiated Between Client and Server (Doc ID 304235.1)

Stefan Koehler on SAP SCN: http://goo.gl/y0MAuH

As well as the usual suspects: Net Services Admin and Reference Guide.

Posted in 12c Release 1, Exadata | Tagged: | 1 Comment »

RAC 12c enhancements: adding an additional SCAN-part 4

Posted by Martin Bach on May 26, 2014

This is going to be the last part of this series, however long it might end up being in the end. In the previous articles you read how to create a physical standby database from a RAC One database.

Networks (refresher)

To make it easier to follow without going back to the previous articles, here are the networks I’m using, listed for your convenience.

  • 192.168.100/24: Client network
  • 192.168.102/24: Dedicated Data Guard network

Data Guard Broker Configuration

I said it before and I say it again: I like the Data Guard broker interface for managing standby databases. It’s the ultimate simplification and so easy to implement that it’s quite safe to operate even in large organisations. As the added bonus you get OEM integration as well. OEM relies on a Broker configuration. The first step in managing the standby database therefore is to create the Data Guard configuration. And by the way, the documentation was correct, and the parameter “listener_networks” is already set! When the standby database starts you can see it being set by the Oracle agent process. The alert.log reads:

Starting background process DMON
DMON started with pid=34, OS id=11023
ORACLE_BASE from environment = /u01/app/oracle
Using default pga_aggregate_limit of 2048 MB
ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.58)(PORT=1521))' SCOPE=MEMORY SID='sby_2';
2014-04-28 05:11:47.874000 -04:00
ALTER SYSTEM SET remote_listener=' ron12csby-scan.example.com:1521' SCOPE=MEMORY SID='sby_2';
ALTER SYSTEM SET listener_networks='(( NAME=net2)(LOCAL_LISTENER=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.102.58)
(PORT=1522)))))','((NAME=net2)(REMOTE_LISTENER=ron12csby-dgscan.dg.example.com:1521))' SCOPE=MEMORY SID='sby_2';
2014-04-28 05:12:45.107000 -04:00
Decreasing number of real time LMS from 1 to 0

So no more need to worry about local listener (unless you have many local listeners and are not on 12c), remote_listener and listener_networks. That’s a big relief, to see that it does work. I can now create the configuration. I like the broker command line interface more than the OEM interface, especially since it is a lot easier to show here in the post.

Before you can work with the broker you need to enabled it. Set the following init.ora parameters to values similar to these below. It is important that these values are set identical on all instances. You also need to ensure the broker configuration files are in ASM.

SQL> show parameter dg_broker

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
dg_broker_config_file1               string      +DATA/ron/d1.dat
dg_broker_config_file2               string      +RECO/ron/d2.dat
dg_broker_start                      boolean     TRUE
SQL>

Make sure to make the settings on both clusters! Now connect using the command line interface and create the configuration. Note that I’m connecting using ronprinet2, which is the TNS name for the primary database using the Broker Network (192.168.102/24)

[oracle@ron12cprinode1 ~]$ dgmgrl sys/xxx@ronprinet2
DGMGRL for Linux: Version 12.1.0.1.0 - 64bit Production

Copyright (c) 2000, 2012, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected as SYSDBA.
DGMGRL> help create configuraiton

Creates a broker configuration

Syntax:

  CREATE CONFIGURATION <configuration name> AS
    PRIMARY DATABASE IS <database name>
    CONNECT IDENTIFIER IS <connect identifier>;

DGMGRL> create configuration dgtest as primary database is "RON" connect identifier is "ronprinet2";
Configuration "dgtest" created with primary database "RON"

DGMGRL> show database verbose "RON"

Database - RON

  Role:              PRIMARY
  Intended State:    OFFLINE
  Instance(s):
    pri_2

  Properties:
    DGConnectIdentifier             = 'ronprinet2'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'ASYNC'
    RedoRoutes                      = ''
    DelayMins                       = '0'
    Binding                         = 'optional'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyLagThreshold               = '0'
    TransportLagThreshold           = '0'
    TransportDisconnectedThreshold  = '30'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '4'
    LogArchiveMinSucceedDest        = '1'
    FastStartFailoverTarget         = ''
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.51)
(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=ron_DGMGRL)(INSTANCE_NAME=pri_2)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    TopWaitEvents                   = '(monitor)'

Database Status:
DISABLED

DGMGRL>

A few more things worth mentioning here: first of all the database does not know about the other instance. That’s not to worry about: you will see how this is addressed during the online relocation. The StaticConnectIdentifier is also pointing to the wrong network. Let’s have a look at the only instance:

DGMGRL> show instance verbose pri_2

Instance 'pri_2' of database 'RON'

  Host Name: ron12cprinode1.example.com
  PFILE:
  Properties:
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.51)(PORT=1521))
(CONNECT_DATA=(SERVICE_NAME=ron_DGMGRL)(INSTANCE_NAME=pri_2)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    TopWaitEvents                   = '(monitor)'

Instance Status:
DISABLED

Unsurprisingly the StaticConnectIdentifier is pointing to the wrong interface again. Let’s correct this.

DGMGRL> edit instance pri_2 set property StaticConnectIdentifier=
> '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.102.51)(PORT=1522))
> (CONNECT_DATA=(SERVICE_NAME=ron_DGMGRL)(INSTANCE_NAME=pri_2)(SERVER=DEDICATED)))';
Property "staticconnectidentifier" updated

This command updated the setting on the database and instance level. OK, let’s move on. I need to add the standby database.

DGMGRL> add database "RONDG" AS CONNECT IDENTIFIER IS ronsbynet2 MAINTAINED AS PHYSICAL;
Database "RONDG" added

DGMGRL> show database verbose "RONDG"

Database - RONDG

  Role:              PHYSICAL STANDBY
  Intended State:    OFFLINE
  Transport Lag:     (unknown)
  Apply Lag:         (unknown)
  Apply Rate:        (unknown)
  Real Time Query:   OFF
  Instance(s):
    sby_2

  Properties:
    DGConnectIdentifier             = 'ronsbynet2'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'ASYNC'
    RedoRoutes                      = ''
    DelayMins                       = '0'
    Binding                         = 'optional'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyLagThreshold               = '0'
    TransportLagThreshold           = '0'
    TransportDisconnectedThreshold  = '30'
    ApplyParallel                   = 'AUTO'
    StandbyFileManagement           = 'AUTO'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '4'
    LogArchiveMinSucceedDest        = '1'
    DbFileNameConvert               = ''
    LogFileNameConvert              = ''
    FastStartFailoverTarget         = ''
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.58)
(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rondg_DGMGRL)(INSTANCE_NAME=sby_2)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    TopWaitEvents                   = '(monitor)'

Database Status:
DISABLED

DGMGRL> show instance verbose sby_2

Instance 'sby_2' of database 'RONDG'

  Host Name: ron12csbynode1.example.com
  PFILE:
  Properties:
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.58)(PORT=1521))
(CONNECT_DATA=(SERVICE_NAME=rondg_DGMGRL)(INSTANCE_NAME=sby_2)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    TopWaitEvents                   = '(monitor)'

Instance Status:
DISABLED

Again let’s fix the static connection identifier.

DGMGRL> edit instance sby_2 set property StaticConnectIdentifier =
> '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.102.58)(PORT=1522))(CONNECT_DATA=(SERVICE_NAME=rondg_DGMGRL)
(INSTANCE_NAME=sby_2)(SERVER=DEDICATED)))';
Property "staticconnectidentifier" updated

Now everything is in place, I can enable the configuration.

DGMGRL> enable configuration
Enabled.
DGMGRL> show configuration

Configuration - dgtest

  Protection Mode: MaxPerformance
  Databases:
  RON   - Primary database
    RONDG - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

DGMGRL>

If you haven’t set the standby file management property to auto yet, please do so. I really don’t like MRP0 aborting because a datafile UNNAMED0000010 has been added. This causes confusion that’s better avoided.

If you get errors about missing standby redo logs (SRLs), please add them. I have dones so on the standby and primary, but didn’t include the output here.

Switchover

That should be all that’s needed in preparation. Since I’m impatient I wanted to see if the switchover works. First I check the configuration:

DGMGRL> show configuration

Configuration - dgtest

  Protection Mode: MaxPerformance
  Databases:
  RON   - Primary database
    RONDG - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

Success means: ready to switch over. Let’s try:

DGMGRL> switchover to "RONDG"
Performing switchover NOW, please wait...
Operation requires a connection to instance "sby_2" on database "RONDG"
Connecting to instance "sby_2"...
Connected as SYSDBA.
New primary database "RONDG" is opening...
Operation requires startup of instance "pri_2" on database "RON"
Starting instance "pri_2"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "RONDG"
DGMGRL> show configuration

Configuration - dgtest

  Protection Mode: MaxPerformance
  Databases:
  RONDG - Primary database
    RON   - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

Great so that worked. It was important that the StaticConnectIdentifiers are set correctly. During testing I found that relying on DGConnectIdentifier did not work, when it pointed to the second SCAN listener. The SCAN listeners do not “know” about the statically registered %_DGMGRL services, and you receive the dreaded “listener does not know about service name in connect descriptor” error. I also had to pick a listener for the StaticConnectIdentifier. I wanted to have both instances as potential targets for starting the instance, but that failed too. So in a way I map node 1 to instance sby_2 by pointing the broker to its listener.

Online Relocation

Now on to part 2: what happens to the standby when I perform an online relocation on the primary. Let’s try.

[oracle@ron12csbynode1 admin]$ srvctl status database -d rondg
Instance sby_2 is running on node ron12csbynode1
Online relocation: INACTIVE
[oracle@ron12csbynode1 admin]$ srvctl status service -d rondg
Service ron12c is running on instance(s) sby_2

[oracle@ron12csbynode1 admin]$ srvctl relocate database -d rondg -timeout 2 -verbose -node ron12csbynode2
Configuration updated to two instances
Instance sby_1 started
Services relocated
Waiting for up to 2 minutes for instance sby_2 to stop ...
Instance sby_2 stopped
Configuration updated to one instance
[oracle@ron12csbynode1 admin]$

In the meantime I connected to the primary (while it was relocating) and switched a few logs. This works, as you can see (RON is the standby database now)

2014-05-23 10:30:16.943000 -04:00
Recovery of Online Redo Log: Thread 1 Group 6 Seq 119 Reading mem 0
  Mem# 0: +DATA/RON/ONLINELOG/group_6.280.846066567
  Mem# 1: +RECO/RON/ONLINELOG/group_6.330.846066593
2014-05-23 10:30:42.897000 -04:00
Archived Log entry 226 added for thread 1 sequence 119 ID 0x64c27dbf dest 1:
2014-05-23 10:30:49.482000 -04:00
Media Recovery Waiting for thread 1 sequence 120 (in transit)
2014-05-23 10:30:52.043000 -04:00
Primary database is in MAXIMUM PERFORMANCE mode
Re-archiving standby log 8 thread 2 sequence 97
2014-05-23 10:31:02.879000 -04:00
Recovery of Online Redo Log: Thread 1 Group 5 Seq 120 Reading mem 0
  Mem# 0: +DATA/RON/ONLINELOG/group_5.279.846066497
  Mem# 1: +RECO/RON/ONLINELOG/group_5.329.846066535
2014-05-23 10:31:12.400000 -04:00
RFS[8]: Assigned to RFS process (PID:13182)
RFS[8]: Selected log 9 for thread 2 sequence 98 dbid 1681008808 branch 838874793
2014-05-23 10:31:14.987000 -04:00
Archived Log entry 227 added for thread 2 sequence 97 ID 0x64c27dbf dest 1:
2014-05-23 10:31:50.408000 -04:00
Media Recovery Waiting for thread 2 sequence 98 (in transit)

2014-05-23 10:32:02.193000 -04:00
RFS[8]: Selected log 8 for thread 2 sequence 99 dbid 1681008808 branch 838874793
2014-05-23 10:32:05.872000 -04:00
Recovery of Online Redo Log: Thread 2 Group 9 Seq 98 Reading mem 0
  Mem# 0: +DATA/RON/ONLINELOG/group_9.283.846066895
  Mem# 1: +RECO/RON/ONLINELOG/group_9.333.846066947

The logs are from 2 threads-the relocation temporarily changes the single instance database to a cluster database by starting instances on both hosts.

Eventually instance sby_2 stops and causes TNS errors in the standby’s alert.log:

***********************************************************************

Fatal NI connect error 12528, connecting to:
 (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=ron12csby-dgscan.dg.example.com)(PORT=1521))
(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=RONDG_DGB)(INSTANCE_NAME=sby_2)(CID=(PROGRAM=oracle)
(HOST=ron12cprinode1.example.com)(USER=oracle))))

  VERSION INFORMATION:
TNS for Linux: Version 12.1.0.1.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 12.1.0.1.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 12.1.0.1.0 - Production
  Time: 23-MAY-2014 10:34:25
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12564
...

Nothing to be alarmed about, this is normal, and shown in the output of the relocate command.

The good news is that the primary receives more redo:

2014-05-23 10:35:40.756000 -04:00
RFS[9]: Assigned to RFS process (PID:13329)
RFS[9]: Selected log 9 for thread 2 sequence 100 dbid 1681008808 branch 838874793
2014-05-23 10:36:08.256000 -04:00
Archived Log entry 231 added for thread 2 sequence 100 ID 0x64c27dbf dest 1:
2014-05-23 10:36:14.754000 -04:00
Media Recovery Waiting for thread 2 sequence 101 (in transit)
2014-05-23 10:36:15.972000 -04:00
RFS[7]: Opened log for thread 2 sequence 101 dbid 1681008808 branch 838874793
2014-05-23 10:36:20.162000 -04:00
Archived Log entry 232 added for thread 2 sequence 101 rlc 838874793 ID 0x64c27dbf dest 2:
2014-05-23 10:36:29.656000 -04:00
Media Recovery Log +RECO/RON/ARCHIVELOG/2014_05_23/thread_2_seq_101.393.848313375

And after instance 2 is shut down, you can see redo from thread 1 being transferred.

2014-05-23 10:46:58.257000 -04:00
RFS[6]: Selected log 5 for thread 1 sequence 122 dbid 1681008808 branch 838874793
Archived Log entry 233 added for thread 1 sequence 121 ID 0x64c27dbf dest 1:
2014-05-23 10:47:00.479000 -04:00
Media Recovery Waiting for thread 1 sequence 122 (in transit)
2014-05-23 10:47:03.686000 -04:00
Recovery of Online Redo Log: Thread 1 Group 5 Seq 122 Reading mem 0
  Mem# 0: +DATA/RON/ONLINELOG/group_5.279.846066497
  Mem# 1: +RECO/RON/ONLINELOG/group_5.329.846066535

If you are unsure about the thread to instance mapping, check v$thread, this example is from the primary:

SQL> select thread#, status, enabled, instance from v$thread;

   THREAD# STATUS ENABLED  INSTANCE
---------- ------ -------- ------------------------------
         1 OPEN   PUBLIC   sby_1
         2 CLOSED PUBLIC   sby_2

In summary my test showed that an online relocation does not seem to cause trouble for the standby database. I need to run a benchmark like Swingbench against it to see how the online relocation behaves when the system is under load though, the database was pretty much idle during the online relocation.

Amending the Broker configuration

Circling back to the introduction: what happens to the DG broker configuration after the database has been a RAC database (albeit briefly)? Here is the output from the configuration after the first relocation:

DGMGRL> show database verbose "RONDG"

Database - RONDG

  Role:              PRIMARY
  Intended State:    TRANSPORT-ON
  Instance(s):
    sby_1
    sby_2

  Properties:
    DGConnectIdentifier             = 'ronsbynet2'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'ASYNC'
    RedoRoutes                      = ''
    DelayMins                       = '0'
    Binding                         = 'optional'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyLagThreshold               = '0'
    TransportLagThreshold           = '0'
    TransportDisconnectedThreshold  = '30'
    ApplyParallel                   = 'AUTO'
    StandbyFileManagement           = 'AUTO'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '4'
    LogArchiveMinSucceedDest        = '1'
    DbFileNameConvert               = ''
    LogFileNameConvert              = ''
    FastStartFailoverTarget         = ''
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    StaticConnectIdentifier(*)
    StandbyArchiveLocation(*)
    AlternateLocation(*)
    LogArchiveTrace(*)
    LogArchiveFormat(*)
    TopWaitEvents(*)
    (*) - Please check specific instance for the property value

Database Status:
SUCCESS

Notice how the second instance (sby_1) has been added. The first question I had: what’s it done to the connection identifiers? As expected it’s set to the wrong network for the newly added instance:

DGMGRL> show instance verbose sby_1;

Instance 'sby_1' of database 'RONDG'

  Host Name: ron12csbynode2.example.com
  PFILE:
  Properties:
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.60)(PORT=1521))
(CONNECT_DATA=(SERVICE_NAME=rondg_DGMGRL)(INSTANCE_NAME=sby_1)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    TopWaitEvents                   = '(monitor)'

Instance Status:
SUCCESS

So that needs to be changed as well. It would make sense to systematically perform a relocation as part of the build of the Data Guard environment on primary and standby to register all instances and update them. Not that it’s a big issue, the switchover is most likely not going to fail, but you have to manually start the database if the connection information is wrong.

Summary

It’s been quite a journey! This article series explained how to add a second SCAN to the environment including all the other infrastructure such as VIPs and listeners. You then saw how to duplicate a database for Data Guard and finally

Posted in 12c Release 1, Data Guard, Linux | Leave a Comment »

Mysterious new Oracle compression type

Posted by Martin Bach on May 13, 2014

As part of our research for our joint presentation at OGH.nl and E4 my colleague Frits Hoogland made an interesting discovery. He verified the compression format for updated rows previously compressed using Hybrid Columnar Compression (HCC). In that particular example we researched HCC data on Exadata storage. As you may know, reading HCC data in its compressed form is limited to Exadata, Pillar Axiom, and the ZFS Storage Appliance (ZFSSA).

Background

So far I took it for granted that Oracle HCC compressed data can’t be updated in place. As has been shown in many posts and books (for example by Kerry Osborne) an update on a row that has been compressed using HCC will trigger the move of that particular row to a new block. The new block is flagged for OLTP compression. OLTP compression is a variation of the BASIC compression or de-duplication that Jonathan Lewis expertly covered in a four part series hosted by All Things Oracle. The link to the first part of it can be found here. Even if you think you don’t need to bother with BASIC compression you are mistaken: it is still very relevant, especially in the Exadata/HCC context. Before reading on about block changes etc. it might be a good idea to refer to Kerry’s blog post mentioned before. If you want to really dive into the matter I suggest two more posts as reference:

This almost starts to feel like a scientific article where you can’t see the text for all the footnotes but credit is due!

An interesting discovery

So as it happened Frits discovered a new compression type when updating rows in a HCC compressed table in 11.2.0.3.22. My examples that follow are on 12.1.0.1.0. Consider this table:

  
SQL> CREATE TABLE t1_wp
  2  enable row movement
  3  column store
  4  compress for query low
  5  AS
  6  WITH v1 AS
  7  (SELECT rownum n FROM dual CONNECT BY level <= 10000)
  8  SELECT  rownum id,
  9    rpad(rownum,500) t_pad,
 10    TRUNC(sysdate) - 180 + TRUNC((rownum-1)/3)/86400 date_created,
 11    TRUNC(sysdate) - 180 + TRUNC((rownum-1)/3)/86400 + dbms_random.value(1800, 86400)/86400 date_completed,
 12    CASE
 13	 WHEN mod(rownum,100000) = 0
 14	 THEN CAST('RARE' AS VARCHAR2(12))
 15	 WHEN mod(rownum,10000) = 0
 16	 THEN CAST('FAIRLY RARE' AS VARCHAR2(12))
 17	 WHEN mod(rownum,1000) = 0
 18	 THEN CAST('NOT RARE' AS VARCHAR2(12))
 19	 WHEN mod(rownum,100) = 0
 20	 THEN CAST('COMMON' AS	 VARCHAR2(12))
 21	 ELSE CAST('THE REST' AS VARCHAR2(12))
 22    END state,
 23    TO_CHAR(mod(rownum, 4)) spcol
 24  FROM v1,
 25    v1
 26  WHERE rownum <= 1e6;

It basically created 1 million rows with HCC compress for query low. I chose Query Low because it’s the easiest to work with as its compression unit size is roughly 32k. The compressed table is of small size.

SQL> select bytes/power(1024,2) m, blocks from user_segments
  2  where segment_name = 'T1_WP';

         M     BLOCKS
---------- ----------
        16       2048

Now what I am after is an update on a specific CU. I want to know the minimum and maximum ID per block:

select min(id),max(id),blockn from (
 select id,DBMS_ROWID.ROWID_RELATIVE_FNO(rowid) , 
   DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) as blockn 
   from martin.t1_wp
) group by blockn order by blockn;

   MIN(ID)    MAX(ID)	  BLOCKN
---------- ---------- ----------
         1       1964     262147
      1965       4194     262150
      4195       6424     262154
      6425       7148     262158
      7149       9375     262161
      9376      11598     262165
     11599      13795     262169
     13796      14970     262173
     14971      17127     262177
     17128      19270     262181
...

And so forth for a total 501 rows.

What happens now if I update IDs 1 to 10? But before I answer the question I’d like to show you the block dump of block 262147:

SQL> select rowid, dbms_compression.get_compression_type(user, 'T1_WP', rowid) compressionType 
  2  from t1_wp where id between 1 and 10;

ROWID              COMPRESSIONTYPE
------------------ ---------------
AAAO7sAALAABAADAAA               8
AAAO7sAALAABAADAAB               8
AAAO7sAALAABAADAAC               8
AAAO7sAALAABAADAAD               8
AAAO7sAALAABAADAAE               8
AAAO7sAALAABAADAAF               8
AAAO7sAALAABAADAAG               8
AAAO7sAALAABAADAAH               8
AAAO7sAALAABAADAAI               8
AAAO7sAALAABAADAAJ               8

10 rows selected.

As you can see the compression type is 8 or query low. Let’s look at the dump:

Block header dump:  0x02c40003
 Object id on Block? Y
 seg/obj: 0xeeec  csc: 0x99a.535752ba  itc: 3  flg: E  typ: 1 - DATA
     brn: 0  bdba: 0x2c40000 ver: 0x01 opc: 0
     inc: 0  exflg: 0

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0xffff.000.00000000  0x00000000.0000.00  C---    0  scn 0x099a.535752ba
0x02   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
0x03   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
bdba: 0x02c40003
data_block_dump,data header at 0x7f50921c447c
===============
tsiz: 0x1f80
hsiz: 0x1c
pbl: 0x7f50921c447c
     76543210
flag=-0------
ntab=1
nrow=1
frre=-1
fsbo=0x1c
fseo=0x30
avsp=0x14
tosp=0x14
        r0_9ir2=0x0
        mec_kdbh9ir2=0x0
                      76543210
        shcf_kdbh9ir2=----------
                  76543210
        flag_9ir2=--R-----      Archive compression: Y
                fcls_9ir2[0]={ }
0x16:pti[0]     nrow=1  offs=0
0x1a:pri[0]     offs=0x30
block_row_dump:
tab 0, row 0, @0x30
tl: 8016 fb: --H-F--N lb: 0x0  cc: 1
nrid:  0x02c40004.0
col  0: [8004]
Compression level: 01 (Query Low)
 Length of CU row: 8004
kdzhrh: ------PC- CBLK: 3 Start Slot: 00
 NUMP: 03
 PNUM: 00 POFF: 7964 PRID: 0x02c40004.0
 PNUM: 01 POFF: 15980 PRID: 0x02c40005.0
 PNUM: 02 POFF: 23996 PRID: 0x02c40006.0
*---------
CU header:
CU version: 0   CU magic number: 0x4b445a30
CU checksum: 0x147384ec
CU total length: 28107
CU flags: NC-U-CRD-OP
ncols: 6
nrows: 1964
algo: 0
CU decomp length: 27815   len/value length: 1048622
row pieces per row: 1
num deleted rows: 0
START_CU:
 00 00 1f 44 0f 03 00 00 00 03 00 00 1f 1c 02 c4 00 04 00 00 00 00 3e 6c 02
 c4 00 05 00 00 00 00 5d bc 02 c4 00 06 00 00 00 4b 44 5a 30 ec 84 73 14 00
 00 6d cb eb 06 00 06 07 ac 00 10 00 2e 01 00 00 00 00 00 00 00 00 00 00 00
 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
...

And now for the update, which is a simple “update t1_wp set spcol = ‘UPDATED’ where id between 1 and 10′” affecting exactly 10 rows.

Checking the ROWIDs again:

SQL> select rowid, dbms_compression.get_compression_type(user, 'T1_WP', rowid) compressionType
  2  from t1_wp where id between 1 and 10;

ROWID		   COMPRESSIONTYPE
------------------ ---------------
AAAO7sAALAABAfeAAA               1
AAAO7sAALAABAfeAAB               1
AAAO7sAALAABAfeAAC               1
AAAO7sAALAABAfeAAD               1
AAAO7sAALAABAfeAAE               1
AAAO7sAALAABAfeAAF               1
AAAO7sAALAABAfeAAG               1
AAAO7sAALAABAfeAAH               1
AAAO7sAALAABAfeAAI               1
AAAO7sAALAABAfeAAJ               1

10 rows selected.

So that’s interesting-the rowids have changed, and so has the compression algorithm. What if I commit?


SQL> commit;

Commit complete

SQL> select rowid, dbms_compression.get_compression_type(user, 'T1_WP', rowid) compressionType
  2  from t1_wp where id between 1 and 10;

ROWID              COMPRESSIONTYPE
------------------ ---------------
AAAO7sAALAABAfeAAA              64
AAAO7sAALAABAfeAAB              64
AAAO7sAALAABAfeAAC              64
AAAO7sAALAABAfeAAD              64
AAAO7sAALAABAfeAAE              64
AAAO7sAALAABAfeAAF              64
AAAO7sAALAABAfeAAG              64
AAAO7sAALAABAfeAAH              64
AAAO7sAALAABAfeAAI              64
AAAO7sAALAABAfeAAJ              64

10 rows selected.

There it is, the new mysterious compression type 64. To my knowledge it is not documented in 11.2, but it is in 12.1, as “DBMS_COMPRESSION.COMP_BLOCK”. What does it look like? Picking ROWID for id = 10 (AAAO7sAALAABAfeAAJ)

SQL> select id from t1_wp where rowid = 'AAAO7sAALAABAfeAAJ';

        ID
----------
        10

SQL> alter system dump datafile 7 block 46401502;

Block header dump:  0x02c407de
 Object id on Block? Y
 seg/obj: 0xeeec  csc: 0x99a.53575da9  itc: 2  flg: E  typ: 1 - DATA
     brn: 0  bdba: 0x2c40781 ver: 0x01 opc: 0
     inc: 0  exflg: 0

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x000a.008.00038f01  0x00000f5d.720e.0a  --U-   15  fsc 0x0000.53576117
0x02   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
bdba: 0x02c407de
data_block_dump,data header at 0x7fbc909ea264
===============
tsiz: 0x1f98
hsiz: 0x4c
pbl: 0x7fbc909ea264
     76543210
flag=-0----X-
ntab=2
nrow=20
frre=-1
fsbo=0x4c
fseo=0xbe
avsp=0x72
tosp=0x72
        r0_9ir2=0x1
        mec_kdbh9ir2=0x1
                      76543210
        shcf_kdbh9ir2=----------
                  76543210
        flag_9ir2=--R-LNOC      Archive compression: N
                fcls_9ir2[0]={ }
                perm_9ir2[6]={ 2 3 4 5 0 1 }
0x1c:pti[0]     nrow=5  offs=0
0x20:pti[1]     nrow=15 offs=5
0x24:pri[0]     offs=0x1f5c
0x26:pri[1]     offs=0x1f70
...
block_row_dump:
tab 0, row 0, @0x1f5c
tl: 20 fb: --H-FL-- lb: 0x0  cc: 2
col  0: [ 8]  54 48 45 20 52 45 53 54
col  1: [ 7]  55 50 44 41 54 45 44
bindmp: 00 0d 02 d0 54 48 45 20 52 45 53 54 cf 55 50 44 41 54 45 44
tab 0, row 1, @0x1f70
tl: 10 fb: --H-FL-- lb: 0x0  cc: 1
col  0: [ 7]  78 71 0b 0d 01 01 04
bindmp: 00 03 cf 78 71 0b 0d 01 01 04
...
tab 0, row 4, @0x1f8e
tl: 10 fb: --H-FL-- lb: 0x0  cc: 1
col  0: [ 7]  78 71 0b 0d 01 01 01
bindmp: 00 03 cf 78 71 0b 0d 01 01 01
tab 1, row 0, @0x1d55
tl: 519 fb: --H-FL-- lb: 0x1  cc: 6
col  0: [ 8]  54 48 45 20 52 45 53 54
col  1: [ 7]  55 50 44 41 54 45 44
col  2: [ 2]  c1 02
col  3: [500]
 31 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
...
col  4: [ 7]  78 71 0b 0d 01 01 01
col  5: [ 7]  78 71 0b 0d 0b 09 01
bindmp: 2c 01 05 00 ca c1 02 fa 01 f4 31 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 04 cf 78
 71 0b 0d 0b 09 01
tab 1, row 1, @0x1b4e
....

I have not come to a conclusion about this yet. There are similarities to the BASIC compression block dump in that there are 2 tables (ntab=2), and the first one looks a bit like a symbol table to me. It also uses the familiar perm_9ir2 field to let us know it changed the column order. The flag is new though, or at least to me. The only reference I could find to flag=-0—-X- was in the bug database, interestingly related to a OLTP-compressed block. The tl of the first row in t1 does not compute either, something I have only seen with BASIC/OLTP compressed data blocks.

The problem is that I can’t use the interpretation of BASIC compression in the bindump. Take the bindmp of tab 1, row 0:

bindmp: 2c 01 05 00 ca c1 02 fa 01 f4 31 20 20 20 20 20 20 20 20 20 20 20 20…

Going by the rules I know about the first byte is the flag byte (2c = –H-FL–), followed by the lock byte (01), followed by the number of columns stored at this location (05) and then the way of explaining the bindmp for OLTP/BASIC compressed data falls apart. What is interesting though that there seems to be a pattern:

> for line in $(grep bindmp orcl_ora_30181.trc); do  echo ${line:0:64}; done
bindmp: 00 0d 02 d0 54 48 45 20 52 45 53 54 cf 55 50 44 41 54 45
bindmp: 00 03 cf 78 71 0b 0d 01 01 04
bindmp: 00 03 cf 78 71 0b 0d 01 01 03
bindmp: 00 03 cf 78 71 0b 0d 01 01 02
bindmp: 00 03 cf 78 71 0b 0d 01 01 01
bindmp: 2c 01 05 00 ca c1 02 fa 01 f4 31 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 03 fa 01 f4 32 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 04 fa 01 f4 33 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 05 fa 01 f4 34 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 06 fa 01 f4 35 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 07 fa 01 f4 36 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 08 fa 01 f4 37 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 09 fa 01 f4 38 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 0a fa 01 f4 39 20 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 0b fa 01 f4 31 30 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 0c fa 01 f4 31 31 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 0d fa 01 f4 31 32 20 20 20 20 20 20 20
bindmp: 2c 01 05 00 ca c1 0e fa 01 f4 31 33 20 20 20 20 20 20 20
bindmp: 2c 01 06 d0 54 48 45 20 52 45 53 54 cf 55 50 44 41 54 45
bindmp: 2c 01 06 d0 54 48 45 20 52 45 53 54 cf 55 50 44 41 54 45

In another part of this article I will try to work out what exactly is represented by the bindmp.

Posted in 12c Release 1, Exadata | 5 Comments »

RAC 12c enhancements: adding an additional SCAN-part 3

Posted by Martin Bach on May 9, 2014

Travel time can be writing time and while sitting in the departure lounge waiting for my flight I use the opportunity to add part 3 of the series. In the previous two parts you could read how to add a second SCAN and the necessary infrastructure to the cluster. Now it is time to create the standby database. It is assumed that a RAC One Node database has already been created on the primary cluster and is in archivelog mode.

Static Registration with the Listeners

The first step is to statically register the databases with their respective listeners. The example below is for the primary database first and standby next, it is equally applicable to the standby. The registration is needed during switchover operations when the broker restarts databases as needed. Without static registration you cannot connect to the database remotely while it is shut down.

# static registration on the primary cluster-$GRID_HOME/network/admin/listener.ora
SID_LIST_LISTENER_DG=
  (SID_LIST=
    (SID_DESC=
      (GLOBAL_DBNAME=RON)
      (ORACLE_HOME=/u01/app/oracle/product/12.1.0.1/dbhome_1)
      (SID_NAME=pri_1)
    )
    (SID_DESC=
      (GLOBAL_DBNAME=RON)
      (ORACLE_HOME=/u01/app/oracle/product/12.1.0.1/dbhome_1)
      (SID_NAME=pri_2)
    )

    (SID_DESC=
      (GLOBAL_DBNAME=RON_DGMGRL)
      (ORACLE_HOME=/u01/app/oracle/product/12.1.0.1/dbhome_1)
      (SID_NAME=pri_1)
    )
    (SID_DESC=
      (GLOBAL_DBNAME=RON_DGMGRL)
      (ORACLE_HOME=/u01/app/oracle/product/12.1.0.1/dbhome_1)
      (SID_NAME=pri_2)
    )
  )

# static registration on the standby cluster-$GRID_HOME/network/admin/listener.ora
SID_LIST_LISTENER_DG=
  (SID_LIST=
    (SID_DESC=
      (GLOBAL_DBNAME=rondg)
      (ORACLE_HOME=/u01/app/oracle/product/12.1.0.1/dbhome_1)
      (SID_NAME=sby_1)
    )
    (SID_DESC=
      (GLOBAL_DBNAME=rondg)
      (ORACLE_HOME=/u01/app/oracle/product/12.1.0.1/dbhome_1)
      (SID_NAME=sby_2)
    )

    (SID_DESC=
      (GLOBAL_DBNAME=rondg_DGMGRL)
      (ORACLE_HOME=/u01/app/oracle/product/12.1.0.1/dbhome_1)
      (SID_NAME=sby_1)
    )
    (SID_DESC=
      (GLOBAL_DBNAME=rondg_DGMGRL)
      (ORACLE_HOME=/u01/app/oracle/product/12.1.0.1/dbhome_1)
      (SID_NAME=sby_2)
    )
  )

The static listener registration has to be performed on each cluster node, primary and standby cluster alike.

Database Duplication-Creating the Physical Standby

With the listener registration complete you can start the duplication. I am using backup-based duplication here, you could equally go for a duplication from active database. To be able to perform the network duplication you have to have a backup first. I created it on the primary:

RMAN> backup incremental level 0 database format '/u01/oraback/%U';

Starting backup at 28-APR-14
using channel ORA_DISK_1
using channel ORA_DISK_2
channel ORA_DISK_1: starting incremental level 0 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00003 name=+DATA/RON/DATAFILE/sysaux.257.838874315
input datafile file number=00006 name=+DATA/RON/DATAFILE/users.259.838874605
channel ORA_DISK_1: starting piece 1 at 28-APR-14
channel ORA_DISK_2: starting incremental level 0 datafile backup set
channel ORA_DISK_2: specifying datafile(s) in backup set
input datafile file number=00001 name=+DATA/RON/DATAFILE/system.258.838874441
input datafile file number=00004 name=+DATA/RON/DATAFILE/undotbs1.260.838874611
input datafile file number=00008 name=+DATA/RON/DATAFILE/undotbs2.268.838875559
channel ORA_DISK_2: starting piece 1 at 28-APR-14
channel ORA_DISK_1: finished piece 1 at 28-APR-14
piece handle=/u01/oraback/0pp6r8f8_1_1 tag=TAG20140428T044110 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:48
channel ORA_DISK_1: starting incremental level 0 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00010 name=+DATA/RON/DD7D8C1D4C234B38E04325AAE80AF577/DATAFILE/sysaux.272.838876477
input datafile file number=00011 name=+DATA/RON/DD7D8C1D4C234B38E04325AAE80AF577/DATAFILE/users.275.838876481
channel ORA_DISK_1: starting piece 1 at 28-APR-14
channel ORA_DISK_2: finished piece 1 at 28-APR-14
piece handle=/u01/oraback/0qp6r8f8_1_1 tag=TAG20140428T044110 comment=NONE
channel ORA_DISK_2: backup set complete, elapsed time: 00:00:53
channel ORA_DISK_2: starting incremental level 0 datafile backup set
channel ORA_DISK_2: specifying datafile(s) in backup set
input datafile file number=00012 name=+DATA/RON/DD7D8C1D4C234B38E04325AAE80AF577/DATAFILE/example.273.838876479
input datafile file number=00009 name=+DATA/RON/DD7D8C1D4C234B38E04325AAE80AF577/DATAFILE/system.274.838876479
channel ORA_DISK_2: starting piece 1 at 28-APR-14
channel ORA_DISK_1: finished piece 1 at 28-APR-14
piece handle=/u01/oraback/0rp6r8gp_1_1 tag=TAG20140428T044110 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:29
channel ORA_DISK_1: starting incremental level 0 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00007 name=+DATA/RON/DD7C48AA5A4404A2E04325AAE80A403C/DATAFILE/sysaux.265.838874953
channel ORA_DISK_1: starting piece 1 at 28-APR-14
channel ORA_DISK_2: finished piece 1 at 28-APR-14
piece handle=/u01/oraback/0sp6r8h1_1_1 tag=TAG20140428T044110 comment=NONE
channel ORA_DISK_2: backup set complete, elapsed time: 00:00:25
channel ORA_DISK_2: starting incremental level 0 datafile backup set
channel ORA_DISK_2: specifying datafile(s) in backup set
input datafile file number=00005 name=+DATA/RON/DD7C48AA5A4404A2E04325AAE80A403C/DATAFILE/system.266.838874953
channel ORA_DISK_2: starting piece 1 at 28-APR-14
...
channel ORA_DISK_2: finished piece 1 at 28-APR-14
piece handle=/u01/oraback/0up6r8ht_1_1 tag=TAG20140428T044110 comment=NONE
channel ORA_DISK_2: backup set complete, elapsed time: 00:00:25
Finished backup at 28-APR-14

Starting Control File and SPFILE Autobackup at 28-APR-14
piece handle=+RECO/RON/AUTOBACKUP/2014_04_28/s_846045788.270.846045815 comment=NONE
Finished Control File and SPFILE Autobackup at 28-APR-14

RMAN> backup current controlfile for standby;

Starting backup at 28-APR-14
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=282 instance=pri_1 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=43 instance=pri_1 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including standby control file in backup set
channel ORA_DISK_1: starting piece 1 at 28-APR-14
channel ORA_DISK_1: finished piece 1 at 28-APR-14
piece handle=/u01/oraback/10p6rafh_1_1 tag=TAG20140428T051528 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 28-APR-14

Starting Control File and SPFILE Autobackup at 28-APR-14
piece handle=+RECO/RON/AUTOBACKUP/2014_04_28/s_846047742.271.846047749 comment=NONE
Finished Control File and SPFILE Autobackup at 28-APR-14

RMAN>

Don’t forget to backup the current controlfile for a standby database – that’s immensely important for backup-based duplication but will be taken care of by the duplication “from active database”. Once the backup is created, ensure it is available on the standby host. In my case I am using the same backup location on the primary as well as on the standby. Being a good citizen and because I want test active duplication from backup later I created a new TNS entry in /u01/app/oracle/product/12.1.0.1/dbhome_1/network/admin/tnsnames.ora

SETUP =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.102.58)(PORT = 1522))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = rondg)
      (INSTANCE_NAME = sby_2)
    )
  )

This is needed since there won’t be a cross-registration of the new SCAN listener with the listener_dg:

[oracle@ron12csbynode1 ~]$ lsnrctl status listener_dg

LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 28-APR-2014 04:57:13

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_DG)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_DG
Version                   TNSLSNR for Linux: Version 12.1.0.1.0 - Production
Start Date                23-APR-2014 10:54:58
Uptime                    4 days 18 hr. 2 min. 15 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.1.0.1/grid/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/ron12csbynode1/listener_dg/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_DG)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.102.58)(PORT=1522)))
Services Summary...
Service "rondg" has 2 instance(s).
  Instance "sby_1", status UNKNOWN, has 1 handler(s) for this service...
  Instance "sby_2", status UNKNOWN, has 1 handler(s) for this service...
Service "rondg_DGMGRL" has 2 instance(s).
  Instance "sby_1", status UNKNOWN, has 1 handler(s) for this service...
  Instance "sby_2", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully
[oracle@ron12csbynode1 ~]$

If you are not using active duplication you also need to take care of the password file. After a lot of trial-and-error it became apparent that a) you can’t have the password file in ASM and b) you need to copy the original password file from the primary database. If the source password is in ASM you can use asmcmdp cp +data/RON/orapwRON.ora /tmp to copy it out of ASM. When I tried to create a password file locally I could not connect-every time I tried I had an ‘ORA-1033 “ORACLE initialization or shutdown in progress”‘ For the purpose of the duplication I am using node 1 on the standby cluster. Later on you can extend the database to both nodes.

You will need a password file to start the standby database, here is what I used:

[oracle@ron12csbynode1 dbs]$ cat initsby.ora
*.archive_lag_target=0
*.audit_file_dest='/u01/app/oracle/admin/rondg/adump'
*.audit_trail='db'
*.cluster_database=true
*.compatible='12.1.0.0.0'
*.db_block_size=8192
*.db_create_file_dest='+DATA'
*.db_domain=''
*.db_name='ron'
*.db_unique_name='rondg'
*.db_recovery_file_dest='+RECO'
*.db_recovery_file_dest_size=5025m
*.dg_broker_start=TRUE
*.diagnostic_dest='/u01/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=priXDB)'
*.enable_pluggable_database=true
*.log_archive_dest_1='location=USE_DB_RECOVERY_FILE_DEST','valid_for=(ALL_LOGFILES, ALL_ROLES)'
*.nls_language='ENGLISH'
*.nls_territory='UNITED KINGDOM'
*.open_cursors=300
*.pga_aggregate_target=512m
*.processes=300
*.remote_login_passwordfile='exclusive'
*.sga_target=1024m
*.standby_file_management='auto'

Since the SIDs will be sby_1 and sby_2 I created symlinks for these, pointing to the pfile. With everything in place it was time to duplicate!

[oracle@ron12cprinode1 oraback]$ rman target sys/secret@ronprinet2 auxiliary sys/secret@setup

Recovery Manager: Release 12.1.0.1.0 - Production on Mon Apr 28 05:14:35 2014

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

connected to target database: RON (DBID=1681008808)
connected to auxiliary database: RON (not mounted)

RMAN> duplicate target database for standby;

...

contents of Memory Script:
{
   switch clone datafile all;
}
executing Memory Script

datafile 1 switched to datafile copy
input datafile copy RECID=18 STAMP=846048497 file name=+DATA/RONDG/DATAFILE/system.271.846047875
datafile 3 switched to datafile copy
input datafile copy RECID=19 STAMP=846048500 file name=+DATA/RONDG/DATAFILE/sysaux.272.846047875
datafile 4 switched to datafile copy
input datafile copy RECID=20 STAMP=846048502 file name=+DATA/RONDG/DATAFILE/undotbs1.270.846047877
datafile 5 switched to datafile copy
input datafile copy RECID=21 STAMP=846048503 file name=+DATA/RONDG/F1CEE6116436678FE0433264A8C016E9/DATAFILE/system.278.846048325
datafile 6 switched to datafile copy
input datafile copy RECID=22 STAMP=846048505 file name=+DATA/RONDG/DATAFILE/users.269.846047877
datafile 7 switched to datafile copy
input datafile copy RECID=23 STAMP=846048506 file name=+DATA/RONDG/F1CEE6116436678FE0433264A8C016E9/DATAFILE/sysaux.277.846048381
datafile 8 switched to datafile copy
input datafile copy RECID=24 STAMP=846048508 file name=+DATA/RONDG/DATAFILE/undotbs2.256.846047879
datafile 9 switched to datafile copy
input datafile copy RECID=25 STAMP=846048510 file name=+DATA/RONDG/F1CF406751F56F3EE0433264A8C0E61D/DATAFILE/system.275.846048145
datafile 10 switched to datafile copy
input datafile copy RECID=26 STAMP=846048512 file name=+DATA/RONDG/F1CF406751F56F3EE0433264A8C0E61D/DATAFILE/sysaux.274.846048189
datafile 11 switched to datafile copy
input datafile copy RECID=27 STAMP=846048513 file name=+DATA/RONDG/F1CF406751F56F3EE0433264A8C0E61D/DATAFILE/users.257.846048191
datafile 12 switched to datafile copy
input datafile copy RECID=28 STAMP=846048515 file name=+DATA/RONDG/F1CF406751F56F3EE0433264A8C0E61D/DATAFILE/example.276.846048143
Finished Duplicate Db at 28-APR-14

RMAN>

After the duplication finished ensure you switch to the use of SPFILEs instead of the pfile. You also need to add the *.control_files to the (s)pfile. You could also create the spfile in ASM like I did here.

Registration with Clusterware

With the duplication complete you can register the database and a service with Clusterware:

[oracle@ron12csbynode1 ~]$ srvctl add database -d rondg -oraclehome /u01/app/oracle/product/12.1.0.1/dbhome_1 \
> -dbtype RACONENODE -server ron12csbynode1,ron12csbynode2 -instance sby -spfile '+DATA/RONDG/spfilesby.ora' \
> -role PHYSICAL_STANDBY -startoption MOUNT -diskgroup DATA,RECO

[oracle@ron12csbynode1 ~]$ srvctl add service -d rondg -s ron12c

You don’t need to start that service now, it will be started when the database is opened:

[oracle@ron12csbynode1 ~]$ crsctl stat res ora.rondg.ron12c.svc -p
NAME=ora.rondg.ron12c.svc
TYPE=ora.service.type
...
START_DEPENDENCIES=hard(ora.rondg.db,type:ora.cluster_vip_net1.type) \
weak(type:ora.listener.type) dispersion(type:ora.service.type) \
pullup(type:ora.cluster_vip_net1.type) pullup:always(ora.rondg.db)
...

Done for now

This concludes the third part of this article. We are getting closer! There is a primary and a standby database now, and all that remains to be done is the creation of the Data Guard configuration to start log shipping. You can read about that in part 4. The final part will also include demonstrations of Data Guard behaviour when you relocate the instance from node 1 to node 2. I will also show you how to perform a switchover operation.

Posted in 12c Release 1, Data Guard, Linux | Leave a Comment »

RAC 12c enhancements: adding an additional SCAN-part 2

Posted by Martin Bach on April 28, 2014

In the first part of this article you could read how to add an additional network resource, additional VIPs and SCAN to an 12.1.0.1.2 cluster. In this part I hope to show you the next steps such as adding the SCAN listeners and other resources.

New SCAN listener

With the second SCAN added it is time to add the next set of SCAN listeners. This is really simple, and here is the code to add them:

[oracle@ron12cprinode1 ~]# srvctl add scan_listener -netnum 2 -listener dgscanlsnr

After starting the SCAN listeners on network 2, I can see they are indeed working correctly:

[oracle@ron12cprinode1 ~]$ srvctl start scan_listener -k 2

[oracle@ron12cprinode1 ~]$ srvctl config scan_listener -k 2
SCAN Listener DGSCANLSNR_SCAN1_NET2 exists. Port: TCP:1521
Registration invited nodes:
Registration invited subnets:
SCAN Listener DGSCANLSNR_SCAN2_NET2 exists. Port: TCP:1521
Registration invited nodes:
Registration invited subnets:
SCAN Listener DGSCANLSNR_SCAN3_NET2 exists. Port: TCP:1521
Registration invited nodes:
Registration invited subnets:
[oracle@ron12cprinode1 ~]$

This step needs to be repeated on the second cluster as well. As it’s more or less the same I didn’t repeat it here.

Now it’s getting slightly busier in the infrastructure! In between the various tasks I verified the status of Clusterware. At this point I had additional VIPs, a new SCAN, and new SCAN listeners.

crsctl stat res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       ron12cprinode1           STABLE
               ONLINE  ONLINE       ron12cprinode2           STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       ron12cprinode1           STABLE
               ONLINE  ONLINE       ron12cprinode2           STABLE
ora.OCR.dg
               ONLINE  ONLINE       ron12cprinode1           STABLE
               ONLINE  ONLINE       ron12cprinode2           STABLE
ora.RECO.dg
               ONLINE  ONLINE       ron12cprinode1           STABLE
               ONLINE  ONLINE       ron12cprinode2           STABLE
ora.asm
               ONLINE  ONLINE       ron12cprinode1           Started,STABLE
               ONLINE  ONLINE       ron12cprinode2           Started,STABLE
ora.net1.network
               ONLINE  ONLINE       ron12cprinode1           STABLE
               ONLINE  ONLINE       ron12cprinode2           STABLE
ora.net2.network
               ONLINE  ONLINE       ron12cprinode1           STABLE
               ONLINE  ONLINE       ron12cprinode2           STABLE
ora.ons
               ONLINE  ONLINE       ron12cprinode1           STABLE
               ONLINE  ONLINE       ron12cprinode2           STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.DGSCANLSNR_SCAN1_NET2.lsnr
      1        ONLINE  ONLINE       ron12cprinode2           STABLE
ora.DGSCANLSNR_SCAN2_NET2.lsnr
      1        ONLINE  ONLINE       ron12cprinode2           STABLE
ora.DGSCANLSNR_SCAN3_NET2.lsnr
      1        ONLINE  ONLINE       ron12cprinode2           STABLE
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.cvu
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.oc4j
      1        OFFLINE OFFLINE                               STABLE
ora.ron.db
      1        ONLINE  ONLINE       ron12cprinode1           Open,STABLE
ora.ron.ron12c.svc
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.ron12cprinode1.vip
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.ron12cprinode1_2.vip
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.ron12cprinode2.vip
      1        ONLINE  ONLINE       ron12cprinode2           STABLE
ora.ron12cprinode2_2.vip
      1        ONLINE  ONLINE       ron12cprinode2           STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.scan1_net2.vip
      1        ONLINE  ONLINE       ron12cprinode2           STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.scan2_net2.vip
      1        ONLINE  ONLINE       ron12cprinode2           STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       ron12cprinode1           STABLE
ora.scan3_net2.vip
      1        ONLINE  ONLINE       ron12cprinode2           STABLE
--------------------------------------------------------------------------------

Additional Node Listeners

Now I need another set of listeners on the 2nd network. You can use netca for this, but it’s far easier to add it on the command line.

I was going ahead and use port 1522 for this new set of listeners to remove some of the ambiguity in having multiple listeners. There is not a technical need to do so, since the listeners listen on different networks (IP addresses) they could all use port 1521.

Here is the code for reference:

[oracle@ron12cprinode1 ~]$ srvctl add listener -listener LISTENER_DG -netnum 2
[oracle@ron12cprinode1 ~]$ srvctl config listener
Name: LISTENER
Network: 1, Owner: oracle
Home: <CRS home>
End points: TCP:1521
Name: LISTENER_DG
Network: 2, Owner: oracle
Home: <CRS home>
End points: TCP:1522

Who needs to start a GUI if this can be done with 1 line ;) Don’t forget to create the node listeners on the standby cluster as well.

Listener (cross) registration

One of the perks of 12c Clusterware is that you don’t need to set listener_networks manually anymore. If you haven’t heard about this particular parameter here is some background.

You have to define listener_networks to allow cross-registration of cluster-listener (SCAN!) and the node listener. Following the above example I need to ensure that the SCAN on network 1-the one created during the installation-registers with the listeners on the same network. I also need to ensure that my Data Guard SCAN listener registers with the DG node listeners. In versions prior to 12c this was done by manually setting the initialisation parameter listener_networks.

In 12c the documentation claims this happens automatically, and the ever-sceptic me didn’t want to believe this at first. But after a restart of the primary database I could see the following entries in the alert.log, these are set by Clusterware, not me:

NOTE: dependency between database ron and diskgroup resource ora.DATA.dg is established
ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.51)(PORT=1521))' SCOPE=MEMORY SID='pri_1';
ALTER SYSTEM SET remote_listener=' ron12cpri-scan.example.com:1521' SCOPE=MEMORY SID='pri_1';
ALTER SYSTEM SET listener_networks='(( NAME=net2)(LOCAL_LISTENER=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.102.51)(PORT=1522)))))','((NAME=net2)(REMOTE_LISTENER=ron12cpri-dgscan.dg.example.com:1521))' SCOPE=MEMORY SID='pri_1';
2014-04-28 02:44:46.242000 -04:00
ALTER DATABASE   MOUNT

192.168.100.51 is the IP address of ron12cprinode1-vip, and the local_listener defaults to the listener on that interface. The remote listener is also set to the recommended value automatically. The interesting bit lies in the next command, and I want to make sure it all works. 192.168.102.51 is the first VIP on the second (DG) network and you saw that ron12cpri-dgscan.dg.example.com has just been created. For those of you who remember setting listener_networks in 11.2 you will find this automation a big relief.

Did it work?

Cross registration is best checked with the listeners. There are quiet a few listeners to check (do a ps -ef | grep tns for fun on one of your cluster nodes):

  • local node listener on network 1
  • (any) SCAN listener on network 1
  • local node listener on network 2
  • (any) SCAN listener on network 2

A quick check to the listeners on the primary node reveals the following:

[oracle@ron12cprinode1 ~]$ lsnrctl status

LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 28-APR-2014 02:50:25

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 12.1.0.1.0 - Production
Start Date                23-APR-2014 10:55:44
Uptime                    4 days 15 hr. 54 min. 41 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.1.0.1/grid/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/ron12cprinode1/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.100.50)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.100.51)(PORT=1521)))
Services Summary...
Service "+ASM" has 1 instance(s).
  Instance "+ASM1", status READY, has 1 handler(s) for this service...
Service "pdb1" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "priXDB" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron12c" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron_DGB" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
The command completed successfully
[oracle@ron12cprinode1 ~]$

As was expected, the node listener “knows” about my database instance. What about the initial SCAN listeners?

[oracle@ron12cprinode1 ~]$ lsnrctl status LISTENER_SCAN1

LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 28-APR-2014 02:54:31

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN1)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_SCAN1
Version                   TNSLSNR for Linux: Version 12.1.0.1.0 - Production
Start Date                23-APR-2014 10:55:41
Uptime                    4 days 15 hr. 58 min. 50 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.1.0.1/grid/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/ron12cprinode1/listener_scan1/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN1)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.100.54)(PORT=1521)))
Services Summary...
Service "pdb1" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "priXDB" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron12c" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron_DGB" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
The command completed successfully
[oracle@ron12cprinode1 ~]$

Looks ok too. The new Data Guard SCAN listener is next:

[oracle@ron12cprinode2 ~]$ lsnrctl status DGSCANLSNR_SCAN1_NET2

LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 28-APR-2014 02:54:54

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=DGSCANLSNR_SCAN1_NET2)))
STATUS of the LISTENER
------------------------
Alias                     DGSCANLSNR_SCAN1_NET2
Version                   TNSLSNR for Linux: Version 12.1.0.1.0 - Production
Start Date                28-APR-2014 02:16:46
Uptime                    0 days 0 hr. 38 min. 8 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.1.0.1/grid/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/ron12cprinode2/dgscanlsnr_scan1_net2/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=DGSCANLSNR_SCAN1_NET2)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.102.54)(PORT=1521)))
Services Summary...
Service "pdb1" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron12c" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron_DGB" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
The command completed successfully
[oracle@ron12cprinode2 ~]$

And finally, what about the new LISTENER_DG, the Data Guard node listener:

[oracle@ron12cprinode1 admin]$ lsnrctl status LISTENER_DG

LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 28-APR-2014 03:14:05

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_DG)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_DG
Version                   TNSLSNR for Linux: Version 12.1.0.1.0 - Production
Start Date                23-APR-2014 10:55:41
Uptime                    4 days 16 hr. 18 min. 25 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.1.0.1/grid/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/ron12cprinode1/listener_dg/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_DG)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.102.51)(PORT=1522)))
Services Summary...
Service "RON" has 3 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "pdb1" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron12c" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
Service "ron_DGB" has 1 instance(s).
  Instance "pri_1", status READY, has 1 handler(s) for this service...
The command completed successfully
[oracle@ron12cprinode1 admin]$

So it appears the listeners have been able to register with the LREG-processes of my database. In the next part of the series I’ll add the static database registration to listener.ora and duplicate my database for use as a physical standby.

Posted in 12c Release 1, Linux | Leave a Comment »

RAC 12c enhancements: adding an additional SCAN-part 1

Posted by Martin Bach on April 22, 2014

Based on customer request Oracle has added the functionality to add a second SCAN, completely independent of the SCAN defined/created during the cluster creation. Why would you want to use this feature? A few reasons that spring to mind are:

  • Consolidation: customers insist on using a different network
  • Separate network for Data Guard traffic

To demonstrate the concept I am going to show you in this blog post how I

  1. Add a new network resource
  2. Create new VIPs
  3. Add a new SCAN
  4. Add a new SCAN listener

It actually sounds more complex than it is, but I have a feeling I need to split this article in multiple parts as it’s far too long.

The lab setup

When you install RAC 11.2 and 12.1 you are prompted to specify a Single Client Access Name, or SCAN. This SCAN is usually defined in the corporate DNS server and resolves to 3 IP addresses. This allows for an easy way to implement client-side load balancing. The SCAN is explained in more detail in Pro Oracle Database 11g RAC on Linux for 11.2 and on OTN for 11.2 and 12.1. To spice the whole configuration up a little bit I decided to use RAC One Node on the clusters I am using for this demonstration.

I created 2 12.1.0.1.2 clusters for this Data Guard test. Hosts ron12cprinode1 and ron12cprinode2 form the primary cluster, ron12csbynode1 and ron12csbynode2 will form the standby cluster. The RAC One Node database is named RON:

[oracle@ron12cprinode1 ~]$ srvctl config database -db ron
Database unique name: ron
Database name: ron
Oracle home: /u01/app/oracle/product/12.1.0.1/dbhome_1
Oracle user: oracle
Spfile: +DATA/ron/spfilepri.ora
Password file: +DATA/ron/orapwpri
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: ron
Database instances:
Disk Groups: RECO
Mount point paths:
Services: ron12c
Type: RACOneNode
Online relocation timeout: 30
Instance name prefix: pri
Candidate servers: ron12cprinode1,ron12cprinode2
Database is administrator managed
[oracle@ron12cprinode1 ~]$

To make things even more interesting I defined my ORACLE_SID prefix on the primary to be “pri” and “sby” on the standby.

[oracle@ron12cprinode1 ~]$ ps -ef | grep smon
oracle    2553     1  0 Feb06 ?        00:00:09 asm_smon_+ASM1
oracle   15660 15578  0 05:05 pts/3    00:00:00 grep smon
oracle   28241     1  0 Feb07 ?        00:00:18 ora_smon_pri_1
[oracle@ron12cprinode1 ~]$

A quick check with gpnptool reveals the network usage before the addition of the second SCAN:

<gpnp:Network-Profile>
 <gpnp:HostNetwork id="gen" HostName="*">
  <gpnp:Network id="net1" IP="192.168.100.0" Adapter="eth0" Use="public"/>
  <gpnp:Network id="net2" IP="192.168.101.0" Adapter="eth1" Use="cluster_interconnect"/>
 </gpnp:HostNetwork>
</gpnp:Network-Profile>

There is the default network, (“netnum 1″) that is created on the network defined as “public” during the installation. I have another spare network port (eth2) reserved for the new network and Data Guard traffic. Currently network 1 is the only one available.

[root@ron12cprinode1 ~]# srvctl config network
Network 1 exists
Subnet IPv4: 192.168.100.0/255.255.255.0/eth0, static
Subnet IPv6:

As you can see RAC 12c now supports IPv6. I have another network available that I want to make available for Data Guard traffic. For this purpose I added all nodes into DNS. I am a bit old-fashioned when it comes to DNS, I am still using bind most of the time. Here is an excerpt of my reverse name resolution file:

; hosts - primary cluster
50	PTR	ron12cprinode1.dg.example.com.
51	PTR	ron12cprinode1-vip.example.com.
52	PTR	ron12cprinode2.dg.example.com.
53	PTR	ron12cprinode2-vip.dg.example.com.
; Data Guard SCAN - primary cluster
54	PTR	ron12cpri-scan.dg.example.com.
55	PTR	ron12cpri-scan.dg.example.com.
56	PTR	ron12cpri-scan.dg.example.com.

; hosts - standby cluster
57	PTR	ron12csbynode1.dg.example.com.
58	PTR	ron12csbynode1-vip.dg.example.com.
59	PTR	ron12csbynode2.dg.example.com.
60	PTR	ron12csbynode2-vip.dg.example.com.
; Data Guard SCAN - standby cluster
61	PTR	ron12csby-scan.dg.example.com.
62	PTR	ron12csby-scan.dg.example.com.
63	PTR	ron12csby-scan.dg.example.com.

The domain is *.dg.example.com, the primary database client traffic will be routed through *.example.com.

Adding the new network

The first step to be performed is to make Clusterware aware of the second network. I am doing this on both sides of the cluster. Notice that the primary nodes are called *pri* whereas the standby cluster is called *sby*

[root@ron12cprinode1 ~]# srvctl add network -netnum 2 -subnet 192.168.102.0/255.255.255.0/eth2 -nettype static -verbose
Successfully added Network.

[root@ron12csbynode1 ~]# srvctl add network -netnum 2 -subnet 192.168.102.0/255.255.255.0/eth2 -nettype static -verbose
Successfully added Network.

So this worked, now I have 2 networks:

[root@ron12cprinode1 ~]# srvctl config network
Network 1 exists
Subnet IPv4: 192.168.100.0/255.255.255.0/eth0, static
Subnet IPv6:
Network 2 exists
Subnet IPv4: 192.168.102.0/255.255.255.0/eth2, static
Subnet IPv6:
[root@ron12cprinode1 ~]#

In the next step I have to add VIPs for the new nodes on the *.dg.example.com subnet. The VIPs must be added on all cluster nodes, 4 in my case.

[oracle@ron12cprinode2 ~]$ srvctl add vip -h

Adds a VIP to the Oracle Clusterware.

Usage: srvctl add vip -node <node_name> -netnum <network_number> -address {<name>|<ip>}/<netmask>[/if1[|if2...]] [-skip] [-verbose]
    -node     <node_name>          Node name
    -address                       <vip_name|ip>/<netmask>[/if1[|if2...]] VIP address specification for node applications
    -netnum   <net_num>            Network number (default number is 1)
    -skip                          Skip reachability check of VIP address
    -verbose                       Verbose output
    -help                          Print usage
[oracle@ron12cprinode2 ~]$

So I did this on each node in my cluster

[root@ron12cprinode1 ~]# srvctl add vip -node ron12cprinode1 -netnum 2 -address 192.168.102.51/255.255.255.0/eth2 -verbose
Network exists: 2/192.168.102.0/255.255.255.0/eth2, type static
Successfully added VIP.
[root@ron12cprinode2 ~]# srvctl add vip -node ron12cprinode2 -netnum 2 -address 192.168.102.53/255.255.255.0/eth2 -verbose
Network exists: 2/192.168.102.0/255.255.255.0/eth2, type static
Successfully added VIP.

[root@ron12csbynode1 ~]# srvctl add vip -node ron12csbynode1 -netnum 2 -address 192.168.102.58/255.255.255.0/eth2 -verbose
Network exists: 2/192.168.102.0/255.255.255.0/eth2, type static
Successfully added VIP.
[root@ron12csbynode2 ~]# srvctl add vip -node ron12csbynode2 -netnum 2 -address 192.168.102.60/255.255.255.0/eth2 -verbose
Network exists: 2/192.168.102.0/255.255.255.0/eth2, type static
Successfully added VIP.

And I need to start the VIPs. They have some funny names as you can see in crsctl status resource (the names can’t be defined, see output of srvctl add scan -h above)

[root@ron12cprinode1 ~]# srvctl status vip -vip ron12cprinode1_2
VIP 192.168.102.51 is enabled
VIP 192.168.102.51 is not running
[root@ron12cprinode1 ~]# srvctl start vip -vip ron12cprinode1_2
[root@ron12cprinode1 ~]# srvctl start vip -vip ron12cprinode2_2
[root@ron12cprinode1 ~]# srvctl status vip -vip ron12cprinode1_2
VIP 192.168.102.51 is enabled
VIP 192.168.102.51 is running on node: ron12cprinode1
[root@ron12cprinode1 ~]#

Add the second SCAN

At this time you can add the second SCAN. The command syntax is shown here:

[oracle@ron12cprinode1 ~]$ srvctl add scan -h

Adds a SCAN VIP to the Oracle Clusterware.

Usage: srvctl add scan -scanname <scan_name> [-netnum <network_number>]
    -scanname <scan_name>          Domain name qualified SCAN name
    -netnum  <net_num>             Network number (default number is 1)
    -subnet                        <subnet>/<netmask>[/if1[|if2...]] NET address specification for network
    -help                          Print usage

Implemented on my first cluster node the command is easier to comprehend.

[root@ron12cprinode1 ~]# srvctl add scan -scanname ron12cpri-dgscan.dg.example.com -netnum 2

[root@ron12cprinode1 ~]# srvctl status scan
SCAN VIP scan1 is enabled
SCAN VIP scan1 is running on node ron12cprinode2
SCAN VIP scan2 is enabled
SCAN VIP scan2 is running on node ron12cprinode1
SCAN VIP scan3 is enabled
SCAN VIP scan3 is running on node ron12cprinode1

You need to create the SCAN on both clusters. On my primary cluster the SCAN has been created with the following configuration:

[root@ron12cprinode1 ~]# srvctl config scan -netnum 2
SCAN name: ron12cpri-dgscan.dg.example.com, Network: 2
Subnet IPv4: 192.168.102.0/255.255.255.0/eth2
Subnet IPv6:
SCAN 0 IPv4 VIP: 192.168.102.54
SCAN name: ron12cpri-dgscan.dg.example.com, Network: 2
Subnet IPv4: 192.168.102.0/255.255.255.0/eth2
Subnet IPv6:
SCAN 1 IPv4 VIP: 192.168.102.55
SCAN name: ron12cpri-dgscan.dg.example.com, Network: 2
Subnet IPv4: 192.168.102.0/255.255.255.0/eth2
Subnet IPv6:
SCAN 2 IPv4 VIP: 192.168.102.56
[root@ron12cprinode1 ~]#

You can see the new VIPs in the output of ifconfig, just as you would with the primary SCAN:

eth2      Link encap:Ethernet  HWaddr 52:54:00:FE:E2:D5
          inet addr:192.168.102.50  Bcast:192.168.102.255  Mask:255.255.255.0
          inet6 addr: fe80::5054:ff:fefe:e2d5/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:523 errors:0 dropped:0 overruns:0 frame:0
          TX packets:339 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:118147 (115.3 KiB)  TX bytes:72869 (71.1 KiB)

eth2:1    Link encap:Ethernet  HWaddr 52:54:00:FE:E2:D5
          inet addr:192.168.102.55  Bcast:192.168.102.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

So there is nothing too surprising in the output, it’s exactly the same as before with the public SCAN created during the installation.

End of part 1

This already seems like a lot of text to me so I think it’s time to pause here. The next parts will demonstrate the addition of the SCAN listeners, the new node listeners on the *.dg.example.com network and finally the duplication of the primary RAC One Node database for use as a standby database.

Posted in 12c Release 1, KVM | Tagged: , | 1 Comment »

 
Follow

Get every new post delivered to your Inbox.

Join 2,231 other followers