In the first part of this article I looked at a number of views and some netstat output to show how redo is transported from the primary database to its standby systems. The long story short is that TT02 (“async ORL multi”) was found sending redo to CDB3 asynchronously whilest NSS2 (“sync”) transferred redo to the synchronised target – CDB2. Unlike v$dataguard_process wanted me to believe, it really wasn’t LGWR sending redo over the network.
In this little article I would like to show you how the standby databases CDB2 and CDB3 receive redo and how you can map this back to the primary database, closing the loop.
How does CDB2 receive redo?
First I’m looking at CDB2, which receives redo via synchronous mode. I should be able to narrow the communication down between primary and standby by referring to the LGWR and TT02 process IDs in the CLIENT_PID column on the standby. As a quick reminder, 14986 is the PID for LGWR, 15029 belongs to NSS2, and 15252 maps to TT02. Let’s try:
SQL> select db_unique_name,database_role from v$database; DB_UNIQUE_NAME DATABASE_ROLE ------------------------------ ---------------- CDB2 PHYSICAL STANDBY SQL> select name, pid, role, action, client_pid, client_role, sequence#, block#, dest_id 2 from v$dataguard_process where client_pid in (14986,15029,15252); NAME PID ROLE ACTION CLIENT_PID CLIENT_ROLE SEQUENCE# BLOCK# DEST_ID ----- ---------- ----------------------- ------------ ---------- ---------------- ---------- ---------- ---------- rfs 5517 RFS sync IDLE 14986 log writer 95 174565 0
So it would appear the process responsible for shipping redo to “SYNC” destinations is the log writer. Actually, the output of v$dataguard_process is quite interesting, which is why I’m adding it here for the sake of completeness:
SQL> select name, pid, role, action, client_pid, client_role, sequence#, block#, dest_id 2 from v$dataguard_process order by action; NAME PID ROLE ACTION CLIENT_PID CLIENT_ROLE SEQUENCE# BLOCK# DEST_ID ----- ---------- ----------------------- ------------ ---------- ---------------- ---------- ---------- ---------- rfs 5517 RFS sync IDLE 14986 log writer 95 229446 0 rfs 5350 RFS archive IDLE 15224 archive gap 0 0 1 rfs 5346 RFS ping IDLE 15124 gap manager 95 0 0 rfs 5354 RFS archive IDLE 15233 archive gap 0 0 1 MRP0 5348 managed recovery IDLE 0 none 0 0 0 rfs 5352 RFS archive IDLE 15240 archive gap 0 0 1 LGWR 5207 log writer IDLE 0 none 0 0 0 TT01 5259 redo transport timer IDLE 0 none 0 0 0 TT00 5255 gap manager IDLE 0 none 0 0 0 ARC1 5263 archive redo IDLE 0 none 0 0 0 ARC2 5265 archive redo IDLE 0 none 0 0 0 ARC3 5267 archive redo IDLE 0 none 0 0 0 TMON 5242 redo transport monitor IDLE 0 none 0 0 0 ARC0 5257 archive local IDLE 0 none 0 0 0 14 rows selected.
This view tells me that LGWR is attached to the RFS sync proces. But now I know better than that, and it is similar to what I saw on the primary. Looking a little closer, I can see that strictly speaking, the RFS process is connected to NSS2:
[root@server2 ~]# netstat -tunalp | egrep 'Active|Proto|5517' Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp6 0 0 192.168.100.22:1521 192.168.100.21:15515 ESTABLISHED 5517/oracleCDB2
I am repeating the values for the primary here so you don’t have to go back to the previous article:
[root@server1 ~]# ps -ef | egrep 'lgwr|lg0|nss' oracle 14986 1 0 09:58 ? 00:01:19 ora_lgwr_CDB1 oracle 14990 1 0 09:58 ? 00:00:00 ora_lg00_CDB1 oracle 14994 1 0 09:58 ? 00:00:00 ora_lg01_CDB1 oracle 15029 1 0 09:58 ? 00:00:43 ora_nss2_CDB1 [root@server1 ~]# netstat -vW --numeric-ports -ee -p -la --tcp | egrep '14986|14990|14994|15029' tcp 0 0 server1.example.com:15515 server2.example.com:1521 ESTABLISHED oracle 16400768 15029/ora_nss2_CDB1
You will notice that port 15515 on server1 belongs to ora_nss2_CDB1.
Going back a little to v$dataguard_process, it seems a bit weird to see MRP0 as “idle” when the database is in managed recovery mode using real time apply. Trying something else I am querying v$managed_standby and voila: MRP0 is said to apply logs:
SQL> select process,pid,status,client_process,client_pid,sequence#,block# 2 from v$managed_standby order by status; PROCESS PID STATUS CLIENT_P CLIENT_PID SEQUENCE# BLOCK# --------- ---------- ------------ -------- ---------------------------------------- ---------- ---------- DGRD 5255 ALLOCATED N/A N/A 0 0 DGRD 5259 ALLOCATED N/A N/A 0 0 MRP0 5348 APPLYING_LOG N/A N/A 95 246625 ARCH 5257 CLOSING ARCH 5257 92 4096 ARCH 5263 CLOSING ARCH 5263 93 2048 ARCH 5265 CLOSING ARCH 5265 94 342016 ARCH 5267 CONNECTED ARCH 5267 0 0 RFS 5350 IDLE UNKNOWN 15224 0 0 RFS 5354 IDLE UNKNOWN 15233 0 0 RFS 5352 IDLE UNKNOWN 15240 0 0 RFS 5517 IDLE LGWR 14986 95 246626 RFS 5346 IDLE Archival 15124 0 0 12 rows selected.
I guess that’s true, as the system is in constant recovery using the standby logfiles.
And what about CDB3?
On the other hand, CDB3 – to which redo is shipped asynchronously – lists TT02 as it’s counterpart:
SQL> select db_unique_name,database_role from v$database; DB_UNIQUE_NAME DATABASE_ROLE ------------------------------ ---------------- CDB3 PHYSICAL STANDBY SQL> select name, pid, role, action, client_pid, client_role, sequence#, block#, dest_id 2 from v$dataguard_process order by action; NAME PID ROLE ACTION CLIENT_PID CLIENT_ROLE SEQUENCE# BLOCK# DEST_ID ----- ------------------------ ----------------------- ------------ ---------- ---------------- ---------- ---------- ---------- rfs 14803 RFS ping IDLE 15124 gap manager 96 0 0 rfs 14809 RFS archive IDLE 15233 archive gap 0 0 0 rfs 14811 RFS async IDLE 15252 async ORL multi 96 34674 0 MRP0 11825 managed recovery IDLE 0 none 0 0 0 ARC0 11776 archive local IDLE 0 none 0 0 0 ARC2 11786 archive redo IDLE 0 none 0 0 0 TT00 11774 gap manager IDLE 0 none 0 0 0 ARC3 11788 archive redo IDLE 0 none 0 0 0 TMON 11706 redo transport monitor IDLE 0 none 0 0 0 LGWR 11676 log writer IDLE 0 none 0 0 0 ARC1 11784 archive redo IDLE 0 none 0 0 0 TT01 11778 redo transport timer IDLE 0 none 0 0 0 12 rows selected.
Unlike the case with CDB2, the local RFS process is indeed connecting to TT02 on server1:
[root@server2 ~]# netstat -vW --numeric-ports -ee -p -la --tcp | egrep '^Active|^Proto|14811' Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name tcp6 0 0 server2.example.com:1521 server1.example.com:12936 ESTABLISHED oracle 4658198 14811/oracleCDB3
… and on server1:
[root@server1 ~]# netstat -vW --numeric-ports -ee -p -la --tcp | egrep '^Active|^Proto|12936' Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name tcp 0 0 server1.example.com:12936 server2.example.com:1521 ESTABLISHED oracle 15820538 15252/ora_tt02_CDB1
This should be enough evidence, I rest my case :)
Summary
So I guess that answers my question: On my small VMs in the lab, NSSn is responsible for shipping redo to targets in “SYNC” mode. The redo transport server processes TTnn ship redo to destinations that are defined for asynchronous transport.