Martins Blog

Trying to explain complex things in simple terms

Archive for August, 2016

Little things worth knowing: using Remote Diagnostic Agent more effectively with profiles

Posted by Martin Bach on August 26, 2016

RDA, or the Remote Diagnostics Agent, has been around for a little while. Over the time, and with the growing number of Oracle’s acquisitions it has become, shall we say, a little more difficult to handle. It appears to me as if every one of them will have its diagnostics handled by RDA making it hard to focus on something specific, like for example the database.

I won’t go into very detail of the Remote Diagnostic Agent in this post; please make sure you read and understand the documentation on MOS and the README* files that come with it before using it.

I based this blog post on Oracle Linux 7.2 and RDA 8.12.16.6.14 in case you find this via a search engine.

Basic Usage

In the first step you need to download RDA from My Oracle Support. I used DOC ID 314422.1 to get it for Linux x86-64. Installation was simple, I just unzipped the zipfile as always into the RDBMS owner’s home which has plenty of free space for data collection in my lab systems. You might actually already have a copy of RDA somewhere that shipped with your Oracle product as it comes bundled with a lot of them. I am focusing on the Oracle database and infrastructure and as far as I know, have to download RDA in this case.

If you follow the documentation, you verify the installation and begin with the setup. And here is where I am starting to struggle with the tool’s concept, but let me explain why:

[oracle@server1 rda]$ ./rda.sh -S
------------------------------------------------------------------------------
RDA.BEGIN: Initializes the Data Collection
------------------------------------------------------------------------------
Enter the Oracle home to be used for data analysis
> /u01/app/oracle/product/12.1.0.2/dbhome_1

------------------------------------------------------------------------------
RDA.CONFIG: Collects Key Configuration Information
------------------------------------------------------------------------------
------------------------------------------------------------------------------
SAMPLE.SAMPLE: Controls Sampling
------------------------------------------------------------------------------
------------------------------------------------------------------------------
RDA.CCR: Collects OCM Diagnostic Information
------------------------------------------------------------------------------
Do you want to diagnose OCM installations (Y/N)?
Press Return to accept the default (N)
> n

------------------------------------------------------------------------------
RDA.CUST: Collects Customer-Specific Information
------------------------------------------------------------------------------
------------------------------------------------------------------------------
OS.OS: Collects the Operating System Information
------------------------------------------------------------------------------
------------------------------------------------------------------------------
OS.PROF: Collects the User Profile
------------------------------------------------------------------------------
------------------------------------------------------------------------------
OS.NET: Collects Network Information
------------------------------------------------------------------------------
Do you want RDA to perform the network ping tests (Y/N)?
Press Return to accept the default (N)
> 
...

And this takes you on a long journey where you are asked to collect diagnostic information about pretty much every Oracle product or at least so it seems. I was actually only interested in my local Oracle database installation, but nevertheless I was prompted for Oracle VM for example:


...

------------------------------------------------------------------------------
OS.OVMS: Collects Oracle VM Server Information
------------------------------------------------------------------------------
Should RDA analyze Oracle VM Server (Y/N)?
Press Return to accept the default (N)
> n

------------------------------------------------------------------------------
OS.OVMM: Collects Oracle VM Manager Information
------------------------------------------------------------------------------
Do you want RDA to analyze Oracle VM Manager (Y/N)?
Press Return to accept the default (N)
> n

...

A useful question for some, but not for me at this point. And so on; the list gets longer it seems with every release. And I am not willing to answer 42,000 questions every time I deploy RDA! That left me with 2 options:

  • Patiently go through the list and dutifully answer every question. If you inadvertently hit the return key one time too often-which happens quite easily-you can correct the problem later by editing the configuration file.
  • Consider using profiles

Option 1 might be viable because it’s a one-off process but I personally don’t find it very practical for various reasons. And yes, it seems possible to let RDA “guess” your environment but that didn’t work as I expected.

Profiles

Profiles on the other hand are easy to use! You can view them online on MOS 391983.1 or alternatively as part of RDA’s built-in man-page by executing ./rda.sh -L and searching for “Available profiles:”. If you haven’t set the PAGER variable you might end up with more, but I think less is more :)

[oracle@server1 rda]$ PAGER=less ./rda.sh -L

...

Available profiles:
  AS10g                            Oracle Application Server 10g problems
  AS10g_Discoverer                 Discoverer 10g problems

...

  DB10g                            Oracle Database 10g problems
  DB11g                            Oracle Database 11g problems
  DB12c                            Oracle Database 12c problems
  DB8i                             Oracle Database 8i problems
  DB9i                             Oracle Database 9i problems
  DB_Assessment                    Oracle Database assessment collections
  DB_BackupRecovery                Oracle Database backup and recovery

...

These profiles are a logical grouping of various tests provided by the RDA framework. From a remote support point of view they are fantastic and leave little room for user error! Always get the information you need, not a report that only has half of the required information for troubleshooting. If you like to see more detail you can combine the -M (for the manual) and -p (for profile) options as in this example:

[oracle@server1 rda]$ ./rda.sh -M -p DB11g
NAME
    Profile DB11g - Oracle Database 11g problems

MODULES
    The DB11g profile uses the following modules:
      DB:DCdb    Controls Oracle RDBMS Data Collection
      DB:DCdba   Collects Oracle RDBMS Information
      DB:DCdbm   Collects Oracle RDBMS Memory Information
      DB:DCdnfs  Collects Direct NFS Information
      DB:DClog   Collects Oracle Database Trace and Log Files
      DB:DCsp    Collects SQL*Plus/iSQL*Plus Information
      EM:DCagt   Collects Enterprise Manager Agent Information
      EM:DCdbc   Collects Database Control Information
      EM:DCgrid  Controls Grid Control Data Collection
      OS:DCinst  Collects the Oracle Installation Information
      OS:DCnet   Collects Network Information
      OS:DConet  Collects Oracle Net Information
      OS:DCos    Collects the Operating System Information
      OS:DCperf  Collects Performance Information
      OS:DCprof  Collects the User Profile

COPYRIGHT NOTICE
    Copyright (c) 2002, 2016, Oracle and/or its affiliates. All rights
    reserved.

TRADEMARK NOTICE
    Oracle and Java are registered trademarks of Oracle and/or its affiliates.
    Other names may be trademarks of their respective owners.

[oracle@server1 rda]$ 

By the way, rda.sh -M displays the entire man page.

Coming back to the profiles: I found the Assessments to be quite useful and a good starting point if you would like to get an overview:

[oracle@server1 rda]$ ./rda.sh -L | grep Assessment
  ADBA                             Collects ACS Oracle Database Assessment
  Apps_DB_Assessment               Oracle Applications Database assessment
  Asm_Assessment                   Oracle ASM assessment collections
  DB_Assessment                    Oracle Database assessment collections
  Exadata_Assessment               Oracle Exadata assessment collections
  Maa_Assessment                   Maximum Availability Architecture
  Maa_Exa_Assessment               Maximum Availability Architecture with
  Rac_Assessment                   Real Application Cluster assessment
  DB_Assessment                    DB Assessment collections

Specifying a profile does not entirely relieve you from having to enter information in an interactive session, but it reduces the time needed to complete the initial configuration.

Next time you need to run RDA, have a look at the avaible profiles, maybe there’s one that serves your needs!

Posted in Linux, Oracle | Tagged: | 1 Comment »

OSWatcher integration in Trace File Analyzer (TFA)

Posted by Martin Bach on August 2, 2016

Some time ago I wrote a post about using OSWatcher for system analysis. Neil Chandler (@ChandlerDBA) rightfully pointed out that although OSWatcher was cool, TFA was the way to go. TFA can include OSWatcher, but more importantly it adds a lot of value over and above what OSWatcher does.

I guess it depends on what you want to do-I still think that OSWatcher is a good starting point and enough for most problems on single instance systems. When it comes to clustered environments, TFA looks a lot more appealing though.

In this article I am taking a closer look at using TFA – which is part of the Oracle 11.2.0.4 and 12.1.0.2. TFA is automatically updated as part of the quarterly patches, which is nice because the default/base release does not seem to be working properly. Thankfully TFA can be patched outside the regular patch cycle.

What is TFA?

TFA is a tool which – amongst other things – helps you gather information about incidents across your cluster. If you ever worked on Exadata half-racks or other clusters with more than 4 nodes you will quickly start to appreciate having to use one tool for this task. The TFA output is suitable for attaching to a Service Request which should, at least in theory, help speed up the problem resolution.

It is also an excellent parsing tool and has excellent reporting capabilities thanks to its “analyze” command.

As an added benefit you get a lot of tools that were previously known as “RAC and DB Support Tools Bundle”. This includes OSWatcher as well, the reason for this post.

Plus you don’t have to worry about starting OSWatcher when booting: TFA is started via a systemd unit file in Oracle Linux 7, and I found it started as a service in Oracle Linux 6. On OL7.x you can check its status using the standard systemd commands suite, as shown here:

[oracle@rac12sbnode1 ~]$ systemctl status oracle-tfa
oracle-tfa.service - Oracle Trace File Analyzer
   Loaded: loaded (/etc/systemd/system/oracle-tfa.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2016-08-02 09:46:24 BST; 3h 14min ago
 Main PID: 27799 (init.tfa)
   CGroup: /system.slice/oracle-tfa.service
           ├─14670 /bin/sleep 30
           ├─27799 /bin/sh /etc/init.d/init.tfa run >/dev/null 2>&1 </dev/null
           └─27890 /u01/app/12.1.0.2/grid/jdk/jre/bin/java -Xms128m -Xmx512m \
                     oracle.rat.tfa.TFAMain /u01/app/12.1.0.2/grid/tfa/rac12sbnode1/tfa_home
[oracle@rac12sbnode1 ~]$ 

Updating TFA

It is quite likely that your version of TFA is older than the one available from MOS ID 1513912.2, which appears to be its main landing page. I applied the proactive bundle patch for July 2016 to my 2 node RAC cluster and found the TFA version to be 12.1.2.7.0. At the time of writing Oracle has released TFA 12.1.2.8.0.

The update is quite simple, but needs to be performed as root. To be sure I’m not doing something I shouldn’t be doing I checked the current version:

[oracle@rac12sbnode1 ~]$ tfactl print status

.----------------------------------------------------------------------------------------------------.
| Host         | Status of TFA | PID   | Port | Version    | Build ID             | Inventory Status |
+--------------+---------------+-------+------+------------+----------------------+------------------+
| rac12sbnode1 | RUNNING       | 23081 | 5000 | 12.1.2.7.0 | 12127020160304140533 | COMPLETE         |
| rac12sbnode2 | RUNNING       |  6296 | 5000 | 12.1.2.7.0 | 12127020160304140533 | COMPLETE         |
'--------------+---------------+-------+------+------------+----------------------+------------------'

In the next step, after switching to the root account, I staged the TFA software and executed the installer. This will automatically distribute the new version across all nodes in the cluster.

[root@rac12sbnode1 patches]# . oraenv
ORACLE_SID = [root] ? +ASM1
The Oracle base has been set to /u01/app/oracle
[root@rac12sbnode1 patches]# unzip -q TFALite_v12.1.2.8.0.zip 
[root@rac12sbnode1 patches]# ./installTFALite 
TFA Installation Log will be written to File : /tmp/tfa_install_25296_2016_08_02-09_40_12.log

Starting TFA installation

TFA HOME : /u01/app/12.1.0.2/grid/tfa/rac12sbnode1/tfa_home
TFA Build Version: 121280 Build Date: 201606232222
Installed Build Version: 121270 Build Date: 201603041405

TFA is already installed. Patching /u01/app/12.1.0.2/grid/tfa/rac12sbnode1/tfa_home...
TFA patching typical install from zipfile is written to /u01/app/12.1.0.2/grid/tfa/rac12sbnode1/tfapatch.log

TFA will be Patched on: 
rac12sbnode1
rac12sbnode2

Do you want to continue with patching TFA? [Y|N] [Y]: y

Checking for ssh equivalency in rac12sbnode2
Node rac12sbnode2 is not configured for ssh user equivalency

SSH is not configured on these nodes : 
rac12sbnode2

Do you want to configure SSH on these nodes ? [Y|N] [Y]: y

Configuring SSH on rac12sbnode2...

Generating keys on rac12sbnode1...

Copying keys to rac12sbnode2...

/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@rac12sbnode2's password: 

Creating ZIP: /u01/app/12.1.0.2/grid/tfa/rac12sbnode1/tfa_home/internal/tfapatch.zip

Using SSH to patch TFA to remote nodes :

Applying Patch on rac12sbnode2:

TFA_HOME: /u01/app/12.1.0.2/grid/tfa/rac12sbnode2/tfa_home
Stopping TFA Support Tools...
Shutting down TFA
Removed symlink /etc/systemd/system/multi-user.target.wants/oracle-tfa.service.
Removed symlink /etc/systemd/system/graphical.target.wants/oracle-tfa.service.
. . . . . 
. . . 
Successfully shutdown TFA..
Copying files from rac12sbnode1 to rac12sbnode2...

Current version of Berkeley DB in  is 5.0.84, so no upgrade required
Running commands to fix init.tfa and tfactl in rac12sbnode2...
Updating init.tfa in rac12sbnode2...
Starting TFA in rac12sbnode2...
Starting TFA..
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Waiting up to 100 seconds for TFA to be started..
. . . . . 
Successfully started TFA Process..
. . . . . 
TFA Started and listening for commands

Enabling Access for Non-root Users on rac12sbnode2...


Applying Patch on rac12sbnode1:

Stopping TFA Support Tools...

Shutting down TFA for Patching...

Shutting down TFA
Removed symlink /etc/systemd/system/graphical.target.wants/oracle-tfa.service.
Removed symlink /etc/systemd/system/multi-user.target.wants/oracle-tfa.service.
. . . . . 
. . . 
Successfully shutdown TFA..

Current version of Berkeley DB is 5.0.84, so no upgrade required

Copying TFA Certificates...

Running commands to fix init.tfa and tfactl in localhost

Starting TFA in rac12sbnode1...

Starting TFA..
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Waiting up to 100 seconds for TFA to be started..
. . . . . 
Successfully started TFA Process..
. . . . . 
TFA Started and listening for commands

Enabling Access for Non-root Users on rac12sbnode1...

root@rac12sbnode2's password: 
Removed SSH configuration on rac12sbnode2...

.--------------------------------------------------------------------.
| Host         | TFA Version | TFA Build ID         | Upgrade Status |
+--------------+-------------+----------------------+----------------+
| rac12sbnode1 |  12.1.2.8.0 | 12128020160623222219 | UPGRADED       |
| rac12sbnode2 |  12.1.2.8.0 | 12128020160623222219 | UPGRADED       |
'--------------+-------------+----------------------+----------------'

[root@rac12sbnode1 patches]# 

This has upgraded TFA in one easy step.

Support Tools Bundle Missing with stock-TFA

If you read MOS 1513912.2 carefully, you undoubtedly spotted that beginning with TFA 12.1.2.3.0 the RAC and DB Support Tools Bundle is included with TFA, alongside some other very useful utilities. But you only get them after deploying TFA from MOS. Here is the list as shown post-patch:

[oracle@rac12sbnode1 ~]$ tfactl toolstatus
.-------------------------------------------.
|           External Support Tools          |
+--------------+--------------+-------------+
| Host         | Tool         | Status      |
+--------------+--------------+-------------+
| rac12sbnode1 | alertsummary | DEPLOYED    |
| rac12sbnode1 | exachk       | DEPLOYED    |
| rac12sbnode1 | ls           | DEPLOYED    |
| rac12sbnode1 | pstack       | DEPLOYED    |
| rac12sbnode1 | orachk       | DEPLOYED    |
| rac12sbnode1 | sqlt         | DEPLOYED    |
| rac12sbnode1 | grep         | DEPLOYED    |
| rac12sbnode1 | summary      | DEPLOYED    |
| rac12sbnode1 | prw          | NOT RUNNING |
| rac12sbnode1 | vi           | DEPLOYED    |
| rac12sbnode1 | tail         | DEPLOYED    |
| rac12sbnode1 | param        | DEPLOYED    |
| rac12sbnode1 | dbglevel     | DEPLOYED    |
| rac12sbnode1 | darda        | DEPLOYED    |
| rac12sbnode1 | history      | DEPLOYED    |
| rac12sbnode1 | oratop       | DEPLOYED    |
| rac12sbnode1 | oswbb        | RUNNING     |
| rac12sbnode1 | dbperf       | RUNNING     |
| rac12sbnode1 | changes      | DEPLOYED    |
| rac12sbnode1 | events       | DEPLOYED    |
| rac12sbnode1 | ps           | DEPLOYED    |
| rac12sbnode1 | srdc         | DEPLOYED    |
'--------------+--------------+-------------'

The stock-version, although it gets patched with the proactive bundle patch, does not include them. I ran this command before applying the TFA patch, but after having applied the proactive bundle patch to my cluster:

[oracle@rac12sbnode1 ~]$ tfactl toolstatus
.------------------------.
| External Support Tools |
+-------+-------+--------+
| Host  | Tool  | Status |
+-------+-------+--------+
'-------+-------+--------'

This is actually a feature, not a bug, as documented in MOS 2054786.1. The note states quite clearly that the RAC and DB Support Tools bundle is only installed if you deploy the MOS version. I just did that; I am good.

TFA Tools

I really love the idea of having these tools availble. The TFA user guide, also available from MOS 1513912.2 (tab “Users Guide”) explains from page 39 onwards how to use them.

For example-

[oracle@rac12sbnode1 ~]$ tfactl oratop -h
Usage : /u01/app/12.1.0.2/grid/tfa/rac12sbnode1/tfa_home/bin/tfactl.pl oratop
         -database <dbname> <Oratop Options> <logon>

Options: 
-database <dbname> Database name to run oratop
<logon> : defalut will be / as sysdba. Specify a different user using
          {username[/password][@connect_identifier] | / }
          [AS {SYSDBA|SYSOPER}]
          connect_identifier: host[:port]/[service_name]
<Oratop Options>:
-k : FILE#:BLOCK#, section 4 lt is (EVENT/LATCH)
-m : MODULE/ACTION, section 4 (default is USERNAME/PROGRAM)
-s : SQL mode, section 4 (default is process mode)
-c : database service mode (default is connect string)
-f : detailed format, 132 columns (default: standard, 80 columns)
-b : batch mode (default is text-based user interface)
-n : maximum number of iterations (requires number)
-i : interval delay, requires value in seconds (default: 5s)

e.g:
   /u01/app/12.1.0.2/grid/tfa/rac12sbnode1/tfa_home/bin/tfactl.pl oratop -database testdb1
   /u01/app/12.1.0.2/grid/tfa/rac12sbnode1/tfa_home/bin/tfactl.pl oratop -database testdb1 -bn1


Makes for interesting output, this example is from a system running the Swingbench Order Entry benchmark on an overloaded set of VMs:


Oracle 12c - Primary NCDB   11:19:31 up:  14h,   2 ins,    5 sn,   1 us, 2.7G mt,    4% fra,   0 er,                      93.3% db
ID %CPU LOAD %DCU   AAS  ASC  ASI  ASW  ASP  AST  UST MBPS IOPS IORL LOGR PHYR PHYW  %FR   PGA TEMP UTPS UCPS SSRT DCTR DWTR  %DBT
 1   50    2   16   2.2    0    1    2    0    3    3  222  108   1m  25k  23k   25    6  373M    0   12   57  12m   14   85  59.2
 2   31    2    9   1.5    1    2    0    0    2    2  105  104 876u  19k  18k   14    5  385M    0    7   42  10m   13   86  40.8

EVENT (C)                                                         TOTAL WAITS   TIME(s)  AVG_MS  PCT                    WAIT_CLASS
DB CPU                                                                             1636           39                              
log file parallel write                                                 49704       849    17.2   20                    System I/O
control file sequential read                                           344732       687     2.0   16                    System I/O
log file switch (checkpoint incomplete)                                   319       612  1739.7   15                 Configuration
log file sync                                                            6328       404    63.2   10                        Commit

ID   SID     SPID USERNAME  PROGRAM    SRV  SERVICE  PGA  SQLID/BLOCKER OPN  E/T  STA  STE  WAIT_CLASS  EVENT/*LATCH           W/T
 2   249     9949 B/G       LG01       DED  SYS$BAC 1.4M                     14h  ACT  WAI  System I/O  log file parallel wri  49m
 1    11    25634 B/G       DBW0       DED  SYS$BAC  12M                     14h  ACT  WAI  System I/O  db file parallel writ  42m
 2   248     9945 B/G       LG00       DED  SYS$BAC 1.4M                     14h  ACT  WAI  System I/O  log file parallel wri  31m
 1   247    25636 B/G       LGWR       DED  SYS$BAC 1.7M                     14h  ACT  WAI  System I/O  *test excl. non-paren  24m
 1    22     6342 SOE       JDBC Thin  DED  NCDB    3.2M                       0  ACT  WAI  Commit      log file sync          12m
 2   247     9941 B/G       LGWR       DED  SYS$BAC 1.6M                     14h  ACT  WAI  Other       LGWR any worker group  12m
 1   265     6344 SOE       JDBC Thin  DED  NCDB    5.3M  7ws837zynp1zv SEL    0  ACT  I/O  User I/O    direct path read        5m
 1   257     6340 SOE       JDBC Thin  DED  NCDB    5.5M  8zz6y2yzdqjp0 SEL    0  ACT  CPU  User I/O    cpu runqueue            4m
 2    46    11208 SOE       JDBC Thin  DED  NCDB    6.4M  7ws837zynp1zv SEL    0  ACT  I/O  User I/O    direct path read        3m
 2   274    11206 SOE       JDBC Thin  DED  NCDB    5.4M  7ws837zynp1zv SEL    0  ACT  I/O  User I/O    direct path read        2m

Nice little overview :) But I’m digressing…

OSWatcher

There are two ways for TFA to access OSWatcher information:

  1. Using the analyze command to provide a summary view
  2. Invoking OSWatcher directly

The first option provides a nice overview. I’ve been running swingbench on the system with far too many users, which you can see here:

[oracle@rac12sbnode1 ~]$ tfactl analyze -comp osw -since 1h
INFO: analyzing host: rac12sbnode1

                     Report title: OSW top logs
                Report date range: last ~1 hour(s)
       Report (default) time zone: GMT - Greenwich Mean Time
              Analysis started at: 02-Aug-2016 03:27:14 PM BST
            Elapsed analysis time: 0 second(s).
               Configuration file: /u01/app/12.1.0.2/grid/tfa/rac12sbnode1/tfa_home/ext/tnt/conf/tnt.prop
              Configuration group: osw
                        Parameter: 
              Total osw rec count:            174, from 02-Aug-2016 02:00:16 PM BST to 02-Aug-2016 03:26:54 PM BST
OSW recs matching last ~1 hour(s):            120, from 02-Aug-2016 02:27:19 PM BST to 02-Aug-2016 03:26:54 PM BST
                        statistic: t     first   highest   (time)   lowest   (time)  average  non zero  3rd last  2nd last      last  trend
            top.loadavg.last01min:        3.91      5.48 @02:32PM     2.08 @02:46PM     3.42       103      3.21      2.91      2.64   -32%
            top.loadavg.last05min:        3.10      3.91 @02:32PM     3.05 @02:53PM     3.42       103      3.54      3.43      3.32     7%
            top.loadavg.last15min:        3.13      3.59 @03:22PM     3.13 @02:27PM     3.37       103      3.53      3.49      3.45    10%
                top.tasks.running:           5         9 @02:34PM        1 @02:27PM        3       119         2         1         2   -60%
               top.tasks.sleeping:         316       325 @02:40PM      315 @02:49PM      320       119       320       321       320     1%

INFO: analyzing host: rac12sbnode2

                     Report title: OSW top logs
                Report date range: last ~1 hour(s)
       Report (default) time zone: GMT - Greenwich Mean Time
              Analysis started at: 02-Aug-2016 03:27:15 PM BST
            Elapsed analysis time: 0 second(s).
               Configuration file: /u01/app/12.1.0.2/grid/tfa/rac12sbnode2/tfa_home/ext/tnt/conf/tnt.prop
              Configuration group: osw
                        Parameter: 
              Total osw rec count:            174, from 02-Aug-2016 02:00:14 PM BST to 02-Aug-2016 03:26:52 PM BST
OSW recs matching last ~1 hour(s):            120, from 02-Aug-2016 02:27:16 PM BST to 02-Aug-2016 03:26:52 PM BST
                        statistic: t     first   highest   (time)   lowest   (time)  average  non zero  3rd last  2nd last      last  trend
            top.loadavg.last01min:        2.75      5.73 @02:40PM     2.23 @02:52PM     3.44       111      3.91      4.31      3.88    41%
            top.loadavg.last05min:        2.78      4.16 @02:41PM     2.78 @02:27PM     3.40       111      3.52      3.65      3.67    32%
            top.loadavg.last15min:        2.93      3.60 @03:13PM     2.93 @02:27PM     3.32       111      3.49      3.53      3.55    21%
                top.tasks.running:           2         8 @03:10PM        1 @03:00PM        3       120         2         2         1   -50%

[oracle@rac12sbnode1 ~]$

As you can imagine the system is somewhat overloaded. The minimum interval to report on seems to be 1 hour:

[oracle@rac12sbnode1 ~]$ tfactl analyze -comp osw -since 5m

ERROR: Invalid value for -since. Supported values are n<h|d>

The analyze command can do a lot more, make sure to have a look at the documentation to find out more.

But you can run OSWatcher directly as well:

[oracle@rac12sbnode1 ~]$ tfactl oswbb -h

Usage : /u01/app/12.1.0.2/grid/tfa/rac12sbnode1/tfa_home/bin/tfactl.pl oswbb \
          [<OSWatcher Analyzer Options> | -since n[mhd] ]

Options: 

-since n[mhd] Run OSWatcher analyzer for last n [m]inutes or [h]ours or [d]ays.

<OSWatcher Analyzer Options>: -P <name> -L <name> -6 -7 -8 -B <time> -E <time> -A 
     -P <profile name>  User specified name of the html profile generated
                        by oswbba. This overrides the oswbba automatic naming
                        convention for html profiles. All profiles
                        whether user specified named or auto generated
                        named will be located in the /profile directory.

     -A <analysis name> Same as option A from the menu. Will generate
                        an analysis report in the /analysis directory or
                        user can also specify the name of the analysis file
                        by specifying full qualified path name of file.
                        The "A" option can not be used together with the
                        "S" option.
     -S <>              Will generate an analysis of a subset of the data
                        in the archive directory. This option must be used
                        together with the -b and -e options below. See the
                        section "Specifying the begin/end time of the analysis"
                        above. The "S" option can not be used together with
                        the "A" option.

     -START <filename>  Used with the analysis option to specify the first
                        file located in the oswvmstat directory to analyze.

     -STOP <filename>   Used with the analysis option to specify the last
                        file located in the oswvmstat directory to analyze.

     -b <begin time>    Used with the -S option to specify the begin time
                        of the analysis period. Example format:
                        -b Jan 09 13:00:00 2013

     -e <end time>      Used with the -S option to specify the end time
                        of the analysis period. Example format:
                        -e Jan 09 13:15:00 2013

     -L <location name> User specified location of an existing directory
                        to place any gif files generated
                        by oswbba. This overrides the oswbba automatic
                        convention for placing all gif files in the
                        /gif directory. This directory must pre-exist!
     -6                 Same as option 6 from the menu. Will generate
                        all cpu gif files.


     -7                 Same as option 7 from the menu. Will generate
                        all memory gif files.

     -8                 Same as option 8 from the menu. Will generate
                        all disk gif files.



     -NO_IOSTAT         Ignores files in the oswiostat directory from
                        analysis

     -NO_TOP            Ignores files in the oswtop directory from
                        analysis

     -NO_NETSTAT        Ignores files in the oswnetstat directory from
                        analysis

     -NO_PS             Ignores files in the oswps directory from
                        analysis

     -MEM_ALL           Analyzes virtual and resident memory allocations
                        for all processes. This is very resource intensive.

     -NO_Linux          Ignores files in the oswmeminfo directory from
                        analysis

e.g:
   /u01/app/12.1.0.2/grid/tfa/rac12sbnode1/tfa_home/bin/tfactl.pl oswbb
   /u01/app/12.1.0.2/grid/tfa/rac12sbnode1/tfa_home/bin/tfactl.pl oswbb -since 2h

[oracle@rac12sbnode1 ~]$

Those look quite similar to the ones I have shown you in my previous post about OSWatcher, so I won’t go into detail. Here is an example, note how I can specify the last 10 minutes:

[oracle@rac12sbnode1 ~]$ tfactl oswbb -since 10m

Validating times in the archive...

Warning. The end date you entered is not contained in the archive directory
The end date you entered is:     Tue Aug 02 15:39:06 BST 2016
The last date in the archive is: Tue Aug 02 15:38:55 BST 2016
Defaulting to using the last date in the archive

Scanning file headers for version and platform info...


Parsing file rac12sbnode1_iostat_16.08.02.1500.dat ...


Parsing file rac12sbnode1_vmstat_16.08.02.1500.dat ...




Parsing file rac12sbnode1_top_16.08.02.1500.dat ...


Parsing file rac12sbnode1_ps_16.08.02.1500.dat ...

...

After the analysis has completed, the report is opened in a pager and shown.

This report is best viewed in a fixed font editor like textpad...

OSWatcher Analyzer

Input Archive:       /u01/app/oracle/tfa/repository/suptools/rac12sbnode1/oswbb/oracle/archive
Archive Source Dest: /u01/app/oracle/tfa/repository/suptools/rac12sbnode1/oswbb/oracle/archive
Archive Start Time:  Aug 2 15:28:54 2016
Archive Stop Time:   Aug 2 15:38:55 2016
Hostname:            RAC12SBNODE1
OS Version:          Linux
Snapshot Freq:       30
CPU COUNT:           2

############################################################################
# Contents Of This Report:
#
# Section 1: System Status
# Section 2: System Slowdowns 
#   Section 2.1: System Slowdown RCA Process Level Ordered By Impact
# Section 3: System General Findings
# Section 4: CPU Detailed Findings
#   Section 4.1: CPU Run Queue:
#   Section 4.2: CPU Utilization: Percent Busy
#   Section 4.3: CPU Utilization: Percent Sys
# Section 5: Memory Detailed Findings
#   Section 5.1: Memory: Process Swap Queue 
#   Section 5.2: Memory: Scan Rate 
#   Section 5.3  Memory: Page In: 
#   Section 5.4  Memory: Page Tables (Linux only): 
#   Section 5.5: Top 5 Memory Consuming Processes Beginning
#   Section 5.6: Top 5 Memory Consuming Processes Ending
# Section 6: Disk Detailed Findings
#   Section 6.1: Disk Percent Utilization Findings
#   Section 6.2: Disk Service Times Findings
#   Section 6.3: Disk Wait Queue Times Findings
#   Section 6.4: Disk Throughput Findings
#   Section 6.5: Disk Reads Per Second
#   Section 6.6: Disk Writes Per Second

...

Summary

TFA really is a very useful tool, and this is not only due to the integration of OSWatcher. A lot of useful information that is beyond the scope of this article is available, and the search function is quite invaluable when trying to hunt down problems in your cluster. Maybe I’ll dedicate another post to that at some later time …

Posted in Linux | Tagged: , | 4 Comments »