11G OCM Agent Not Starting Automatically


Whenever I start up gridctrl, the OEM agent is not started automatically.

Location of Start Up Script

Referring back to 11G OCM Upgrade OMR to 11.1.0.7 (2), the startup script location is /etc/rc.d/init.d/gcstartup.

I noticed that I had hacked the 10g version of the seedstup script instead of using the 11g version. I checked the difference between the scripts, and found a minor difference:

# cd /opt/oracle/app/OracleHomes
# diff \
> db10g/install/unix/scripts/seedstup \
> db11g/install/unix/scripts/seedstup
32d31
< export ORACLE_BASE=/opt/oracle/app

I updated the gcstartup script to use the 11g version.

Diagnosis of Problem

Since gcstartup is a service under RHEL, I decided to see what the problem was by running the process from the console. As the service is already started, I can test the scripts by shutting the service down:

# service gcstartup stop

LSNRCTL for Linux: Version 11.1.0.7.0 - Production on 19-NOV-2010 22:07:58

Copyright (c) 1991, 2008, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC)))
The command completed successfully

SQL*Plus: Release 11.1.0.7.0 - Production on Fri Nov 19 22:08:08 2010

Copyright (c) 1982, 2008, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production
With the Partitioning option

SQL> Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> Disconnected from Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production
With the Partitioning option
opmnctl: stopping opmn and all managed processes...
/bin/su: invalid option -- u
Try `/bin/su --help' for more information.

Looks like there is a problem in the agentstup (in /opt/oracle/app/OracleHomes/agent10g/install/unix/scripts/) with the invocation of the su command — there is an extra parameter (-u) before the user name.

Fixing the Problem

The corrected version of the script is:

#!/bin/sh
#Script to start and stop the Agent during shutdown and restart of the machine

export ORACLE_HOME=/opt/oracle/app/OracleHomes/agent10g
installUser=oracle
executingUser=$USER
SU=/bin/su

case "$1" in
  start)
             #Commenting as is not required after the bug 5329412
        #OH=`$ORACLE_HOME/bin/emctl getemhome | grep EMHOME | cut -c8-`
         #agentTZ=`grep agentTZRegion $OH/sysman/config/emd.properties | cut -c15-`
        #grep $agentTZ $ORACLE_HOME/sysman/admin/ossupportedtzs.lst
        #if [ $? = 0 ]; then
             #TZ=$agentTZ
            #export TZ
         #fi
         if [ .$executingUser = .$installUser ] ; then
      $ORACLE_HOME/bin/emctl start agent
    else
         $SU  $installUser  $ORACLE_HOME/bin/emctl start agent
     fi
   ;;
  stop)
          if [ .$executingUser = .$installUser ] ; then
      $ORACLE_HOME/bin/emctl stop agent
    else
         $SU  $installUser  $ORACLE_HOME/bin/emctl stop agent
     fi
   ;;
  *)
         echo $"Usage: $0 {start|stop}"
         exit 1
esac

Verifying the Fix

As the service is now down, I can test the whole sequence by starting the service:

# service gcstartup start

LSNRCTL for Linux: Version 11.1.0.7.0 - Production on 19-NOV-2010 22:33:58

Copyright (c) 1991, 2008, Oracle.  All rights reserved.

Starting /opt/oracle/app/OracleHomes/db11g/bin/tnslsnr: please wait...

TNSLSNR for Linux: Version 11.1.0.7.0 - Production
System parameter file is /opt/oracle/app/OracleHomes/db11g/network/admin/listener.ora
Log messages written to /opt/oracle/app/OracleHomes/db11g/log/diag/tnslsnr/gridctrl/listener/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=gridctrl.yaocm.id.au)(PORT=1521)))

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 11.1.0.7.0 - Production
Start Date                19-NOV-2010 22:33:59
Uptime                    0 days 0 hr. 0 min. 0 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /opt/oracle/app/OracleHomes/db11g/network/admin/listener.ora
Listener Log File         /opt/oracle/app/OracleHomes/db11g/log/diag/tnslsnr/gridctrl/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=gridctrl.yaocm.id.au)(PORT=1521)))
Services Summary...
Service "PLSExtProc" has 1 instance(s).
  Instance "PLSExtProc", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully

SQL*Plus: Release 11.1.0.7.0 - Production on Fri Nov 19 22:33:59 2010

Copyright (c) 1982, 2008, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> ORACLE instance started.

Total System Global Area  535662592 bytes
Fixed Size                  1314580 bytes
Variable Size             159383788 bytes
Database Buffers          369098752 bytes
Redo Buffers                5865472 bytes
Database mounted.
Database opened.
SQL> Disconnected from Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production
With the Partitioning option
opmnctl: starting opmn and all managed processes...
Oracle Enterprise Manager 10g Release 5 Grid Control 10.2.0.5.0.
Copyright (c) 1996, 2009 Oracle Corporation.  All rights reserved.
Starting agent .............................................................................................................. started.

The question is how I missed this when I set things up back in January.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s