OEM Alerts for Network Issues


I appear to have some sort of network issue in my 11G OCM network.

Symptoms

I received two (2) OEM alerts for different targets:

  1. For LISTENER_gridctrl.yaocm.id.au, the Response Time metric was reported to be 18.590s
  2. For repos.yaocm.id.au, the User Logon Time metric was reported to be 20.025s

Diagnosis

The tnsping repos command reports the following responses from various hosts in the network:

Host TNSPING Response (s)
clyde 7.840
gridctrl 12.540
penrith 13.510

So, the problem seems to network wide. Investigating the IP layer, I see the following average response times to ping -c 5 gridctrl:

Host OS PING Response (ms)
clyde RHEL 3.205
gridctrl RHEL 0.035
penrith WIN XP 0.000

The network latency hardly explains any of the response time seen by the TNSPING.

Resolution

As a temporary fix, I have raised the thresholds for the alerts:

Target Metric Warning Critical
LISTENER_gridctrl.yaocm.id.au Response Time 20000 30000
repos.yaocm.id.au User Logon Time 60000  

This is not satisfactory, but these metrics are not on the critical path for the OCM exam.

Advertisements

One thought on “OEM Alerts for Network Issues

  1. Repeated TNSPING via COUNT parameter – Yet Another OCM

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s