OEM Alarm Clean Up


I have cleaned up some more issues with Grid Control.

Targets with Unknown Status

I removed the following target because the status is unknown:

  • generic_mom_managed_host

Tuning Critical Alarm

There is one (1) critical alarm:

Loader Throughput (rows per second) for gridctrl.yaocm.id.au:4889_Management_Service,XMLLoader0 exceeded the critical threshold (3000). Current value: 3086.2

This alarm is for the OMS and Repository: Management Services and Repository target. The statistics for this alert are:

Last Known Value 3444
Average Value 2785.72
High Value 3481.18
Low Value 201.03
Warning Threshold 2700
Critical Threshold 3000
Occurrences Before Alert 2
Corrective Action None

As you can see from the above graph, the metric is almost consistently about 3,500.M

I have two (2) choices:

  1. Fix the underlying problem
  2. Adjust the alarm thresholds

Being fundamentally slack, I choose to change the alarm thresholds to:

Level Old Threshold New Threshold
Warning Threshold 2700 4000
Critical Threshold 3000 4300

I seem to have to wait for two (2) consective alert evaluations (at ten (10) minute intervals) before the alarm clears.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s