Implementation of Logical Standby Database a Failure


Three (3) years ago, I recommended and set up a logical standby database for a critical business function. This is now widely viewed as a failure.

Summary

My implementation of a logical standby database is viewed as a failure by three (3) key stakeholders:

  1. The business client sees the outages needed to rebuild the logical standby database every time there is a problem as unacceptable.
  2. Senior management sees the time needed by DBAs to maintain the logical standby database as being too high.
  3. The DBAs see logical standby databases as unreliable and need to be rebuilt in order to solve problems with hanging, slow performance, and long lags.

The root cause in my failure was to properly train the DBAs about how to manage logical standby databases. As a result, any time the logical standby database became ‘stuck’, the DBA would opt for a rebuild to get around the problem. This rebuild caused an outage and took several hours of DBA time.

Background

The critical business function was built around a third party product that had an Oracle RDBMS 11.2 database as its back-end. However, this product lacked certain reporting abilities that were needed to create near real-time dashboards about the performance of this business function.

The normal solution would have been to load the data from the product into an ETL process that populated data-marts which, in turn, fed the dashboards. Because of budget reasons, this solution was not pursued immediately, but a temporary solution was required until the budget was worked out and approved.

I recommended a logical standby database as the interim solution because:

  • It built upon our existing experience with physical standby databases;
  • It was included in our Enterprise Edition licence as Oracle Data Guard—SQL Apply;
  • It allowed developers to modify the product schema to support reporting through extra indices and materialised views in a different database without interfering with vendor support in the primary database;
  • It allowed developers to install additional schemas to hold:
    • Extracts from product schema
    • PL/SQL procedures
    • Jobs
  • Changes in the product schema were immediately replicated to the reporting database without DBA intervention;
  • The deployment of the solution would be quick.

The ETL process is being done by jobs on the reporting database, and the results populated into tables that feed the dashboards. So, this is still a data-mart solution, but implemented within a logical standby database.

Problems Encountered

Over the past three (3) years, there have two (2) major problems with Logical Standby Databases as described by the following MOS notes:

  1. Slow Logical apply, due to objects: DBMS_TABCOMP_TEMP_UNCMP OR DBMS_TABCOMP_TEMP_CMP (Doc ID 1114000.1)
  2. A logical standby apply process is slow, with the BUILDER showing error ORA-44604 (Doc ID 1498892.1)

First Problem

The first problem, Slow Logical apply, due to objects: DBMS_TABCOMP_TEMP_UNCMP OR DBMS_TABCOMP_TEMP_CMP (Doc ID 1114000.1), was identified very early on, and the build procedure for Logical Standby Database was updated to include the work-around.

Recently, I was astounded to learn that other DBAs were ignoring this work-around. They said they did not understand why it was needed and were reluctant to follow any procedure that they did not understand.

Thus, they always rebuilt the Logical Standby Database whenever the logical apply slowed to a crawl.

Second Problem

The second problem, A logical standby apply process is slow, with the BUILDER showing error ORA-44604 (Doc ID 1498892.1), occurred after a release upgrade on the primary database. The vendor supplied a large amount of DDL statements as part of each release.

As a consequence, the logical standby database failed to catch up in a reasonable time. In earlier releases, this meant a complete rebuild of the logical standby database.

For the latest release, I used the technique I described in Rough and ready performance tuning of logical standby database instance. When the other DBAs saw the logical standby database catch up to the primary database within an hour, they were amazed that such a thing was possible. They had resigned themselves to doing a complete rebuild of the logical standby database.

Conclusion

Both of these problems revealed that other DBAs had only one (1) way to deal with logical standby database problems: rebuild. This process they knew and it was reliable in that it fixed whatever the problem was. There was no need to go deeper. Their logical standby database tool-kit was adequate for the task.

If I had spent the time to train the other DBAs about logical standby databases, they may have had a much larger tool-kit to draw upon when confronted by logical standby database problems.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s