Powell’s approach is expressed in the Introduction as follows:
Traditionally, relational database model design (and particularly the topic of normalization), has been much too precise for commercial environments. There is an easy way to interpret normalization, and this book contains original ideas in that respect.
Here I think is the common misconception about the conflation of database design and normalization. Normalization is a technique not the design process. I think I am looking for a way of conceiving how to organize information in a database. Normalization does not give that to me.
The concept of design according to Powell is:
Design is the process of ensuring that it all works without actually building it. Design is a little like testing something on paper before spending thousands of hours building it in possibly the wrong way.
I think Powell is talking about the physical design here. He wants to get a prototype working on paper first before implementing it. I am more interested in the logical design process.
The objectives of database design are listed by Powell (Ch. 1) as:
- “Aim for a well-structured database model”
- “Data integrity”
- “Support both planned queries and ad-hoc or unplanned queries”
- “Support the objectives of the business”
- “Provide adequate performance for any required change activity”
- “Each table in a database model should preferably represent a single subject or topic”
- “Future growth must always be a serious consideration”
- “Future changes can be accommodated for, but potential structural changes can be difficult to allow for”
- “Minimize dependence between applications and database model structures if you expect change.”
There is really no clear direction in these objectives. Brooks (in The Design of Design) follows Vitruvius‘s dictum of firmness, usefulness, and delight (p. 140). The objective of firmness would appear to conflict with that of design evolution. However, my reading of Brooks is that he is talking about a consistency of style and vision. That is what firmness is about. There is a solidity to the design.
I think the last objective of Powell’s would lend itself to interpretation of coupling in Computer Science. The lower the degree of coupling between the application and the database, the greater the degreee of freedom in modifying one and having a lower impact on the other.
I think other books I have read like that of Powell are confused over what is actually in the database. They talk about information and data. The database is really a set of propositions that are true at a certain point of time. I think C.J.Date’s approach is more disciplined in this manner as he does not flounder around with the distinction between knowledge, information, and data.
In his discussion of normalization (Ch. 4), Powell is really discussing data quality. I think data quality should be an explicit design attribute for databases. The general model is that data coming into the database has a low quality and various users of the database have differing levels of data quality to match their needs.
The authors I have read so far seem to mistake data integrity for high data quality. I think data integrity is merely what is entered into the database comes out again as the same value. It is entirely possible for a database of high data quality to have poor data integrity as poor quality data is refined into a higher quality one.
Powell is very down on the academic discription of Normal Forms, and prefers a more colloquial expression. I think Churcher’s approach in Beginning Database Design is still a better way. She comes into Normal Forms from a folksy story.
As for 4NF and 5NF, C.J.Date says that these are of historical interest only (2005, p. 143). I think Powell does a disservice by being so dismissive of the normal forms because he hinders people from developing their knowledge of database theory.
Although Powell considers denormalization to be an advanced technique (Ch. 6), he seems to be rather flippant about the data quality issue. In my view, denormalizing the database lowers the quality of data. In other words, the truth of the propositions in the database are no longer reliable.
I am surprised that Powell (Ch. 6) considers the data warehouse to be different from a relational database. A database is merely a set of true propositions about the world whatever the number of propositions a database contains. An OLTP database and a data warehouse have the same conceptual basis.
In the chapter on Data Warehouses (Ch. 7), Powell misses the essential distinction between OLTP and data warehouses: they have different approaches to time. An OLTP database concentrates on the present&mash;what is the state of the business at this very moment? A data warehouse maintains a timeline of the past—such and such happened at that time. Yet, they are both concerned with true propositions.
I did not follow Powell’s examples too closely. I think the design examples were too straightforward. There was no exploration of alternate paths in order to get closer to a conceptual model.
Powell’s book can be dismissive at times of the underlying realtional theory. At least, the methodology he espouses is not as bureaucratic as that of Hernandez’s Database Design for Mere Mortals.
Of the three (3) books reviewed so far, I think Chrucher’s book has the best approach by concentrating on use cases.
Brooks, Fred. The Design of Design. Addison Wesley, ©2010
Churcher, Clare. Beginning Database Design. APress. ©2007.
Date, Chris J. Database in depth: relational theory for practitioners. O’Reilly Media, Inc., ©2005.
Hernandez, Michael James. Database Design for Mere Mortals. Addison-Wesley, ©2003.
Powell, Gavin. Beginning Database Design and Implementation. Wrox Press. ©2006. Books24x7. <http://common.books24x7.com/book/id_12449/book.asp> (accessed September 14, 2010)