I am now reading Database Design for Mere Mortals (by Michael James Hernandez) in my enthusiasm about learning about design after my encounter with Fred Brook‘ The Design of Design. I am also reading this book in light of my experience with Clare Churcher‘s book, Beginning Database Design, which I had reviewed previously here.
The key drivers for database design are expressed as:
Yet the primary reason you should be concerned with database is that it’s crucial to the consistency, integrity, and accuracy of the data in a database. (p.28)
At least, we know what we are aiming for in the design of a database. Notice that performance or ease of uase are not among the objectives. Nor is the agility or flexibility of the database design. The important thing is to get the information correct first, and then worry about these other things. Of these three (3) drivers, the most important is accuracy because:
Inaccurate information is probably the most determinental result of improper database design—it can adversely affect your organization’s bottom line. (p.28)
Italics in original
At one previous place of employment, every business unit had some way of calculating the cash position of the organisation, and they all came up with different answers. There was much angst among the senior managers until an edict was promulugated that the cash position reports produced by the computer centre were the correct ones. This put more stress on the database team to ensure that the data was accurate.
From these key drivers, the objectives of a good design are listed as:
- “The database supports both required and ad hoc information retrieval.
- “The tables are constructed properly and efficiently.
- “Data integrity is imposed at the field, table, and relationship levels.
- “The database supports business rules relevant to the organization.
- “The database lends itself to future growth.” (p.33) (Italics in original)
In contrast to Churcher (2007), Hernandez does not attempt to explain or directly use normal forms (p.36) because he reckons that the audience would find it too confusing without the proper foundations. So, we have a group of database designers who are ignorant of the underlying theory, and who will be unable to advance their knowledge because they lack the proper foundation. At least Churcher’s book gives her readers a chance to deepen their knowledge. Instead, Hernandez stresses the importance of database theory:
The relational database is based on two branches of mathematics known as set theory and first-order predicate logic. This very fact is what allows the relational databse to guarantee accurate information. These branches of mathematics also provide the basis for formulating good design methodologies and the building blocks necessary to create good relational database structures. (p.30)
Italics in original
Hernandez appears to be strongly influenced by the Structured Systems Analysis and Design methodology (SSADM). The parody of this methodology is the Waterfall Model. Hernandez is big on Mission Statements (pp. 101-106) and interviews (pp. 91-99) to gather requirements. It is from these interviews and examination of the current database (pp. ) that Hernandez generates a list of tables (p. 181 ff).
This is the same sort of design error that Churcher also makes: the design follows on from the identification of entities whether be objects or subjects. This is really a static view of the data model. Both Churcher and Hernandez are seeking the concrete before looking at the activity. This is looking for nouns before looking at verbs. I would propose that the designer should look at the verbs first, and then look for the nouns. The essence of a database is to do things not merely to exist.
The discussion about keys by Hernandez seems harmless except for the rule that:
Each primary key within the database must be unique—no two tables should have the same primary key unless one of them is a subset table. (p. 267) (Italics in original)
This reasoning precludes the use of surrogate keys and makes database design more complex than it should be. The wrongness of this rule is that it assumes that all primary keys have the same data domain. This is really a very silly rule. Why add an extra burden if there is no sense to it. The only place I can think such a rule makes sense is for NoSQL databases where the primary keys do actually occupy the same domain.
So much of what Hernandez writes reminds of those piles of forms we used to fill out for the SSADM. Most of those forms were dutifully filed away and forgotten. It was just hazing for junior programmers. (We used to think the acronym was missing ‘IS’ for Information Systems.)
With regards to Business Rules (p. 403 ff), Hernandez makes the following distinction:
There are two major types of business rules: database oriented and application-oriented. (p. 407) (Italics in original)
Here Hernandez is mixing logical and physical design. At the logical level of the design, there are only business rules. How and where they are implemented is a matter for the physical design phase.
All in all, this book has now become the standard method in database design. Yet, I am left hungering for more.