What is a database?


What is a database? That is the fundamental question for database design.

There appears to be four (4) classes of definitions:

Physical Defintions

The physical definitions for a database appear to be standard for encyclopedias and reference manuals from RDBMS vendors. This emphasis on the physicality of the database is also apparent in the historical texts that were current around the early 1990’s.

According to Wikipedia, the definition is that:

A database consists of an organized collection of data for one or more uses, typically in digital form.

Wikipedia contributors (2010)
Emphasis in original

In the Oracle 11.2 Concepts glossary, we find that a physical view is presented in that a database is:

Organized collection of data treated as a unit. The purpose of a database is to store and retrieve related information. Every Oracle database instance accesses only one database in its lifetime.

Oracle Corporation (2010, Glossary-4)

These industry quotes are all current. It is also interesting to track the historical notions of what a database is.

Deen’s definition is:

a generalised integrated collection of data which is structured on natural data relationships so that it provides all necessary access paths to each unit of data in order to fulfil the differing needs of all users.

Deen (1977, 5)

Hughes (1991, 48-49) defines database systems as having three (3) characteristics:

  1. Large amount of data on external storage;
  2. “…complex interrelationships”;
  3. Data is sharable

Flemming and Von Halle are much more honest:

The database aspect of relational databases is much more confusing. There is no universal definition of a relational database except that it is a computerized structure for storing data that the user regards as relational tables (rows and columns). Beyond that concept, every relational DBMS product relies on its own interpretation of what a database is.

Flemming and Von Halle (1989, 58)
Italics in original

These definitions refers to a physical design, not a logical design. Anyone who follows this type of definition would be correct to consider the physical layout of data on disk or in memory.

General Defintions

Powell (2006) says that a database is:

A collection of information, preferably related information, and preferably organized.

No Defintions

The following authors have no explicit defintion of what are a database is:

  • Churcher (2007)
  • Kent (2000)
  • Kimball et. al. (2008)
  • Pascal (2000)

Relational Defintions

All of these definitions involve C.J. Date. It is almost a dichotomy in the database field: there is the C.J. Date explanation and there is nearly everyone else. This is the case in the 40 years since the birth of the relational database theory.

Date and Darwen defines a database as follows:

A database is a named container for relvars; the content of a given database at any given time is a set of (database) relvars.

Date and Darwen (1998, 144)
Emphasis in original
(Italics in original)

Once we know where we are going and know where we are, we can start planning the journey. If the above definition is what we expect of a database, then we need to design a container of relvars. This leads to the question what are relvars.

In 2004, he wrote that “a database is really a collection of true propositions” (Date 2004, 15) (Emphasis in original). In 2007, Date was still saying the same thing that:

…a database can be regarded as a a collection of propositions, assumed by convention to be ones that evaluate to TRUE.

Date (2007, 81)
(Italics in original)

This is a summary of what he wrote earlier:

I’ve said a database can be thought of as a collection of true propositions. In fact, a database, together with the operators that apply to propositions represented in that database (or to sets of such propositions, rather), is a logical system. …it was Codd’s very great insight…that…a database isn’t really a collection of data; rather, it’s a collection of facts, or in other words, true propositions.

Date (2005, 76)
(Italics in original)

In 2007, Date and Darwen now say that we should be using the term dbvar (database variable) instead of database in order to emphasis that it contains values (Date and Darwen 2007, 389).

Once we have the concept of a dbvar, we can more confidently approach the problem of a temporal database.

…the overall database can be thought of as a sequence of database values, where each such value is timestamped with the time of the update that produced it, and the complete sequence is ordered chronologically. The most recent database value in the sequence is, of course, the current one…

Date et.al. (2003, 302)
Emphasis in original

Bibliography

Churcher, Clare (2007), Beginning Database Design, APress: USA.

Date, C.J. (2004), “Chapter 1: An Overview of Database Management”, An introduction to database systems, 8th Ed., Pearson Education, Inc.:USA, see pp. 3-31.

Date, C.J. (2005), “Chapter 4: Relation Variables”, Database in depth: relational theory for practitioners, O’Reilly Media, Inc.: USA, see pp. 61-80.

Date, C.J. (2007), “Chapter 3: Constraints and Predicates”, Logic and Databases: The Roots of Relational Theory, Trafford Publishing: USA, see pp. 67-94.

Date, C.J., and Darwen, Hugh (1998), “Chapter 6: RM Prescriptions”,Foundation for Object/Relational Databases: The Third Manifesto, Addison Wesley Longman: USA, see pp. 99-169.

Date, C.J., and Darwen, Hugh (2007), “Appendix D: What Is a Database?”, Databases, types and the relational model: the third manifesto, Addison Wesley Longman: USA, see pp. 389-391.

Date, C.J., Darwen, Hugh, and Lorentzos, Nikos A. (2003), “Chapter 15: Stated Times and Logged Times”, Temporal data and the relational model: a detailed investigation into the application of interval and relation theory to the problem of temporal database management, Morgan Kaufmann: USA, see pp.297-312.

Deen, S.M. (1977), “Chapter 1: Introduction”, Fundamentals of Data Base Systems, Macmillan: UK, see pp. 1-10.

Flemming, Candice C. and Von Halle, Barbara (1989), “Chapter 3: Relational Concepts and SQL”, Handbook of relational database design, Addison-Wesley: USA, see pp 31-67.

Hughes, John G. (1991), “Chapter 2: Principles of object-oriented systems”, Object-Oriented Databases, Prentice Hall International: UK, see pp. 48-78.

Kent, William (2000), Data and Reality, 1stBooks : USA.

Kimball, Ralph, Ross, Margy, Thornthwaite, Warren, Mundy, Joy, and Becker, Bob (2008) The Data Warehouse Lifecycle Toolkit, Second Edition. John Wiley & Sons. Books24x7. <http://common.books24x7.com/book/id_24441/book.asp> (accessed September 25, 2010)

Oracle Corporation (2010), “Glossary”, Oracle® Database Concepts 11g Release 2 (11.2), Oracle Corporation: USA, published Febuary 2010.

Pascal, Fabian (2000), Practical Issues in Database Management: A Reference for the Thinking Practitioner, Addison-Wesley: USA.

Powell, Gavin (2006), “Glossary”. Beginning Database Design and Implementation, Wrox Press: USA, Books24x7 <http://common.books24x7.com/book/id_12449/book.asp> (accessed October 4, 2010)

Wikipedia contributors (2010), “Database,” Wikipedia, The Free Encyclopedia, <http://en.wikipedia.org/w/index.php?title=Database&oldid=388211008> (accessed October 3, 2010).

Advertisements

4 thoughts on “What is a database?

  1. hello!This was a really admirable Topics!
    I come from itlay, I was fortunate to search your blog in digg
    Also I get much in your Topics really thank your very much i will come every day

  2. hello!This was a really fabulous topic!
    I come from itlay, I was fortunate to discover your subject in digg
    Also I obtain much in your website really thank your very much i will come later

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s