Monday, November 25, 2013

Codd's Rules

Edgar Frank "Ted" Codd is considered a pioneer of relational model for databases. He proposed 13 rules to say why a DataBase Management System (DBMS) is called a Relational and what it has to give in order to be called as Relational.

These are the fundamental rules to be followed by every database. However some databases do not follow some of the rules Codd specified.

The following are the rules:

Rule 0 – Foundation Rule

It states that the system must qualify as a RELATIONAL as a database as well as MANAGEMENT SYSTEM.

RELATIONAL specifies that data is represented in the form of tuples (records) and grouped into relations.
A database is said to be RELATIONAL if the data is stored in the form of tables and there can exist a relationship among the tables.

MANAGEMENT SYSTEM specifies that the database is maintained well through some procedures and processes so that the functionality of the database is good.

Rule 1 – The Information Rule

The information in the RDBMS is always presented in the form of rows and columns.

The data present in those rows and columns is called VALUE.

Rule 2 – The Guaranteed Access Rule

All the data should be accessible.

There should be a PRIMARY KEY on one of the columns of the table.

Data can be accessed from the database by specifying the name of the table. When a particular row is to be accessed then the name of the table and the name of the column on which the PRIMARY KEY is specified should be mentioned.

Rule 3 – Systematic Treatment of NULL Values

DBMS should allow the fields of the table remain empty or NULL. The moto behind this is there will be an indication that some data is missing.

These NULLs are different from the regular values and are independent of data types. NULL cannot be considered as ZERO (0) even.

NULL refers neither positive nor negative. It just denotes an empty space.

Rule 4 – Active Online Catalog based on the Relational Model

Here the Catalog means Dictionary. There should be a catalog which contains the definitions of database objects like Tables, Views, Indexes, Statistics etc.

"Active Online" specifies that there should be access to the users to know the catalog information.

"Based on the Relational Model" User should be able to know this information through the query language which he/she uses to get the information from the regular database objects.

There is a uniform means specified by SQL standards to know this metadata called INFORMATION_SCHEMA.

Rule 5 – The Comprehensive Data Sub-Language Rule

There should be a language communicating through which the data can be accessed by the Users.

Such language contains some syntax and semantics defined. This resulted in the development of Structured Query Language (SQL).

Every database user should know this language to access the data. Otherwise if the data is accessible if user uses any kind of language then there won't be a proper organization in the database.

This SQL is used in the database interface as well as within the application programs.

This language supports Data Definition, Data Manipulation, Constraints, Indexes etc.

Rule 6 – The View Updating Rule

All the Views that are theoretically updatable must be updated by the system.

This means the content of the view should be modified if the content of the table with which the view is referencing, is modified.

Some databases do not follow this rule. In Microsoft SQL Server, you can see that this rule is followed.

Rule 7 – High-Level Insert, Update and Delete

All the operations like INSERT, UPDATE and DELETE should be supported.

Even the SET operations of SET THEORY must be supported.

This means that data can be retrieved from a relational database in sets constructed of data from multiple rows and/or multiple tables.

A single statement is enough to update multiple number of rows and values so that there is less work for user.

Rule 8 – Physical Data Independence

Any changes in the physical storage of the data should not effect the application performance.

If some supporting tables are modified or moved from one disk to another then a situation shouldn't arise where application should be changed.

Rule 9 – Logical Data Independence

Any changes to the Logical storage like tables should not affect the application structure.

If a table structure is changed, like it is split into two tables then the user shouldn't have a situation that he/she has to get the data from two tables. Though there are two tables, user should get the result in the form of single table obtained by joining the two tables.

Logical Data Independence is tougher to achieve than Physical Data Independence.

Rule 10 – Integrity Independence

Integrity Constraints are used to enforce accuracy and consistency in relational databases.

They must be specified separately from the application program and are stored in the catalog.

If any change has to be made to these constraints then that change should not effect the application.

The database should maintain its own integrity rather than depending on other programs. This makes RDBMS independent of the Front-End.

Rule 11 – Distribution Independence

If a database is distributed across different networks then there shouldn't be any difference in the performance of it.

Even the users shouldn't know that the database is distributed across the network and can be able to work as usual.

Rule 12 – The Non-Subversion Rule

Suppose a user is provided a low level access to the data then it should be ensured that he/she is unable to cross the integrity rule and change the data.

Such security is achieved by using some locking and encryption mechanisms.

No comments:

Post a Comment