Wednesday, 15 March 2017

NORMALIZATION

Database Normalization:
Database normalization is the process of efficiently organizing data in a database. There are two reasons of the normalization process:
·      Eliminating redundant data, for example, storing the same data in more than one tables.
·     Ensuring data dependencies make sense.
What is Normalization?
·         Normalization is a database design technique which organizes tables in a manner that reduces redundancy and dependency of data.
·         It divides larger tables to smaller tables and link them using relationships.
·      Normalization helps produce database systems that are cost effective, cost effective and have better security models.
·         most practical applications normalization achieves its best in 3rd Normal Form. 
  Types of  Database Normal Forms
1NF (First Normal Form) Rules:
·         Each table cell should contain single value.
·      Each record needs to be unique.
Consider we have the following table:
CREATE TABLE CUSTOMERS(
       ID   INT              NOT NULL,
       NAME VARCHAR (20)     NOT NULL,
       AGE  INT              NOT NULL,
       ADDRESS  CHAR (25),
       ORDERS   VARCHAR(155)
);
So if we populate this table for a single customer having multiple orders, then it would be something as follows:
ID
NAME
AGE
ADDRESS
ORDERS
100
Sachin
36
Bangalore
Cannon XL-200
100
Sachin
36
Bangalore
Battery XL-200
100
Sachin
36
Bangalore
Tripod Large

But as per 1NF, we need to ensure that there are no repeating groups of data. So let us break above table into two parts and join them using a key as follows:
CUSTOMERS table:
CREATE TABLE CUSTOMERS(
       ID   INT              NOT NULL,
       NAME VARCHAR (20)     NOT NULL,
       AGE  INT              NOT NULL,
       ADDRESS  CHAR (25),
       PRIMARY KEY (ID)
);
This table would have the following record:
ID
NAME
AGE
ADDRESS
100
Sachin
36
Bangalore

ORDERS table:
CREATE TABLE ORDERS(
       ID   INT              NOT NULL,
       CUSTOMER_ID INT       NOT NULL,
       ORDERS   VARCHAR(155),
       PRIMARY KEY (ID)
);
This table would have the following records:
ID
CUSTOMER_ID
ORDERS
10
100
Cannon XL-200
11
100
Battery XL-200
12
100
Tripod Large

2NF (Second Normal Form) Rules :
·      Rule 1- Be in 1NF
·      Rule 2- Single Column Primary Key
Consider the following example:

This table has a composite primary key [Customer ID, Store ID]. The non-key attribute is [Purchase Location]. In this case, [Purchase Location] only depends on [Store ID], which is only part of the primary key. Therefore, this table does not satisfy second normal form.
To bring this table to second normal form, we break the table into two tables, and now we have the following:

What we have done is to remove the partial functional dependency that we initially had. Now, in the table [TABLE_STORE], the column [Purchase Location] is fully dependent on the primary key of that table, which is [Store ID].
3NF (ThirdNormal Form) Rules:
·      Rule 1- Be in 2NF
·      Rule 2- Has no transitive functional dependencies 
  What is a transitive functional dependencies?
·         A transitive functional dependency is when changing a non-key column , might cause any of the other non-key columns to change.
·         Functional dependencies are a very important component of the normalize data process
To move our 2NF table into 3NF we again need to need divide our table.

In the table able, [Book ID] determines [Genre ID], and [Genre ID] determines [Genre Type]. Therefore, [Book ID] determines [Genre Type] via [Genre ID] and we have transitive functional dependency, and this structure does not satisfy third normal form.

Now all non-key attributes are fully functional dependent only on the primary key. In [TABLE_BOOK], both [Genre ID] and [Price] are only dependent on [Book ID]. In [TABLE_GENRE], [Genre Type] is only dependent on [Genre ID]. 
Boyce Codd Normal Form (BCNF):
Even when a database is in 3rd Normal Form, still there would be anomalies resulted if it has more than one Candidate Key.  
4NF (Fourth Normal Form) Rules
If no database table instance contains two or more, independent and multivalued data describing the relevant entity , then it is in 4th Normal Form.
5NF (Fifth Normal Form) Rules
A table is in 5th Normal Form only if it is in 4NF and it cannot be decomposed in to any number of smaller tables without loss of data.
6NF (Sixth Normal Form) Proposed

6th Normal Form is not standardized yet however it is being discussed by database experts for some time.

No comments:

Post a Comment