CSE2004
Database Management System
Normalization
Normalization
It is a process for assigning attributes to entities.
It reduces data redundancies and helps eliminate the data
anomalies.
Anomalies are problems that can occur in poorly
planned, un-normalised databases where all the data is
stored in one table
Why Normalization?
Sample Database Table R
E
D
Row 1 X U
Row 2 X N
D
Row 3 X A
N
Row 4 X C
Row 5 X I
E
S
Redundancies lead to
Increases the size of Database
Insertion Problem
Deletion Problem
Updation Problem
Example (Students Table)
Registration Name of the School Name of the Contact
Number Student Faculty Number
1 AAAAA SCOPE Dr. ZZZZZ 12345678910
2 BBBBB SCOPE Dr. ZZZZZ 12345678910
3 CCCCC SCOPE Dr. ZZZZZ 12345678910
4 DDDDD SCOPE Dr. ZZZZZ 12345678910
5 EEEEE SCOPE Dr. ZZZZZ 12345678910
Normalization will solve?
Normalization will solve?
Registration Name of School School Name of the Contact
Number the Faculty Number
Student SCOPE Dr. ZZZZZ 12345678910
1 AAAAA SCOPE
It is not about eliminating redundancy
2 BBBBB SCOPE
3 CCCCC SCOPE It is all about minimizing data
4 DDDDD SCOPE redundancy
5 EEEEE SCOPE
Normalization
Follows Divide and Rule
Logical, Independent but related data
Normalization Stages
Normalization works through a series of stages called
“NORMAL FORMS”
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Boyce-Codd Normal Form
First Normal Form
First Stage of Normalization Process
It gives us with a Scalable Table Design which can be
extended (Easily)
If the table is not even in First Normal Form, it is considered
as “POOR DATABASE DESIGN”
First Normal Form (Rule 1)
Each Column should contain atomic values
Entries like X, Y and W, Z will be treated as violation
Column 1 Column 2 Column 3
A X, Y
B W, X
C Y
D Z
First Normal Form (Rule 2)
A column should contain values that are of same type
Do not inter-mix different types of values in any column
D-O-B NAME Column 3
01-04-2021 X
14-04-2021 Y
07-03-2020 Z
A 01-01-2019
First Normal Form (Rule 3)
Each column should have unique name
Same names may lead to confusion
D-O-B NAME NAME
01-04-2021 X P
14-04-2021 Y Q
07-03-2020 Z R
09-09-2017 A S
First Normal Form (Rule 4)
Order in which the data stored is does not matter
SQL Query will help us to fetch the data
REG. NO NAME NAME
4 X P
1 Y Q
3 Z R
2 A S
Example (1 NF)
Student Table
Registration Name of the Name of the
Number Student Subject
1 AAAAA DBMS, OS
2 BBBBB JAVA
3 CCCCC C, DBMS
4 DDDDD C++
Example (1 NF) – To ensure atomicity
Student Table
Registration Name of the Name of the
Number Student Subject
1 AAAAA DBMS
1 AAAAA OS
2 BBBBB JAVA
3 CCCCC C
3 CCCCC DBMS
4 DDDDD C++
Second Normal Form (2NF)
It should be in the First Normal Form
It should not have Partial-Dependencies
Second Normal Form (2NF)
Partial-Dependency Example
It should be in the First Normal Form
It should not have Partial-Dependencies
Know about Dependency (Primary Key)
Second Normal Form (2NF)
Dependency for understanding
Dependency Example (Students Table)
Student Student Registration School Address
_ID Name Number
1 AAAAA 20BAI SCOPE Chennai
2 AAAAA 20BCE SCOPE Mumbai
3 BBBBB 20CPS SCOPE Kolkata
4 CCCCC 20BCE SCOPE Andhra
Second Normal Form (2NF)
Partial Dependency
If two columns act as primary key
Student Table Subject Table (Subject ID, Name)
Score Table
Second Normal Form (2NF)
Partial Dependency
Score Table
Score_ID Student_ID Subject_ID Marks Faculty
1 1 1 99 Dr. XXXX
2 1 2 97 Dr. YYYY
3 2 1 100 Dr. XXXX
4 2 2 96 Dr. YYYY
5 2 4 84 Dr. ZZZZ
Second Normal Form (2NF)
Partial Dependency
Score Table
Score_ID Student_ID Subject_ID Marks Faculty
1 1 1 99 Dr. XXXX
2 1 2 97 Dr. YYYY
3 2 1 100 Dr. XXXX
4 2 2 96 Dr. YYYY
5 2 4 84 Dr. ZZZZ
Second Normal Form (2NF)
Partial Dependency
Score Table
Score_ID Student_ID Subject_ID Marks Faculty
1 1 1 99 Dr. XXXX
2 1 2 97 Dr. YYYY
3 2 1 100 Dr. XXXX
4 2 2 96 Dr. YYYY
5 2 4 84 Dr. ZZZZ
Second Normal Form (2NF)
Partial Dependency
Score Table
Score_ID Student_ID Subject_ID Marks Faculty
1 1 1 99 Dr. XXXX
2 1 2 97 Dr. YYYY
3 2 1 100 Dr. XXXX
4 2 2 96 Dr. YYYY
5 2 4 84 Dr. ZZZZ
Second Normal Form (2NF)
Partial Dependency
Removing Partial Dependency (Faculty)
Subject_ID Subject Name Faculty
1 Python Dr. XXXX
2 C Dr. YYYY
3 C++ Dr. AAAA
4 Java Dr. ZZZZ
Third Normal Form (3NF)
Table must be in 2NF
It should not have Transitive Dependency
Three Tables were created in 2NF
[Link] Table
[Link] Table
[Link] Table / Faculty Table
Third Normal Form (3NF)
Transitive Dependency not allowed
Table must be in 2NF
It should not have Transitive Dependency
Score Table
Score_ID Student_ID Subject_ID Marks Faculty
1 1 1 99 Dr. XXXX
2 1 2 97 Dr. YYYY
3 2 1 100 Dr. XXXX
4 2 2 96 Dr. YYYY
5 2 4 84 Dr. ZZZZ
Third Normal Form (3NF)
Transitive Dependency not allowed
Once, Exam name is added, then total marks will have some
differences based on theory and lab
So, total marks may change
Therefore, total marks is dependent on exam name
Third Normal Form (3NF)
Transitive Dependency not allowed
Now, the score table looks like
Score_ID Student_ID Subject_ID Marks Exam Total
Name Marks
1 1 1 99
2 1 2 97
3 2 1 100
4 2 2 96
5 2 4 84
Third Normal Form (3NF)
Transitive Dependency is Removed
Fourth Table is created
Exam Table
Exam Name Total Marks
BCNF (Boyce-Codd)
Table must be in 3NF
Check Prime Attribute and Non-prime Attribute
A non prime attribute should not derive prime attribute
BCNF (Boyce-Codd)
Enrollment
Student_ID Subject Faculty
101 Java Dr. XXXXX
101 C++ Dr. BBBBB
102 Java Dr. CCCCC
103 C Dr. DDDDD
104 Java Dr. CCCCC
Satisfies 1 NF (Atomic, Unique Column names)
Satisfies 2 NF (No Partial Dependencies)
Satisfied 3 NF (No Transitive Dependencies)
BCNF (Boyce-Codd)
Enrollment
Student_ID Subject Faculty
101 Java Dr. XXXXX
101 C++ Dr. BBBBB
102 Java Dr. CCCCC
103 C Dr. DDDDD
104 Java Dr. CCCCC
The problem is
StudentID, Subject (prime, because of candidate key) fetches
professor (Acceptable)
Professor (non prime / or not a primary key) fetching subject
and student (not acceptable)
BCNF (Boyce-Codd)
Solution
Create two tables
[Link] Table Student ID Faculty ID
Faculty ID Name Subject
2. Faculty Table
[Link]
nTr1ory9o2MgsOmmx2w8FB3&index=4