HARAMAYA UNIVERSITY
COLLEGE OF COMPUTING AND INFORMATICS
DEPARTMENT OF INFORMATION TECHNOLOGY
FUNDAMENTALS OF DATABASE SYSTEMS
CHAPTER FOUR
LECTURE 2
NORMALIZATION
2 CONTENTS
Introduction
Normalization
Normal Forms
4
INTRODUCTION … (1)
Data anomalies are inconsistencies in the data stored in a database as a result of an
operation such as update, insertion, and/or deletion.
Such inconsistencies may arise when a particular record stored in multiple locations
and not all of the copies are updated.
Redundant data wastes disk space and creates maintenance problems.
If data that exists in more than one place must be changed, the data must be changed
in exactly the same way in all locations.
A customer address change is much easier to implement if that data is stored only in
the Customers table and nowhere else in the database.
5
INTRODUCTION … (2)
6
INTRODUCTION … (3)
The table contains two entity’s information: Student and Department
Insertion If we add a department called “ISY”, we have to wait to have a
Anomaly student who registered in that department.
Updating For example, If we change the HoD of CS department, we have to
Anomaly change in 2 rows. This may lead for inconsistency as the number
of row is increased and unknown
Deletion All of the department detail will be lost if we delete a student
Anomaly information
7
NORMALIZATION … (1)
Normalization is a technique of organizing data in database in to multiple related tables to
minimize data redundancy.
Data redundancy is repetition of similar data in multiple places that increases the size of the
database (Memory Space).
Data redundancy leads to insertion, updating, and deletion anomalies.
We can prevent such anomalies by implementing level of normalization called Normal Forms
(NF).
Normalization rules divides larger tables into smaller tables and links them using relationships.
The purpose of Normalization in SQL is to eliminate redundant data and ensure data is stored
logically.
Normalization is a solution for the former problem.
8
NORMALIZATION … (2)
9
NORMALIZATION … (3)
10
NORMAL FORMS … (1)
The database community has developed a series of guidelines for ensuring that
databases are normalized.
These are referred to as normal forms and are numbered from one (the lowest form of
normalization, referred to as first normal form or 1NF) through five (fifth normal form or
5NF).
o First Normal Form
o Second Normal Form
o Third Normal Form
o BCNF (Boyce-Codd normal Form)
o Fourth Normal Form
11
NORMAL FORMS … (2)
First Normal Form (1NF)
For a table to be in the First Normal Form, it should follow the following 4 rules:
o Rule 1: It should only have single(atomic) valued attributes/columns.
o Rule 2: Values stored in a column should be of the same domain.
If the domain is Integer. All values of that column should be Integer
o Rule 3: All the columns in a table should have unique names.
o Rule 4: And the order in which data is stored, does not matter.
The 1st Normal form expects you to design your table in such a way that it can easily
be extended and it is easier for you to retrieve data from it whenever required.
12
NORMAL FORMS … (3)
NOT NORMALIZED TABLE
STUDENT_I
NAME COURSE CODE COURSE NAME INSTRUCTOR
D
Dignity Enatnesh,
101 ITec2021, ITec2022 DB, Networking
Matters Wogayehu
Respect
102 ITec2021 DB Enatnesh
Yourself
103 Stay Positive ITec2022 Networking Wogayehu
Table-1
Issue: Multiple values in a single column (Courses
and Instructors).
13
NORMAL FORMS … (3)
Solution: 1NF (First Normal Form)
• Remove repeating groups and create separate rows for each course.
STUDENT_I COURSE
NAME CODE COURSE NAME INSTRUCTOR
D
101 Dignity Matters ITec2021 DB Enatnesh
101 Dignity Matters ITec2022 Networking Wogayehu
102 Respect Yourself ITec2021 DB Enatnesh
103 Stay Positive ITec2022 Networking Wogayehu
Table-2
14
NORMAL FORMS … (4)
2NF (Second Normal Form) - Remove Partial Dependencies from Table 2
• Rule: A table must be in 1NF, and all non-key attributes should be fully dependent
on the primary key.
• Current Primary Key: (Student_ID, CourseCode) → Composite Key
• Problem: The Name column depends only on Student_ID, not on CourseCode.
• Solution: Split into Two Tables
Student_I Student_I Course Instruct
Name Course
D Code or
D
101 ITec2021 DB Enatnesh
Dignity
101 Networkin
101 ITec2022 Wogayehu
Matters g
Table-3 102 ITec2021 DB Enatnesh
Respect Table-4
102 Networkin
15
NORMAL FORMS … (4)
3NF (Third Normal Form) - Remove Transitive Student_I
Name
Dependencies from Table 4 D
• Rule: A table must be in 2NF, and there should 101 Dignity Matters
be no transitive dependencies (i.e., non-key 102 Respect Yourself
columns should not depend on other non-key 103 Stay Positive
Table-5
columns).
• Problem: The Instructor column depends on
Student_I
the Course, not directly on Student_ID. Course Code
D
• Solution: Create a Separate Course Table
101 ITec2021
Course Instruct
Course 101 ITec2022
Code or
102 ITec2021
ITec2021 DB Enatnesh
Table-6 Table-7
103 ITec2022
Networkin
ITec2022 Wogayehu
g
16
SUMMARY
Normalization Form Definition
1NF (First Normal Eliminates duplicate columns and ensures each
Form) cell has a single value (atomicity).
Remove partial dependencies (a column should
2NF (Second Normal
depend on the whole primary key, not just part
Form)
of it).
Remove transitive dependencies (A → B → C). A
3NF (Third Normal
non-key column should not depend on another
Table-8
Form)
non-key column.
TEACHING YOU IS GOOD LUCK