0% found this document useful (0 votes)

9 views20 pages

Normalization MOd 3 DBMS

The document discusses database normalization, a process aimed at reducing data redundancy and improving data integrity through various normal forms (1NF, 2NF, 3NF, BCNF, 4NF). It outlines the importance of normalization in preventing anomalies such as insertion, deletion, and updation, while also detailing the advantages and disadvantages of the process. Additionally, it explains functional dependencies and their role in achieving different normal forms, emphasizing the need for a structured and efficient database design.

Uploaded by

gagandevganpc01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views20 pages

Normalization MOd 3 DBMS

Uploaded by

gagandevganpc01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Unit – C Normalization: Normalization concept, First normal form(1NF), Second Normal

form(2NF),Third Normal form(3NF), BCNF, Fourth normal form(4NF), Normalization using Multi-
valued Dependencies and Normalization using Functional Dependencies.

Normalization - Database normalization is the process of structuring a relational database in

accordance with a series of so-called normal forms in order to reduce data redundancy and
improve data integrity.

Normalization is an important process in database design that helps improve the database's
e iciency, consistency, and accuracy. It makes it easier to manage and maintain the data and
ensures that the database is adaptable to changing business needs.

 Database normalization is the process of organizing the attributes of the database to reduce or
eliminate data redundancy (having the same data but at di erent places).

 Data redundancy unnecessarily increases the size of the database as the same data is repeated
in many places. Inconsistency problems also arise during insert, delete, and update
operations.

 In the relational model, there exist standard methods to quantify how e icient a databases is.
These methods are called normal forms and there are algorithms to covert a given database
into normal forms.

 Normalization generally involves splitting a table into multiple ones which must be linked each
time a query is made requiring data from the split tables.

Why do we need Normalization?

The primary objective for normalizing the relations is to eliminate the below anomalies. Failure to
reduce anomalies results in data redundancy, which may threaten data integrity and cause
additional issues as the database increases. Normalization consists of a set of procedures that
assist you in developing an e ective database structure.

 Insertion Anomalies: Insertion anomalies occur when it is not possible to insert data into a
database because the required ﬁelds are missing or because the data is incomplete. For
example, if a database requires that every record has a primary key, but no value is provided for
a particular record, it cannot be inserted into the database.

 Deletion anomalies: Deletion anomalies occur when deleting a record from a database and can
result in the unintentional loss of data. For example, if a database contains information about
customers and orders, deleting a customer record may also delete all the orders associated
with that customer.

 Updation anomalies: Updation anomalies occur when modifying data in a database and can
result in inconsistencies or errors. For example, if a database contains information about
employees and their salaries, updating an employee’s salary in one record but not in all related
records could lead to incorrect calculations and reporting.
Before Normalization: The table is prone to redundancy and anomalies (insertion, update, and
deletion).

After Normalization: The data is divided into logical tables to ensure consistency, avoid redundancy
and remove anomalies making the database e icient and reliable.

Features of Database Normalization

 Elimination of Data Redundancy: One of the main features of normalization is to eliminate the
data redundancy that can occur in a database. Data redundancy refers to the repetition of data
in di erent parts of the database. Normalization helps in reducing or eliminating this
redundancy, which can improve the e iciency and consistency of the database.

 Ensuring Data Consistency: Normalization helps in ensuring that the data in the database is
consistent and accurate. By eliminating redundancy, normalization helps in preventing
inconsistencies and contradictions that can arise due to di erent versions of the same data.

 Simpliﬁcation of Data Management: Normalization simpliﬁes the process of managing data in

a database. By breaking down a complex data structure into simpler tables, normalization
makes it easier to manage the data, update it, and retrieve it.

 Improved Database Design: Normalization helps in improving the overall design of the
database. By organizing the data in a structured and systematic way, normalization makes it
easier to design and maintain the database. It also makes the database more ﬂexible and
adaptable to changing business needs.

 Avoiding Update Anomalies: Normalization helps in avoiding update anomalies, which can
occur when updating a single record in a table a ects multiple records in other tables.
Normalization ensures that each table contains only one type of data and that the relationships
between the tables are clearly deﬁned, which helps in avoiding such anomalies.

 Standardization: Normalization helps in standardizing the data in the database. By organizing

the data into tables and deﬁning relationships between them, normalization helps in ensuring
that the data is stored in a consistent and uniform manner.

Advantages of Normalization

 Normalization eliminates data redundancy and ensures that each piece of data is stored in
only one place, reducing the risk of data inconsistency and making it easier to maintain data
accuracy.

 By breaking down data into smaller, more speciﬁc tables, normalization helps ensure that
each table stores only relevant data, which improves the overall data integrity of the
database.

 Normalization simpliﬁes the process of updating data, as it only needs to be changed in one
place rather than in multiple places throughout the database.
 Normalization enables users to query the database using a variety of di erent criteria, as
the data is organized into smaller, more speciﬁc tables that can be joined together as
needed.

 Normalization can help ensure that data is consistent across di erent applications that use
the same database, making it easier to integrate di erent applications and ensuring that all
users have access to accurate and consistent data.

Disadvantages of Normalization

 Normalization can result in increased performance overhead due to the need for
additional join operations and the potential for slower query execution times.

 Normalization can result in the loss of data context, as data may be split across multiple
tables and require additional joins to retrieve.

 Proper implementation of normalization requires expert knowledge of database design and

the normalization process.

 Normalization can increase the complexity of a database design, especially if the data
model is not well understood or if the normalization process is not carried out correctly.

A functional dependency (FD) in a relational database management system (DBMS) is a

constraint between two sets of attributes (columns) in a table. It states that the value of one set
of attributes, called the determinant (left-hand side), uniquely determines the value of another
set of attributes, called the dependent (right-hand side).

Functional dependencies are crucial for database normalization, a process that minimizes
data redundancy and improves data integrity and consistency.

Notation

An FD is typically denoted using an arrow:

X -> Y

 X is the Determinant (left-hand side).

 Y is the Dependent (right-hand side).

Meaning: If two rows in the table have the same values for all attributes in X, then they must
also have the same values for all attributes in Y

Example: In an Employee table with attributes {EmployeeID, Name, Department}, if EmployeeID

is unique, then:

EmployeeID -> Name, Department

This means knowing the EmployeeID uniquely determines both the Name and Department.
Types of Functional Dependencies

Functional dependencies are classiﬁed into several types based on the relationship between
the determinant and the dependent set of attributes, which are particularly important in the
context of normalization:

 Trivial Functional Dependency: An FD X -> Y is trivial if Y is a subset of X (Y \subseteq X).

This is always true and provides no new information about the data relationships.

o Example: {EmployeeID, Name\} -> EmployeeID

 Non-Trivial Functional Dependency: An FD X -> Y is non-trivial if Y is not a subset of X (Y

\not\subseteq X). This shows an important relationship between attributes.

o Example: $EmployeeID \rightarrow Name$

 Full Functional Dependency (used for 2NF): An FD $X \rightarrow Y$ holds if $Y$ is

functionally dependent on $X$ and is not functionally dependent on any proper subset of
$X$. This is only relevant when $X$ is a composite key.

o Example (Composite Key $\{OrderNo, ProductNo\}$): $\{OrderNo, ProductNo\}

\rightarrow Quantity$ (Quantity requires the full key).

 Partial Functional Dependency (violates 2NF): An FD $X \rightarrow Y$ holds if a non-key

attribute $Y$ is functionally dependent on only a part of a composite candidate key $X$.

o Example (Composite Key $\{OrderNo, ProductNo\}$): $OrderNo \rightarrow

OrderDate$ (OrderDate only depends on part of the key).

 Transitive Functional Dependency (violates 3NF): An FD $X \rightarrow Z$ holds when $X

\rightarrow Y$ and $Y \rightarrow Z$ and $Y$ is not a superkey. The dependency is indirect,
through the intermediate attribute set $Y$.

o Example: $EmployeeID \rightarrow DepartmentID$ and $DepartmentID \rightarrow

DepartmentName$. Therefore, $EmployeeID \rightarrow DepartmentName$ is a
transitive dependency.

Normal Forms in DBMS

First Normal Form (1NF)- A table is said to be in First Normal Form (1NF) if and only if:

All the attributes (columns) in the relation contain atomic (indivisible) values only.
There should be no repeating groups or arrays.

The First Normal Form (1NF) is the most basic rule for organizing data in a relational database. It
ensures that your data is structured correctly, making it easy to search, sort, and manage.

Core Principle: Atomic Values

The main rule of 1NF is that every column in every row of a table must contain atomic (indivisible)
values.

 Atomic means that a column cannot hold multiple values, and it cannot hold a structure or
group of values that can be broken down further.

 Think of it like a single cell in a spreadsheet: it should hold only one piece of information.

Violations of 1NF (What to Avoid)

A table violates 1NF in two main ways:

1. Repeating Groups

This is when a single row has multiple columns that are meant to hold the same kind of information.

StudentID StudentName Course1 Course2 Course3

101 Alice Math Physics History

102 Bob Chemistry Art

This table violates 1NF because Course1, Course2, and Course3 are essentially the same attribute
(Course) repeated across the columns. If Alice takes a fourth course, you'd have to add a new
column, which is bad database design.

2. Multi-Valued Attributes

This is when a single cell contains more than one value, often separated by commas or some other
delimiter.

StudentID StudentName Courses

101 Alice Math, Physics, History

102 Bob Chemistry, Art

This table violates 1NF because the Courses cell for Alice contains three separate courses. If you
wanted to ﬁnd all students taking only Physics, it would be di icult to search this column. The value
is not atomic.

Achieving 1NF

To bring a table into 1NF, you must eliminate repeating groups and multi-valued attributes by
creating separate rows for each distinct atomic value.

The 1NF Solution Table - Instead of one row with multiple courses, you create a new row for every
course a student takes, duplicating the student's information as needed.
StudentID StudentName Course

101 Alice Math

101 Alice Physics

101 Alice History

102 Bob Chemistry

102 Bob Art

Why this works:

1. Atomicity: Every cell contains a single, indivisible value (one StudentID, one StudentName,
one Course).

2. Scalability: If a student takes a new course, you simply add a new row. You don't have to
change the structure (columns) of the table.

3. Searchability: It's now very easy to query the table, for example, to ﬁnd all courses taken by
student '101' or all students taking 'Physics'.

In simple terms, 1NF is about ensuring that each piece of data gets its own dedicated, single slot in
the table.

Second Normal Form (2NF)

The Second Normal Form (2NF) is the next level of database normalization. It builds directly upon
1NF, adding a speciﬁc rule about how non-key columns relate to the primary key.

Prerequisite: You Must Already Be in 1NF

A table must satisfy 1NF before it can be evaluated for 2NF.

The Core Rule of 2NF

The rule for 2NF is:

Every non-key attribute must be fully functionally dependent on the entire Primary Key.

Let's break down this concept:

1. Primary Key vs. Composite Key

 Primary Key: A column (or a set of columns) that uniquely identiﬁes every row.

 Composite Key: A Primary Key composed of two or more columns. 2NF is only relevant
when you have a composite key.

2. Functional Dependency
 Deﬁnition: An attribute B is functionally dependent on an attribute A (written as $A
\rightarrow B$) if knowing the value of A uniquely determines the value of B.

o Example: $\text{StudentID} \rightarrow \text{StudentName}$ (If you know the ID, you
know the Name).

3. Partial Dependency (The 2NF Violation)

A partial dependency occurs when a non-key attribute is determined by only a part of the
composite Primary Key, not the entire key. This is the violation that 2NF eliminates.

Violation Example

Imagine a table used to track which instructors teach which courses, and what the tuition fee is.

CourseID InstructorID CourseName InstructorName TuitionFee

CSC101 I001 Database Dr. Smith $500

CSC101 I002 Database Ms. Jones $500

MTH202 I001 Calculus Dr. Smith $450

In this table:

 Composite Primary Key: $(\text{CourseID}, \text{InstructorID})$ - You need both to

uniquely identify a row (e.g., to distinguish between Dr. Smith and Ms. Jones teaching the
same course).

 Non-Key Attributes: $\text{CourseName}$, $\text{InstructorName}$, $\text{TuitionFee}$.

The Partial Dependencies:

1. $\text{CourseID} \rightarrow \text{CourseName}$ and $\text{CourseID} \rightarrow

\text{TuitionFee}$

o The $\text{CourseName}$ (Database) and $\text{TuitionFee}$ ($500) are determined

only by the $\text{CourseID}$ (CSC101), which is only part of the key.

o The $\text{TuitionFee}$ is repeated every time the course is listed with a di erent
instructor, which is redundancy.

2. $\text{InstructorID} \rightarrow \text{InstructorName}$

o The $\text{InstructorName}$ (Dr. Smith) is determined only by the

$\text{InstructorID}$ (I001), which is only part of the key.

o The $\text{InstructorName}$ is repeated for every course the instructor teaches.

Achieving 2NF (The Solution)

To achieve 2NF, you must remove the partial dependencies by splitting the original table into new
tables, where all non-key attributes are dependent on the entire Primary Key of their new table.

1. Course Table (Removing $\text{CourseID}$ Dependencies)

CourseID (PK) CourseName TuitionFee

CSC101 Database $500

MTH202 Calculus $450

 Constraint: All columns are fully dependent on the entire Primary Key ($\text{CourseID}$).

2. Instructor Table (Removing $\text{InstructorID}$ Dependencies)

InstructorID (PK) InstructorName

I001 Dr. Smith

I002 Ms. Jones

 Constraint: All columns are fully dependent on the entire Primary Key
($\text{InstructorID}$).

3. Enrollment/Teaching Table (The Relationship)

CourseID (Part of PK/FK) InstructorID (Part of PK/FK)

CSC101 I001

CSC101 I002

MTH202 I001

 Constraint: This table only holds the key attributes and forms the relationship. Any
information you add here must be dependent on both ($\text{CourseID},
\text{InstructorID}$).

Beneﬁts of 2NF:
 Less Redundancy: The $\text{TuitionFee}$ and $\text{InstructorName}$ are now stored
only once.

 Easier Updates: If the $\text{TuitionFee}$ for CSC101 changes, you only update one row in
the $\text{Course}$ table, not multiple rows in the original table.

 Reduced Anomalies: You can now add a new course without assigning an instructor, or
add a new instructor without having them teach a course yet.

Third Normal Form (3NF)

The Third Normal Form (3NF) is the most widely applied level of normalization in practical
database design. It ensures that all data in a table directly describes the entity identiﬁed by the
primary key and nothing else.

Prerequisite: You Must Already Be in 2NF

A table must satisfy 2NF before it can be evaluated for 3NF.

 1NF Rule: All cells must have atomic values.

 2NF Rule: All non-key attributes must be fully dependent on the entire Primary Key
(relevant only with a composite key).

The Core Rule of 3NF

The rule for 3NF eliminates a speciﬁc type of dependency that still exists after 2NF:

There must be no transitive dependencies among non-key attributes.

Let's clarify what this means:

Transitive Dependency (The 3NF Violation)

A transitive dependency occurs when a non-key attribute is dependent on another non-key

attribute, instead of being directly dependent on the Primary Key.

 In simpler terms: A non-key column shouldn't determine the value of another non-key
column.

The ﬂow of dependency looks like this:

$$\text{Primary Key} \rightarrow \text{Non-Key Attribute A} \rightarrow \text{Non-Key Attribute B}$$

Violation Example (The Employee Department Table)

Imagine a table used to track employee information, including which department they work in and
where that department is located.
EmployeeI EmployeeNa Department DepartmentNa DepartmentLocati
D (PK) me ID me on

101 Alice D01 Sales New York

102 Bob D02 IT Chicago

103 Charlie D01 Sales New York

This table is in 2NF because:

 The Primary Key is $\text{EmployeeID}$.

 All other columns ($\text{EmployeeName}$, $\text{DepartmentID}$,

$\text{DepartmentName}$, $\text{DepartmentLocation}$) are directly dependent on
$\text{EmployeeID}$.

The Transitive Dependency:

1. The $\text{EmployeeID}$ determines the $\text{DepartmentID}$. ($\text{EmployeeID}

\rightarrow \text{DepartmentID}$)

2. The $\text{DepartmentID}$ determines the $\text{DepartmentName}$ and

$\text{DepartmentLocation}$. ($\text{DepartmentID} \rightarrow \text{DepartmentName},
\text{DepartmentLocation}$)

The non-key attributes $\text{DepartmentName}$ and $\text{DepartmentLocation}$ are

determined by another non-key attribute ($\text{DepartmentID}$), not directly by the
$\text{EmployeeID}$.

The Redundancy Problem:

 The $\text{DepartmentName}$ and $\text{DepartmentLocation}$ (e.g., Sales, New York) are

repeated for every employee in that department.

 If the Sales department moves to London, you would have to update multiple rows, leading
to potential update anomalies (if you miss one employee's row, the data becomes
inconsistent).

Achieving 3NF (The Solution)

To achieve 3NF, you must remove the transitive dependencies by moving the transitively
dependent attributes into a new, separate table. The determinant non-key attribute
($\text{DepartmentID}$) becomes the Primary Key of the new table.

1. Employee Table (The Primary Entity)

EmployeeID (PK) EmployeeName DepartmentID (FK)

101 Alice D01

102 Bob D02

103 Charlie D01

 Constraint: All non-key attributes ($\text{EmployeeName}$ and $\text{DepartmentID}$) are

dependent only on the $\text{EmployeeID}$.

2. Department Table (The Secondary Entity)

DepartmentID (PK) DepartmentName DepartmentLocation

D01 Sales New York

D02 IT Chicago

 Constraint: All columns are fully dependent on the $\text{DepartmentID}$.

Beneﬁts of 3NF:

 Minimal Redundancy: Department details are stored only once.

 Easy Updates: If the Sales department moves, you only update one row in the
$\text{Department}$ table.

 Reduced Anomalies: You can add a new department to the $\text{Department}$ table even
if no employee has been assigned to it yet.

Boyce-Codd Normal Form (BCNF)

The Boyce-Codd Normal Form (BCNF) is a stricter version of 3NF. It addresses a speciﬁc,
uncommon, but critical anomaly that can still occur in tables that are already in 3NF, especially
when the table has multiple overlapping composite candidate keys.

Prerequisite: You Must Already Be in 3NF

A table must satisfy 3NF before it can be evaluated for BCNF.

 3NF Rule: Eliminate transitive dependencies (No non-key column determines another non-
key column).
The Core Rule of BCNF

The BCNF rule simpliﬁes the 3NF rule by applying the concept of functional dependency to all
determinants (attributes that determine others):

For every non-trivial functional dependency ($X \rightarrow Y$), $X$ must be a superkey.

Let's break down this concept:

1. Functional Dependency ($X \rightarrow Y$): Knowing the value of $X$ uniquely
determines the value of $Y$.

2. Trivial Dependency: When $Y$ is a subset of $X$ (e.g., $(\text{StudentID}, \text{Course})

\rightarrow \text{Course}$). BCNF only cares about non-trivial dependencies.

3. Superkey: Any attribute or set of attributes that uniquely identiﬁes a row (this includes the
Primary Key and any other Candidate Keys).

In simpler terms: The only columns that should be able to determine the value of other
columns in the table are the ones that uniquely identify the entire row (the superkeys).

Violation Example

Consider a table tracking students and their advisors for speciﬁc subjects.

StudentID Subject Advisor

S100 Physics Dr. Smith

S100 Math Dr. Jones

S200 Math Dr. Jones

S300 Physics Dr. Smith

Facts and Dependencies:

1. A student can have multiple subjects, but a student has only one advisor for a given
subject.

o Candidate Key 1 (Primary Key): $(\text{StudentID}, \text{Subject})$

2. An advisor is only assigned to one speciﬁc subject (i.e., a Physics advisor only advises
on Physics).

o Dependency: $\text{Advisor} \rightarrow \text{Subject}$ (Non-trivial functional

dependency)
The Violation:

 The table is in 3NF. There are no transitive dependencies ($\text{PK} \rightarrow \text{Non-
Key A} \rightarrow \text{Non-Key B}$).

 The table violates BCNF because of the dependency: $\text{Advisor} \rightarrow

\text{Subject}$

o The determinant is $\text{Advisor}$.

o Is $\text{Advisor}$ a Superkey? No, because $\text{Advisor}$ (Dr. Smith) does not

uniquely identify the row (S100, S300 both have Dr. Smith as an advisor).

o Because $\text{Advisor}$ is not a Superkey but is determining another attribute

($\text{Subject}$), the table is not in BCNF.

The Problem:

 Redundancy: The relationship between $\text{Advisor}$ and $\text{Subject}$ (e.g., Dr.

Smith, Physics) is repeated for every student advised by Dr. Smith.

 Update Anomaly: If Dr. Smith switches subjects from Physics to Chemistry, you must
update multiple rows, risking inconsistency.

Achieving BCNF

To achieve BCNF, you must separate the attribute that is acting as a non-superkey determinant
($\text{Advisor}$) and the attribute it determines ($\text{Subject}$) into a new table.

1. Advisor-Subject Table (The New Entity)

Advisor (PK) Subject

Dr. Smith Physics

Dr. Jones Math

 Constraint: Here, $\text{Advisor}$ is the Superkey (Primary Key), and it fully determines the
$\text{Subject}$. BCNF achieved for this table.

2. Student-Advisor Table (The Relationship)

StudentID (Part of PK/FK) Advisor (Part of PK/FK)

S100 Dr. Smith

StudentID (Part of PK/FK) Advisor (Part of PK/FK)

S100 Dr. Jones

S200 Dr. Jones

S300 Dr. Smith

 Constraint: The Primary Key is $(\text{StudentID}, \text{Advisor})$. BCNF achieved for this
table. (No other columns, so the rule is vacuously true).

Fourth Normal Form (4NF) Explained

The Fourth Normal Form (4NF) is an advanced level of normalization that goes beyond the
concepts of functional and transitive dependencies addressed by 3NF and BCNF. It speciﬁcally
addresses a type of redundancy called Multi-valued Dependency (MVD).

Prerequisite: You Must Already Be in BCNF

A table must satisfy BCNF before it can be evaluated for 4NF.

 BCNF Rule: Every determinant (an attribute that determines others) must be a Superkey.

The Core Rule of 4NF

The rule for 4NF is:

A table is in 4NF if and only if it is in BCNF and contains no non-trivial Multi-valued

Dependencies (MVDs) other than a candidate key.

What is a Multi-valued Dependency (MVD)?

An MVD exists when one attribute ($A$) determines multiple values for another attribute ($B$)
independently of other attributes ($C$) in the table. It is denoted as $A \twoheadrightarrow B$.

 Key Insight: MVDs typically occur in tables with at least three columns where two of the
non-key attributes are independent of each other, but both are dependent on the key.

 The problem is that this independence forces a Cartesian product of the independent
values to be listed for every row, leading to massive redundancy.

Violation Example (The Student Activity Table)

Imagine a table to track students, the courses they take, and their hobbies. Assume the following
real-world rules:

1. A student can take multiple courses.

2. A student can have multiple hobbies.

3. The courses a student takes are completely independent of their hobbies.

StudentID Course Hobby

S100 Math Cricket

S100 Math Chess

S100 Physics Cricket

S100 Physics Chess

S200 History Reading

The Dependencies:

 Primary Key (Candidate Key): $(\text{StudentID}, \text{Course}, \text{Hobby})$ — All three

attributes are needed to uniquely identify a row.

 Multi-valued Dependencies (MVDs):

o $\text{StudentID} \twoheadrightarrow \text{Course}$ (A student has multiple

courses).

o $\text{StudentID} \twoheadrightarrow \text{Hobby}$ (A student has multiple

hobbies).

The Redundancy Problem:

The table must list every combination of a student's courses with their hobbies to represent all the
facts.

 Student S100 takes two courses ($\text{Math}$, $\text{Physics}$) and has two hobbies
($\text{Cricket}$, $\text{Chess}$).

 The table must have $2 \times 2 = 4$ rows, even though $\text{Math}$ isn't related to
$\text{Cricket}$ any more than it is to $\text{Chess}$. The $\text{Course}$ and
$\text{Hobby}$ data are cross-multiplied unnecessarily.

 If S100 takes a 3rd course, say Chemistry, you must add two new rows (one for
Chemistry/Cricket and one for Chemistry/Chess), leading to an Insertion Anomaly.

The table is in BCNF (since the only determinant is the $\text{StudentID}$ which is part of the
composite key, and $\text{StudentID}$ alone is not a superkey, but the dependencies are MVDs, not
Functional Dependencies). However, the table is not in 4NF because it contains two non-trivial
MVDs ($\text{StudentID} \twoheadrightarrow \text{Course}$ and $\text{StudentID}
\twoheadrightarrow \text{Hobby}$) determined by the same attribute ($\text{StudentID}$).

Achieving 4NF (The Solution)

To achieve 4NF, you must separate the independent multi-valued attributes into their own
tables.

1. Student-Course Table

StudentID (PK/FK) Course (PK)

S100 Math

S100 Physics

S200 History

2. Student-Hobby Table

StudentID (PK/FK) Hobby (PK)

S100 Cricket

S100 Chess

S200 Reading

Beneﬁts of 4NF:

 Eliminates Redundancy: Course and Hobby information are now stored only once per
relationship, eliminating the cross-product explosion.

 No Anomalies: If Student S100 adds $\text{Chemistry}$ as a course, you add only one row
to the $\text{Student-Course}$ table, not multiple rows to the original table.

4NF is the goal for eliminating redundancy caused by multiple, independent one-to-many
relationships in a single table.

Fifth Normal Form (5NF) / Project-Join Normal Form (PJNF)

The Fifth Normal Form (5NF), also known as Project-Join Normal Form (PJNF), is the highest level
of normalization concerned with eliminating redundancy. It deals with a more complex dependency
called Join Dependency.

Prerequisite: You Must Already Be in 4NF

A table must satisfy 4NF before it can be evaluated for 5NF.

 4NF Rule: The table must be in BCNF and contain no non-trivial Multi-valued
Dependencies (MVDs).

The Core Rule of 5NF

The rule for 5NF is:

A table is in 5NF if and only if it is in 4NF, and every non-trivial Join Dependency in that table is
implied by the candidate keys.

In simpler terms, a table in 4NF is in 5NF if it cannot be losslessly decomposed into any number
of smaller tables without the decomposition being based on the table's candidate keys.

What is a Join Dependency (JD)?

A Join Dependency is a constraint on a table ($R$) stating that it can be losslessly reconstructed by
joining three or more smaller tables ($R_1, R_2, \dots, R_n$), where each smaller table is a
projection (subset of columns) of the original table.

$$R = R_1 \bowtie R_2 \bowtie \dots \bowtie R_n$$

 Trivial JD: A JD is trivial if one of the decomposed tables ($R_i$) contains all of the attributes
of the original table ($R$). This is not a concern for 5NF.

 Non-Trivial JD: A non-trivial JD exists if you need to join three or more projections to get
back the original table, and this decomposition is not a result of a superkey. MVD is a
special case of JD where the decomposition is into only two projections ($n=2$).

Violation Example

5NF violations are rare and typically occur when there is a compound constraint that restricts valid
data combinations, which cannot be expressed using functional or multi-valued dependencies.

Consider a table called Supplier-Part-Project with the schema (Supplier, Part, Project), where the
primary key is the combination of all three: $(\text{Supplier}, \text{Part}, \text{Project})$.

Supplier Part Project

S1 P1 J1

S1 P2 J2

S2 P1 J1

Assume a complex rule:

If supplier $S$ supplies part $P$, AND supplier $S$ supplies a part to project $J$, AND part $P$ is
used in project $J$, THEN supplier $S$ must supply part $P$ to project $J$.

This constraint implies a non-trivial Join Dependency $JD(\{\text{Supplier}, \text{Part}\},

\{\text{Supplier}, \text{Project}\}, \{\text{Part}, \text{Project}\})$.

 The table is in 4NF because the only key is the entire table, and there are no non-trivial
MVDs (you can't decompose it into just two tables without a lossy join or creating spurious
tuples).

 However, it violates 5NF because the table can be losslessly decomposed into three
relations, and this dependency is not implied by the candidate key.
Solution: Decompose to 5NF

To satisfy 5NF, we must decompose the original table into the three smaller tables:

1. $\text{R}_1(\text{Supplier}, \text{Part})$

2. $\text{R}_2(\text{Supplier}, \text{Project})$

3. $\text{R}_3(\text{Part}, \text{Project})$

This decomposition eliminates the redundancy and ensures that the complex constraint is
implicitly maintained, preventing the insertion of invalid combinations.

In Practice

5NF is highly theoretical and rarely applied in commercial database design, as tables that violate
4NF but not 5NF are infrequent. Normalization usually stops at BCNF or 4NF.

Normal
Forms Description of Normal Forms

First Normal A relation is in ﬁrst normal form if every attribute in that relation is single-valued
Form (1NF) attribute.

Second A relation that is in First Normal Form and every non-primary-key attribute is
Normal Form fully functionally dependent on the primary key, then the relation is in Second
(2NF) Normal Form (2NF).

A relation is in the third normal form, if there is no transitive dependency for

non-prime attributes as well as it is in the second normal form. A relation is in
3NF if at least one of the following conditions holds in every non-trivial function
dependency X –> Y.

 X is a super key.
Third Normal
Form (3NF)  Y is a prime attribute (each element of Y is part of some candidate key).
Normal
Forms Description of Normal Forms

For BCNF the relation should satisfy the below conditions

 The relation should be in the 3rd Normal Form.

Boyce-Codd
Normal Form  X should be a super-key for every functional dependency (FD) X−>Y in a
(BCNF) given relation.

A relation R is in 4NF if and only if the following conditions are satisﬁed:

Fourth
 It should be in the Boyce-Codd Normal Form (BCNF).
Normal Form
(4NF)  The table should not have any Multi-valued Dependency.

A relation R is in 5NF if and only if it satisﬁes the following conditions:

 R should be already in 4NF.

Fifth Normal
Form (5NF)  It cannot be further non loss decomposed (join dependency).

DBMS NF
No ratings yet
DBMS NF
11 pages
Database Normalization Explained
No ratings yet
Database Normalization Explained
78 pages
HSVG245 Normalization
No ratings yet
HSVG245 Normalization
13 pages
Database Normalization Explained
No ratings yet
Database Normalization Explained
12 pages
4th Unit Normalization and File Organization
No ratings yet
4th Unit Normalization and File Organization
52 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
4 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
8 pages
Unit 2 DBMS
No ratings yet
Unit 2 DBMS
13 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
48 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
8 pages
Functional Dependencies and Normalization Guide
No ratings yet
Functional Dependencies and Normalization Guide
36 pages
DBMS Unit 4
No ratings yet
DBMS Unit 4
8 pages
Normalization
No ratings yet
Normalization
4 pages
Module 4
No ratings yet
Module 4
17 pages
Database Design: Normalization Explained
No ratings yet
Database Design: Normalization Explained
18 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
159 pages
Understanding Database Normalization Techniques
No ratings yet
Understanding Database Normalization Techniques
3 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
11 pages
Comprehensive Guide to Database Normalization
No ratings yet
Comprehensive Guide to Database Normalization
10 pages
RDBMS Normalization Explained
No ratings yet
RDBMS Normalization Explained
8 pages
Understanding Update Anomalies in DBMS
No ratings yet
Understanding Update Anomalies in DBMS
15 pages
Schema Refinement and Normalization Guide
No ratings yet
Schema Refinement and Normalization Guide
21 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
15 pages
Notes Chapter 2.1 Lecture 2.1.2 (Introduction To Normalization and 1NF)
No ratings yet
Notes Chapter 2.1 Lecture 2.1.2 (Introduction To Normalization and 1NF)
4 pages
Database Normalization Explained
No ratings yet
Database Normalization Explained
23 pages
Database Normalization Techniques Explained
No ratings yet
Database Normalization Techniques Explained
73 pages
Functional Dependency
No ratings yet
Functional Dependency
20 pages
Database Normalization Techniques Explained
No ratings yet
Database Normalization Techniques Explained
36 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
4 pages
Database Normalization Explained
No ratings yet
Database Normalization Explained
19 pages
Identifying Business Data for Normalization
No ratings yet
Identifying Business Data for Normalization
11 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
4 pages
Database Normalization Explained
No ratings yet
Database Normalization Explained
5 pages
Understanding Normalization in DBMS
No ratings yet
Understanding Normalization in DBMS
16 pages
Functional Dependencies & Normal Forms
No ratings yet
Functional Dependencies & Normal Forms
11 pages
DBMS Schema Refinement & Normalization Guide
No ratings yet
DBMS Schema Refinement & Normalization Guide
39 pages
Functional Dependency and Normalization Guide
No ratings yet
Functional Dependency and Normalization Guide
14 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
16 pages
Understanding Database Design Anomalies
No ratings yet
Understanding Database Design Anomalies
22 pages
Database Normalization and Anomalies Guide
No ratings yet
Database Normalization and Anomalies Guide
22 pages
Understanding Functional Dependency & Normalization
No ratings yet
Understanding Functional Dependency & Normalization
12 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
9 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
1 page
Lect # 8 - DBMS Fundamentals - 085928
No ratings yet
Lect # 8 - DBMS Fundamentals - 085928
16 pages
Unit 1 Notes RDBMS
No ratings yet
Unit 1 Notes RDBMS
19 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
27 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
14 pages
Data Normalization Guidelines for Databases
No ratings yet
Data Normalization Guidelines for Databases
14 pages
Relational Database Design & Normalization
No ratings yet
Relational Database Design & Normalization
5 pages
Understanding Second Normal Form
No ratings yet
Understanding Second Normal Form
21 pages
Normalization: Reducing Data Redundancy
No ratings yet
Normalization: Reducing Data Redundancy
26 pages
Database Normalization Explained
No ratings yet
Database Normalization Explained
22 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
33 pages
Dimensional Analysis in Engineering
No ratings yet
Dimensional Analysis in Engineering
11 pages
Grade 3 Music Lesson Plan: Sounds & Silence
No ratings yet
Grade 3 Music Lesson Plan: Sounds & Silence
4 pages
Theatre Types and Stage Areas
No ratings yet
Theatre Types and Stage Areas
13 pages
General Remedial Works Specification
No ratings yet
General Remedial Works Specification
3 pages
Kafka's Metamorphosis: Alienation Explored
No ratings yet
Kafka's Metamorphosis: Alienation Explored
6 pages
Modern 2-Bed Flat in Springfield Gardens
No ratings yet
Modern 2-Bed Flat in Springfield Gardens
2 pages
EdTech Market Insights and Trends 2021
No ratings yet
EdTech Market Insights and Trends 2021
11 pages
Wastewater Treatment Quiz Questions
No ratings yet
Wastewater Treatment Quiz Questions
4 pages
Surat to Vashi Pickup Request Details
No ratings yet
Surat to Vashi Pickup Request Details
2 pages
Cloud Computing Concepts Overview
No ratings yet
Cloud Computing Concepts Overview
15 pages
Congdon B
No ratings yet
Congdon B
74 pages
Suremag Notes
No ratings yet
Suremag Notes
3 pages
Quality Control in Road Construction
No ratings yet
Quality Control in Road Construction
9 pages
Mustang Sally Lyrics and Chords
No ratings yet
Mustang Sally Lyrics and Chords
14 pages
Understanding Brahma-samhita's Teachings
No ratings yet
Understanding Brahma-samhita's Teachings
56 pages
HSBC Job Opportunities in Vizag
No ratings yet
HSBC Job Opportunities in Vizag
2 pages
ECO120 Final Exam July 2024
No ratings yet
ECO120 Final Exam July 2024
6 pages
Geosynthetic Encased Columns For Soft Soil Improvement Marcio Almeida Ebook Complete Online Chapters
100% (6)
Geosynthetic Encased Columns For Soft Soil Improvement Marcio Almeida Ebook Complete Online Chapters
63 pages
100 Word Search Puzzle Printable
No ratings yet
100 Word Search Puzzle Printable
1 page
Inductance and Magnetic Circuits Exercises
No ratings yet
Inductance and Magnetic Circuits Exercises
9 pages
Raspberry Pi PoE+ HAT Overview
No ratings yet
Raspberry Pi PoE+ HAT Overview
6 pages
The Victims of Slavery Colonization and The Holocaust A Comparative History of Persecution 9781472508263 9781472509970 9781474219105 9781472508690 - Compress
No ratings yet
The Victims of Slavery Colonization and The Holocaust A Comparative History of Persecution 9781472508263 9781472509970 9781474219105 9781472508690 - Compress
275 pages
Essntial C2000 PDF
No ratings yet
Essntial C2000 PDF
70 pages
Easy Apron Pattern Instructions
No ratings yet
Easy Apron Pattern Instructions
10 pages
Understanding the Research Process
No ratings yet
Understanding the Research Process
9 pages
Key Project Management Milestones Guide
No ratings yet
Key Project Management Milestones Guide
53 pages
Pros and Cons of GM Foods Explained
No ratings yet
Pros and Cons of GM Foods Explained
7 pages
Writing Effective CDRs for Engineers
No ratings yet
Writing Effective CDRs for Engineers
3 pages
M.Sc. Chemistry Entrance Exam 2014 Guide
No ratings yet
M.Sc. Chemistry Entrance Exam 2014 Guide
19 pages
Culligan CTM Water Softener Overview
No ratings yet
Culligan CTM Water Softener Overview
2 pages