0% found this document useful (0 votes)
12 views81 pages

Logical Database Design Techniques

Chapter 4 discusses logical database design, focusing on structuring data for relational models. It outlines steps for building and validating a logical data model, including deriving relations, normalizing data, and ensuring integrity constraints. The chapter also covers various relationship types and their representation in database tables, emphasizing the importance of non-redundant data storage.

Uploaded by

nobody5111227
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views81 pages

Logical Database Design Techniques

Chapter 4 discusses logical database design, focusing on structuring data for relational models. It outlines steps for building and validating a logical data model, including deriving relations, normalizing data, and ensuring integrity constraints. The chapter also covers various relationship types and their representation in database tables, emphasizing the importance of non-redundant data storage.

Uploaded by

nobody5111227
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Chapter 4

Logical Database Design

1
Topics
Topics Subtopics
4. 4.1. Logical Database Design for Relational Model
Logical
Database 4.2. Steps to Build Logical Data Model
Design [Link] Relations for Logical Data Model
[Link] Relations Using Normalization
[Link] Relations Against User Transactions
[Link] key and Integrity Constraints
[Link] Logical Data Model with User
[Link] Logical Data Models into Global Model
4.3. Redundancy and Data Anomaly
[Link] Dependency (FD)
[Link]
[Link] of normalization

2
Topics (Continued)
Topics Subtopics
4. 4.4. Process of Normalization (1NF, 2NF, 3NF, BCNF)
Logical
Database [Link] Form
Design [Link]. NF- First Normal Form
[Link]. NF- Second Normal Form (2NF)
[Link]. 3NF- Third Normal Form
[Link]. Boyce–Codd Normal Form (BCNF)
[Link]. Multivalued Dependency and Fourth Normal Form
[Link]. Join Dependencies and Fifth Normal Form
4.5 Pitfalls of Normalization
4.6. Denormalization

3
Logical Database Design for Relational Model
➢ Logical database design is the process of deciding how
to arrange the attributes of the entities in a given
business environment into database structures, such
as the tables of a relational database.
➢ Logical design is the process of constructing a model
of the data used in an enterprise based on a specific
data model (in our case relational), but independent of
a particular DBMS and other physical considerations.

4
Logical Database Design for Relational Model
➢ The goal of logical database design is to create well
structured tables that properly reflect the company's
business environment.
➢ The tables will be able to store data about the
company's entities in a non-redundant manner and
foreign keys will be placed in the tables so that all the
relationships among the entities will be supported.
➢ The purpose of logical design is to translate the
conceptual representation to the logical structure of
the database, which includes designing the relations.
5
Logical Database Design for Relational Model

 Logical design is a semiphysical realisation of the

concepts.

 We say semiphysical because we are really not

concerned with the actual physical file that is stored

in memory; rather, we are concerned with placing data

into relational tables that we will visualise as a

physical organisation of data.


6
Logical Database Design for Relational Model

 Recall, a relational database is a database of

twodimensional tables called relations. The tables are

composed of rows (tuples) and columns (attributes).

 In a relational database, all attributes must be atomic

(simple), and keys must be not null.

 The process of converting an ER diagram into a

database is called mapping.


7
Logical Database Design for Relational Model
Steps to Build and Validate Logical Data Model
 Translate the conceptual data model into a logical data
model and then validate this model to check that it is
structurally correct using normalisation and supports
the required transactions.
 These are the steps involved in building and validating
a Logical Data Model.
 Step 1. Derive relations for logical data model.
 Step 2. Validate relations using normalisation.
 Step 3. Validate relations against user transactions.
 Step 4. Define integrity constraints.
 Step 5. Review logical data model with user.
 Step 6. Merge logical data models into global model
(optional step).
8
Logical Database Design for Relational Model
Steps to Build and Validate Logical Data Model

 In this step, we create relations for the logical data


model that will represent the entities, relationships,
and attributes that have been identified.
 We will look at:
 How to represent entity types (and attributes)
 How to represent relationship types (and attributes
(if any))

9
Derive relations for logical data model
How to represent entity types
1) Strong entity types
➢ For each strong entity in the data model, create a relation
that includes all the simple attributes of that entity.
➢ For composite attributes, include only the constituent
simple attributes(its components only).
2) Weak entity types
➢ For each weak entity in the data model, create a relation that
includes all the simple attributes of that entity.
➢ A weak entity must include its Partial key and its owner entity
type PK as a FK. The combination of the two keys form the
PK of the weak entity.

10
Example
Employee
Emp_NO{PK}
Name
Fname
Mname
Lname
Gender
Partial Key of
Salary
Week entity
Dep NO
Dependent
1..1
Name
Claims Gender
0..* Relationship

Employee (Emp_NO, Fname,Mname ,Lname,Gender, Salary)


Primary Key Emp_NO
DEPENDENT (DepNo, EmpNo, Name, Gender, Relationship )
Primary Key DepNo, EmpNo

11
Derive relations for logical data model
How to represent relationship types
3) One-to-many (1:*) binary relationship types
➢ For each 1:* binary relationship, the entity on the ‘one side’ of
the relationship is designated as the parent entity and the
entity on the ‘many side’ is designated as the child entity.
➢ To represent this relationship, post a copy of the Primary Key
attribute(s) of parent entity into the relation representing the
child entity, to act as a foreign key.
➢ Add attributes that describes the relationship to the child
entity(if exist).

12
Example

Child entity Employee


Emp_NO{PK}
Name Parent entity
Sex
Salary

1..*
Branch
Branch_No {PK}
Allocates
Branch Name
0..1
Employee (Emp_NO, Name, Sex, Salary,Branch No)
Primary Key Emp_NO
Branch (Branch_No, Branch Name)
Primary Key Branch_No

13
Derive relations for logical data model
How to represent relationship types
4) One-to-one (1:1) binary relationship types
➢ Creating relations to represent a 1:1 relationship is more complex as the
cardinality cannot be used to identify the parent and child entities in a
relationship.
➢ Instead, the participation constraints are used to decide whether it is
best to represent the relationship by:
1. Combining the entities involved into one relation
2. OR by creating two relations and posting a copy of the Primary
Key from one relation to the other.
➢ Consider the following :
(a) mandatory participation on both sides of 1:1 relationship
(b) mandatory participation on one side of 1:1 relationship
(c) optional participation on both sides of 1:1 relationship

14
Derive relations for logical data model
How to represent relationship types
(a) one-to-one (1:1) - Mandatory participation on both sides

➢ One alternative is to post a copy of the primary key


attribute(s) from one side of the relationship to the other
side as a foreign key.

15
Derive relations for logical data model
How to represent relationship types
➢ Option One:

➢ Staff(staffNo, name, position, salary)

Primary Key staffNo

➢ Car(regNo, make, model, staffNo)

Primary Key regNo

Foreign Key staffNo references Staff(staffNo)

➢ In this example, we post a copy of staffNo into the Car


relation as a foreign key field.

16
Derive relations for logical data model
How to represent relationship types
➢ Option Two:

➢ Car(regNo, make, model)

Primary Key regNo

➢ Staff(staffNo, name, position, salary, regNo)

Primary Key staffNo

Foreign Key regNo references Car (regNo)

➢ In this example, we post a copy of regNo into the Staff


relation as a foreign key field.

17
Derive relations for logical data model
How to represent relationship types

➢ Another alternative is to combine the entities involved

into one relation and choose one of the primary keys of

original entities to be primary key of the new relation,

while the other is used as an alternate key.

➢ Staff(staffNo, name, position, salary, regNo, make, model)

Primary Key staffNo

18
Derive relations for logical data model
How to represent relationship types
(b) one-to-one (1:1) Mandatory participation on one side
➢ Identify parent and child entities using participation constraints.
Entity with optional participation in relationship is designated as
the parent entity, and entity with mandatory participation is
designated as the child entity.

➢ A copy of the primary key of the parent entity (entity with


optional participation) is placed in the relation representing the
child entity (entity with mandatory participation).

➢ If the relationship has one or more attributes, these attributes


should follow the posting of the primary key to the child relation.
19
Derive relations for logical data model
How to represent relationship types

➢Staff(staffNo, name, position, salary)


Primary Key staffNo
➢Car(regNo, make, model, staffNo)
Primary Key regNo
Foreign Key staffNo references
Staff(staffNo)

➢ In this example, we post a copy of staffNo into the Car


relation as a foreign key field because the participation of
Staff in the relationship is optional
20
Derive relations for logical data model
How to represent relationship types
( c) one-to-one (1:1) Optional participation on both sides
➢ One solution is to post from one side of the relationship to
the other side of the relationship. Since both sides are
‘optional’ you look at the data and post from the side with the
most ‘optional’ participation to the other side.
Staff(staffNo, name, position, salary)
Primary Key staffNo
Car(regNo, make, model, staffNo)
Primary Key regNo
Foreign Key staffNo references
Staff(staffNo)

➢Assuming most of the Cars are allocated and only a minority of Staff uses a

Car, we will post staffNo into Car as a foreign key.


21
Derive relations for logical data model
How to represent relationship types
➢ Another alternative is to create a new relation. We post a
copy of the primary key attribute(s) of the entities that
participate in the relationship into the new relation, to
act as foreign keys.
➢ Staff(staffNo, name, position, salary)
Primary Key staffNo
➢ Car(regNo, make, model)
Primary Key regNo
➢ Drives(staffNo, regNo)
Primary Key staffNo
Foreign Key staffNo references Staff(staffNo)
Foreign Key regNo references Car(regNo)
➢ staffNo or regNo is sufficient as the Primary Key field as each entity only
participates once in the relationship.
22
Derive relations for logical data model
How to represent relationship types
5) Recursive relationships Mapping conclusion
a) One-to-many (1:*) recursive relationships
Single relation with two copies for the primary key with different names.

b) Many-to-many (*:*) recursive relationships


Two relations

 One relation for the entity type.

 And create a new relation to represent the relationship. The new relation
would only have two attributes, both copies of the primary key(i.e. the
primary key has two attributes, both taken from the primary key of the
entity).

23
Example a)
One-to-many (1:*) recursive relationships
The representation of
a 1:* recursive
relationship is
similar to 1:* binary Is_managed_by
relationship. Staff
However, in this
0..*
case, both the parent
and child entity is Employee
the same entity.
Employee_I{PK}
Manager 0..1 Employee_Name
Employee_DOB

EMPLOYEE (Employee_ID, Employee_Name, Employee_DOB, Manager_ID)


Primary Key Employee ID
Foreign Key Manager_ID

24
24
Example b)
Many-to-many (*:*) recursive relationships

Contains 
Component
0..*
Quantity
Item
Item No{PK}
Product 0..* Description
Unit_Cost

ITEM (Item_No, Description, Unit_Cost)


Primary Key: Item_No
COMPONENT (Item_No, Componenet_No, Quantity)
Primary Key: Item_No, Componenet_No
Foreign Key: Item_No references ITEM(Item_No)
Foreign Key: Componenet_No references ITEM(Item_No)

25
25
Derive relations for logical data model
How to represent relationship types
6) Many-to-many (*:*) binary relationship types

➢ Create a relation to represent the relationship and include any


attributes that are part of the relationship.

➢ We post a copy of the Primary Key attribute(s) of the entities that


participate in the relationship into the new relation, to act as
foreign keys.

➢ These foreign keys will also form the Primary Key of the new
relation, possibly in combination with some of the attributes of the
relationship.
26
Example
Employee
Emp_NO{PK}
Name
Sex
Salary

1..*
Project

Hours Proj_No {PK}


Works on
Project Name
0..*

Employee (Emp_NO, Name, Sex, Salary)


Primary Key Emp_NO
Project (Proj_No, ProjectName)
Primary Key Proj_No
Work-on(EmpNo,ProjNo, hours)
Primary Key Emp_NO Proj_No,
27
Derive relations for logical data model
How to represent relationship types
7) Complex relationship types
➢ Create a relation to represent the relationship and include any
attributes that are part of the relationship.
➢ Post a copy of the Primary Key attribute(s) of the entities that
participate in the complex relationship into the new relation, to act
as foreign keys.
➢ Any foreign keys that represent a ‘many’ relationship (for example,
1..*, 0..*) generally will also form the Primary Key of this new
relation, possibly in combination with some of the attributes of the
relationship.

28
Example
Sdate
Edate

Business Supplier
Contracts
BizNo {PK} SupNo {PK}

Lawyer
LawNo, {PK}

Business (BizNo., ……)


Supplier(SupNo, ……..)
Lawyer(LawNo,…….…)
Contract(BizNo, SupNo, LawNo, StDate, EDate)

29
Derive relations for logical data model
How to represent relationship types

8) Multi-valued attributes
➢ Create a new relation to represent multi-valued attribute
and include Primary Key of entity in new relation, to act as
a foreign key.
➢ Unless the multi-valued attribute is itself an alternate key of
the entity, the Primary Key of the new relation is the
combination of the multi-valued attribute and the Primary
Key of the entity.

30
Example

The Primary Key of new relation:


- the multi-valued attribute if itself an
alternate key. ( e.g. we sure that there
are no duplicated tel_nos for Employee
employees ). Emp_NO{PK}
-OR the combination of the multi- Name
valued attribute and the Primary Key Sex
of the entity. ( e.g. 2 brothers work in Salary
same company). Tel_no [1..*]

Employee (Emp_NO, Name, Sex, Salary)


Primary Key Emp_NO

Telephone(Tel_no, EmpNo)
Primary Key Tel_no
OR
Telephone(Tel_no, EmpNo)
Primary Key Tel_no, EmpNo

31
Summary of how to map entities and
relationships to relations

32
Validate relations using normalisation

 The logical design should contain only properly normalized


tables.

 Check composition of each table using the rules of


normalisation, to avoid unnecessary duplication of data.

 For each identified table (old and new), you must ensure that
all attributes are fully dependent on the identified primary key.

 Ensure each table is in at least 3NF.

33
Validate relations against user transactions
 Ensure that the relations in the logical data model support the
required transactions.

 ensure that the relations created in the previous step also support
these transactions, and thereby ensure that no error has been
introduced while creating relations.

 One approach is to examine transaction’s data requirements to


ensure that the data is present in one or more tables.

 If a transaction requires data in more than one table, check these


tables are linked through the primary key/foreign key mechanism.

34
Check key and Integrity Constraints

 Check integrity constraints are represented in the logical data


model. This includes identifying:
 Required data (e.g. Not NULL): Some attributes must always
contain a value.
 Attribute domain constraints: Every attribute has a set of values
that are legal (domain).
 Structural Constraints: Handled by relationship cardinality and
participation.
 Entity integrity: Primary key must be unique and not null.
 Referential integrity (see Referential Integrity constraints).

35
Check key and Integrity Constraints
 General constraints (Triggers): Updates to entities may be
controlled by constraints governing transactions that are
represented by the updates.

 For example, a property management system may have a rule that


prevents a member of staff from managing more than 100 properties
at the same time.

36
Check key and Integrity Constraints
(Referential Integrity constraints)
 As stated previously a foreign key value is an attribute that is copied to
one table and is a primary key value in another table. Foreign key values
are used to link tables together.

 The attribute staffNo in the Client table is a foreign key value as it is the
primary key value in the Staff table.

37
Cascading Referential Integrity constraints

 We must ensure that cascading referential integrity constraints


are specified that define conditions under which a candidate
key or foreign key may be inserted, deleted, or updated:

 Insert tuple into child relation: Value of foreign key must match
primary key value of parent table or else be null. Inserting a Client
record: staffNo must match an existing staffNo from the Staff table
or else be NULL.

 Delete tuple from child relation: No difference or effect on


parent table. Deleting a Client record: No effect on Staff table.
38
Cascading Referential Integrity constraints

 Update foreign key of child tuple: Value of foreign key must


match primary key value of parent table or else be null.
Updating a Client record: staffNo must match an existing
staffNo from the Staff table or else be NULL.

 Insert tuple into parent relation: No difference or effect


on child table. Inserting a Staff record: No effect on Client table.

39
Cascading Referential Integrity constraints
 Delete tuple from parent relation: This has repercussions on the
child table. What if we deleted a Staff member that is associated with
Clients? The corresponding Client records would have a staff member
that no longer exists in the Staff table, therefore violates referential
integrity.
 There are a few options:
 No Action: Prevent a deletion from parent relation if there are any
referenced child tuples.
 Cascade: When a parent tuple is deleted automatically delete any referenced
child tuples.
 Set Null: When a parent is deleted, the foreign key values in all
corresponding child tuples are automatically set to null.
 Set default: When a parent is deleted, the foreign key values in all
corresponding child tuples are automatically set to their default values.
 No Check: When a parent tuple is deleted, do nothing to ensure that
referential integrity is maintained.
40
Cascading Referential Integrity constraints
 Update primary key of parent tuple: Again, this has
repercussions on the child table. What would happen if all staff
numbers in the Staff table were updated from 5 to 6 characters?
For example, s1234 became s01234? Staff numbers in the Client
table would no longer be valid.
 If the primary key value of a parent relation tuple is updated,
referential integrity is lost if there exists a child tuple
referencing the old primary key value. To ensure referential
integrity, the strategies as for (Delete tuple from parent
relation) will suffice.
 Note that the options available in MySQL are: cascade, set null,
no action, and restrict. No Action is a keyword from standard
SQL. In MySQL, it is equivalent to RESTRICT.

41
Review logical data model with user

 Review the logical data model with the users to ensure that
they consider the model to be a true representation of the data
requirements of the enterprise.

42
Merge logical data models into global model

 Merge logical data models into a single global logical data model
that represents all user views of a database.

 The activities in this step include:

 Step 1. Merge local logical data models into global model.

 Step [Link] global logical data model.

 Step 3. Review global logical data model with users.

43
Redundancy and Data Anomaly

 What is Redundancy

 Redundancy refers to the duplication of data within a


database system. While some degree of redundancy is
inevitable and even necessary for efficient data retrieval,
excessive redundancy can lead to various issues, including
 increased storage requirements,

 data inconsistency, and

 decreased performance.

44
Redundancy and Data Anomaly
 Redundancy, in the context of a DBMS, occurs when the
same data is stored in multiple locations within a database.

 It can arise due to various reasons, such as


 denormalized database design,

 a lack of proper data modeling, and

 the replication of data for backup or distribution purposes.

 Redundancy can exist at the attribute level (repeating data


values within a single record) or at the relation level (repeating
entire records across multiple tables).
45
Redundancy and Data Anomaly
 Row Level Redundancy:  If the SID is primary key
to each row, you can use
it to remove the
duplicates as shown
SID SName Age below:
SID SName Age
1 Jojo 20

2 Kit 25 1 Jojo 20

1 Jojo 20 2 Kit 25

46
Redundancy and Data Anomaly (Cont..)
 Column Level Redundancy:
 Now Rows are same but in column level because of Sid
is primary key but columns are same.
Redundant
Column
Sid Sname Cid Cname Fid Fname Salary Values

1 AA C1 DBMS F1 Jojo 30000

2 BB C2 JAVA F2 KK 50000

3 CC C1 DBMS F1 Jojo 30000

4 DD C1 DBMS F1 Jojo 30000

47
Redundancy and Data Anomaly
 Example for Redundancy
 Concept of redundancy in DBMS with a simple student table.

student_id student_name student_age dept_id dept_name dept_head

Computer Steve
1 Tony Stark 18 100
Science Rogers
Computer Steve
2 Thor Odinson 18 100
Science Rogers

► Every student record in this student table has the identical


department data, dept_id, dept_name, and dept_head. The student
table becomes redundant as a result of this.
48
Redundancy and Data Anomaly
 What is Data Anomaly
 Problems that can occur in poorly planned, unnormalized
databases where all the data is stored in one table (a flat-file
database).
 Problems caused due to redundancy are anomalies.
 There are three types of anomalies that may occur when the
database is not normalized.
 Insertion Anomaly
 Update Anomaly
 Deletion Anomaly

49
Redundancy and Data Anomaly
 Anomaly Example
 Below table University consists of seven attributes: Sid,
Sname, Cid, Cname, Fid, Fname, and Salary. And the Sid
acts as a key attribute or a primary key in the relation.

50
Redundancy and Data Anomaly
 Insertion Anomaly
 An Insert Anomaly occurs when certain attributes cannot be
inserted into the database without the presence of other
attributes.
 Suppose a new faculty joins the University, and the Database
Administrator inserts the faculty data into the above table. But he is
not able to insert because Sid is a primary key, and can’t be NULL.
So this type of anomaly is known as an insertion anomaly.

51
Redundancy and Data Anomaly
 Delete Anomaly
 A Delete Anomaly exists when certain attributes are lost
because of the deletion of other attributes.
 When the Database Administrator wants to delete the student
details of Sid=2 from the above table, then it will delete the
faculty and course information too which cannot be recovered
further.

SQL:
DELETE FROM University
WHERE Sid=2;

52
Redundancy and Data Anomaly
 Update Anomaly
 An Update Anomaly exists when one or more instances of
duplicated data is updated, but not all.
 When the Database Administrator wants to change the salary
of faculty F1 from 30000 to 40000 in above table University,
then the database will update salary in more than one row due
to data redundancy. So, this is an update anomaly in a table.

SQL:
UPDATE University
SET Salary= 40000
WHERE Fid=“F1”;

To remove all these anomalies, we need to normalize


the data in the database.
53
Functional Dependency (FD)
 A functional dependency (FD) is a relationship between
two attributes, typically between the primary key and other
non-key attributes within a table.

 A functional dependency denoted by X→Y , is an association


between two sets of attribute X and Y. Here, X is called the
determinant, and Y is called the dependent.
 For example,
 SIN ———-> Name, Address, Birthdate
 Here, SIN determines Name, Address and Birthdate. So, SIN is the
determinant and Name, Address and Birthdate are the dependents.

54
Functional Dependency (FD)

 Types of functional dependency

 The following are types functional dependency in DBMS:

1. Fully-Functional Dependency

2. Partial Dependency

3. Transitive Dependency

4. Trivial Dependency

5. Multivalued Dependency

55
Functional Dependency (FD)
1. Full functional Dependency
 A functional dependency X→Y is said to be a full functional
dependency, if removal of any attribute A from X, the
dependency does not hold any more. i.e. Y is fully functional
dependent on X, if it is Functionally Dependent on X and not
on any of the proper subset of X.

 For example,
 {Emp_num,Proj_num} → Hour is a full functional dependency.
Here, Hour is the working time by an employee in a project.

56
Functional Dependency (FD)
2. Partial functional Dependency
 functional dependency X → Y is said to be a partial functional
dependency, if after removal of any attribute A from X, the
dependency does not holds. i.e. Y is dependent on a proper
subset of X. So X is partially dependent on X.

 For example,
 If {Emp_num,Proj_num} → Emp_name but also Emp_num →
Emp_name then Emp_name is partially functionally dependent
on {Empl_num,Proj_num}.

57
Functional Dependency (FD)
3. Transitive Dependency
 A functional dependency is X → Z is said to be a transitive
functional dependency if there exists the functional
dependencies X → Y and Y → Z. i.e. it is an indirect
relationship.
 For example,
 Staff_No→Branch_No and Branch_No→BAddress

58
Functional Dependency (FD)
4. Trivial Dependency
 A functional dependency X → Y is said to be a trivial
functional dependency if Y is a subset of X.
 For example,
 {Emp_num,Emp_name} → Emp_num is a trivial functional
dependency since Emp_num is a subset of {Emp_num,Emp_name}.
5. Multivalued Dependency
 Multivalued dependency occurs in the situation where there
are multiple independent multivalued attributes in a single
table.
 A multivalued dependency is a complete constraint between
two sets of attributes in a relation. It requires that certain
tuples be present in a relation.
59
Functional Dependency (FD)
 Example: Consider the following table

 The functional dependencies


 car_model -> manufr_year
 car_model-> colour are multivalued dependency since
manufr_year and color both are multivalued attribute.
60
Normalization

 Normalization is the process of decomposing the relations


into well structured relations to organize the data in the
database to remove redundancy of data, insertion anomaly,
update anomaly and deletion anomaly.
 The normal forms used for normalization are:
 First normal form(1NF)
 Second normal form(2NF)
 Third normal form(3NF)
 Boyce & Codd normal form (BCNF)
 Fourth normal form (4NF)
 Fifth normal form (5NF)
61
Normalization

 Normalization is the process of decomposing the relations


into well structured relations to organize the data in the
database to remove redundancy of data, insertion anomaly,
update anomaly and deletion anomaly.
 The normal forms used for normalization are:
 First normal form(1NF)
 Second normal form(2NF)
 Third normal form(3NF)
 Boyce & Codd normal form (BCNF)
 Fourth normal form (4NF)
 Fifth normal form (5NF)
62
Normalization – 1NF
 The first normal form is based on the simple or atomic
attribute and single valued attribute. A relation is said to be in
1NF if all the attributes of the relation are atomic and single
valued.
 Example-

 The Relation employee is not in 1NF, because employees with


employee id 102 and 104 are having two phone numbers. i.e. the
Emp_mobile attribute is a multi valued attribute.
63
Normalization – 1NF
 After normalization to 1NF the relation will be like this:

64
Normalization – 2NF
 The second normal form is based on the full functional
dependency. A relation is said to be in 2NF if it is first in 1NF and all
non-key attributes are fully functional dependent on the primary key..
i.e. no partial dependency exists.
 Example- Consider the following example:

 This table has a composite primary key {Customer ID, Store ID}. The non-
key attribute is Purchase Location. In this case, Purchase Location only
depends on Store ID, which is a part of the primary key. Therefore, this table
does not satisfy second normal form.
65
Normalization – 2NF
 To bring this table to second normal form, we have to break the
table into two tables as shown below.

 Now, in the table TABLE_STORE, the column Purchase Location


is fully dependent on the primary key of that table, which is
Store ID.
66
Normalization – 3NF
 The third normal form is based on the transitive dependency.
A relation is said to be in 3NF if it is first in 2NF and no
transitive functional dependency exists.
 Example: Consider the following table:

 In the above table, Book ID determines Genre ID, and Genre ID


determines Genre Type. Therefore, Book ID determines Genre
Type via Genre ID and we have transitive functional dependency,
and this structure does not satisfy third normal form.
67
Normalization – 3NF
 To bring this table to 3NF, we split the table into two as follows:

 Now all non-key attributes are fully functional dependent only


on the primary key. In TABLE_BOOK, both Genre ID and Price
are only dependent on Book ID. In TABLE_GENRE, Genre Type
is only dependent on Genre ID.
68
Normalization – BCNF

 Boyce Codd normal form (BCNF) - is the advance version


of 3NF. It is stricter than 3NF.

 A table is in BCNF if every functional dependency X → Y, X is


the super key of the table.

 For BCNF, the table should be in 3NF, and for every FD, LHS is
super key.

69
Normalization – BCNF
 Example: assume there is a company where employees work
in more than one department.

EMPLOYEE table:

EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO


264 India Designing D394 283
264 India Testing D394 300
364 UK Stores D283 232
364 UK Developing D283 549

70
Normalization – BCNF
 In the above table Functional dependencies are as
follows:

 EMP_ID → EMP_COUNTRY

 EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}

 Candidate key: {EMP-ID, EMP-DEPT}

 The table is not in BCNF because neither EMP_DEPT nor


EMP_ID alone are keys.

71
Normalization – BCNF
 To convert the given table into BCNF, we decompose it into
three tables:
EMP_COUNTRY table: EMP_DEPT table:

EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO


Designing D394 283
264 India
Testing D394 300
264 India Stores D283 232
EMP_DEPT_MAPPING table: Developing D283 549
EMP_ID EMP_DEPT
D394 283
D394 300
D283 232
D283 549
72
Normalization – BCNF
 Functional dependencies:

1. EMP_ID → EMP_COUNTRY

2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}

 Candidate keys:
 For the first table: EMP_ID

 For the second table: EMP_DEPT

 For the third table: {EMP_ID, EMP_DEPT}

 Now, this is in BCNF because left side part of both the functional
dependencies is a key.
73
Normalization – 4NF
➢ A relation will be in 4NF if it is in Boyce Codd normal form and
has no multi-valued dependency.

➢ For a dependency A → B, if for a single value of A, multiple values


of B exists, then the relation will be a multi-valued dependency.
• Example STUDENT
STU_ID COURSE HOBBY
21 Computer Dancing
21 Math Singing
34 Chemistry Dancing
74 Biology Cricket
59 Physics Hockey
74
Normalization – 4NF

 The given STUDENT table is in 3NF, but the COURSE and


HOBBY are two independent entity. Hence, there is no
relationship between COURSE and HOBBY.

 In the STUDENT relation, a student with STU_ID, 21 contains


two courses, Computer and Math and two hobbies, Dancing
and Singing. So there is a Multi-valued dependency on STU_ID,
which leads to unnecessary repetition of data.

75
Normalization – 4NF

 So to make the above table into 4NF, we can decompose it into


two tables:
STUDENT_COURSE STUDENT_HOBBY

STU_ID COURSE STU_ID HOBBY


21 Computer 21 Dancing
21 Math 21 Singing
34 Chemistry 34 Dancing
74 Biology 74 Cricket
59 Physics 59 Hockey

76
Denormalization

 Denormalization is a database optimization technique where


we add redundant data in the database to get rid of the
complex join operations.

 This is done to speed up database access speed.


Denormalization is done after normalization for improving the
performance of the database.

 The data from one table is included in another table to reduce


the number of joins in the query and hence helps in speeding
up the performance.
77
Denormalization
 A denormalized database should never be confused by a
database that has never been normalized.
 Example: Suppose after normalization we have two tables first,
Student table and second, Branch table. The student has the
attributes as Roll_no , Student-name , Age , and Branch_id .

78
Denormalization
 The branch table is related to the Student table with Branch_id
as the foreign key in the Student table.

► If we want the name of students along with the name of the branch name
then we need to perform a join operation. The problem here is that if the
table is large we need a lot of time to perform the join operations. So, we
can add the data of Branch_name from Branch table to the Student table and
this will help in reducing the time that would have been used in join
operation and thus optimize the database.
79
Denormalization

 Advantages of Denormalization

 Query execution is fast since we have to join fewer tables.

 Disadvantages of Denormalization

1. As data redundancy is there, update and insert operations are


more expensive and take more time. Since we are not
performing normalization, so this will result in redundant data.

2. Data Integrity is not maintained in denormalization. As there


is redundancy so data can be inconsistent.

80
81

You might also like