Logical Database Design Techniques
Logical Database Design Techniques
1
Topics
Topics Subtopics
4. 4.1. Logical Database Design for Relational Model
Logical
Database 4.2. Steps to Build Logical Data Model
Design [Link] Relations for Logical Data Model
[Link] Relations Using Normalization
[Link] Relations Against User Transactions
[Link] key and Integrity Constraints
[Link] Logical Data Model with User
[Link] Logical Data Models into Global Model
4.3. Redundancy and Data Anomaly
[Link] Dependency (FD)
[Link]
[Link] of normalization
2
Topics (Continued)
Topics Subtopics
4. 4.4. Process of Normalization (1NF, 2NF, 3NF, BCNF)
Logical
Database [Link] Form
Design [Link]. NF- First Normal Form
[Link]. NF- Second Normal Form (2NF)
[Link]. 3NF- Third Normal Form
[Link]. Boyce–Codd Normal Form (BCNF)
[Link]. Multivalued Dependency and Fourth Normal Form
[Link]. Join Dependencies and Fifth Normal Form
4.5 Pitfalls of Normalization
4.6. Denormalization
3
Logical Database Design for Relational Model
➢ Logical database design is the process of deciding how
to arrange the attributes of the entities in a given
business environment into database structures, such
as the tables of a relational database.
➢ Logical design is the process of constructing a model
of the data used in an enterprise based on a specific
data model (in our case relational), but independent of
a particular DBMS and other physical considerations.
4
Logical Database Design for Relational Model
➢ The goal of logical database design is to create well
structured tables that properly reflect the company's
business environment.
➢ The tables will be able to store data about the
company's entities in a non-redundant manner and
foreign keys will be placed in the tables so that all the
relationships among the entities will be supported.
➢ The purpose of logical design is to translate the
conceptual representation to the logical structure of
the database, which includes designing the relations.
5
Logical Database Design for Relational Model
concepts.
9
Derive relations for logical data model
How to represent entity types
1) Strong entity types
➢ For each strong entity in the data model, create a relation
that includes all the simple attributes of that entity.
➢ For composite attributes, include only the constituent
simple attributes(its components only).
2) Weak entity types
➢ For each weak entity in the data model, create a relation that
includes all the simple attributes of that entity.
➢ A weak entity must include its Partial key and its owner entity
type PK as a FK. The combination of the two keys form the
PK of the weak entity.
10
Example
Employee
Emp_NO{PK}
Name
Fname
Mname
Lname
Gender
Partial Key of
Salary
Week entity
Dep NO
Dependent
1..1
Name
Claims Gender
0..* Relationship
11
Derive relations for logical data model
How to represent relationship types
3) One-to-many (1:*) binary relationship types
➢ For each 1:* binary relationship, the entity on the ‘one side’ of
the relationship is designated as the parent entity and the
entity on the ‘many side’ is designated as the child entity.
➢ To represent this relationship, post a copy of the Primary Key
attribute(s) of parent entity into the relation representing the
child entity, to act as a foreign key.
➢ Add attributes that describes the relationship to the child
entity(if exist).
12
Example
1..*
Branch
Branch_No {PK}
Allocates
Branch Name
0..1
Employee (Emp_NO, Name, Sex, Salary,Branch No)
Primary Key Emp_NO
Branch (Branch_No, Branch Name)
Primary Key Branch_No
13
Derive relations for logical data model
How to represent relationship types
4) One-to-one (1:1) binary relationship types
➢ Creating relations to represent a 1:1 relationship is more complex as the
cardinality cannot be used to identify the parent and child entities in a
relationship.
➢ Instead, the participation constraints are used to decide whether it is
best to represent the relationship by:
1. Combining the entities involved into one relation
2. OR by creating two relations and posting a copy of the Primary
Key from one relation to the other.
➢ Consider the following :
(a) mandatory participation on both sides of 1:1 relationship
(b) mandatory participation on one side of 1:1 relationship
(c) optional participation on both sides of 1:1 relationship
14
Derive relations for logical data model
How to represent relationship types
(a) one-to-one (1:1) - Mandatory participation on both sides
15
Derive relations for logical data model
How to represent relationship types
➢ Option One:
16
Derive relations for logical data model
How to represent relationship types
➢ Option Two:
17
Derive relations for logical data model
How to represent relationship types
18
Derive relations for logical data model
How to represent relationship types
(b) one-to-one (1:1) Mandatory participation on one side
➢ Identify parent and child entities using participation constraints.
Entity with optional participation in relationship is designated as
the parent entity, and entity with mandatory participation is
designated as the child entity.
➢Assuming most of the Cars are allocated and only a minority of Staff uses a
And create a new relation to represent the relationship. The new relation
would only have two attributes, both copies of the primary key(i.e. the
primary key has two attributes, both taken from the primary key of the
entity).
23
Example a)
One-to-many (1:*) recursive relationships
The representation of
a 1:* recursive
relationship is
similar to 1:* binary Is_managed_by
relationship. Staff
However, in this
0..*
case, both the parent
and child entity is Employee
the same entity.
Employee_I{PK}
Manager 0..1 Employee_Name
Employee_DOB
24
24
Example b)
Many-to-many (*:*) recursive relationships
Contains
Component
0..*
Quantity
Item
Item No{PK}
Product 0..* Description
Unit_Cost
25
25
Derive relations for logical data model
How to represent relationship types
6) Many-to-many (*:*) binary relationship types
➢ These foreign keys will also form the Primary Key of the new
relation, possibly in combination with some of the attributes of the
relationship.
26
Example
Employee
Emp_NO{PK}
Name
Sex
Salary
1..*
Project
28
Example
Sdate
Edate
Business Supplier
Contracts
BizNo {PK} SupNo {PK}
Lawyer
LawNo, {PK}
29
Derive relations for logical data model
How to represent relationship types
8) Multi-valued attributes
➢ Create a new relation to represent multi-valued attribute
and include Primary Key of entity in new relation, to act as
a foreign key.
➢ Unless the multi-valued attribute is itself an alternate key of
the entity, the Primary Key of the new relation is the
combination of the multi-valued attribute and the Primary
Key of the entity.
30
Example
Telephone(Tel_no, EmpNo)
Primary Key Tel_no
OR
Telephone(Tel_no, EmpNo)
Primary Key Tel_no, EmpNo
31
Summary of how to map entities and
relationships to relations
32
Validate relations using normalisation
For each identified table (old and new), you must ensure that
all attributes are fully dependent on the identified primary key.
33
Validate relations against user transactions
Ensure that the relations in the logical data model support the
required transactions.
ensure that the relations created in the previous step also support
these transactions, and thereby ensure that no error has been
introduced while creating relations.
34
Check key and Integrity Constraints
35
Check key and Integrity Constraints
General constraints (Triggers): Updates to entities may be
controlled by constraints governing transactions that are
represented by the updates.
36
Check key and Integrity Constraints
(Referential Integrity constraints)
As stated previously a foreign key value is an attribute that is copied to
one table and is a primary key value in another table. Foreign key values
are used to link tables together.
The attribute staffNo in the Client table is a foreign key value as it is the
primary key value in the Staff table.
37
Cascading Referential Integrity constraints
Insert tuple into child relation: Value of foreign key must match
primary key value of parent table or else be null. Inserting a Client
record: staffNo must match an existing staffNo from the Staff table
or else be NULL.
39
Cascading Referential Integrity constraints
Delete tuple from parent relation: This has repercussions on the
child table. What if we deleted a Staff member that is associated with
Clients? The corresponding Client records would have a staff member
that no longer exists in the Staff table, therefore violates referential
integrity.
There are a few options:
No Action: Prevent a deletion from parent relation if there are any
referenced child tuples.
Cascade: When a parent tuple is deleted automatically delete any referenced
child tuples.
Set Null: When a parent is deleted, the foreign key values in all
corresponding child tuples are automatically set to null.
Set default: When a parent is deleted, the foreign key values in all
corresponding child tuples are automatically set to their default values.
No Check: When a parent tuple is deleted, do nothing to ensure that
referential integrity is maintained.
40
Cascading Referential Integrity constraints
Update primary key of parent tuple: Again, this has
repercussions on the child table. What would happen if all staff
numbers in the Staff table were updated from 5 to 6 characters?
For example, s1234 became s01234? Staff numbers in the Client
table would no longer be valid.
If the primary key value of a parent relation tuple is updated,
referential integrity is lost if there exists a child tuple
referencing the old primary key value. To ensure referential
integrity, the strategies as for (Delete tuple from parent
relation) will suffice.
Note that the options available in MySQL are: cascade, set null,
no action, and restrict. No Action is a keyword from standard
SQL. In MySQL, it is equivalent to RESTRICT.
41
Review logical data model with user
Review the logical data model with the users to ensure that
they consider the model to be a true representation of the data
requirements of the enterprise.
42
Merge logical data models into global model
Merge logical data models into a single global logical data model
that represents all user views of a database.
43
Redundancy and Data Anomaly
What is Redundancy
decreased performance.
44
Redundancy and Data Anomaly
Redundancy, in the context of a DBMS, occurs when the
same data is stored in multiple locations within a database.
2 Kit 25 1 Jojo 20
1 Jojo 20 2 Kit 25
46
Redundancy and Data Anomaly (Cont..)
Column Level Redundancy:
Now Rows are same but in column level because of Sid
is primary key but columns are same.
Redundant
Column
Sid Sname Cid Cname Fid Fname Salary Values
2 BB C2 JAVA F2 KK 50000
47
Redundancy and Data Anomaly
Example for Redundancy
Concept of redundancy in DBMS with a simple student table.
Computer Steve
1 Tony Stark 18 100
Science Rogers
Computer Steve
2 Thor Odinson 18 100
Science Rogers
49
Redundancy and Data Anomaly
Anomaly Example
Below table University consists of seven attributes: Sid,
Sname, Cid, Cname, Fid, Fname, and Salary. And the Sid
acts as a key attribute or a primary key in the relation.
50
Redundancy and Data Anomaly
Insertion Anomaly
An Insert Anomaly occurs when certain attributes cannot be
inserted into the database without the presence of other
attributes.
Suppose a new faculty joins the University, and the Database
Administrator inserts the faculty data into the above table. But he is
not able to insert because Sid is a primary key, and can’t be NULL.
So this type of anomaly is known as an insertion anomaly.
51
Redundancy and Data Anomaly
Delete Anomaly
A Delete Anomaly exists when certain attributes are lost
because of the deletion of other attributes.
When the Database Administrator wants to delete the student
details of Sid=2 from the above table, then it will delete the
faculty and course information too which cannot be recovered
further.
SQL:
DELETE FROM University
WHERE Sid=2;
52
Redundancy and Data Anomaly
Update Anomaly
An Update Anomaly exists when one or more instances of
duplicated data is updated, but not all.
When the Database Administrator wants to change the salary
of faculty F1 from 30000 to 40000 in above table University,
then the database will update salary in more than one row due
to data redundancy. So, this is an update anomaly in a table.
SQL:
UPDATE University
SET Salary= 40000
WHERE Fid=“F1”;
54
Functional Dependency (FD)
1. Fully-Functional Dependency
2. Partial Dependency
3. Transitive Dependency
4. Trivial Dependency
5. Multivalued Dependency
55
Functional Dependency (FD)
1. Full functional Dependency
A functional dependency X→Y is said to be a full functional
dependency, if removal of any attribute A from X, the
dependency does not hold any more. i.e. Y is fully functional
dependent on X, if it is Functionally Dependent on X and not
on any of the proper subset of X.
For example,
{Emp_num,Proj_num} → Hour is a full functional dependency.
Here, Hour is the working time by an employee in a project.
56
Functional Dependency (FD)
2. Partial functional Dependency
functional dependency X → Y is said to be a partial functional
dependency, if after removal of any attribute A from X, the
dependency does not holds. i.e. Y is dependent on a proper
subset of X. So X is partially dependent on X.
For example,
If {Emp_num,Proj_num} → Emp_name but also Emp_num →
Emp_name then Emp_name is partially functionally dependent
on {Empl_num,Proj_num}.
57
Functional Dependency (FD)
3. Transitive Dependency
A functional dependency is X → Z is said to be a transitive
functional dependency if there exists the functional
dependencies X → Y and Y → Z. i.e. it is an indirect
relationship.
For example,
Staff_No→Branch_No and Branch_No→BAddress
58
Functional Dependency (FD)
4. Trivial Dependency
A functional dependency X → Y is said to be a trivial
functional dependency if Y is a subset of X.
For example,
{Emp_num,Emp_name} → Emp_num is a trivial functional
dependency since Emp_num is a subset of {Emp_num,Emp_name}.
5. Multivalued Dependency
Multivalued dependency occurs in the situation where there
are multiple independent multivalued attributes in a single
table.
A multivalued dependency is a complete constraint between
two sets of attributes in a relation. It requires that certain
tuples be present in a relation.
59
Functional Dependency (FD)
Example: Consider the following table
64
Normalization – 2NF
The second normal form is based on the full functional
dependency. A relation is said to be in 2NF if it is first in 1NF and all
non-key attributes are fully functional dependent on the primary key..
i.e. no partial dependency exists.
Example- Consider the following example:
This table has a composite primary key {Customer ID, Store ID}. The non-
key attribute is Purchase Location. In this case, Purchase Location only
depends on Store ID, which is a part of the primary key. Therefore, this table
does not satisfy second normal form.
65
Normalization – 2NF
To bring this table to second normal form, we have to break the
table into two tables as shown below.
For BCNF, the table should be in 3NF, and for every FD, LHS is
super key.
69
Normalization – BCNF
Example: assume there is a company where employees work
in more than one department.
EMPLOYEE table:
70
Normalization – BCNF
In the above table Functional dependencies are as
follows:
EMP_ID → EMP_COUNTRY
71
Normalization – BCNF
To convert the given table into BCNF, we decompose it into
three tables:
EMP_COUNTRY table: EMP_DEPT table:
1. EMP_ID → EMP_COUNTRY
Candidate keys:
For the first table: EMP_ID
Now, this is in BCNF because left side part of both the functional
dependencies is a key.
73
Normalization – 4NF
➢ A relation will be in 4NF if it is in Boyce Codd normal form and
has no multi-valued dependency.
75
Normalization – 4NF
76
Denormalization
78
Denormalization
The branch table is related to the Student table with Branch_id
as the foreign key in the Student table.
► If we want the name of students along with the name of the branch name
then we need to perform a join operation. The problem here is that if the
table is large we need a lot of time to perform the join operations. So, we
can add the data of Branch_name from Branch table to the Student table and
this will help in reducing the time that would have been used in join
operation and thus optimize the database.
79
Denormalization
Advantages of Denormalization
Disadvantages of Denormalization
80
81