Chapter 2: Overview
of Database Designs
By: Kuribachew Gizaw
(PhD)
Dec 2024
Outline
2.1 Database designs
2.2
Conceptual Database Design---ER-
modeling
2.3 Logical Database Design
2.3.1 Mapping,
2.3.2 Normalization
2.4 Physical design
Overview of Database Design
In many software design tools, the database design methodologies and
software engineering methodologies are intertwined since these activities
are strongly related.
Database development process include the following activities:
Planning of Information systems Design
Requirements Analysis,
Design (Conceptual, Logical and Physical Design)
Tuning
Implementation
Operation and Support
Database Design
The prime interest of a database system is the
Design part which is again sub divided into other
three sub-phases.
These sub-phases are:
1. Conceptual Design
2. Logical Design, and
3. Physical Design
Conceptual Database Design
Conceptual Design is about “what kind of objects a database contains” .
Is based on some dedicated modeling technique
(Entity-Relationship, UML)
The result of Conceptual Design identifies:
the types of objects about which data will be collected
the properties of objects that will be presented as data items
such dependencies among objects and data items that should be reflected in
the database
Entity – Relationship (ER) model
Popular high-level conceptual data model
Description of the data requirements of users
Concepts like:
entities
attributes
relationships and
constraints are used in this model
It is a top-down approach
Entity, Attributes, & Relationships
Entity
It is a basic object which an ER model represents
It is the thing about which an organization wants to keep data
It is an aggregation of a number of data elements where each data element is an
attribute of the entity
Attributes
Each entity has attributes – the particular properties that describe
it. e.g. name is an attribute of the entity STUDENT
An entity has a value for each of its attributes
The legal values of an attribute are specified by a domain.
Example: Domain for sex attribute may be {M, F}
Types of Entities
Strong entity type –
Does not depend on anything else in the database for its
existence. It contains its own primary key–
Weak entity type
depends on some related entity for its existence. It has no
candidate keys without including the primary key of the entity on
which it depends.
All entities / relationships of the same type have the same attributes
Has
Staff Dependent
Relationship
Relationship – is a set of meaningful associations among
entity types
Has
Branch Staff
• Degree of Relationship
– the number of participating entities in a relationship
• Relationship of degree one is called unary/recursive
– A recursive relationship is a relationship where the same entity type participates
more than once in different roles.
• Relationship of degree two is called binary
• Relationship of degree three is called ternary
• Relationship of degree four is called quaternary (rare occurrence)
Structural Constraints
1. Cardinality: The maximum number of possible
relationship occurrences for an entity participating in a
given relationship type.
In binary relationship there are:-
An entity of one entity type can be associated with only one
entity of another entity type and vise versa
Manages
Staff Branch
1..1 0..1
Members of a staff can
manage zero or one branch
a branch always has multiplicity
one manager
Chen Notation for ER Modeling Crow’s Feet Notation for ER Modeling
attribute
Attribute Entity Name
attribute AttributeName1
Primary key attribute [derivedAttribute]
Multi-valued attribute attribute {multivaluedAttribute}
compositeAttribute(…)
Derived attribute attribute
Attribute
Relationship name
One-to-one relationship 1 1
M Relationship name
One-to-many relationship 1
Relationship name
Many-to-many relationship N M
Mandatory for A and B 1 M Relationship
A B A B
Mandatory participation of B, M
1 Relationship
optional participation of A A B A B
Activity 2
Draw an ER diagram that represents all the involved entities, their
relationship and constraints in ABC_online shopping.
Entities
Customer
Order
Product
Staff
Relationship can be drawn from the following description
Any customer orders any product
One Staff contacts many customers
Logical DB Design
Logical database design is the process of constructing a
model of the information used in an enterprise based on
a specific data model, but
independent of a particular DBMS and other physical considerations.
Two basic activities During Logical DB Design
Mapping
Normalization
Mapping is the process of mapping the ER model to a
relational model
Normalization is the process of producing a set of relations
with desirable properties according to the data requirement
of an enterprise.
Some of the Rules used to Map conceptual
schemas to relation schemas
1. Strong entity types
Create a new relation that includes all the simple attributes (or
simple components of composite attributes)
Leave out multi-valued attributes
Pick a PK.
2. Weak entity types
Create a new relation that includes all the simple attributes (or
simple components of composite attributes)
Include as foreign key
the primary key(s) of the owner entity types (s).
e.g. add an SSN attribute to the dependent relation.
PK is the combination of the FK and the partial key.
Example: Mapping a regular entity
This …
Becomes …
Example: Mapping a Composite Attribute
Resulting relation
Example: Mapping a weak entity, This …
Becomes…
Rules continued …
3. One-to-many (1:*) binary relationship types
Add to the relation on the many side an FK to the
other relation.
E.g. Customer_ID is added to the Order relation to
represent the customer in the Customer relation.
(see next slide)
Include any relationship attributes
Rules continued …
4. Many-to-many (*:*) binary relationship types
Create a new relation containing
FKs to both participating relations
You can possibly include additional relationship attributes
to this relation
E.g. see the next slide
PK of the Order_Line relation is the combination of both
FKs
Sometimes you may also need to add one or more of the
relationship attributes to form the PK.
Rules continued …
5. One-to-one(1:1) binary relationship types
Add to one of the participating relations an FK to the
other relation:
You may include any additional relationship
attributes.
Rules continued …
6. Multi-valued attributes
create a new relation containing
FK to the relation of the entity
the attribute or attributes (if composite)
PK is the FK plus
the attribute if it was a simple one
one or more of the attributes for composite ones.
Example: This …
is modeled as…
Rules continued…
7. Ternary relationship
Create a new relation containing a foreign key
referencing each of the three entities involved
Include any relationship attributes as required
PK is usually a combination of all the three FKs but
May also require some relationship attributes
May not need all the three FKs if some entity can only
participate once with each combination of the others.
… is mapped to the following Relations
Example: This conceptual model (ER model) …
Supervises
Supervisee 0..* Supervisor 0..1 Works For
Employee Department
1..* 1..1
name Manages DName
eid{pk} 1..1 0..1 number {pk}
1..1
sex start_Date
bdate Controls
Addr 1..*
1..1 Works on
1..* 0..*
Has 1..* Project
Hours pName
0..* number {pk}
Dependent
Name
Sex
bdate
… is mapped to this logical model (Relation
Schemas)
Assignment: Referential integrity constraints are not
shown in the schema. Include them.
DEPARTMENT dname dnumber mgrssn mgr_start_date
EMPLOYEE name eid bdate addr sex salary dno super_id
PROJECT pname pnumber plocation dnum
DEPENDENT eid dependent-name sex bdate relationship
WORKS_ON eid pno hours
NORMALIZATION
Normalization
Themain objective in developing a logical
data model is to create an accurate
representation of the data, its relationships
and constraints.
Toachieve this objective, we must identify a
suitable set of relations.
A technique we can use to help identify such
relations is known as normalization.
The Purpose of Normalization
Normalization- a technique for producing a set
of relations with desirable properties, given
the data requirement of an enterprise.
Normalization reduces data redundancy in a
database.
Bydoing so it eliminates serious manipulation
anomalies.
The Purpose of …
Normalization is often performed as a series of tests
on a relation to determine whether it satisfies or
violates the requirements of a given normal form.
Three normal forms were initially proposed, called
first normal form (1NF), second normal form (2NF),
and third normal form (3NF).
Subsequently, a stronger definition of third normal
form was introduced by R. Boyce and E.F. Codd,
referred to as Boyce-Codd Normal Form (BCNF).
The purpose of …
Higher normal forms that go beyond BCNF
were introduced later such as fourth (4NF)
and fifth (5NF) normal form.
However, these later normal forms deal
with practical situations that are very rare.
We shall see only the first three in this
course.
The purpose of …
The process of normalization is a formal
method that identifies relations based on their
primary key (or candidate keys) and the
functional dependencies among their
attributes
Normalization supports DB designers by
presenting a series of tests, which can be
applied to individual relations so that a
relational schema can be normalized to a
specific form to prevent the possible
occurrence of update anomalies.
Data Redundancy and Update
Anomalies
A major aim of relational database design is to
group attributes into relations so as to minimize
data redundancy and thereby reduce the file storage
space required by the implemented base relations.
As an example to demonstrate these problems,
consider the following two alternative ways of
capturing data. The first alternative uses two tables,
whereas the second one uses one table.
Alternative One:
Staff Relation
IDNo fName sAddress Position Salary branchNo
SL21 Alie Arat Killo Manager 3000 B5
SG37 Chaltu Megenagna Snr Asst 1200 B3
SG14 Kiflu Merkato Deputy 1800 B3
SA9 Kassa Shiro meda Arat Assistant 900 B7
SG5 Hirut Killo Manager 2400 B3
SL41 Taye Merkato Assistant 900 B5
Branch Relation
branchNo bAddress btelNo
B5 Arada 55 66 77
B7 Mexico 66 77 88
B3 Merkato 77 88 99
Alternative Two:-
Staff_Branch Relation
Staff_No fName sAddress position salary branchN bAddress telNo
o
SL21 Alie Arat Killo Manager 3000 B5 Arada 55 66 77
SG37 Chaltu Megenagna Snr Asst 1200 B3 Merkato 77 88 99
SG14 Kiflu Merkato Deputy 1800 B3 Merkato 77 88 99
SA9 Kassa Shiro Meda Assistant 900 B7 Mexico 66 77 88
SG5 Hirut Arat Killo Manager 2400 B3 Merkato 77 88 99
SL41 Taye Merkato Assistant 900 B5 Arada 55 66 77
Functional Dependencies
Oneof the main concepts associated with
normalization is functional dependency.
Functional
dependency - describes the relationship
between attributes in a relation.
If A and B are attributes of relation R, B is functionally
dependent on A (denoted AB), if each value of A is
associated with exactly one value of B. (A and B may each
consist of one or more attributes).
Functional Dependencies continued…
Functional dependency is a property of the meaning
or semantics of the attributes in a relation.
The semantics indicate how attributes relate to one
another, and specify the functional dependencies
between attributes.
When a functional dependency is present, the
dependency is specified as a constraint between the
attributes
Functional Dependencies continued…
Consider a relation with attributes A and B, where
attribute B is functionally dependent on attribute A.
If we know the value of A and we examine the relation that holds
the dependency, we find only one value of B in all the rows that
have a given value of A at any moment in time.
That is, when two rows have the same value of A, they also have
the same value of B.
However, for a given value of B there may be several different
value of A.
Functional Dependencies continued…
B is functionally
A B
Dependent on A
• Determinant – the determinant of a functional dependency refers
to the attribute or group of attributes on the left hand side of the
arrow.
• That is,when a functional dependency exists, the attribute or
group of attributes on the left-hand side of the arrow is called the
determinant. For example, A is the determinant of B in the above
model.
Example – Functional dependencies
Consider the attributes iDNo and position of the Staff
relation given below. For a specific iDNo, for example SL21,
we determine the position of that member of staff as
Manager.
In other words, the position attribute is functionally
dependent on the iDNo, but the opposite is not true, iDNo is
not functionally dependent on position. A member of staff
holds one position, however, there may be several
members of staff with the same position.
For each id number, there is only one position, on the other
hand, there are several id numbers associated with a
position.
iDNo is functionally
iDNo position
dependent on position
The Process of Normalization
Normalization is a formal technique for analyzing relations based on their
primary key and functional dependencies.
The technique involves a series of rules that can be used to test individual
relations so that a database can be normalized to any degree.
Normalization is often executed as a series of steps.
The Process of Normalization…
Each step corresponds to a specific normal form
that has known properties.
Thefirst normal form (1NF) is critical in creating
appropriate relations. All the subsequent normal
forms are optional.
However, to avoid the update anomalies, it is
normally recommended that we proceed to at
least 3NF.
Stages of Normalization
Unnormalised
(UDF)
Remove repeating groups
First normal form
(1NF)
Remove partial dependencies
Second normal form
(2NF)
Remove transitive dependencies
Third normal form
(3NF)
Remove remaining functional
dependency anomalies
Boyce-Codd normal
form (BCNF)
Remove multivalued dependencies
Fourth normal form
(4NF)
Remove remaining anomalies
Fifth normal form
(5NF)
First Normal Form (1NF)
Unnormalized form (UNF) – a table that
contains one or more repeating groups.
1NF – a relation in which the intersection of
each row and column contains one and only
one value.
Itstates that the domain of an attribute must
include only atomic (simple, indivisible)
values.
Unnormalised Normal Form (UNF)
ORDER
Customer No: 001964 Order Number: 00012345
Name: Mark Campbell Order Date: 14-Feb-2002
Address: 1 The House
Leytonstone
E11 9ZZ
Product Product Unit Order Line
Number Description Price Quantity Total
T5060 Hook 5.00 5 25.00
PT42 Bolt 2.50 10 20.50
QZE48 Spanner 20.00 1 20.00
Order Total: 65.50
ORDER (orderNo, orderDate, custNo, custName, custAdd,
(prodNo, prodDesc, unitPrice, ordQty, lineTotal)*, orderTotal)
Example - UNF to 1NF
ORDER (orderNo, orderDate, custNo, custName, custAdd,
(prodNo, prodDesc, unitPrice, ordQty, lineTotal)*, orderTotal)
1. Remove the outermost repeating group (and any nested repeated groups it may contain)
and create a new relation to contain it. (rename original to indicate 1NF)
ORDER-1 (orderNo, orderDate, custNo, custName, custAdd, orderTotal)
(prodNo, prodDesc, unitPrice, ordQty, lineTotal)
2. Add to this relation a copy of the PK of the relation immediately enclosing it.
ORDER-1 (orderNo, orderDate, custNo, custName, custAdd, orderTotal)
(orderNo, prodNo, prodDesc, unitPrice, ordQty, lineTotal)
3. Name the new entity (appending the number 1 to indicate 1NF)
ORDER-LINE-1 (orderNo, prodNo, prodDesc, unitPrice, ordQty, lineTotal)
4. Determine the PK of the new entity
ORDER-LINE-1 (orderNo, prodNo, prodDesc, unitPrice, ordQty, lineTotal)
Second Normal Form (2NF)
2NF is based on the concept of full functional
dependency.
Full functional dependency – indicates that if A (which is
primary key) and B (non key attribute) are attributes of
a relation, we say B is fully functionally dependent on A if
B is functionally dependent on A, but not on any proper
subset of A.
In other words, A functional dependency A B: is said to
be a full functional dependency if removal of any
attribute from A results in the dependency not being
sustained any more, otherwise it is called partially
dependency.
Second Normal Form (2NF)
Definition: A relation is in 2NF if, and only if, it is in 1NF and
every non-key attribute is fully dependent on the primary key.
Steps from 1NF to 2NF:
• Remove the attributes that are only partially functionally dependent on
the composite key, and place them in a new relation.
• Add to this relation a copy of the attribute(s) which are the determinants
of these attributes. These will automatically become the primary key of
this new relation.
• Name the new entity (appending the number 2 to indicate 2NF)
• Rename the original entity (ending with a 2 to indicate 2NF)
Example - 1NF to 2NF
ORDER-LINE-1 (order-no, prod-no, prod-desc, unit-price, ord-qty, line-total)
1. Remove the attributes that are only partially functionally dependent on the composite
key, and place them in a new relation.
ORDER-LINE-1 (order-no, prod-no, ord-qty, line-total)
(prod-desc, unit-price)
2. Add to this relation a copy of the attribute(s) which determines these attributes. These will
automatically become the primary key of this new relation..
ORDER-LINE-1 (order-no, prod-no, ord-qty, line-total)
(prod-no, prod-desc, unit-price)
3. Name the new entity (appending the number 2 to indicate 2NF)
PRODUCT-2 (prod-no, prod-desc, unit-price)
4. Rename the original entity (ending with a 2 to indicate 2NF)
ORDER-LINE-2 (order-no, prod-no, ord-qty, line-total)
Third Normal Form (3NF)
3NF is based on the idea of transitive
dependency.
Transitivedependency: A condition where A, B,
and C are attributes of a relation such that if AB
and BC, then C is transitively dependent on A
via B (provided that A is not functionally
dependent on B or C)
It exists when a non-key attribute determines another
non-key attribute
Transitive Dependency
Definition: A transitive dependency exists when there is an
intermediate functional dependency.
Formal Notation: If A B and B C, then it can be stated
that the following transitive dependency exists:
ABC
Third Normal Form (3NF)
Definition: A relation is in 3NF if, and only if, it is in 2NF and every
non-key attribute is non-transitively dependent on the primary key.
Steps from 2NF to 3NF:
• Remove the attributes that are transitively dependent on non-key
attribute(s), and place them in a new relation.
• Add to this relation a copy of the attribute(s) which are the
determinants of these attributes. These will automatically become
the primary key of this new relation.
• Name the new entity (appending the number 3 to indicate 3NF)
• Rename the original entity (ending with a 3 to indicate 3NF)
Example - 2NF to 3NF
ORDER-2 (order-no, order-date, cust-no, cust-name, cust-add, order-total
1. Remove the attributes that are transitively dependent on non-key attributes, and place
them in a new relation.
ORDER-2 (order-no, order-date, cust-no, order-total
(cust-name, cust-add )
2. Add to this relation a copy of the attribute(s) which determines these attributes. These will
automatically become the primary key of this new relation.
ORDER-2 (order-no, order-date, cust-no, order-total
(cust-no, cust-name, cust-add )
3. Name the new entity (appending the number 3 to indicate 3NF)
CUSTOMER-3 (cust-no, cust-name, cust-add )
4. Rename the original entity (ending with a 3 to indicate 3NF)
ORDER-3 (order-no, order-date, cust-no, order-total
The resulting Relations which are
each in 3NF are the following:-
ORDER-3 (order-no, order-date, cust-no, order-total)
CUSTOMER-3 (cust-no, cust-name, cust-add )
PRODUCT-2 (prod-no, prod-desc, unit-price)
ORDER-LINE-2 (order-no, prod-no, ord-qty, line-total)
Physical
Database Design
Physical Database Design
Physical database design is the process of producing a
description of the implementation of the database on
secondary storage.
Physical design describes the base relation, file organization,
and indexes used to achieve efficient access to the data,
and any associated integrity constraints and security
measures.
Sources of information for the physical design process
include global logical data model and documentation that
describes model.
Logical database design is concerned with the what; physical
database design is concerned with the how.
It describes the storage structures and access methods
used to achieve efficient access to the data.
Steps in physical database design
1. Translate logical data model for target DBMS
2. Design base relation
3. Design representation of derived data
4. Design enterprise constraint
5. Design physical representation
6. Analyse transactions
7. Choose file organization
8. Choose indexes
9. Estimate disk space and system requirement
10. Design user view
11. Design security mechanisms
12. Consider controlled redundancy
13. Monitor and tune the operational system
Thank You!