UNIT – II
Syllabus: Database Design and Storage Structures
High-Level Conceptual Data Models for Database Design. Entity
Types, Entity Sets, Attributes and Keys.
Relationship Types, Relationship Sets, Roles and Structural
Constraints. Weak Entity Types.
Extended ER Features. Refining the ER Design. Naming
Conventions and Design Issues. ER to Relational Mapping.
File Organization and Storage. Secondary Storage Devices. File
Organization Techniques. Single-Level Ordered Index. Multi-Level
Indexes. Indexes on Multiple Keys
2.1
Higher Conceptual Modeling
Conceptual modeling is a very important phase in designing a
successful database application.
Entity-Relationship (ER) model is a popular high-level
conceptual data model.
This model and its variations are frequently used for the
conceptual design of database applications, and many database
design tools employ its concepts.
2.2
2.3
The first step shown is requirements collection and
analysis. During this step, the database designers
interview prospective database users to understand and
document their data requirements.
The next step is to create a conceptual schema for the
database, using a high-level conceptual data model.
This step is called conceptual design.
The next step in database design is the actual
implementation of the database, using a commercial
DBMS.
The last step is the physical design phase, during
which the internal storage structures, file
organizations, indexes, access paths, and physical
design parameters for the database files are
specified.
2.4
Entity-Relationship Model
Entity, Relationship, and E-R Diagram
Entity
Attributes
Weak/Regular Entity
Relationship
Degree
Mapping Cardinality
Relationship with Attributes
Participation
Keys
Extended E-R Features
Entity Specialization/Generalization
Relationship Aggregation
Design of an E-R Database Schema
2.5
Entity, Relationship, and E-R Diagram
A database can be modeled as:
a collection of entities,
relationship among entities.
A database can be illustrated by an E-R diagram
2.6
E-R Diagrams
Rectangles represent entity sets.
Diamonds represent relationship sets.
Lines link attributes to entity sets and entity sets to relationship sets.
Ellipses represent attributes
Double ellipses represent multivalued attributes. (will study later)
Dashed ellipses denote derived attributes. (will study later)
Underline indicates primary key attributes (will study later)
2.7
Entity Sets
An entity is an object that exists and is distinguishable
from other objects.
Example: specific person, company, event, plant
Entities have attributes
Example: people have names and addresses
An entity set is a set of entities of the same type that
share the same properties.
Example: set of all persons, companies, trees,
holidays
2.8
Entity Sets customer and loan
customer-id customer- customer- customer- loan- amount
name street city number
2.9
Attributes
An entity is represented by a set of attributes, that is descriptive
properties possessed by all members of an entity set.
Example:
customer = (customer-id, customer-name,
customer-street, customer-city)
loan = (loan-number, amount)
Domain – the set of permitted values for each attribute
Attribute types:
Simple and composite attributes.
Single-valued and multi-valued attributes
E.g. multivalued attribute: phone-numbers
Derived attributes
Can be computed from other attributes
E.g. age, given date of birth
2.10
Composite Attributes
2.11
E-R Diagram With Composite, Multivalued, and
Derived Attributes
2.12
Weak Entity and Regular/Strong Entity
A weak entity is an entity that is existence-dependent
on some other entity. By contrast, a regular entity (or
“a strong entity”) is an entity which is not weak.
The existence of a weak entity set depends on the
existence of a identifying entity set
it must relate to the identifying entity set via a total,
one-to-many relationship set from the identifying to
the weak entity set
E.g. An employee’s dependents might be weak
entities --- they can’t exist (so far as the database is
concerned) if the relevant employee does not exist.
A weak entity type can be related to more than one
regular entity type.
2.13
Weak Entity and Regular/Strong Entity
We depict a weak entity by double rectangles.
The identifying relationship is depicted using a
double diamond.
2.14
Relationship Sets
A relationship is an association among several entities
Example:
Hayes depositor A-102
customer entity relationship set account entity
A relationship set is a mathematical relation among n 2
entities, each taken from entity sets
{(e1, e2, … en) | e1 E1, e2 E2, …, en En}
where (e1, e2, …, en) is a relationship
Example:
(Hayes, A-102) depositor
2.15
Relationship Set borrower
2.16
Relationship Sets (Cont.)
An attribute can also be property of a relationship set.
For instance, the depositor relationship set between entity sets
customer and account may have the attribute access-date
2.17
Degree of a Relationship Set
Refers to number of entity sets that participate in a relationship
set.
Relationship sets that involve two entity sets are binary (or degree
two). Generally, most relationship sets in a database system are
binary.
Relationship sets may involve more than two entity sets.
E.g. Suppose employees of a bank may have jobs
(responsibilities) at multiple branches, with different
jobs at different branches. Then there is a ternary
relationship set between entity sets employee, job
and branch
Relationships between more than two entity sets are rare. Most
relationships are binary. (More on this later.)
2.18
E-R Diagram with a Ternary Relationship
2.19
Binary Vs. Non-Binary Relationships
Some relationships that appear to be non-binary may be better
represented using binary relationships
E.g. A ternary relationship parents, relating a child to his/her
father and mother, is best replaced by two binary relationships,
father and mother
Using two binary relationships allows partial information (e.g.
only mother being know)
But there are some relationships that are naturally non-binary
E.g. works-on
2.20
Roles
Entity sets of a relationship need not be distinct
o The labels “manager” and “worker” are called roles; they specify
how employee entities interact via the works-for relationship set.
o Roles are indicated in E-R diagrams by labeling the lines that
connect diamonds to rectangles.
o Role labels are optional, and are used to clarify semantics of the
relationship
2.21
Mapping Cardinalities
Express the number of entities to which another entity can be
associated via a relationship set.
Most useful in describing binary relationship sets.
For a binary relationship set the mapping cardinality must be
one of the following types:
One to one
One to many
Many to one
Many to many
2.22
Mapping Cardinalities
One to one One to many
Note: Some elements in A and B may not be mapped to any
elements in the other set
2.23
Mapping Cardinalities
Many to one Many to many
Note: Some elements in A and B may not be mapped to any
elements in the other set
2.24
Mapping Cardinality
We express cardinality constraints by drawing either a directed
line (), signifying “one,” or an undirected line (—), signifying
“many,” between the relationship set and the entity set.
E.g.: One-to-one relationship:
A customer is associated with at most one loan via the
relationship borrower
A loan is associated with at most one customer via borrower
2.25
One-To-Many Relationship
In the one-to-many relationship a loan is associated with at most
one customer via borrower, a customer is associated with
several (including 0) loans via borrower
2.26
Many-To-One Relationships
In a many-to-one relationship a loan is associated with several
(including 0) customers via borrower, a customer is associated
with at most one loan via borrower
2.27
Many-To-Many Relationship
A customer is associated with several (possibly 0) loans
via borrower
A loan is associated with several (possibly 0) customers
via borrower
2.28
Mapping Cardinalities affect ER Design
Can make access-date an attribute of account, instead of a relationship
attribute, if each account can have only one customer
I.e., the relationship from account to customer is many to one, or
equivalently, customer to account is one to many
2.29
Relationship Sets with Attributes
2.30
Participation of an Entity Set in a
Relationship Set
Total participation (indicated by double line): every entity in the entity
set participates in at least one relationship in the relationship set
E.g. participation of loan in borrower is total
every loan must have a customer associated to it via borrower
Partial participation: some entities may not participate in any
relationship in the relationship set
E.g. participation of customer in borrower is partial
2.31
Keys
A super key of an entity set is a set of one or more attributes
whose values uniquely determine each entity.
A candidate key of an entity set is a minimal super key
Customer-id is candidate key of customer
account-number is candidate key of account
Although several candidate keys may exist, one of the
candidate keys is selected to be the primary key.
2.32
Keys for Relationship Sets
The combination of primary keys of the participating entity sets
forms a super key of a relationship set.
(customer-id, account-number) is the super key of depositor
NOTE: This means a pair of entity sets can have at most one
relationship in a particular relationship set.
E.g. if we wish to track all access dates to each account by
each customer, we cannot assume a relationship for each
access. We can use a multivalued attribute though
Must consider the mapping cardinality of the relationship set
when deciding the what are the candidate keys
Need to consider semantics of relationship set in selecting the
primary key in case of more than one candidate key
2.33
Recursive Relationship Type is: SUPERVISION
(participation role names are shown)
2.34
Attribute of a Relationship Type is:
Hours of WORKS_ON
2.35
COMPANY ER Schema Diagram
using (min, max) notation
2.36
ER DIAGRAM FOR A BANK
DATABASE
© The Benjamin/Cummings Publishing Company, Inc. 1994, Elmasri/Navathe, Fundamentals of Database Systems, Second Edition
2.37
Extended ER Model
EER is a high-level data model that incorporates the extensions to the
original ER model. Enhanced ERDs are high-level models that
represent the requirements and complexities of complex databases.
In addition to ER model concepts, EE-R includes -
Subclasses and Superclasses.
Specialization and Generalization.
Category or union type.
Aggregation.
These concepts are used to create EE-R diagrams.
Subclasses and Super classes are two types of classes.
A super class is an object that can be further subdivided into subtypes.
2.38
Triangle, Square, and Circle are subgroups of the superclass form.
Subclasses are groups of entities that share certain common
characteristics.
The properties and characteristics of the super class are passed down to
the subclass.
2.39
Attribute inheritance is the property by which subtype entities
inherit values of all attributes and instances of all relationships of
the supertype. This important property makes it unnecessary to
include supertype attributes or relationships redundantly with the
subtypes.
Example of Subtype and Super type Relationship
2.40
Specialization
Top-down design process; we designate subgroupings within an
entity set that are distinctive from other entities in the set.
These subgroupings become lower-level entity sets that have
attributes or participate in relationships that do not apply to the
higher-level entity set.
Depicted by a triangle component labeled IS A (E.g. customer “is
a” person).
Attribute inheritance – a lower-level entity set inherits all the
attributes and relationship participation of the higher-level entity
set to which it is linked.
2.41
Specialization Example
2.42
Generalization
A bottom-up design process – combine a number of entity
sets that share the same features into a higher-level entity set.
Specialization and generalization are simple inversions of each
other; they are represented in an E-R diagram in the same way.
The terms specialization and generalization are used
interchangeably.
2.43
2.44
Specialization and Generalization
(Contd.)
Can have multiple specializations of an entity set based on
different features.
E.g. permanent-employee vs. temporary-employee, in
addition to officer vs. secretary vs. teller
Each particular employee would be
a member of one of permanent-employee or temporary-
employee,
and also a member of one of officer, secretary, or teller
The ISA relationship also referred to as superclass - subclass
relationship
2.45
Design Constraints on a
Specialization/Generalization
Constraint on which entities can be members of a given lower-level
entity set.
condition-defined
E.g. all customers over 65 years are members of the senior-
citizen entity set; senior-citizen IS A person.
user-defined
Constraint on whether or not entities may belong to more than one
lower-level entity set within a single generalization.
Disjoint
an entity can belong to only one lower-level entity set
Noted in E-R diagram by writing disjoint next to the ISA
triangle
Overlapping
an entity can belong to more than one lower-level entity set
2.46
Design Constraints on a
Specialization/Generalization (Contd.)
Completeness constraint -- specifies whether or not an entity in
the higher-level entity set must belong to at least one of the
lower-level entity sets within a generalization.
total : an entity must belong to one of the lower-level entity sets
partial: an entity need not belong to one of the lower-level
entity sets
2.47
Aggregation
Represents the connection between a whole entity and its parts.
Consider the ternary relationship works-on, which we saw earlier
Suppose we want to record managers for tasks performed by an
employee at a branch
2.48
Aggregation (Cont.)
Relationship sets works-on and manages represent overlapping
information
Every manages relationship corresponds to a works-on relationship
However, some works-on relationships may not correspond to any
manages relationships
So we can’t discard the works-on relationship
Eliminate this redundancy via aggregation
Treat relationship as an abstract entity
Allows relationships between relationships
Abstraction of relationship into new entity
Without introducing redundancy, the following diagram represents:
An employee works on a particular job at a particular branch
An employee, branch, job combination may have an associated
manager
2.49
E-R Diagram With Aggregation
2.50
E-R Design Decisions
The use of an attribute or entity set to represent an object.
Whether a real-world concept is best expressed by an entity set
or a relationship set.
The use of a ternary relationship versus a pair of binary
relationships.
The use of specialization/generalization – contributes to
modularity in the design.
The use of aggregation – can treat the aggregate entity set as a
single unit without concern for the details of its internal structure.
2.51
E-R Diagram for a Banking Enterprise
2.52
Summary of Symbols Used in E-R
Notation
2.53
Summary of Symbols (Cont.)
2.54
Alternative E-R Notations
2.55