Chapter 3
Data Modelling Using the Entity-
Relationship(ER) Model
Outlines
Phases of database design
E/R data model
Elements of E/R data model
Case study
Phase of database design
• Database design is made up of three main phases. Namely,
conceptual design
Logical design
Physical design
• Conceptual database design: to build the conceptual
representation of the database, which includes identification
of the important entities, relationships, and attributes.
• Logical database design: to translate the conceptual
representation to logical structure of the database, which
includes designing the relations.
• Physical database design: to decide how the logical structure
is to be physically implemented (as base relations) in the
target database Management System(DBMS).
• The conceptual design stage develops a conceptual data
model that later on is transformed into the implementation
model. The conceptual data model mainly refers to the Entity
– Relationship (E/R) Data Model.
• the conceptual design also includes functional requirements
which describe the kinds of operations (or transactions) that
will be performed on the data.
Entity-Relationship (E/R) data model
The Entity-Relationship (E/R) data model was introduced by
Peter Chen in 1976-77 as a way to unify the network and
relational database views.
It is popular because it is very simple to create and read.
It is a diagrammatic representation of the logical structure of
the Database.
It views the real world as a set of objects (entities) and
relationships among these objects.
Elements of E/R model
• The three basic notions of the E/R model are:
Entity: represents existing real-world objects or concepts, such
as places, objects, events, persons, orders, customers, and so on.
Relationship: represents associations between objects, such as
the fact that a customer may place an order.
Attribute: describes the entity, such as the invoice date or the
customer first name.
Case study
• Consider the database system to be developed for “xyz”
software share company. The following are brief short listed
structures of the company.
The company runs various projects with a total of 68 full-time
employees and over 120 part-time employees.
A project has a unique id and name that may be designed for
new software development or for a release of a new version
of software that had been developed by the company.
The projects are having start date, due date, complete date
and status that describe their progress. Every project is lead
by a senior manager organized into teams of five to eight
programmers coordinated by a team leader.
The owners of the projects are the customers of the
company. A single customer can own one or more projects.
The customers have unique id, name and address.
The company is organized into departments that are
identified by a unique name and lead by department heads. A
department head can only lead a single department in his/her
employment by the company.
An employee can only belong to one department. Every
employee is identified by an Id, a name, an address, and a
position. In addition full-time employees have monthly salary
and allowance rate; and part-time employees have contract
period and hourly rate. Working schedule of both full-time
employees and part-time employees is maintained on weekly
bases.
Entity Sets
• Entities are usually recognizable concepts, either concrete or
abstract, such as person, places, things, or events which have
relevance to the database. An entity set is then a set
consisting of the same type of entities that share same
properties.
• Consider the case study; some specific examples of entities
are then:
EMPLOYEES, PROJECTS, CUSTOMERS …
• The candidate entities from the requirement statements are
the nouns and the adjective noun phrases.
• The “EMPLOYEES” entity set represents all the set of
employees and the “Projects” entity set represents all the set
of projects.
• Entities are classified as independent (Strong) or dependent
(Weak).
A strong (independent) entity is one that does not rely on
other entities for identification.
A weak (dependent) entity is one that relies on other entities
for identification. An individual occurrence of an entity set is
also known as an instance (object).
Attributes
• Attributes are descriptive properties that are associated with
an entity. A set of attributes describe an entity.
A particular instance of an attribute is called a value.
• For example, “Employee Id” and “Name” are the attributes of
the “EMPLOYEES” entity set; and “Kevin Jones” is one value of
the attribute “Name”.
• The domain of an attribute is the collection of all possible
values an attribute can have. The domain of “Name” is a
character string.
• Attributes can be classified as identifiers or descriptors.
Identifiers: more commonly called keys, uniquely identify an
instance of an entity.
Example: “Employee Id” uniquely identifies an employee entity
from the entity set.
Descriptor: describes a non-unique characteristic of an entity
instance.
Example: “Name” is a descriptor for the “EMPLOYEES” entity
set.
• Other way of categorizing Attributes is as Simple and
Composite attributes.
Simple Attributes: are attributes also known as Atomic
Attributes that can not be divided into subparts mainly of
primitive types.
Example: “Age” and “Gender” of the “EMPLOYEES” entity set.
Composite Attributes: are attributes that are composed of
smaller subparts that can be subdivided into the subparts
(Attributes).
Example: “Address” of the “EMPLOYEES” entity set that can be
divided into “City”,“Home Address”, “Phone”, and “P.O. Box”
Hierarchical Composite Attributes
• Another classification of attributes is based on the values that
they can hold as: Single-valued and Multi-valued attributes.
Single-valued Attributes: are attributes having only one
possible value at any time.
Example: “Name” and “Gender” of the “EMPLOYEES” entity set.
Multi-valued Attributes: are attributes that are having possibly
more than one value.
Example: “Address” of the “EMPLOYEES” entity set.
• Attributes can also be categorized Stored and Derived
attributes.
Derived Attributes: are attributes that can be calculated from
the related stored attributes, entities or general states.
Stored Attributes: on the other hand are attributes that can
not be calculated in any way from the stored attributes.
Example: “Birth Date” of the “EMPLOYEES” entity set is a stored
attribute, whereas “Age” is a derived attribute that can be
calculated from the “Birth Date” and “Current Date”.
Relationship Sets
• A Relationship represents an association between two or
more entities. An example of a relationship would be:
- “EMPLOYEES” are Assigned to “TEAMS”
- “CUSTOMERS” Owns “PROJECTS”
- “TEAMS” works on “PROJECTS”
• A Relationship Set is then a set consisting same types of
relationships. The entities involved in the relationship are
known as participating entities and the function the entity
plays in a relationship is called the entity’s role.
Example: In the Assigned relationship “EMPLOYEES” and
“TEAMS” entity sets are the participating entity sets; and the
“EMPLOYEES” entity has a role as a “Programmer” or “Team
Leader” in the relationship.
• Relationships are classified in terms of degree, connectivity,
cardinality, and existence.
Degree: The degree of a relationship is the number of
entities associated with the relationship. The n-ary (multi-way)
relationship is the general form for degree n. Special cases are
the binary, and ternary, where the degree is 2, and 3,
respectively
• Connectivity: The connectivity of a relationship describes
the mapping of associated entity instances in the
relationship. The values of connectivity are “one” or “many”.
• Cardinality:The cardinality of a relationship is the actual
number of related occurrences for each of the two entities.
The basic types of connectivity for relations are: one-to-one,
one-to-many, and many to-many.
• Existence: denotes whether the existence of an entity
instance is dependent upon the existence of another, related,
entity instance. The existence of an entity in a relationship is
defined as either mandatory or optional.
Design Principles
• E/R Diagram
The Entity Relationship (E/R) data model is a diagrammatical
data model. The elements of the E/R model are represented by:
- Rectangles - for the Entity sets,
- Ellipses - for the Attributes,
- Diamonds - for the Relationships, and
- Lines - for the links between the attributes and the entity sets
and between the entity sets and the relationships.
• - Double border Rectangles - for the weak entity sets.
- Double border Ellipses - for the multi-valued attributes.
- Dashed border Ellipses - for the derived attributes.
- Arrow Head Line - for the link between an entity set and a
one-to-one or many-to-one relationship. The arrow is headed
to the one side entity set.
• Example
- “EMPLOYEES” are Assigned to “TEAMS”
- “CUSTOMERS” Owns “PROJECTS”
- “TEAMS” works on “PROJECTS”
• Composite attributes are represented by linked ellipses as depicted in the
above figure with the attributes “Address” and “H Addrs”.
• Relationship Attributes: Attribute(s) may be used in some relationships
to describe the relationship further. Consider the relationship “WorksOn”
between the “TEAMS” and “PROJECTS” entity sets. The relationship can
be further described if an attribute “Task” is added to it as follows.
• Multi-way Relationship: Consider the three way
relationship between the “PROJECTS”, “TEAMS”, and
“SOFTWARE” entity sets.
Cardinality Limits of a Relationship: The credential limit of
a relationship is labeled as:
- 0..* or 0..∞ indicating zero or more participation of the entity
in the relationship.
- 1..* or 1..∞ indicating one or more participation of the entity
in the relationship.
- 0..1 indicating zero or one participation of the entity in the
relationship.
- 1..1 indicating exactly one participation of the entity in the
relationship.
• The multi-way (ternary) relationship shown in figure 5 above
can be reduced to a binary relationship with the use of an
entity set in place of the relationship and having three new
relationships for the links in between the participating entity
sets and the relationship.
• If the multi-way relationship set that is transformed into the
binary relationship had any attributes, these are assigned to
the entity set that replaces the relationship.
• Entity Set Roles in a Relationship: In some relationships a
single entity set may participate more than once in such case a
label is on the link line from the entity set is used to
differentiate the participation of the entity set.
• Total Participation in a Relationship: The participation of
the entity set in a relationship is said to be in total if every
element of the entity set is at least related to one element in
the other participating entities through the relationship;
otherwise the participation is said to partial. A total
participation in E/R model is represented by a double line
from the entity set to the relationship.
Consider the relationship between the “SOFTWARE” and
“ASSIGNMENT” entity sets above. Every Software is produced
in an Assignment, hence the relationship from the
“SOFTWARE” to
the “ASSIGNMENT” is total.
Design Issues
• The following are some useful principles to be followed in
designing databases.
1. Faithfulness - first and for most, the design should be
faithful to the specifications. That is classes or entity sets and
their attributes should reflect reality.
2. Avoiding Redundancy - be careful to say everything only
once.
3. Simplicity - avoid introducing more elements into your
design than are absolutely necessary.
• 4. Picking the Right kind of Element - Sometimes we have options
regarding the type of design element used to represent a real-world concept.
Use of Entity Set versus Attributes: Generally, of something has more information
associated with it than just its name, it probably needs to be an entity set.
However, if it has only its name to contribute to the design, then it is probably
better to make it an attribute.
Example: A “SOFTWARE” entity set may have a “Version” attribute, or “VERSION”
can be argued to be an entity set.
Entity versus Relationship Sets: Since relationships represent events there will
always be confusion between the entity sets and relations.
Binary versus n-ary Relationship Sets: Generally, of something has more
information associated with it than just its name, it probably needs to be an entity
set. However, if it has only its name to contribute to the design, then it is
probably better to make it an attribute.
Keys and Constraints
• Constraints in a database design are assertions that need to be
conformed to maintain the actual aspect of the real world system
under construction. Some of the commonly used classifications of
database constraints are:
[Link] – are attributes or set of attributes that can be used to
uniquely identify an entity within the entity set. A key in E/R model
is represented by underlining the attributes belonging to the key
[Link]-value constraints – are requirements that specify a
given attribute or set of attributes are unique in certain context.
3. Referential integrity constraints – are requirements for
an existence of an entity in the database so as it can be
referenced by a certain relationship. Referential integrity in E/R
model is represented by a rounded arrow head pointing to the
entity set required for existence.
4. Domain constraints – are requirements on an attribute
value to be in a specified range of values. No specific notation,
but side marks may be used to represent domain constraints.
5. Participation constraints – are assertions whether the
participation of an entity in a relationship is total or partial.
Keys
• As described above keys are attributes or set of attributes
that suffice to distinguish entities from each other.
A super key also know as super set is then a set of one or
more attributes that in group (collectively) can identify an
entity uniquely from the entity set.
Example: Consider the “EMPLOYEES” entity set, then
- “EmpId”, “EmpId, Name”, “NationalId”, “NationalId, BDate”, …
are super keys
- “Name”, “BDate” are NOT super keys
• The more interesting super set is the minimal super set that
is referred to as the candidate key.
The candidate key is the sufficient and the necessary set of
attributes to distinguish an entity set.
Example: In the “EMPLOYEES” entity set
- “EmpId”, “NationalId”, “Name, BDate” (assuming that there
is no coincidence that employees with the same name may
born on the same day) … are candidate keys
• The designer of the database is the one that makes the
choice of the candidate keys for implementation, but the
choice has to be made carefully. Primary key is a term used
to refer to the candidate key that is selected by the designer
for implementation.