ENTITY RELATIONSHIP MODELING
The entity-relationship model (or ER model) is a way of graphically representing the logical
relationships of entities (or objects) in order to create a database. The ER model was first
proposed by Peter Pin-Shan Chen of Massachusetts Institute of Technology (MIT) in the 1970s.
ERD is a data modeling technique that can help define business processes and can be used as the
foundation for a relational database.
In ER modeling, the structure for a database is portrayed as a diagram, called an entity-
relationship diagram (or ER diagram), that resembles the graphical breakdown of a sentence into
its grammatical parts.
Entities are rendered as points, polygons, circles, or ovals.
Relationships are portrayed as lines connecting the points, polygons, circles, or ovals. Any ER
diagram has an equivalent relational table, and any relational table has an equivalent ER
diagram. A relationship can be defined according to the number of entities associated with it,
known as the degree.
In a logical sense, entities are the equivalent of grammatical nouns, such as employees,
departments, products, or networks.
An entity can be defined by means of its properties, called attributes. Relationships are the
equivalent of verbs or associations, such as the act of purchasing, the act of repairing, being a
member of a group, or being a supervisor of a department.
E-R DIAGRAM
An entity-Relationship Diagram (ERD) is a graphical representation of an information system
that shows the relationship between people, objects, places, concepts or events within that
system.
Entity-Relationship (ER) model, a high-level data model that is useful in developing a
conceptual design for a database. Creation of an ER diagram, which is one of the first steps in
designing a database, helps the designer(s) to understand and to specify the desired components
of the database and the relationships among those components. An ER model is a diagram
containing entities or "items", relationships among them, and attributes of the entities and the
relationships.
While useful for organizing data that can be represented by a relational structure, an entity-
relationship diagram can't sufficiently represent semi-structured or unstructured data, and an
ERD is unlikely to be helpful on its own in integrating data into a pre-existing information
system.
1
Components of an ER Diagram
Three main components of an ERD are:
-The entities, which are objects or concepts that can have data stored about them
-The relationship between those entities and
-The cardinality, which defines that relationship in terms of numbers
For example, an ER diagram representing the information system for a company's sales
department might start with graphical representations of entities such as the sales representative,
the customer, the customer's address, the customer's order, the product and the warehouse. (See
diagram) Then lines or other symbols can be used to represent the relationship between entities,
and text can be used to label the relationships.
Finally, cardinality notations define the attributes of the relationship between the entities.
Cardinalities can denote that an entity is optional (for example, a sales rep could have no
customers or could have many) or mandatory (for example, the must be at least one product
listed in an order.)
2
The three main cardinal relationships are:
One-to-one (1:1). For example, if each customer in a database is associated with one
mailing address.
One-to-many (1: M). For example, a single customer might place an order for multiple
products. The customer is associated with multiple entities, but all those entities have a
single connection back to the same customer.
Many-to-many (M : N). For example, at a company where all call center agents work
with multiple customers, each agent is associated with multiple customers, and multiple
customers might also be associated with multiple agents.
While there are tools to help draw entity-relationship diagrams, such as CASE (computer-aided
software engineering) tools, some relational database management systems also have design
capabilities built in.
Optionality and Cardinality
Symbols at the ends of the relationship lines indicate the optionality and the cardinality of each
relationship. “Optionality” expresses whether the relationship is optional or mandatory.
“Cardinality” expresses the maximum number of relationships.
As a relationship line is followed from an entity to another, near the related entity two symbols
will appear. The first of those is the optionality indicator. A circle ( ) indicates that the
relationship is optional—the minimum number of relationships between each instance of the first
entity and instances of the related entity is zero. One can think of the circle as a zero, or a letter
O for “optional.” A stroke ( | ) indicates that the relationship is mandatory—the minimum
number of relationships between each instance of the first entity and instances of the related
entity is one.
The second symbol indicates cardinality. A stroke ( | ) indicates that the maximum number of
relationships is one. A “crows-foot” ( ) indicates that many such relationships between
instances of the related entities might exist.
The following diagram indicates all of the possible combinations:
3
In our model, we wish to indicate that each school may enroll many students, or may not enroll
any students at all. We also wish to indicate that each student attends exactly one school. The
following diagram indicates this optionality and cardinality:
It is important to note that relationship optionality and cardinality constraints apply specifically
to the system being modeled, not to all possible systems. According to the example modeled
above, a school might not enroll any students—that relationship is optional. A school without
students is not much of a school, and indeed if the system being modeled were a school system
enrollment database, the relationship would probably be mandatory. However, if the system
being modeled is an extracurricular honors program, there may be schools that have no students
currently participating in the program. Consider the function of the system and consult the other
documents in the data model to clarify modeling decisions.
4
ENTITY RELATIONSHIP DIAGRAM -EXAMPLES
Country Bus Company
Entities
Bus - Company owns busses and will hold information about them.
Route - Buses travel on routes and will need described.
Town - Buses pass through towns and need to know about them
Driver - Company employs drivers, personnel will hold their data.
Stage - Routes are made up of stages
Garage - Garage houses buses, and need to know where they are.
Relationships
A bus is allocated to a route and a route may have several buses.
Bus-route (m:1) is serviced by
A route comprises of one or more stages.
route-stage (1:m) comprises
One or more drivers are allocated to each stage.
driver-stage (m:1) is allocated
A stage passes through some or all of the towns on a route.
stage-town (m:n) passes-through
A route passes through some or all of the towns
route-town (m:n) passes-through
Some of the towns have a garage
garage-town (1:1) is situated
A garage keeps buses and each bus has one `home' garage
garage-bus (m:1) is garaged
Draw E-R Diagram
Figure : Bus Company
5
ATTRIBUTES
Bus (reg-no,make,size,deck,no-pass)
Route (route-no,avg-pass)
Driver (emp-no,name,address,tel-no)
Town (name)
Stage (stage-no)
Garage (name,address)
In the text, lines may be directed (have an arrow on the end) to signify mapping
cardinalities for relationship sets.
Figure 2.7: An E-R diagram
Figure 2.8: One-to-many from customer to account
Figure 2.9: Many-to-one from customer to account
6
Benefits of an ER- Diagram
1. Visual representation
The foremost and most important ERD benefit is that it provides a visual representation of the
design. It is normally crucial to have an ERD if you are looking to come up with an effective
database design. This is because the patterns assist the designer in focusing on the way the
database will primarily work with all the data flows and interactions. It is common to the ERD
being used together with data flow diagrams so as to attain a better visual representation.
2. Effective communication
7
An ERD clearly communicates the key entities in a certain database and their relationship with
each other. ERD normally uses symbols for representing three varying kinds of information.
Diamonds are used for representing the relationships, ovals are usually used for representing
attributes and boxes represent the entities. This allows a designer to effectively communicate
what exactly the database will be like
3. Simple to understand
ERD is easy to understand and simple to create. In effect, this design can be used to be shown to
the representatives for both approval and confirmation. The representatives can also make their
contributions to the design, allowing the possibilities of rectifying and enhancing the design.
4. High flexibility
The ERD model is quite flexible to use as other relationships can be derived easily from the
already existing ones. This can be done using other relational tables and mathematical formulae.
The ERD thereby acts like the blueprint for the database and it allows the creation of an accurate
design that reflects the needs of the project.
8
DATABASE SECURITY
Introduction
Database security refers to the collective measures used to protect and secure a database or
database management software from illegitimate use and malicious threats and attacks.
It is a broad term that includes a multitude of processes, tools and methodologies that ensure
security within a database environment.
Database security covers and enforces security on all aspects and components of databases. This
includes:
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned, implemented and maintained by a database administrator
and or other information security professional.
Some of the ways database security is analyzed and implemented include:
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Load/stress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road map/plan to mitigate them
9
Summary of security threats to a database
Figure 5.1 A summary of the potential threats to computer systems.
10
Database Security Concepts/Terms of providing security in a database
a. Authorization;
b. Views;
c. Backup and recovery;
d. Integrity;
e. Encryption;
f. Raid.
1. Authorization
Authorization is the granting of a right or privilege that enables a subject to have legitimate
access to a system or a system’s object. Authorization controls can be built into the software,
and govern not only what database system or object a specified user can access, but also what the
user may do with it. The process of authorization involves authentication of a subject requesting
access to an object, where ‘subject’ represents a user or program and ‘object’ represents a
database table, view, procedure, trigger, or any other object that can be created within the
database system.
2. Views
A view is a virtual table that does not necessarily exist in the database but can be produced upon
request by a particular user, at the time of request. The view mechanism provides a powerful and
flexible security mechanism by hiding parts of the database from certain users. The user is not
aware of the existence of any columns or rows that are missing from the view. A view can be
defined over several tables with a user being granted the appropriate privilege to use it, but not to
use the base tables. In this way, using a view is more restrictive than simply having certain
privileges granted to a user on the base table(s).
3. Backup and recovery
Backup is the process of periodically taking a copy of the database and log file (and possibly
programs) onto offline storage media. A DBMS should provide backup facilities to assist with
the recovery of a database following failure. To keep track of database transactions, the DBMS
maintains a special file called a log file (or journal) that contains information about all updates to
the database. It is always advisable to make backup copies of the database and log file at regular
intervals and to ensure that the copies are in a secure location. In the event of a failure that
renders the database unusable, the backup copy and the details captured in the log file are used to
restore the database to the latest possible consistent state. Journaling is the process of keeping
and maintaining a log file (or journal) of all changes made to the database to enable recovery to
be undertaken effectively in the event of a failure.
11
4. Integrity constraints
Contribute to maintaining a secure database system by preventing data from becoming invalid,
and hence giving misleading or incorrect results.
5. Encryption
Is the encoding of the data by a special algorithm that renders the data unreadable by any
program without the decryption key. If a database system holds particularly sensitive data, it may
be deemed necessary to encode it as a precaution against possible external threats or attempts to
access it. Some DBMSs provide an encryption facility for this purpose. The DBMS can access
the data (after decoding it), although there is degradation in performance because of the time
taken to decode it. Encryption also protects data transmitted over communication lines. There are
a number of techniques for encoding data to conceal the information; some are termed
irreversible and others reversible. Irreversible techniques, as the name implies, do not permit the
original data to be known. However, the data can be used to obtain valid statistical information.
Reversible techniques are more commonly used. To transmit data securely over insecure
networks requires the use of a cryptosystem, which includes:
• an encryption key to encrypt the data (plaintext);
• an encryption algorithm that, with the encryption key, transforms the plain text into
ciphertext;
• a decryption key to decrypt the ciphertext;
• a decryption algorithm that, with the decryption key, transforms the ciphertext back into
plain text.
6. Redundant Array of Independent Disks (RAID)
RAID works by having a large disk array comprising an arrangement of several independent
disks that are organized to improve reliability and at the same time increase performance. The
hardware that the DBMS is running on must be fault-tolerant, meaning that the DBMS should
continue to operate even if one of the hardware components fails. This suggests having
redundant components that can be seamlessly integrated into the working system whenever there
is one or more component failures.
The main hardware components that should be fault-tolerant include disk drives, disk controllers,
CPU, power supplies, and cooling fans. Disk drives are the most vulnerable components with the
shortest times between failures of any of the hardware components.
One solution is the use of Redundant Array of Independent Disks (RAID) technology. RAID
works by having a large disk array comprising an arrangement of several independent disks that
are organized to improve reliability and at the same time increase performance.
12
DATA INTEGRITY
Data integrity
A key strength of a relational database is its ability to force records to be consistent. This is
called the 'data integrity' of a database.
Common ways of doing this include:
1. Referential integrity.
Referential integrity forces the rule that a primary key cannot be duplicated in a table. It
also ensures that if the primary key one table is changed, then the foreign keys in the
other tables are also updated.
2. Data validation
Rules can be put in place that forces field data to be in a certain format. For example, a
date field may have to use dd/mmm/yyyy i.e. 02/Sep/2007 and not Sep 2nd 07 for
example. This is called data validation
13