0% found this document useful (0 votes)
15 views5 pages

Chapter 14 Database

Chapter 14 discusses normalization as a database design technique aimed at creating a suitable set of relations that minimize data redundancy and support enterprise data requirements. It highlights the importance of functional dependencies, the issues of data anomalies, and the process of normalization through various normal forms, including 1NF, 2NF, 3NF, and BCNF. The chapter emphasizes the benefits of normalization in reducing inconsistencies and storage costs while maintaining the integrity of the data.

Uploaded by

Saman
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views5 pages

Chapter 14 Database

Chapter 14 discusses normalization as a database design technique aimed at creating a suitable set of relations that minimize data redundancy and support enterprise data requirements. It highlights the importance of functional dependencies, the issues of data anomalies, and the process of normalization through various normal forms, including 1NF, 2NF, 3NF, and BCNF. The chapter emphasizes the benefits of normalization in reducing inconsistencies and storage costs while maintaining the integrity of the data.

Uploaded by

Saman
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Chapter 14:- Normalization

When we design a database for an enterprise, the main objective is to create an accurate
representation of the data, relationships between the data, and constraints on the data that is
pertinent to the enterprise. To help achieve this objective, we can use one or more database design
techniques. In Chapters 12 and 13 we described a technique called ER modeling. In this chapter and
the next we describe another database design technique called normalization.

The Purpose of Normalization


A technique for producing a set of relations with desirable properties, given the data requirements
of an enterprise is called Normalization.

The purpose of normalization is to identify a suitable set of relations that support the data
requirements of an enterprise.

The characteristics of a suitable set of relations include the following:

 the minimal number of attributes necessary to support the data requirements of the
enterprise
 attributes with a close logical relationship (described as functional dependency) are found in
the same relation
 minimal redundancy, with each attribute represented only once, with the important
exception of attributes that form all or part of foreign keys, which are essential for the
joining of related relations.

Data Redundancy and Data Anomalies


a major aim of relational database design is to group attributes into relations to minimize data
redundancy. If this aim is achieved, the potential benefits for the implemented database include the
following:

 updates to the data stored in the database are achieved with a minimal number of
operations, thus reducing the opportunities for data inconsistencies occurring in the
database;
 reduction in the file storage space required by the base relations thus minimizing costs.

Of course, relational databases also rely on the existence of a certain amount of data redundancy.
This redundancy is in the form of copies of primary keys (or candidate keys) acting as foreign keys in
related relations to enable the modeling of relationships between data. In this section we illustrate
the problems associated with unwanted data redundancy by comparing the Staff and Branch
relations shown in Figure 14.2 with the StaffBranch relation shown in Figure 14.3. The StaffBranch
relation is an alternative format of the Staff and Branch relations. The relations have the following
form:

Staff (staffNo, sName, position, salary, branchNo)

Branch (branchNo, bAddress)

StaffBranch (staffNo, sName, position, salary, branchNo, bAddress)


Note that the primary key for each relation is underlined.

In the StaffBranch relation there is redundant data; the details of a branch are repeated for every
member of staff located at that branch. In contrast, the branch details appear only once for each
branch in the Branch relation, and only the branch number (branchNo) is repeated in the Staff
relation to represent where each member of staff is located. Relations that have redundant data
may have problems called update anomalies, which are classified as insertion, deletion, or
modification anomalies.

Insertion Anomalies:- There are two main types of insertion anomaly, which we illustrate using
the StaffBranch relation shown in Figure 14.3:

 To insert the details of new members of staff into the StaffBranch relation, we must include
the details of the branch at which the staff are to be located. For example, to insert the
details of new staff located at branch number B007, we must enter the correct details of
branch number B007 so that the branch details are consistent with values for branch B007 in
other tuples of the StaffBranch relation. The relations shown in Figure 14.2 do not suffer
from this potential inconsistency, because we enter only the appropriate branch number for
each staff member in the Staff relation. Instead, the details of branch number B007 are
recorded in the database as a single tuple in the Branch relation.
 To insert details of a new branch that currently has no members of staff into the StaffBranch
relation, it is necessary to enter nulls into the attributes for staff, such as staffNo. However,
as staffNo is the primary key for the StaffBranch relation, attempting to enter nulls for
staffNo violates entity integrity (see Section 4.3), and is not allowed. We therefore cannot
enter a tuple for a new branch into the StaffBranch relation with a null for the staffNo. The
design of the relations shown in Figure 14.2 avoids this problem, because branch details are
entered in the Branch relation separately from the staff details. The details of staff ultimately
located at that branch are entered at a later date into the Staff relation.

Deletion Anomalies:- If we delete a tuple from the StaffBranch relation that represents the last
member of staff located at a branch, the details about that branch are also lost from the database.
For example, if we delete the tuple for staff number SA9 (Mary Howe) from the StaffBranch relation,
the details relating to branch number B007 are lost from the database. The design of the relations in
Figure 14.2 avoids this problem, because branch tuples are stored separately from staff tuples and
only the attribute branchNo relates the two relations. If we delete the tuple for staff number SA9
from the Staff relation, the details on branch number B007 remain unaffected in the Branch relation.

Modification Anomalies:- If we want to change the value of one of the attributes of a particular
branch in the StaffBranch relation—for example, the address for branch number B003—we must
update the tuples of all staff located at that branch. If this modification is not carried out on all the
appropriate tuples of the StaffBranch relation, the database will become inconsistent. In this
example, branch number B003 may appear to have different addresses in different staff tuples. The
previous examples illustrate that the Staff and Branch relations of Figure 14.2 have more desirable
properties than the StaffBranch relation of Figure 14.3. This demonstrates that although the
StaffBranch relation is subject to update anomalies, we can avoid these anomalies by decomposing
the original relation into the Staff and Branch relations. There are two important properties
associated with decomposition of a larger relation into smaller relations:

 The lossless-join property ensures that any instance of the original relation can be identified
from corresponding instances in the smaller relations.
 The dependency preservation property ensures that a constraint on the original relation can
be maintained by simply enforcing some constraint on each of the smaller relations. In other
words, we do not need to perform joins on the smaller relations to check whether a
constraint on the original relation is violated.

Functional Dependency:- Describes the relationship between attributes in a relation. For example,

if A and B are attributes of relation R, B is functionally dependent on A (denoted A ® B), if each value
of A is associated with exactly one value of B. (A and B may each consist of one or more attributes.)

Determinant:- Refers to the attribute, or group of attributes, on the left-hand side of the arrow of a
functional dependency.

An Example of a functional Dependency:- Consider the attributes staffNo and position of the
Staff relation in Figure 14.2. For a specific staffNo—for example, SL21—we can determine the
position of that member of staff as Manager. In other words, staffNo functionally determines
position, as shown in Figure 14.5(a). However, Figure 14.5(b) illustrates that the opposite is not true,
as position does not functionally determine staffNo. A member of staff holds one position; however,
there may be several members of staff with the same position.

The relationship between staffNo and position is one-to-one (1:1): for each staff number there is
only one position. On the other hand, the relationship between position and staffNo is one-to-many
(1:*): there are several staff numbers associated with a given position. In this example, staffNo is the
determinant of this functional dependency. For the purposes of normalization, we are interested in
identifying functional dependencies between attributes of a relation that have a one-to-one
relationship between the attribute(s) that makes up the determinant on the left-hand side and the
attribute(s) on the right-hand side of a dependency.

When identifying functional dependencies between attributes in a relation, it is important to


distinguish clearly between the values held by an attribute at a given point in time and the set of all
possible values that an attribute may hold at different times. In other words, a functional
dependency is a property of a relational schema (intension) and not a property of a particular
instance of the schema. This point is illustrated in the following example.

Full Functional Dependency:- Indicates that if A and B are attributes of a relation, B is fully
functionally dependent on A if B is functionally dependent on A, but not on any proper subset of A.

A functional dependency A ® B is a full functional dependency if removal of any attribute from A


results in the dependency no longer existing. A functional dependency A ® B is a partial dependency
if there is some attribute that can be removed from A and yet the dependency still holds.

Transitive Dependency:- A condition where A, B, and C are attributes of a relation such that if A ®
B and B ® C, then C is transitively dependent on A via B (provided that A is not functionally
dependent on B or C).

The Process of Normalization


Normalization is a formal technique for analyzing relations based on their primary key (or candidate
keys) and functional dependencies (Codd, 1972b). The technique involves a series of rules that can
be used to test individual relations so that a database can be normalized to any degree. When a
requirement is not met, the relation violating the requirement must be decomposed into relations
that individually meet the requirements of normalization.

Three normal forms were initially proposed called First Normal Form (1NF), Second Normal Form
(2NF), and Third Normal Form (3NF). Subsequently, R. Boyce and E. F. Codd introduced a stronger
definition of third normal form called Boyce–Codd Normal Form (BCNF) (Codd, 1974). With the
exception of 1NF, all these normal forms are based on functional dependencies among the attributes
of a relation (Maier, 1983). Higher normal forms that go beyond BCNF were introduced later such as
Fourth Normal Form (4NF) and Fifth Normal Form (5NF) (Fagin, 1977, 1979). However, these later
normal forms deal with situations that are very rare.

You might also like