0% found this document useful (0 votes)

3 views10 pages

Normalization NotesByRK

The document outlines informal design guidelines for relation schemas, emphasizing the importance of representing single entities in tuples, avoiding anomalies, and minimizing NULL values. It discusses functional dependencies, inference rules, and the process of normalization through various normal forms (1NF to 5NF) to ensure data integrity and reduce redundancy. Additionally, it covers concepts such as keys, prime attributes, and the significance of lossless joins and dependency preservation during normalization.

Uploaded by

davidboon5859

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views10 pages

Normalization NotesByRK

Uploaded by

davidboon5859

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

NORMALIZATION

Informal Design Guidelines for Relation Schemas

Informal guidelines may be used as measures to determine the quality of relation schema design.
GUIDELINE 1: Informally, each tuple in a relation should represent one entity or relationship instance.
(Applies to individual relations and their attributes).
 Attributes of different entities (EMPLOYEEs, DEPARTMENTs, PROJECTs) should not be
mixed in the same relation
 Only foreign keys should be used to refer to other entities
 Entity and relationship attributes should be kept apart as much as possible.
Bottom Line: Design a schema that can be explained easily relation by relation. The semantics of
attributes should be easy to interpret.
GUIDELINE 2: Design a schema that does not suffer from the insertion, deletion and update anomalies.
If there are any present, then note them so that applications can be made to take them into account
 Mixing attributes of multiple entities may cause problems
 Information is stored redundantly wasting storage
 Problems with update anomalies
– Insertion anomalies
– Deletion anomalies
– Modification anomalies
GUIDELINE 3: Relations should be designed such that their tuples will have as few NULL values as
possible
 Attributes that are NULL frequently could be placed in separate relations (with the primary key)
 Reasons for nulls:
– attribute not applicable or invalid
– attribute value unknown (may exist)
– value known to exist, but unavailable
GUIDELINE 4: The relations should be designed to satisfy the lossless join condition. No spurious
tuples should be generated by doing a natural-join of any relations.
Bad designs for a relational database may result in erroneous results for certain JOIN operations. The
"lossless join" property is used to guarantee meaningful results for join operations
Update Anomalies with examples
Mixing attributes of multiple entities may cause several problems. As Information is stored redundantly
wasting storage, it can also lead to several update anomalies.
Consider the relation:
EMP_PROJ ( Emp#, Proj#, Ename, Pname, No_hours)
 Update Anomaly: Changing the name of project number P1 from “Billing” to “Customer-
Accounting” may cause this update to be made for all 100 employees working on project P1.
 Insert Anomaly: Cannot insert a project unless an employee is assigned to. Inversely - Cannot
insert an employee unless an he/she is assigned to a project.
 Delete Anomaly: When a project is deleted, it will result in deleting all the employees who work
on that project. Alternately, if an employee is the sole employee on a project, deleting that
employee would result in deleting the corresponding project.

Functional Dependency (FDs)

FDs are constraints that are derived from the meaning and interrelationships of the data attributes.
A Functional Dependency indicated by XY where X is a set of attributes and Y is a set of Attributes
holds if whenever two tuples have the same value for X, they must have the same value for Y. i.e. For
any two tuples t1 and t2 in any relation instance r(R): If t1[X]=t2[X], then t1[Y]=t2[Y]
 Examples: social security number determines employee name
SSN -> ENAME
 project number determines project name and location
PNUMBER -> {PNAME, PLOCATION}
 employee ssn and project number determines the hours per week that the employee works on the
project
{SSN, PNUMBER} -> HOURS

Inference Rules for FDs

Given a set of FDs F, we can infer additional FDs that hold whenever the FDs in F hold.
Armstrongs Inference Rules:
Armstrong's inference rules:
IR1. (Reflexive) If Y subset-of X, then X -> Y
IR2. (Augmentation) If X -> Y, then XZ -> YZ
(Notation: XZ stands for X U Z)
IR3. (Transitive) If X -> Y and Y -> Z, then X -> Z
additional inference rules:
(Decomposition) If X -> YZ, then X -> Y and X -> Z
(Union) If X -> Y and X -> Z, then X -> YZ
(Psuedotransitivity) If X -> Y and WY -> Z, then WX -> Z

Closure (2 Definitions)
 Closure of a set F of FDs is the set F+ of all FDs that can be inferred from F
 Closure of a set of attributes X with respect to F is the set X + of all attributes that are
functionally determined by X

Equivalence of Sets of FDs

 Two sets of FDs F and G are equivalent if:
- every FD in F can be inferred from G, and
- every FD in G can be inferred from F
 Hence, F and G are equivalent if F + =G +
Definition: F covers G if every FD in G can be inferred from F (i.e., if G + subset-of F +)
 F and G are equivalent if F covers G and G covers F

Minimal Sets of FDs

A set of FDs is minimal if it satisfies the following conditions:
(1) Every dependency in F has a single attribute for its RHS.
(2) We cannot remove any dependency from F and have a set of dependencies that is equivalent to F.
(3) We cannot replace any dependency X -> A in F with a dependency Y -> A, where Y proper-
subset-of X ( Y subset-of X) and still have a set of dependencies that is equivalent to F.
NORMAL FORMS AND NORMALIZATION
Normalization: The process of decomposing unsatisfactory "bad" relations by breaking up their attributes
into smaller relations
 Normal form: Condition using keys and FDs of a relation to certify whether a relation schema is
in a particular normal form
Denormalization: the process of storing the join of higher normal form relations as a base relation—
which is in a lower normal form

Properties to be satisfied during normalization process (Goals of Normalization)

These include two properties:
■ The nonadditive join or lossless join property, which guarantees that the spurious tuple generation
problem does not occur with respect to the relation schemas created after decomposition
■ The dependency preservation property, which ensures that each functional dependency is represented
in some individual relation resulting after decomposition.
The nonadditive join property is extremely critical and must be achieved at any cost, whereas the
dependency preservation property, although desirable, is sometimes sacrificed
Concepts of KEYS
 A superkey of a relation schema R = {A1, A2, ...., An} is a set of attributes S subset-of R with the
property that no two tuples t1 and t2 in any legal relation state r of R will have t1[S] = t2[S]
 A key K is a superkey with the additional property that removal of any attribute from K will
cause K not to be a superkey any more. In other words a key is a minimal superkey.
 Candidate keys and Primary Key: If a relation schema has more than one key, each is called a
candidate key One of the candidate keys is arbitrarily designated to be the primary key, and the
others are called secondary keys.

Prime and Non-Prime Attribute

 A Prime attribute must be a member of some candidate key
 A Non-prime attribute is not a prime attribute—that is, it is not a member of any candidate key

First Normal Form (1NF)

Definition: A Relation is in 1NF if every attribute of the relation is simple and atomic.
1NF Disallows composite attributes, multivalued attributes, and nested relations
Example:

In the above relation Dlocations is a multivalued attribute which can hold multiple values and
hence not in 1NF. To convert the relation into 1NF, the attribute values has to be made simple and
atomic. This can be done in 3 ways:
1st: Represent each values of the DLocations individually as shown:

The problem with this approach is however there is redundant data and it can lead to anomalies.
2nd: For each location of the department, create new attributes to represent the same.

DNAME DNUMBER DMGRSSN DLOCATION1 DLOCATION2 DLOCATION3

Research 5 333445555 Bellaire Sugarland Houston
Administration 4 987654321 Stafford - -
Headquarters 1 888665555 Houston - -

The problem with this approach is other tuples will have null values for Dlocation2, Dlocation3.

3rd: Remove the attribute Dlocations that violates 1NF and place it in a separate relation
DEPT_LOCATIONS along with the primary key Dnumber of DEPARTMENT.

For nested relations:

First normal form also disallows multivalued attributes that are themselves composite. These are called
nested relations because each tuple can have a relation within it. For example:

Has multiple-level nesting which must be unnest to make the relation into a set of 1NF relations as
follows:

Second Normal Form (2NF)

Uses the concepts of FDs, primary key
 Prime attribute - attribute that is member of the primary key K
 Full functional dependency - a FD Y -> Z where removal of any attribute from Y means the FD
does not hold any more
Examples: - {SSN, PNUMBER} -> HOURS is a full FD since neither SSN -> HOURS nor
PNUMBER -> HOURS hold
- {SSN, PNUMBER} -> ENAME is not a full FD (it is called a partial dependency ) since SSN -
> ENAME also holds

Definition: A relation schema R is in second normal form (2NF) if every non-prime attribute A in R is
fully functionally dependent on the primary key.
Example: consider the following relation shown below. Here the non-prime attribute Hours is Fully
functionally dependent on the PK of the relation where as other non-prime attributes are partially
dependent. Hence the relation is not in 2NF. The relation can be broken down into EP1, EP2 and EP3 to
ensure they satisfy 2NF.
Third Normal Form (3NF)
3NF is based on transitive dependency.
Definition:
 Transitive functional dependency - a FD X -> Z that can be derived from two FDs X -> Y
and Y -> Z
Examples:
- SSN -> DMGRSSN is a transitive FD since
SSN -> DNUMBER and DNUMBER -> DMGRSSN hold
- SSN -> ENAME is non-transitive since there is no set of attributes X where SSN -> X and X ->
ENAME

Definition: A relation schema R is in third normal form (3NF) if it is in 2NF and no non-prime
attribute A in R is transitively dependent on the primary key

Alternate Definition: A relation schema R is in third normal form (3NF) if whenever a FD X -> A holds
in R, then either:
(a) X is a superkey of R, or
(b) A is a prime attribute of R

For example consider the following relation EMP_DEPT. The relation schema EMP_DEPT in Figure is
in 2NF, since no partial dependencies on a key exist. However, EMP_DEPT is not in 3NF because of the
transitive dependency of Dmgr_ssn (and also Dname) on Ssn via Dnumber. We can normalize
EMP_DEPT by decomposing it into the two 3NF relation schemas ED1 and ED2 shown
Boyce Codd Normal Form (BCNF)
BCNF is a Strict version of 3NF
Definition: A relation schema R is in Boyce-Codd Normal Form (BCNF) if whenever an FD X -> A
holds in R, then X is a superkey of R.

Consider the following example: R is in 3NF but not in BCNF as C is not a superkey. To convert the
relation into BCNF, decompose the relations such that X is always a superkey. This can be done as shown
below.

R1(A, C) and R2(A, B)

R1 (B, C) and R2(B, A)
R1 (C, B) and R2(C, A)

Fourth Normal Form (4NF)

4NF is based on Multi-Valued Dependency (MVDs)
Definition:
A multivalued dependency (MVD) X —>> Y specified on relation schema R, where X and Y are both
subsets of R, specifies the following constraint on any relation state r of R: If two tuples t1 and t2 exist in r
such that t1[X] = t2[X], then two tuples t3 and t4 should also exist in r with the following properties, where
we use Z to denote (R 2 (X υ Y)):
· t3[X] = t4[X] = t1[X] = t2[X].
· t3[Y] = t1[Y] and t4[Y] = t2[Y].
· t3[Z] = t2[Z] and t4[Z] = t1[Z].

Definition:
A relation schema R is in 4NF with respect to a set of dependencies F (that includes functional
dependencies and multivalued dependencies) if, for every nontrivial multivalued dependency X —>> Y in
F+, X is a superkey for R.
An all key relation such as EMP does not have any FD’s but has MVD’s. A relation not in 4NF can be
decomposed to remove the redundancy caused by the MVD as shown below.

Fifth Normal Form (5NF)

5NF is based on Join Dependancy
Definition:
A join dependency (JD), denoted by JD(R1, R2, ..., Rn), specified on relation schema R, specifies a
constraint on the states r of R. The constraint states that every legal state r of R should have a non-
additive join decomposition into R1, R2, ..., Rn; that is, for every such r we have
* (pR1(r), pR2(r), ..., pRn(r)) = r

Definition:
 A relation schema R is in fifth normal form (5NF) (or Project-Join Normal Form (PJNF))
with respect to a set F of functional, multivalued, and join dependencies if, for every nontrivial
join dependency JD(R1, R2, ..., Rn) in F+ (that is, implied by F), every Ri is a superkey of R.
The relation SUPPLY is in 4NF because of no
MVD’s however it is no in 5NF, this can be
broken down into R1, R2, R3 to achieve 5NF as
follows

Algorithms:
For finding closure:

For finding key:

Informal Guidelines for Database Normalization
No ratings yet
Informal Guidelines for Database Normalization
53 pages
Functional Dependencies & Normalization Guide
No ratings yet
Functional Dependencies & Normalization Guide
52 pages
Attributes and Relations in DBMS
No ratings yet
Attributes and Relations in DBMS
36 pages
Lecture 6 Normalization
No ratings yet
Lecture 6 Normalization
9 pages
Informal Guidelines for Database Design
No ratings yet
Informal Guidelines for Database Design
34 pages
Functional Dependencies & Normalization Guide
No ratings yet
Functional Dependencies & Normalization Guide
35 pages
Functional Dependencies in Database Normalization
No ratings yet
Functional Dependencies in Database Normalization
26 pages
DBMS Normalization Techniques Explained
No ratings yet
DBMS Normalization Techniques Explained
38 pages
Functional Dependencies & Normalization Guide
No ratings yet
Functional Dependencies & Normalization Guide
36 pages
Database Normalization and SQL Basics
No ratings yet
Database Normalization and SQL Basics
10 pages
Disallowing Spurious Tuples in DBMS
No ratings yet
Disallowing Spurious Tuples in DBMS
12 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
49 pages
CS307 - Lecture 5
No ratings yet
CS307 - Lecture 5
44 pages
Functional Dependencies & Database Normalization
No ratings yet
Functional Dependencies & Database Normalization
35 pages
Understanding Database Normalization Techniques
No ratings yet
Understanding Database Normalization Techniques
44 pages
Functional Dependency & Normalization Guide
No ratings yet
Functional Dependency & Normalization Guide
48 pages
Database Normalization Guidelines
No ratings yet
Database Normalization Guidelines
14 pages
DBMS Design Guidelines for KTU Students
No ratings yet
DBMS Design Guidelines for KTU Students
4 pages
Understanding Functional Dependencies and Normal Forms
No ratings yet
Understanding Functional Dependencies and Normal Forms
45 pages
SQL Functional Dependencies & Normalization
No ratings yet
SQL Functional Dependencies & Normalization
151 pages
Relational Database Design Guidelines
No ratings yet
Relational Database Design Guidelines
4 pages
Functional Dependencies & Database Normalization
No ratings yet
Functional Dependencies & Database Normalization
35 pages
Database Design Theory: Normalization & Keys
No ratings yet
Database Design Theory: Normalization & Keys
13 pages
Functional Dependencies in DB Normalization
No ratings yet
Functional Dependencies in DB Normalization
55 pages
Unit-4 Notes
No ratings yet
Unit-4 Notes
19 pages
Functional Dependencies & Database Normalization
No ratings yet
Functional Dependencies & Database Normalization
44 pages
Database Concepts: Lecture 10: Functional Dependencies and Normalization
No ratings yet
Database Concepts: Lecture 10: Functional Dependencies and Normalization
22 pages
Module 4 Dbms Student
No ratings yet
Module 4 Dbms Student
51 pages
Functional Dependencies and Normalization
No ratings yet
Functional Dependencies and Normalization
60 pages
Database Normalization and FDs Guide
No ratings yet
Database Normalization and FDs Guide
48 pages
Functional Dependencies & Normalization Guide
No ratings yet
Functional Dependencies & Normalization Guide
59 pages
Functional Dependencies & Database Normalization
No ratings yet
Functional Dependencies & Database Normalization
84 pages
Normalization and Indexing in Databases
No ratings yet
Normalization and Indexing in Databases
16 pages
Functional Dependencies and Normalization
No ratings yet
Functional Dependencies and Normalization
55 pages
Functional Dependencies and Normal Forms
No ratings yet
Functional Dependencies and Normal Forms
34 pages
Database Schema Design and Refinement
No ratings yet
Database Schema Design and Refinement
95 pages
Relational Database Design Principles
No ratings yet
Relational Database Design Principles
52 pages
Relational Algebra and Normalization Guide
No ratings yet
Relational Algebra and Normalization Guide
35 pages
Functional Dependencies and Normalization
No ratings yet
Functional Dependencies and Normalization
65 pages
Understanding 2nd Normal Form in Databases
No ratings yet
Understanding 2nd Normal Form in Databases
25 pages
Functional Dependencies & Database Normalization
No ratings yet
Functional Dependencies & Database Normalization
29 pages
Functional Dependencies & Normalization Guide
No ratings yet
Functional Dependencies & Normalization Guide
95 pages
Database Normalization Guidelines
No ratings yet
Database Normalization Guidelines
36 pages
Database Normalization and Dependencies
No ratings yet
Database Normalization and Dependencies
38 pages
Understanding the Relational Model in DBMS
No ratings yet
Understanding the Relational Model in DBMS
32 pages
Database Normalization Techniques
No ratings yet
Database Normalization Techniques
51 pages
Understanding Functional Dependencies
No ratings yet
Understanding Functional Dependencies
51 pages
Normalization and Spurious Tuples in DBMS
No ratings yet
Normalization and Spurious Tuples in DBMS
22 pages
Functional Dependencies and Normalization Guide
No ratings yet
Functional Dependencies and Normalization Guide
36 pages
Functional Dependencies & Normalization Guide
No ratings yet
Functional Dependencies & Normalization Guide
48 pages
Understanding Functional Dependencies and Normalization
No ratings yet
Understanding Functional Dependencies and Normalization
61 pages
Functional Dependencies in DB Normalization
No ratings yet
Functional Dependencies in DB Normalization
73 pages
09 - Functional Dependencies Normalization
No ratings yet
09 - Functional Dependencies Normalization
60 pages
Relational Database Design and Normalization
No ratings yet
Relational Database Design and Normalization
4 pages
Candidate Key and Functional Dependency
No ratings yet
Candidate Key and Functional Dependency
42 pages
Relational Database Normalization Guide
No ratings yet
Relational Database Normalization Guide
38 pages
Database Normalization Techniques
No ratings yet
Database Normalization Techniques
27 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
73 pages

Normalization NotesByRK

Uploaded by

Normalization NotesByRK

Uploaded by

NORMALIZATION

Informal Design Guidelines for Relation Schemas

Functional Dependency (FDs)

Inference Rules for FDs

Equivalence of Sets of FDs

Minimal Sets of FDs

Properties to be satisfied during normalization process (Goals of Normalization)

Prime and Non-Prime Attribute

First Normal Form (1NF)

DNAME DNUMBER DMGRSSN DLOCATION1 DLOCATION2 DLOCATION3

For nested relations:

Second Normal Form (2NF)

R1(A, C) and R2(A, B)

Fourth Normal Form (4NF)

Fifth Normal Form (5NF)

For finding key:

You might also like