0% found this document useful (0 votes)
6 views11 pages

A3 Task

Normalization is a process used in database design to minimize data redundancy and avoid anomalies by structuring data into related tables. It consists of three stages: 1st Normal Form (1NF), 2nd Normal Form (2NF), and 3rd Normal Form (3NF), each progressively refining the data structure. The process also addresses issues of referential integrity, ensuring that linked records are properly managed during updates and deletions.

Uploaded by

2510047
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views11 pages

A3 Task

Normalization is a process used in database design to minimize data redundancy and avoid anomalies by structuring data into related tables. It consists of three stages: 1st Normal Form (1NF), 2nd Normal Form (2NF), and 3rd Normal Form (3NF), each progressively refining the data structure. The process also addresses issues of referential integrity, ensuring that linked records are properly managed during updates and deletions.

Uploaded by

2510047
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Normalisation

When considering a real-life situation that you want to model using a relational database, it
can sometimes be difficult to see how to divide up the data into different related tables in the
most efficient way. The process of normalisation has been developed to help database
designers create designs that avoid data redundancy (duplicated data), which, in turn,
minimises storage requirements and helps to avoid the problems that occur with duplicated
data. For example, if a customer’s postcode is held on a database in two different tables, then
should the customer move house and change their postcode, it may not be updated in more
than one table, which may cause anomalies.
The process of normalisation is an important part of the database design process. The outcome
of the process of normalisation will be a verified database design which identifies the main
structure of the database, including:
▸ the tables that the database will be split into
▸ the indexes used, including the primary and foreign keys and any composite keys required
for the tables
▸ the structure of the individual tables from which the data dictionary can be created.
Key term
Verification – Verification is the process of checking that something is correct. Normalisation
is the process used to verify the database design.
Following on from the normalisation process, database designers need to consider the issue
of referential integrity and how the updating and deletion of records on related tables is dealt
with.
The stages of normalisation
Normalisation is a three-stage process which begins with raw data, such as you might extract
from a paper-based system that the database you are designing will replace.
Summary of the stages of normalization
1. The process starts off with raw data, which is referred to as un-normalised (UNF).
2. The first stage takes raw data and modifies its structure to the meet the requirements
of 1st normal form (1NF).
3. The second stage further modifies the 1NF structure to meet the requirements of 2nd
normal form (2NF).
4. The third stage takes the 2NF data structure and modifies it to meet the requirements
of 3rd normal form (3NF).
To demonstrate the process we will return to the example of the online ordering system. Figure
2.6 gives an idea of the sort of paper-based order form the database will emulate. This is our
un-normalised data (UNF). Note that the order number and the customer details appear just
once but the product details repeat for each item ordered.

Figure 2.6: Paper-based order form


Stage 1: 1st Normal form
The first stage of normalisation involves modifying the data to meet the requirements of 1st
normal form (1NF) by doing the following.
▸ Remove any calculated fields (such as total order value).
▸ Make sure that each data item is atomic. The customer name is not atomic since it can be
split into ‘First name’ and ‘Surname’. The same is true of the customer address but, for
simplicity, we will ignore this attribute.
▸ Remove repeating groups so that there is a data item for every record for each attribute. So
for example, in the paper-based order form, the order number and customer number occur
just once but we must include them in each record on our new 1NF table.
▸ Finally, we must select a primary key (which needs to be unique). There is no one value
which is unique but we can create a composite key using the Order Number and the Product
ID.
Key term
Atomic – in this context, atomic means broken down into individual parts. For example, in
terms of its suitability for a database field a person’s name is not atomic because it can be
broken down into title, first name, middle names and surname.
The completed 1NF table is shown here (Table 2.10). More orders have been added to make
the process clearer. The key fields are indicated by underlining the attribute name.

Stage 2: 2nd Normal form


The second stage of normalisation is to modify the data to meet the requirements of 2nd normal
form (2NF). If a table has a primary key based on a single attribute, then nothing needs to be
done at the 2NF stage. However, our table does have a composite key. In this case, we need to
move any data that is only dependent on one part of the composite key attributes to a separate
table.
The data that are only dependent on one part of the composite key attributes are the Description
and Price, as these only depend on the Product ID and should not be in the OrderNumber table.
If you know the Product ID, then you can find the Description and Price, but knowing the
order number alone will not uniquely identify the product (as several items may be on the
order). The completed 2NF tables now look like this (Tables 2.11–12).
Key term
Dependent on – one field depends on another if you can only find out the unique value of the
second field if you know the value of the first one. For example, if you know the Product ID,
then you can find the description and the price.
Note that the duplicates are removed from the Product table but the Product ID is left in the
main table as a foreign key.
Stage 3: 3rd Normal form
The third and final stage in normalisation is to modify the data to meet the requirements of 3rd
normal form (3NF). The requirement of 3NF is that none of the non-primary key attributes
should depend on any other attributes. If they do, then they need to be moved to another table,
leaving a copy of the primary key field for the new table in the original table as a foreign key.
In the main table, Customer no. will allow you to find a unique customer name (First and
Surname) and, since these are not key attributes, they need to be moved to another table (the
Customer table) with a copy of the key for that new table (customer number) left in the main
table as a foreign key. The 3NF tables are shown here (Tables 2.13–15).
However, the data shown in Tables 2.13–15 are not fully normalised yet because in the main
table there is dependence between Order number and Customer number. Since an order is only
placed by one customer, if you know the order number, you can find the customer number.
Order number and customer number therefore need to go in their own table (the Orders table)
as shown here in the completed 3NF tables (Tables 2.16–19).
Note that the remaining table is named the Order products table since it identifies which
products are on each order.
Research
Although for this qualification you only need to understand the process of normalisation up to
3rd Normal form (3NF), there are higher levels of normalisation. Find out what these are and
why they might be needed.
Database anomalies
The process of normalisation is designed to help reduce the possibility of anomalies occurring
as the database is used. To see how anomalies might occur in a database which has not been
fully normalised we shall look at an expanded version of the table we used to demonstrate how
data at 1NF looked in the section Stage 1 of The stages of normalisation (see Table 2.20).

The following type of anomalies can occur with this kind of 1NF database table:
▸ insert
▸ delete
▸ update.
Insert anomalies
To insert a new order into the database, the customer’s name has to been entered even if that
customer has previously placed an order. This gives rise to the possibility of errors or
inconsistencies.
Another possible source of anomalies is the fact that an existing customer might need to enter
their address every time they place an order. Compare this with the 3NF version of the database
(Table 2.17). In this, the customer details (only Customer Fname and Sname are shown, but
address details would be in the same table) are recorded once only per customer in the
Customer table. Only the Customer no. attribute needs to be recorded in the Orders table.
Another issue would be registering of new customers. In the 1NF version of the database, a
new customer could only be registered if they placed an order as the Product ID is part of the
composite key. Therefore, it is not possible to create a record with a null value in the Product
ID attribute (that is, it is impossible to create a customer account without placing a record).
Delete anomalies
If a customer places an order and then later cancels it, it needs to be deleted. However, if that
customer does not have any other orders, then all their details will be lost since without an
order you cannot have any customer details (you cannot have a customer record). This is
avoided in the 3NF version of the database as customer details are held in a separate table.
Update anomalies
If a customer changes their address, for example, then all the records containing orders for that
customer may need to be searched to change the address on those orders too. Again, this is
avoided in the 3NF version of the database, as customer details are held in a separate table.
Another issue which makes the 1NF version of the database unworkable is that of products
and their details. In this version, only products which have been ordered can exist on the
database. The only way to add a new product is if someone orders it. This clearly would not
be acceptable in practice.
As you can see from the normalisation example we have completed, the process helps you to
decide on the indexing, including the primary, foreign and composite keys required for each
table. The completed table diagrams can be used as the basis for the data dictionary, which
will include full details of the tables, attributes, data types and validation required for the whole
database.
Referential integrity and cascading update/ delete
One issue that can occur with a relational database is how to deal with linked records when
deleting or updating a record at the ‘one’ end of a one-to-many relationship. Consider the
Customer and Orders relationship. One customer can place many orders and the link is made
by placing the Customer table primary key attribute as a foreign key in the Order table.
But what happens if a customer closes his account with the company? If we delete the customer
record alone, that will leave orders on the Order table with no corresponding customer. These
might be considered ‘lost’ orders and might cause problems with an application which expects
to see every order linked to a customer record. One alternative is to use cascading updates,
where deleting a record on the Customer table would automatically delete related records on
the Order table. Microsoft® Access® (and SQL Server) provides an option to switch this
facility on, which is known as referential integrity.
Another problem may occur if you were to allow updates to the key field to which other records
are related. Imagine an employee database where the key field for the Employee table was the
employee’s NI number. Staff annual appraisal records are linked to the employee record by
inserting the primary key from the Employee table (the NI number) as a foreign key on the
Appraisals table. Consider what would happen if it was discovered that a data entry error had
been made in one employee’s NI number and it was updated to be correct. This would mean
that all of the employee’s appraisal records would be ‘lost’ because the NI number on the
parent record had changed. The referential integrity setting in Microsoft® Access® and SQL
Server allows you to prevent changes to key fields which exist as foreign keys in other tables.

You might also like