Chapter 11
Database
Limitation of file based approach
• Data is not validated on entry
• There can be duplicate data.
• There can be empty fields.
• Lack of built in controls during data entry leads to data integrity
problems.
• Data privacy issues as access authority cannot be given to a particular
part of file.
• Data redundancy(data duplication across files)[once data is stored in a
file there should be no need for it to be stored again).
• Data dependency concerns: if new data is added programs also need
to be re written and it will not support new application
Data Dictionary
• A Data Dictionary is a collection of names, definitions, and attributes about
data elements that are being used or captured in a database, information
system.
• Data Dictionary has
1. table name
2. field name // attribute
3. data type
4. type of validation
5. Primary Key
6. Foreign Key
7. relationships
The Relational Database
• Relational Databases use two or more tables linked together (to form
a relationship)
• Relational Databases do not store all the data in the same table.
• Repeated data is moved into it's own table as shown in the image
below:
Relation and Entity
• Relation – special type of table used in a relational database.
• An entity is used to represent an object in the real world that
can be distinguished from other objects. An entity can be a
physical object (such as a person or a place) or a concept (such
as an activity or a task) for which we need to record data in the
database.
For example, a physical object could be an employee, a
customer or a product, and a concept could be an online order, a
school course or a booking.
• An attribute is used to represent a property, a quality or a
characteristic that describes an entity. For example, the name of
an employee, or the date and time that a booking was
submitted.
Instance
• Each set of values that corresponds to a specific teacher is
called an instance of the Teacher entity.
• In order to distinguish between the different instances of an
entity we need to establish an entity identifier (also known an
a key attribute).
• This is an attribute (or set of attributes) that can be used to
uniquely identify each instance of the entity.
• For example, the attribute TeacherId is a unique number that is
assigned to each teacher when they are hired. Therefore, it is
unique for each instance of the entity and can be used as an
entity identifier. An entity identifier can't be null.
Primary key, Candidate key, Secondary key
and Foreign key
• Primary key – attribute or attributes for which the values is unique.
• Candidate key – attribute which can be chosen as a primary key
• Secondary key- a candidate key which has not been chosen a a
primary key.
• Foreign key – an attribute in one table that refers to a primary key in
another table.
What is a relationship?
• A relationship is formed when our two tables are joined together.
• Relationships make use of foreign keys to allow the two tables to
communicate with each other and share their data.
• Foreign keys are identified using a primary key as shown in the image
below:
Primary Key Primary Key
This ensures referential integrity that the value can be entered only in one table when
the same value already exists in the referenced table
Entity Relationship Modelling
• ER model is a type of model based on the notion of real world entities
and relationship among them.
• We can map real world scenarios into ER database model.
• ER model creates a set of entities with their attributes , set of
constraints and relation between them.
• A relationship of two or more entities is used to represent an
interaction or association that exists between those entities. For
example, an employee works in a company, therefore a
relationship exists between the entities Employee and Company.
• In an entity relationship (ER) diagram, each entity is
represented by a rectangle. A relationship between entities is
shown as a line and can be one of three types:
Creating an ER diagram for theatrical agency
• Agency needs a database to handle their bookings.
• Each booking is for a venue.
• Each booking might be for one or more bands
Step 1: Choose Entities
1. Booking
2. Band
3. Member
4. Venue
Step 2: Identify the relationships
1. Booking with Venue
2. Booking with Band
3. Band with Member
A preliminary Entity Relationship Diagram:
Member Band Booking Venue
Step 3 : Decide the cardinalities of the
relationship
• One to one or 1:1
• One to many or 1:M
• Many to one or M:1
• Many to Many or M:M
Relation between member and band entities
The M:1 relationship
Member Band
The M:1 relationship with more details[ member must belong to a
band and band must have more than one member]
MAX MIN
MIN MAX
Member Band
Relationship between booking and venue
• M:1 relationship
Booking Venue
Many booking possible for 1 venue but a venue may be
present without booking
Relationship between Band and booking
entities
• M:M relationship
Band Booking
Full E-R diagram
Belongs to is booked for is made at
Member Band Booking Venue
has is for is booked for
Question 1:
Draw the ER diagram
Question 2
Logical ER model
Consider the entities Candidate and Club
Candidate Club
Resolving many-to-many relationships
A third entity called membership could be created. A compound key would
be used to uniquely identify each record in the new table.
Membership
The primary key from the Candidate table (Candidate number)
and the primary key from the Club table (Club ID) are present
as foreign keys in the Membership table. They combine to
create a compound key which can be used to uniquely identify
each member of each club. We have created two one to many
relationships to remove the many to many relationship from
the database. The relationships between the entities would
look like this on an entity relationship diagram:
Many to Many Relationship
• Many to many relationships create uncertainty and duplications on
data that will eventually result in wrong statements for queries.
Example
Each person can use many banks and each bank can have many
customers.
• In the Entity-Relationship (ER) Diagram, ‘person_id’ is specified as
Primary Key, and Primary Keys should be unique. There are rows that
have the same Primary Key in people table.
• A common way to avoid problems while setting many to many
relationships is creating a new table which is called ‘join table’,
’junction table’, ’cross-reference table’, ‘bridging table’, ‘intersection
table’ in various resources.
Normalisation of Database
Database Normalisation is a technique of organizing the data in the
database. Normalization is a systematic approach of decomposing
tables to eliminate data redundancy and undesirable characteristics like
Insertion, Update and Deletion Anomalies.
It is a multi-step process that puts data into tabular form by removing
duplicated data from the relation tables.
Normalization is used for mainly two purpose,
Eliminating redundant (useless) data.
Ensuring data dependencies make sense i.e. data is logically stored.
Database Normalization Examples
Assume a student table with information regarding the choice of the
subjects.
Database Normal Forms
Now let's move into 1st Normal Forms 1NF (First Normal Form) Rules
• Each table cell should contain a single value.
• Each record needs to be unique.
2NF (Second Normal Form) Rules
• Rule 1- Be in 1NF
• Rule 2- Single Column Primary Key
• No partial dependencies
What is Partial Dependency?
• Partial Dependency occurs when a non-prime attribute is functionally
dependent on part of a candidate key.
• The 2nd Normal Form (2NF) eliminates the Partial Dependency.
Example
Dependency
• When we can ask for branch name of student with student_id 10, and
we can get it. Similarly, if we ask for name of student with student_id
10 or 11, we will get it. So all we need is student_id and every other
column depends on it, or can be fetched using it.
• This is Dependency and we also call it Functional Dependency.
Partial Dependency Example
The primary key for this table is a composition of two columns which is
student_id & subject_id but the teacher's name only depends on
subject, hence the subject_id, and has nothing to do with student_id.
This is Partial Dependency, where an attribute in a table depends on
only a part of the primary key and not on the whole key.
To remove partial dependency
2 nd NF
Requirements for Third Normal Form
• For a table to be in the third normal form,
• It should be in the Second Normal form.
• And it should not have Transitive Dependency.
What is Transitive Dependency?
When a non-prime attribute depends on other non-prime attributes rather
than depending upon the prime attributes or primary key.
Here we can see that total_marks is only
dependent on the exam_name
The Database Management System[DBMS]
• A database management system (DBMS) is system software for
creating and managing databases.
• A DBMS makes it possible for end users to create, read, update and
delete data in a database.
DBMS Schemas: Internal, Conceptual, External
[Note:In computer programming, a schema (pronounced SKEE-mah) is the organization or structure for a database. The activity of data modeling
leads to a schema. (The plural form is schemata.]
Database systems comprise of complex data structures. Thus, to make
the system efficient for retrieval of data and reduce the complexity of
the users, developers use the three level model.
• Internal Level: Actual PHYSICAL storage structure and access paths.
• Conceptual or Logical Level: Structure and constraints for the entire
database
• External or View level: Describes various user views
Facilities of DBMS
• It allows users to define the database, usually through a Data
Definition Language (DDL).
• The DDL allows users to specify the data types and structures and the
constraints on the data to be stored in the database.
• It allows users to insert, update, delete, and retrieve data from the
database, usually through a Data Manipulation Language (DML).
• It also provides a query processor that allows a query to be created
and processed. The query is the mechanism of extracting and
manipulating data from the database.
• It also has the capability to create reports to present formatted
output.
Role of DBA
• Data Administrator (DA) is responsible for the management of the
data resource, including database planning; development and
maintenance of standards and conceptual/logical database design.
Improve performance of a database
Index for a Table
Indexes are used to quickly locate data without having to search every
row in a database table every time a database table is accessed.
Indexes can be created using primary or secondary key of database
table, providing the basis for both rapid random lookups and efficient
access of ordered records.
Structured Query Language (SQL)
• SQL is programming language provided by a DBMS.
• SQL lets you access and manipulate databases.
Data Definition Language(DDL)
• DDL is part of SQL provided for creating or altering tables
• Data Definition Language standards that all the database languages
adhere to – MySQL, Oracle, SQL Server, etc.
Data Definition Language Commands
• CREATE DATABASE
The Create Database command creates a new database.
Syntax:
CREATE DATABASE <DatabaseName>;
Example:
CREATE DATABASE Bandbooking;
CREATE TABLE
Now that we have a new database created, we need to create tables in our
database.
Syntax:
CREATE TABLE <TableName> (
<Column1> <DataType>,
<Column2> <DataType>,
<Column3> <DataType>,
.
.
.
<ColumnN> <DataType>,
)
• Example
CREATE TABLE Band(
BandName varchar(25),
NumberOfMembers integer );
Alter command
Alter commands in DDL. They are used to modify the tables to add,
rename, edit or delete a column. Alter command can also rename the
table itself.
1. The Alter Table command allows us to add a column to an existing
table
Syntax:
ALTER TABLE <TableName> ADD <ColumnName> <DataType>
Modify a column
Syntax:
ALTER TABLE <TableName> MODIFY <ColumnName> <DataType>
Rename a table
Syntax:
ALTER TABLE <TableName> RENAME TO <NewTableName>
Add Primary and Foreign Key
Example
ALTER TABLE Band ADD PRIMARY KEY (BandName);
ALTER TABLE Band-Booking ADD FOREIGN KEY (BandName
REFERENCES Band(BandName);
Data Manupulation Language(DML)
A data manipulation language (DML) is a computer
programming language used for reading , adding (inserting), deleting,
and modifying (updating) data in a database.
• Examples of DML:
• INSERT – is used to insert data into a table.
• UPDATE – is used to update existing data within a table.
• DELETE – is used to delete records from a database table.
• SELECT – is used to obtain data from database or for query
Practice Select Command
[Link]
SELECT ORDER BY
SELECT GROUP BY
SELECT WHERE