Understanding Relational Database Models
Understanding Relational Database Models
Course Delivery
Unit-2
Relational Model
• Represents database as a collection of relations
• Flat file of records
• Tuple -- row
• Attribute -- column header
• Table – relation
Domains,Attributes , Tuples and Relations
• Domain D is a set of atomic values
• examples
• Usa_phone_numbers: The set of ten digit phone numbers
valid in united states
• Local_phone_numbers: The set of seven digit phone numbers
valid within a particular area code in united states
• Social_security_numbers: the set of valid nine digit socail
security numbers
• Names: The set of character strings that represents names of a
person
• Employee_ages: possible ages of emp in a company
• Data type can be specified for each domain
• Usa_phone_numbers – character string of form (ddd)ddd-dddd
• D – numeric digit
• First 3 digits– valid telephone area code
• Employee_ages: integer number between 15 and 80
Formal Definitions - Schema
Slide 5- 4
Formal Definitions - Tuple
Slide 5- 5
• STUDENT(Name, Ssn, Home_phone, Address, Office_phone,
Age, Gpa)
• STUDENT(Name : string , Ssn: string , Home_phone:string,
Address:string, Office_phone:string, Age:integer, Gpa:real)
• dom(Name) = Names
• r(R): relation state – this is a set of tuples (rows)
• r(R) = {t1, t2, …, tn} where each ti is an ordered list of n values
• ti = <v1, v2, …, vn> where each vj is an element-of dom(Aj)
• Relation intension -- schema R
• Relation extension -- Relation state
Formal Definitions - Example
Slide 5- 7
Definition Summary
Slide 5- 8
Example of a Relation
Slide 5- 9
• r(R) dom (A1) X dom (A2) X ....X dom(An)
• Total number of tuples
|dom (A1)| X |dom (A2)| X ....X |dom(An)|
• Several attributes can have same domain, attribute names
indicate different roles
• Same domain USA_phone_numbers
• Home_phone
• Office_phone
Formal Definitions - Domain
Slide 5- 11
Characteristics of Relations
Slide 5- 19
Domain constraint
• Within each tuple , value of each attribute A must be atomic
value from domain dom(Ai)
• No two tuple in a relation can have same values
• SK – Subset of attribute(SUPER KEY)
• T1[SK] T2[SK]
• Key K of relation schems R is superkey of R with property that
removing any attribute A from K leaves set of attributes K’ that
is not a superkey of R anymore
Summary Chart:
Candidate Key Can uniquely identify records (any Student_ID or Email in Students
of them could be the primary key)
Unique Key Ensures uniqueness, can have null Username or Email in Users
Alternate Key A candidate key not chosen as Student_ID in Students (if Email is
primary key primary key)
Out of all the Candidate keys that can be possible for a table,
only one key will be used to retrieve unique tuples from the
table. This Candidate key is called the Primary Key.
There can be only one Primary key for a table. Depending on how
the Candidate Key is constructed, the primary key can be a singl
attribute or a group of attributes. But the important point is that th
Primary key should be a unique and non-null attribute(s)
There can be two ways to create a Primary key for the table.
The first way is to alter an already created to add the Primary
key constraint on an attribute. This is shown below:
The above table has following super keys. All of the following sets of
super key are able to uniquely identify a row of the employee table.
Slide 5- 34
• Key satisfies two properties
• Two distict tuples in any state of relation cannot have
identical values
• Minimal superkey(candidate key)– superkey from which we
cannot remove any attributes and still have uniqueness
constraint hold
• Key is superkey(not vice versa)
{Ssn} – key
{Ssn,name,age} – superkey
More than one key – candidate key
One among them – Primary
Another constraint
Null allowed or not
Every student name must have valid non null value
Name constrained to NOT NULL
Relational Database Schema
• S is a set of relational schemas S={R1,R2….Rm} and
a set of integrity constraints IC
• Relational database state DB of S is a set of relational
states DB={r1,r2….rm} such that each ri is a state of
Ri and such that ri relation state satisfies the IC
• Schema diagram
• Database state satisfying all IC – Valid state
• Invalid state
• Entity Integrity constraint – no primary key value can be
NULL
Referential integrity
• Tuples in one relation that refers to another relation must refer
to an existing tuple in that relation
• Dno of EMP must match Dnumber value in DEPT
Foreign key
• Two relations R1 and R2
• Set of attributes FK in relation schema R1 is foreign key of R2
that references Relation R2 if it satisfies following rules
• Attributes in FK have same domain as primary key attributes PK of R2
• Value of FK in t1 of current state r1(R1) either occurs as a value of PK for some t2 in state
r2(R2) or is NULL
• R1 – referencing relation
• R2 – referenced relation
• Two conditions satisfied– referential integrity constraint from
R1 to R2 is said to hold
• Dno-- foreign key of EMP referring to DEPT
• Dno should match value of primary key of DEPT
• John smith refers to research dept(diagram)
• Foreign key can refer to its own relation
• Super_ssn in EMP refers to supervisor of an EMP
• Eg
• John smith references tuple for EMP ‘franklin ’
• Franklin wong is supervisor of john smith
Other types of constraints
• Semantic integrity constraint
• Not part of DDL
• Salary of employee should not exceed salary of employee
supervisor
• Maximum num of hours an emp can work on project per
week is 56
• All constraint discussed so far are called state constraints
• Define constraints that a valid state of database must satisfy
• Transition constraint
• Defined to deal with state changes in database
eg : Salary of an employee can only increase
Update Operations on Relations
Slide 5- 44
Possible violations for each operation
Slide 5- 45
Possible violations for each operation
Slide 5- 46
Possible violations for each operation
Slide 5- 47
Update operations
• Insert
• Delete
• Update
• Modify
• Insert
• Operation:
• Referential integrity
• Update operation
Slide 6- 53
Unary Relational Operations: SELECT
• The SELECT operation (denoted by (sigma)) is used to select a subset
of the tuples from a relation based on a selection condition.
• The selection condition acts as a filter
• Keeps only those tuples that satisfy the qualifying condition
• Tuples satisfying the condition are selected whereas the other
tuples are discarded (filtered out)
• In general, the select operation is denoted by
<selection condition>(R)
• the symbol (sigma) is used to denote the select operator
• the selection condition is a Boolean (conditional) expression specified on the attributes
of relation R
Slide 6- 54
Unary Relational Operations: SELECT
• Examples:
• Select the EMPLOYEE tuples whose department number is 4:
DNO = 4 (EMPLOYEE)
• Select the employee tuples whose salary is greater than $30,000:
SALARY > 30,000 (EMPLOYEE)
Slide 6- 55
Unary Relational Operations: SELECT
(contd.)
• SELECT is commutative:
• <condition1>( < condition2> (R)) = <condition2> ( < condition1> (R))
Slide 6- 56
The following query results refer to this
database state
Slide 6- 57
Unary Relational Operations:
PROJECT
Slide 6- 58
Unary Relational Operations:
PROJECT (cont.)
Slide 6- 59
Unary Relational Operations:
PROJECT (contd.)
Slide 6- 60
• Select the tuples for all employees who either work in department no
4 and make over $25000 per year, or work in department 5 and make
over $30000
Single expression versus sequence of
relational operations (Example)
• To retrieve the first name, last name, and salary of all
employees who work in department number 5, we must
apply a select and a project operation
• We can write a single relational algebra expression as
follows:
• FNAME, LNAME, SALARY( DNO=5(EMPLOYEE))
OR
• We can explicitly show the sequence of operations, giving a
name to each intermediate relation:
• DEP5_EMPS DNO=5(EMPLOYEE)
• RESULT FNAME, LNAME, SALARY (DEP5_EMPS)
Slide 6- 63
Unary Relational Operations:
RENAME
• The RENAME operator is denoted by (rho)
• In some cases, we may want to rename the attributes of a relation or
the relation name or both
Slide 6- 64
Unary Relational Operations:
RENAME (contd.)
Slide 6- 65
Unary Relational Operations:
RENAME (contd.)
• For convenience, we also use a shorthand for renaming attributes in
an intermediate relation:
• If we write:
• RESULT FNAME, LNAME, SALARY (DEP5_EMPS)
• RESULT will have the same attribute names as
DEP5_EMPS (same attributes as EMPLOYEE)
• If we write:
• RESULT (F, M, L, S, B, A, SX, SAL, SU, DNO)
FNAME, LNAME, SALARY (DEP5_EMPS)
• The 10 attributes of DEP5_EMPS are renamed to F,
M, L, S, B, A, SX, SAL, SU, DNO, respectively
Slide 6- 66
• TEMP DNO=5(EMPLOYEE)
• R(First_name, Last_name,Salary) FNAME, LNAME, SALARY (TEMP)
Relational Algebra Operations from
Set Theory: UNION
• UNION Operation
• Binary operation, denoted by
• The result of R S, is a relation that includes all tuples that are either in R or
in S or in both R and S
• Duplicate tuples are eliminated
• The two operand relations R and S must be “type
compatible” (or UNION compatible)
• R and S must have same number of attributes
• Each pair of corresponding attributes must be type
compatible (have same or compatible domains)
Slide 6- 69
Relational Algebra Operations from
Set Theory: UNION
• Example:
• To retrieve the social security numbers of all employees who
either work in department 5 (RESULT1 below) or directly supervise
an employee who works in department 5 (RESULT2 below)
• We can use the UNION operation as follows:
DEP5_EMPS DNO=5 (EMPLOYEE)
RESULT1 SSN(DEP5_EMPS)
RESULT2(SSN) SUPERSSN(DEP5_EMPS)
RESULT RESULT1 RESULT2
• The union operation produces the tuples that are in either
RESULT1 or RESULT2 or both
Slide 6- 70
Result of union operation
Relational Algebra Operations from
Set Theory
Slide 6- 72
Relational Algebra Operations from Set Theory:
INTERSECTION
• INTERSECTION is denoted by
• The result of the operation R S, is a relation
that includes all tuples that are in both R and S
• The attribute names in the result will be the
same as the attribute names in R
• The two operand relations R and S must be
“type compatible”
Slide 6- 73
Relational Algebra Operations from Set
Theory: SET DIFFERENCE (cont.)
Slide 6- 74
Some properties of UNION, INTERSECT,
and DIFFERENCE
Slide 6- 76
Relational Algebra Operations from Set Theory:
CARTESIAN PRODUCT
Slide 6- 77
Relational Algebra Operations from Set
Theory: CARTESIAN PRODUCT (cont.)
Slide 6- 78
Relational Algebra Operations from Set
Theory: CARTESIAN PRODUCT (cont.)
Slide 6- 79
Cartesian product
Binary Relational Operations: JOIN
Slide 6- 83
Binary Relational Operations: JOIN
(cont.)
• Example: Suppose that we want to retrieve the name of the
manager of each department.
• To get the manager’s name, we need to combine each DEPARTMENT
tuple with the EMPLOYEE tuple whose SSN value matches the
MGRSSN value in the department tuple.
• We do this by using the join operation.
Slide 6- 84
Some properties of JOIN
Slide 6- 86
Some properties of JOIN
Slide 6- 87
Binary Relational Operations:
EQUIJOIN
• EQUIJOIN Operation
• The most common use of join involves join conditions with equality
comparisons only
• Such a join, where the only comparison operator used is =, is called
an EQUIJOIN.
Slide 6- 88
Additional Relational Operations
(cont.)
Slide 6- 89
Additional Relational Operations
(cont.)
• The left outer join operation keeps every tuple in the first or left
relation R in R S; if no matching tuple is found in S, then the
attributes of S in the join result are filled or “padded” with null
values.
• A similar operation, right outer join, keeps every tuple in the second
or right relation S in the result of R S.
• A third operation, full outer join, denoted by keeps all tuples
in both the left and the right relations when no matching tuples are
found, padding them with null values as needed.
Slide 6- 90
(INNER JOIN)
LEFT JOIN):
(RIGHT JOIN):
(FULL JOIN):
Additional Relational Operations
(cont.)
Slide 6- 93
LEFT JOIN):
(RIGHT JOIN):
(FULL JOIN):
Binary Relational Operations:
NATURAL JOIN Operation
• NATURAL JOIN Operation
• Another variation of JOIN called NATURAL JOIN — denoted by * — was
created to get rid of the second (superfluous) attribute in an EQUIJOIN
condition.
• because one of each pair of attributes with identical values is superfluous
• The standard definition of natural join requires that the two join
attributes, or each pair of corresponding join attributes, have the same
name in both relations
• If this is not the case, a renaming operation is applied first.
Slide 6- 98
Binary Relational Operations NATURAL JOIN (contd.)
Slide 6- 99
Example of NATURAL JOIN operation
Slide 6- 100
Complete Set of Relational
Operations
Slide 6- 101
Binary Relational Operations:
DIVISION
• DIVISION Operation
• The division operation is applied to two relations
• R(Z) S(X), where X subset Z. Let Y = Z - X (and hence Z
= X Y); that is, let Y be the set of attributes of R that are not
attributes of S.
Slide 6- 106
Additional Relational Operations: Aggregate
Functions and Grouping
Slide 6- 107
Aggregate Function Operation
Slide 6- 108
Using Grouping with Aggregation
Slide 6- 111
Additional Relational Operations
(cont.)
• Although it is possible to retrieve employees at each level and then
take their union, we cannot, in general, specify a query such as
“retrieve SSNs of all employees e’ directly supervised at level one by
employee e whose name is ‘James Borg’
• without utilizing a looping mechanism.
• The SQL3 standard includes syntax for recursive closure.
Slide 6- 112
Two level recursive query
Additional Relational Operations
(cont.)
Slide 6- 119
Additional Relational Operations
(cont.)
• Example: An outer union can be applied to two relations
whose schemas are STUDENT(Name, SSN, Department,
Advisor) and INSTRUCTOR(Name, SSN, Department, Rank).
• Tuples from the two relations are matched based on having the
same combination of values of the shared attributes— Name,
SSN, Department.
• If a student is also an instructor, both Advisor and Rank will have a
value; otherwise, one of these two attributes will be null.
• The result relation STUDENT_OR_INSTRUCTOR will have the
following attributes:
STUDENT_OR_INSTRUCTOR (Name, SSN, Department,
Advisor, Rank)
Slide 6- 120
Examples of Queries in Relational
Algebra
Q1: Retrieve the name and address of all employees who work for the
‘Research’ department.
RESEARCH_DEPT DNAME=’Research’ (DEPARTMENT)
RESEARCH_EMPS (RESEARCH_DEPT DNUMBER= DNOEMPLOYEE EMPLOYEE)
RESULT FNAME, LNAME, ADDRESS (RESEARCH_EMPS)
Slide 6- 121
Illustrating aggregate functions and
grouping
Slide 6- 122