Relational Algebra
What is Relational Algebra?
• Relational algebra is a notation for specifying queries about
the contents of relations.
• Relational algebra eases the task of reasoning about
queries.
• Operations in relational algebra have counterparts in SQL.
• To process a query, a DBMS translates SQL into a notation
similar to relational algebra.
Five Basic Operations of Relational Algebra
• The Five Basic Operations of Relational Algebra
Group I: Three standard set-theoretic binary operations:
1. Union
2. Difference
3. Cartesian Product
Group II: Two special unary operations on relations:
4. Projection
5. Selection
• Relational Algebra consists of all expressions obtained by
combining these five basic operations in syntactically correct ways.
Union
• Standard set-theoretic definition of union:
A ∪ B = {x : x ∈ A or x ∈ B}
Example
{a, b, c} ∪ {a, d, e} = {a, b, c, d, e}
Union
• For relations, we want the result to be a relation again:
• a set R ⊆ D1 × D2 × · · · × Dn for some n and domains
D1, D2, . . . , Dn
• (or, in other words, a proper table, with each column
associated with a single domain of values).
• So we require in order to take a union of relations R and
S ,that R and S have the same number of columns and
that corresponding columns have the same domains.
Union-compatible relations
• Two relations R and S are union-compatible if
they have the same number of columns and
corresponding columns have the same domains.
Example: not union-compatible
Example-1 Example-2
This is a relation between people This is a relation between people,
and email addresses email addresses and phone
numbers
Email NAME Phone Email NAME
hameed@[Link] Hameed 12345 hameed@gcu. Hameed
[Link]
ali@[Link] Ali 67890 ali@[Link] Ali
baber@[Link] Baber
125438 baber@[Link] Baber
[Link]
Different number of columns
Example: not union-compatible
Email NAME D.O.B NAME
hameed@[Link] Hameed 1999 Hameed
ali@[Link] Ali 1998 Ali
baber@[Link] Baber 2000 Baber
Different domains for the second column
Example: union-compatible
D.O.B NAME D.O.B NAME
1997 Alina 1999 Hameed
2002 Fatima 1998 Ali
2005 Ayesha 2000 Baber
union-compatible: Same number of Column and having same
type
Union of two relations
• Let R and S be two union-compatible relations.
• Then their union R ∪ S is a relation which contains tuples from both
relations:
Note:
Note that union is a partial operation on relations: it is only defined
for some (compatible) relations, not for all of them.
Similar to division for numbers (result of division by 0 is not defined)
Union
C B A C B A C B A
5 3 1 5 4 1 5 3 1
5 4 1 6 3 2 5 4 1
6 3 2
R S R∪S
• R ∪ S is only allowed if R and S have exactly the same attributes.
• In SQL, (SELECT * FROM R) UNION (SELECT * FROM S) only
requires that R and S have the same number of attributes. The
result takes the attributes of R.
Difference of Two Relations
• Let R and S be two union-compatible relations.
Then their difference R - S is a relation which
contains tuples which are in R but not in S:
NOTE:
Note that difference is also a partial operation on
relations.
Difference
C B A C B A C B A
5 3 1 5 4 1 5 3 1
5 4 1 6 3 2
R S R-S
• R - S is only allowed if R and S have exactly the same attributes.
• In SQL, (SELECT * FROM R) MINUS (SELECT * FROM S) only requires
that R and S have the same number of attributes.
• The result takes the attributes of R.
Difference of Two Relations
• Let R and S be two union-compatible relations.
Then their difference R - S is a relation which
contains tuples which are in R but not in S:
NOTE:
Note that difference is also a partial operation on relations.
Difference
C B A C B A C B A
5 3 1 5 4 1 5 3 1
5 4 1 6 3 2
R R-S
S
R - S is only allowed if R and S have exactly the same attributes.
In SQL, (SELECT * FROM R) MINUS (SELECT * FROM S) only requires that R and S have the
same number of attributes.
The result takes the attributes of R.
Intersection of two relations
• Let R and S be two union-compatible relations. Then their intersection a relation
R ∩ S which contains tuples which are both in R and S:
Note:
1. Note that intersection is also a partial operation on relations.
2. Set-intersection is defined in terms of set-difference:
r ∩s = r - (r - s)
3 Thus, set-intersection must follow the same compatibility rules as set-difference:
same arity, corresponding domains.
Cartesian Product
• Cartesian product is a total operation on relations.
• Usual set theoretic definition of product:
Example 8
Extended Cartesian Product
Example
The Projection Operator
Projection
Projection
Projection
Project
Selection
Selection
Select
Combining Selection and Projection
Project Operation Properties
Example:
Age Name Roll_no Query1. Display the name of students in
20 A 1 student table
17 B 2
𝜋 𝑛𝑎𝑚𝑒 ( 𝑆𝑡𝑢𝑑𝑒𝑛𝑡 )
16 C 3
Name
19 D 4
A
18 E 5
B
18 F 6
C
D
E
F
Example:
Age Name Roll_no Query 2. Display the roll numbers and
20 A 1 name of students in student table
17 B 2
𝜋 𝑟𝑜𝑙𝑙 ,𝑛𝑎𝑚𝑒 ( 𝑆𝑡𝑢𝑑𝑒𝑛𝑡 )
16 C 3 𝑛𝑜
19 D 4 Name Roll_no
18 E 5 A 1
18 F 6 B 2
C 3
D 4
E 5
F 6
Example:
Age Name Roll_no Query 3. Display the age of students in
20 A 1 student table
17 B 2
𝜋 𝑎𝑔𝑒 ( 𝑆𝑡𝑢𝑑𝑒𝑛𝑡 )
16 C 3
19 D 4 Age
18 E 5 20
18 F 6 17
16
19
18
18
Example:
Age Name Roll_no Query 4. Display the roll number and name
20 A 1 of students whose age is greater than 17.
17 B 2
16 C 3
𝜋 𝑅𝑜𝑙𝑙
𝑛𝑜
,𝑛𝑎𝑚𝑒 ( 𝜎 𝑎𝑔𝑒>17 ( 𝑆𝑡𝑢𝑑𝑒𝑛𝑡 ) )
19 D 4 Age
18 E 5 20
18 F 6 17
16
19
18
18
)( JOIN
Join
• A join operation combines the product, selection and possibly projection. The join operator combines
data from one tuple of a relation with tuples from another or same relation when certain criteria are
met. The criteria involve a relationship among the attributes in the join relations. Different types of
joins are as follows:
• Theta Join
• Equi Join
• Natural Join
• Outer Join
Natural Join
C B A
5 3 1 D C B A
5 4 1 2 5 3 1
5 4 2
2 5 4 1
6 3 2 R 1 5 4 1
D C B
2 5 4 2
2 5 3
1 5 4 2
2 5 4
1 5 4
R S
1 6 3 S
Natural Join
• R S contains all tuples t such that t[ABC ] is in R,
and t[BCD] in S
• This is different from the cross product
• SELECT * FROM R, S in SQL.
• Natural Join is a join which is performed if there is a
common attribute between the relations.
Natural Join
• A natural join of two tables combines those tables i.e. extends the tuples of one
relation with their counterparts.
• Two tuples (of each table) will be concatenated if their shared (specified)
attributes match.
• It omits tuples which don’t have an associated tuple in the other relation
The list of attribute names shall be given as comma separated list
Binary Relational Operations: JOIN ()
• The sequence of CARTESIAN PRODECT followed by SELECT is used quite
commonly to identify and select related tuples from two relations.
• A special operation, called JOIN combines this sequence into a single
operation.
• This operation is very important for any relational database with more than a
single relation, because it allows us combine related tuples from various
relations
Binary Relational Operations: JOIN ()
• The general form of a join operation on two relations
where R and S can be any relations that result from general relational
.algebra expressions
Natural Join
• Let’s look at the following relation instances of Employee & Parking:
Employee Parking
Emp_Office Emp_name Emp_id Parking_space Parking_lot Emo_id
10 Bob 1001 6 A 1001
11 Alice 1002 14 A 1002
10 Sandy 1003
17 B 1003
11 Larry 1004
6 B 1004
11 Susan 1005
12 A 1005
Query: Display the employee details and their associated parking spaces
Natural Join
• If we perform the operation: Employee Parking
• The operation selects those rows from Parking to combine with the row from Employee where Emp_id
in Employee is equal to the Emp_id in Parking. The resulting relation looks like this:
Parking_space Parking_lot Emp_Office Emp_name Emp_id
6 A 10 Bob 1001
14 A 11 Alice 1002
17 B 10 Sandy 1003
6 B 11 Larry 1004
12 A 11 Susan 1005
Natural Join
• Example: Suppose there are two relations FACULTY and COURSE as follows:
•
Designation Department Name FactID FID Title Course Id
Assistant CS Usman Khalil 1001 1001 DBMS CS-520
Associate Fin Abdullah 1002 OOP CS-511
Associate Eco Ayesha Ashfaq 1003 1003 FM CS-430
Lecturer Fin Tanveer Hussain 1004
Course
Faculty
Natural Join will be :
Faculty Course
Natural Join
Faculty Course
Title CourseID Designation Department Name FactID
DBMS CS-520 Assistant CS Usman Khalil 1001
FM CS-430 Associate Eco Ayesha 1003
Ashfaq
Natural Join
STUDENT SUB_REGD
Phone Age Name Regno Name Regno
8652398452 15 Ali 001 Physics 001
7894562310 17 Umer 002 Computer 002
OS 002
STUDENT⋈ SUB_REGD
STUDENT⋈ SUB_REGD
Subject Phone Age Name Regno
Physics 8652398452 15 Ali 001
Computer 7894562310 17 Umer 002
O.S 7894562310 17 Umer 002
Natural Join
• RA:(Π regno, name, age, phone (student)) ⋈ (Π regno, subject (sub_regd)
• SQL: SELECT * FROM student natural join sub_regd;
STUDENT⋈ SUB_REGD
Subject Phone Age Name Regno
Physics 8652398452 15 Ali 001
Computer 7894562310 17 Umer 002
O.S 7894562310 17 Umer 002
Result: Produces a new temporary relation with regno, name, phone, sregno and subject
attributes of all students. The records that satisfy the join condition regno = sregno are
included in the final result.
Theta Join
• Theta join is the result of performing a SELECT operation using a comparison
operator theta on the product. Theta is denoted by
• In normal cross product, all rows of one relation are merged with all rows of
second relation. In Theta join, only selected rows of first relation are merged with
all rows of second relation . It is denoted as follows:
Theta Join
Theta join is a join which combines the tuples from different relations according
to the given theta condition. The join condition in theta join is denoted by theta
• R1 ⋈
(θ) symbol. This join uses all kind of comparison operator.
<join condition> R2
Here, the <join condition> is of the form R1.a θ R2.b, and
θ is any of the comparison operators {=, <, <=, >, >=,
≠}
Theta Join
STUDENT COURSE
C_ID AGE NAME S_ID C_Name C_1D
11 25 Andrew 1 C Foundation 11
11 30 Angel 2 C++ 21
22 35 Sarah 3
STUDENT (COURSE)
C_Name Age NAME S_ID
C Foundation 25 Andrew 1
C Foundation 30 Angel 2
Theta Join
STUDENT_DETAILS STUDENT_RESULT
Address NAME RollNo Marks Subject RollNo.
London Andrew 1 10 DBMS 1
Spain Angel 2 20 C++ 1
30 Java 2
40 C++ 2
STUDENT_DETAILS )
Theta Join
Marks Subject Address Name Roll No
10 DBMS London Andrew 1
20 C++ London Andrew 1
30 Java Spain Angel 2
40 C++ Spain Angel 2
Theta Join
• Find all the hospitals within 5 miles of a school
Theta Join
• Find all user clicks made within 5 seconds of page
load
Theta Join
AGE_GROUP CUSTOMER
Age_Group Max_age Min_Age Age Customer Name
18-30 30 18 35 Ayesha
31-60 60 31 25 Ali
Above 60 100 61 62 Umair
SQL: SELECT customer_name, age_group FROM customer, age_group WHERE age
between min_age and max_age;
RA: πcustomer_name, age_group(σage >= min_age ^ age <= max_age (customer X age_group))
Theta Join
• Example: Suppose there are two relations FACULTY and COURSE as follows:
•
Designation Department Name FactID FID Title Course Id
Assistant CS Usman 1001 1001 DBMS CS-520
Khalil
Associate Fin Abdullah 1002 OOP CS-511
Associate Eco Ayesha 1003 1003 FM CS-430
Ashfaq
Lecturer Fin Tanveer 1004
Hussain
Course
Faculty
FID Title CourseID Designation Department Name FactID
1001 DBMS CS-520 Associate Finance Abdullah 1002
OOP CS-511 Associate Finance Abdullah 1002
1003 FM CS-430 Associate Finance Abdullah 1002
1001 DBMS CS-520 Associate Economics Ayesha Ashfaq 1003
OOP CS-511 Associate Economics Ayesha Ashfaq 1003
1003 FM CS-430 Associate Economics Ayesha Ashfaq 1003
JOIN
• Movie(title: string, year: int, length: int, genre: string, studioName: string, producerC#: int)
• MovieStar(name: string, address: string, gender: char, birthdate: date)
• StarsIn( movieTitle: string, movieYear: string, starName: string)
• MovieExec(name: string, address: string, CERT#: int, netWorth: int)
• Studio(name: string, address: string, presC#: int)
It is translated as follows:
Practice Question #1
commission city name salesman_id Salesman_id grade city Cust_name Customer-id
0.15 NY James 5001
5001 100 NY Nick Ryman 3002
0.13 Paris Nail 5002
0.11 Lon Pit 5005 5001 200 NY Brad Davis 3007
0.14 Paris Mc Lyon 5006 5002 200 California Graham Zusi 3005
0.13 Rome Paul 5007
5002 300 London Julian Green 3008
0.12 San Lauson 5003
5006 300 Paris Fabian John 3004
5003 100 Berlin Geoff Camrn 3009
Q# Find the salesmen and customers
with their name and cities, who 5007 200 Moscow Jozy Altidor 3003
belongs to the same city 5005 London Brad Guzan 3001
Tree Diagram
Practice Question#2
commission city name salesman_id Salesman_id grade city Cust_name Customer-id
0.15 NY James 5001
5001 100 NY Nick Ryman 3002
0.13 Paris Nail 5002
0.11 Lon Pit 5005 5001 200 NY Brad Davis 3007
0.14 Paris Mc Lyon 5006 5002 200 California Graham Zusi 3005
0.13 Rome Paul 5007
5002 300 London Julian Green 3008
0.12 San Lauson 5003
5006 300 Paris Fabian John 3004
5003 100 Berlin Geoff Camrn 3009
Q# From the following tables
write a query to find the salesperson(s) 5007 200 Moscow Jozy Altidor 3003
and the customer(s) 5005 London Brad Guzan 3001
he handle.
Return Customer Name, city, Salesman
name, commission.
Solution
SELECT customer . customername, customer . city, salesman .
name, [Link]
FROM customer INNER JOIN salesman
ON customer . salesman_id = salesman . salesman_id
π cust . cname, cust . city, salesman . name, salesman . commission (cust ⋈ cust . salesman_id = salesman . salesman_id salesman)
(cust ⋈ cust . salesman_id = salesman . salesman_id salesman)
π cust . cname, cust . city, salesman . name, salesman . commission
Solution
Commission Salesman City Customer Name
0.15 James Hoog New York Nick Rimando
0.15 James Hoog New York Brad Davis
0.13 Nail Knite California Graham Zusi
0.13 Nail Knite London Julian Green
0.14 Mc Lyon Paris Fabian Johnson
0.12 Lauson Hen Berlin Geoff Cameron
0.13 Paul Adam Moscow Jozy Altidor
0.11 Pit Alex London Brad Guzan
Extended Relational-Algebra-Operations
Generalized Projection
Aggregate Functions
Aggregate Functions Cont.
Salesman_id Customer_id Ord_date Purch_amt Ord_no
5002 3005 2012-10-05 150.5 70001
5005 3001 2012-09-10 270.65 70009
5001 3002 2012-10-05 65.26 70002
5003 3009 2012-08-17 110.5 70004
5002 3005 2012-09-10 948.5 70005
5001 3007 2012-07-27 2400.6 70008
5001 3002 2012-09-10 5760 70010
5006 3004 2012-10-10 1983.43 70003
5003 3009 2012-10-10 2480.4 70012
5002 3008 2012-06-27 250.45 70011
5007 3003 2012-08-17 75.29 70013
.Aggregate Functions Cont
Consider the following relational schema:
Orders ( ord_no, purch_amt, ord_date, customer_id, salesman_id)
write a query to calculate total purchase amount of all orders.
Return total purchase amount.
;SELECT SUM (purch_amt) FROM orders
.Aggregate Functions Cont
Consider the following relational schema:
Orders ( ord_no, purch_amt, ord_date, customer_id, salesman_id)
write a SQL query to calculate average purchase amount of all orders.
Return average purchase amount
;SELECT AVG (purch_amt) FROM orders
π AVG (purch_amt)
γ AVG (purch_amt) orders
.Aggregate Functions Cont
Consider the following relational schema:
customers ( ord_no, purch_amt, ord_date, customer_id,
salesman_id)
write a query to count the number of customers. Return number
of customers.
;SELECT COUNT(*) FROM customer
π COUNT (*)
γ COUNT (*) customer
.Aggregate Functions Cont
Consider the following relational schema:
Orders ( ord_no, purch_amt, ord_date, customer_id,
salesman_id)
write a query to find the maximum purchase amount.
SELECT MAX (purch_amt) FROM orders;
π MAX (purch_amt)
γ MAX (purch_amt)
orders
Practice Question#3
DPT_ALLOTMENT DPT_NAME DPT_CODE EMP_DEPTCOD EMP_LNAME EMP_FNAME EMP_IDNO
57 Robbin Michale 127323
65000 IT 57
63 Snares Carlos 526689
15000 FINANCE 63
57 Dosio Enric 843795
240000 HR 47 63 Snares Jhon 328717
55000 RD 27 47 Dosni Joseph 444527
47 Emily Zanifer 659831
75000 QC 89
57 Sitaraman Kuleswar 847674
Table: emp_dept 47 Gabriel Henrey 748681
57 Manuel Alex 555935
27 Mardy George 539569
Write a query to find the names of 63 Saule Mario 733843
departments where more than two employees
27 Snappy Alan 631548
are working. Return dpt_name
57 Foster Maria 839139
Table: emp_details
Solution
SELECT emp_department.dpt_name
FROM emp_details
INNER JOIN emp_department
ON emp_dept =dpt_code
GROUP BY emp_department.dpt_name
HAVING COUNT(*) > 2;
SQL and Corresponding Relational Algebra
SQL and Corresponding Relational Algebra
Renaming
Renaming
Definition: RENAMING
Graded Activity 2
1. Consider the following relational database; Give an expression in the relational algebra for
each request:
employee (person-name, street, city)
works (person-name, company-name, salary)
company (company-name, city)
manages (person-name, manager-name)
a. Modify the database so that Jones now lives in Newtown.
b. Give all employees of First Bank Corporation a 10 percent salary raise.
c. Give all managers in this database a 10 percent salary raise.
d. Give all managers in this database a 10 percent salary raise, unless the salary would
be greater than $100,000. In such cases, give only a 3 percent raise.
e. Delete all tuples in the works relation for employees of Small Bank Corporation.
2. Why do you think relational algebra is called “relational algebra”?
Syntax I
Syntax II
Syntax III
Syntax IV
Syntax V
Syntax VI
Syntax VII
Syntax VIII
Syntax IX
Syntax X
Syntax XI
Syntax XII
Translating SQL into the Relational Algebra
• Movie(title: string, year: int, length: int, genre: string, studioName: string, producerC#: int)
• MovieStar (name: string, address: string, gender: char, birthdate: date)
• StarsIn(movieTitle: string, movieYear: string, starName: string)
• MovieExec(name: string, address: string, CERT#: int, netWorth: int)
• Studio(name: string, address: string, presC#: int)
Select-from-where statements without subqueries
• Consider a general SELECT-FROM-WHERE statement of the form
SELECT Select-list
FROM R1, . . . , R2 T2, . . .
WHERE Where-condition.
When the statement does not use subqueries in its where-condition, we can easily translate it into the
relational algebra as follows:
Note that:
1. An alias R2 T2 in the FROM-clause corresponds to a renaming
2. It is possible that there is no WHERE clause. In that case, it is of course unnecessary to
include the selection σ in the relational algebra expression.
3. If we omit the projection (π) we obtain the translation of the following special case:
SELECT *
FROM R1, . . . , R2 T2, . . .
WHERE Where-condition
.Example 1
• Consider the following SELECT-FROM-WHERE statement.
:Its translation is as follows
Normalizing Where-subqueries into Exists and Not Exists form
• In general, however, we also have to be able to translate statements in which
such subqueries do occur.
• Subqueries can occur in the WHERE clause through the operators =, , <=, >=,
<>; through the quantifiers ANY, or ALL;
• or through the operators EXISTS and IN and their negations NOT EXISTS and
NOT IN.
• We can easily rewrite all of these cases using only EXISTS and NOT EXISTS
Example 2
• The SQL-statement:
can be rewritten equivalently as
Quick Quiz
• Schema
• Employees(number, name, age, salary)
Supervises(boss, employee)
• Does the schema allow for an employee with no
boss?
• Find the names and salaries of all bosses who have
an employee earning more than 100.
Quick Quiz: Solution
• Schema
• Employees(number, name, age, salary) Supervises(boss, employee)
• Does the schema allow for an employee with no boss?
Yes
• Find the names and salaries of all bosses who have an employee earning more
than 100.
References:
• Slides of Relational Databases, Logic, and Complexity by Phokion G. Kolaitis; University of California,
Santa Cruz & IBM Research-Almaden
• Foundations of Databases Relational Query Languages; Slides from Werner Nutt, Thomas Eiter and
Leonid Libkin
• Foundation of Database by Serge Abiteboul, Richard Hull, Victor Vianu
• Relational Algebra-I, Advanced Database Theory, Slides from Dr. Syed Asad Raza Kazmi, Computer
Science Department,GC University Lahore