number of copies of a row in the result is m − n, where m corresponds to the first
relation.
NESTED QUERIES
A nested query is a querythat has another query embedded within it; the embedded
query is called a subquery .
Introduction to Nested Queries
As an example, let us rewrite the following query, which we discussed earlier, using a nested
subquery:
(Q1) Find the names of sailors who have reserved boat 103.
SELECT [Link]
FROM Sailors S
WHERE [Link] IN ( SELECT [Link]
FROM Reserves R
WHERE [Link] = 103 )
The nested subquery computes the (multi)set of sids for sailors who have reserved boat
103 (the set contains 22, 31, and 74 on instances R2 and S3), and the top-level query
retrieves the names of sailors whose sid is in this set. The IN operator allows us to
test whether a value is in a given set of elements; an SQL query is used to generate
the set to be tested. Notice that it is very easy to modify this query to find all sailors
who have not reserved boat 103—we can just replace IN by NOT IN!
(Q2) Find the names of sailors who have reserved a red boat.
SELECT [Link]
FROM Sailors S
WHERE [Link] IN ( SELECT [Link]
FROM Reserves R
WHERE [Link] IN ( SELECT [Link]
FROM Boats B
WHERE [Link] = ‘red’ )
The innermost subquery finds the set of bids of red boats (102 and 104 on instance
B1). The subquery one level above finds the set of sids of sailors who have reserved
one of these boats. On instances B1, R2, and S3, this set of sids contains 22, 31, and64.
Dept of CSE, Unit-2 Page 30
The top-level query finds the names of sailors whose sid is in this set of sids. For the
example instances, we get Dustin, Lubber, and Horatio.
(Q21) Find the names of sailors who have not reserved a red boat.
SELECT [Link]
FROM Sailors S
WHERE [Link] NOT IN ( SELECT [Link]
FROM Reserves R
WHERE [Link] IN ( SELECT [Link]
FROM Boats B
WHERE [Link] = ‘red’ )
This query computes the names of sailors whose sid is not in the set 22, 31, and 64.
Correlated Nested Queries
In the nested queries that we have seen thus far, the inner subquery has been completely
independent of the outer query. In general the inner subquery could depend on the row
that is currently being examined in the outer query (in terms of our conceptual
evaluation strategy). Let us rewrite the following query once more:
(Q1) Find the names of sailors who have reserved boat number 103.
SELECT [Link]
FROM Sailors S
WHERE EXISTS ( SELECT *
FROM Reserves R
WHERE [Link] = 103
AND [Link] = [Link] )
The EXISTS operator is another set comparison operator, such as IN. It allows us to
test whether a set is nonempty. Thus, for each Sailor row S, we test whether the set of
Reserves rows R such that [Link] = 103 AND [Link] = [Link] is nonempty. If so, sailor S has
reserved boat 103, and we retrieve the name. The subquery clearly depends on the current
row S and must be re-evaluated for each row in Sailors. The occurrence of S in the
subquery (in the form of the literal [Link]) is called a correlation, and such queries are
called correlated queries.
Set-Comparison Operators
We have already seen the set-comparison operators EXISTS, IN, and UNIQUE, along with
their negated versions. SQL also supports op ANY and op ALL, where op is one of the
Dept of CSE, Unit-2 Page 31
arithmetic comparison operators {<, <=, =, <>, >=, >}. (SOME is also available, but it is
just a synonym for ANY.)
(Q22) Find sailors whose rating is better than some sailor called Horatio.
SELECT [Link]
FROM Sailors S
WHERE [Link] > ANY ( SELECT [Link]
FROM Sailors S2
WHERE [Link] = ‘Horatio’ )
If there are several sailors called Horatio, this query finds all sailors whose rating is
better than that of some sailor called Horatio. On instance S3, this computes the
sids 31, 32, 58, 71, and 74. What if there were no sailor called Horatio? In this case
the comparison [Link] > ANY . . . is defined to return false, and the above query
returns an empty answer set. To understand comparisons involving ANY, it is useful to
think of the comparison being carried out repeatedly. In the example above, [Link]
is successively compared with each rating value that is an answer to the nested query.
Intuitively, the subquery must return a row that makes the comparison true, in order
for [Link] > ANY . . . to return true.
(Q23) Find sailors whose rating is better than every sailor called Horatio.
We can obtain all such queries with a simple modification to Query Q22: just replace
ANY with ALL in the WHERE clause of the outer query. On instance S3, we would get
the sids 58 and 71. If there were no sailor called Horatio, the comparison [Link]
> ALL . . . is defined to return true! The query would then return the names of all
sailors. Again, it is useful to think of the comparison being carried out repeatedly.
Intuitively, the comparison must be true for every returned row in order for [Link]
> ALL . . . to return true.
As another illustration of ALL, consider the following query:
(Q24) Find the sailors with the highest rating.
SELECT [Link]
FROM Sailors S
WHERE [Link] >= ALL ( SELECT [Link]
FROM Sailors S2 )
Dept of CSE, Unit-2 Page 32
The subquery computes the set of all rating values in Sailors. The outer WHERE con-
dition is satisfied only when [Link] is greater than or equal to each of these rating
values, i.e., when it is the largest rating value. In the instance S3, the condition is
only satisfied for rating 10, and the answer includes the sids of sailors with this rating,
i.e., 58 and 71.
Note that IN and NOT IN are equivalent to = ANY and <> ALL, respectively.
More Examples of Nested Queries
Let us revisit a query that we considered earlier using the INTERSECT operator.
(Q6) Find the names of sailors who have reserved both a red and a green boat.
SELECT [Link]
FROM Sailors S, Reserves R, Boats B
WHERE [Link] = [Link] AND [Link] = [Link] AND [Link] = ‘red’
AND [Link] IN ( SELECT [Link]
FROM Sailors S2, Boats B2, Reserves R2
WHERE [Link] = [Link] AND [Link] = [Link]
AND [Link] = ‘green’ )
As it turns out, writing this query (Q6) using INTERSECT is more complicated because
we have to use sids to identify sailors (while intersecting) and have to return sailor
names:
SELECT [Link]
FROM Sailors S3
WHERE [Link] IN (( SELECT [Link]
FROM Boats B, Reserves R
WHERE [Link] = [Link] AND [Link] = ‘red’ )
INTERSECT
(SELECT [Link]
FROM Boats B2, Reserves R2
WHERE [Link] = [Link] AND [Link] = ‘green’ ))
Our next example illustrates how the division operation in relational algebra can be
expressed in SQL.
(Q9) Find the names of sailors who have reserved all boats.
SELECT [Link]
FROM Sailors S
WHERE NOT EXISTS (( SELECT [Link]
FROM Boats B )
EXCEPT
(SELECT [Link]
FROM Reserves R
Dept of CSE, Unit-2 Page 33
WHERE [Link] = [Link] ))
Notice that this query is correlated—for each sailor S, we check to see that the set of
boats reserved by S includes all boats. An alternative way to do this query without
using EXCEPT follows:
SELECT [Link]
FROM Sailors S
WHERE NOT EXISTS ( SELECT [Link]
FROM Boats B
WHERE NOT EXISTS ( SELECT [Link]
FROM Reserves R
WHERE [Link] = [Link]
AND [Link] = [Link] ))
Intuitively, for each sailor we check that there is no boat that has not been reserved by
this sailor.
AGGREGATE OPERATORS
We now consider a powerful class of constructs for computing aggregate
values such as MIN and SUM. These features represent a significant extension of rela-
tional algebra. SQL supports five aggregate operations, which can be applied on any
column, say A, of a relation:
1. COUNT ([DISTINCT] A): The number of (unique) values in the A column.
2. SUM ([DISTINCT] A): The sum of all (unique) values in the A column.
3. AVG ([DISTINCT] A): The average of all (unique) values in the A column.
4. MAX (A): The maximum value in the A column.
5. MIN (A): The minimum value in the A column.
Note that it does not make sense to specify DISTINCT in conjunction with MIN or MAX
(although SQL-92 does not preclude this).
(Q25) Find the average age of all sailors.
SELECT AVG ([Link])
FROM Sailors S
On instance S3, the average age is 37.4. Of course, the WHERE clause can be used to
restrict the sailors who are considered in computing the average age:
(Q26) Find the average age of sailors with a rating of 10.
Dept of CSE, Unit-2 Page 34