Query statements scan one or more tables or expressions and return the computed result rows. This topic describes the syntax for SQL queries in GoogleSQL for Spanner.
The following table lists and describes the syntax notation rules that GoogleSQL documentation commonly uses.
| Notation | Example | Description |
|---|---|---|
| Square brackets | [ ] |
Optional clauses |
| Parentheses | ( ) |
Literal parentheses |
| Vertical bar | | |
Logical XOR (exclusive OR) |
| Curly braces | { } |
A set of options, such as { a | b | c }. Select one option. |
| Ellipsis | ... |
The preceding item can repeat. |
| Comma | , |
Literal comma |
| Comma followed by an ellipsis | , ... |
The preceding item can repeat in a comma-separated list. |
| Item list | item [, ...] |
One or more items |
[item, ...] |
Zero or more items | |
| Double quotes | "" |
The enclosed syntax characters (for example, "{"..."}") are literal
and required. |
| Angle brackets | <> |
Literal angle brackets |
query_statement: [ statement_hint_expr ] [ table_hint_expr ] [ group_hint_expr ] [ join_hint_expr ] query_expr query_expr: [ WITH cte[, ...] ] { select | ( query_expr ) | set_operation } [ ORDER BY expression [{ ASC | DESC }] [, ...] ] [ LIMIT count [ OFFSET skip_rows ] ] [ FOR UPDATE ] select: SELECT [ { ALL | DISTINCT } ] [ AS { typename | STRUCT | VALUE } ] select_list [ FROM from_clause[, ...] ] [ WHERE bool_expression ] [ GROUP [ group_hint_expr ] BY group_by_specification ] [ HAVING bool_expression ]
SELECT statement
SELECT
[ { ALL | DISTINCT } ]
[ AS { typename | STRUCT | VALUE } ]
select_list
select_list:
{ select_all | select_expression } [, ...]
select_all:
[ expression. ]*
[ EXCEPT ( column_name [, ...] ) ]
[ REPLACE ( expression AS column_name [, ...] ) ]
select_expression:
expression [ [ AS ] alias ]
The SELECT list defines the columns that the query will return. Expressions in
the SELECT list can refer to columns in any of the from_items in its
corresponding FROM clause.
Each item in the SELECT list is one of:
*expressionexpression.*SELECT *SELECT *, often referred to as select star, produces one output column for
each column that's visible after executing the full query.
SELECT * FROM (SELECT "apple" AS fruit, "carrot" AS vegetable);
/*-------+-----------+
| fruit | vegetable |
+-------+-----------+
| apple | carrot |
+-------+-----------*/
SELECT expressionItems in a SELECT list can be expressions. These expressions evaluate to a
single value and produce one output column, with an optional explicit alias.
If the expression doesn't have an explicit alias, it receives an implicit alias according to the rules for implicit aliases, if possible. Otherwise, the column is anonymous and you can't refer to it by name elsewhere in the query.
SELECT expression.*An item in a SELECT list can also take the form of expression.*. This
produces one output column for each column or top-level field of expression.
The expression must either be a table alias or evaluate to a single value of a
data type with fields, such as a STRUCT.
The following query produces one output column for each column in the table
groceries, aliased as g.
WITH groceries AS
(SELECT "milk" AS dairy,
"eggs" AS protein,
"bread" AS grain)
SELECT g.*
FROM groceries AS g;
/*-------+---------+-------+
| dairy | protein | grain |
+-------+---------+-------+
| milk | eggs | bread |
+-------+---------+-------*/
More examples:
WITH locations AS
(SELECT STRUCT("Seattle" AS city, "Washington" AS state) AS location
UNION ALL
SELECT STRUCT("Phoenix" AS city, "Arizona" AS state) AS location)
SELECT l.location.*
FROM locations l;
/*---------+------------+
| city | state |
+---------+------------+
| Seattle | Washington |
| Phoenix | Arizona |
+---------+------------*/
WITH locations AS
(SELECT ARRAY<STRUCT<city STRING, state STRING>>[("Seattle", "Washington"),
("Phoenix", "Arizona")] AS location)
SELECT l.LOCATION[offset(0)].*
FROM locations l;
/*---------+------------+
| city | state |
+---------+------------+
| Seattle | Washington |
+---------+------------*/
SELECT * EXCEPTA SELECT * EXCEPT statement specifies the names of one or more columns to
exclude from the result. All matching column names are omitted from the output.
WITH orders AS
(SELECT 5 as order_id,
"sprocket" as item_name,
200 as quantity)
SELECT * EXCEPT (order_id)
FROM orders;
/*-----------+----------+
| item_name | quantity |
+-----------+----------+
| sprocket | 200 |
+-----------+----------*/
SELECT * REPLACEA SELECT * REPLACE statement specifies one or more
expression AS identifier clauses. Each identifier must match a column name
from the SELECT * statement. In the output column list, the column that
matches the identifier in a REPLACE clause is replaced by the expression in
that REPLACE clause.
A SELECT * REPLACE statement doesn't change the names or order of columns.
However, it can change the value and the value type.
WITH orders AS
(SELECT 5 as order_id,
"sprocket" as item_name,
200 as quantity)
SELECT * REPLACE ("widget" AS item_name)
FROM orders;
/*----------+-----------+----------+
| order_id | item_name | quantity |
+----------+-----------+----------+
| 5 | widget | 200 |
+----------+-----------+----------*/
WITH orders AS
(SELECT 5 as order_id,
"sprocket" as item_name,
200 as quantity)
SELECT * REPLACE (quantity/2 AS quantity)
FROM orders;
/*----------+-----------+----------+
| order_id | item_name | quantity |
+----------+-----------+----------+
| 5 | sprocket | 100 |
+----------+-----------+----------*/
SELECT DISTINCTA SELECT DISTINCT statement discards duplicate rows and returns only the
remaining rows. SELECT DISTINCT can't return columns of the following types:
PROTOSTRUCTARRAYGRAPH_ELEMENTGRAPH_PATHSELECT ALLA SELECT ALL statement returns all rows, including duplicate rows.
SELECT ALL is the default behavior of SELECT.
Queries that return a STRUCT at the root of the return type aren't
supported in Spanner APIs. For example, the following
query is supported only as a subquery:
SELECT STRUCT(1, 2) FROM Users;
Returning an array of structs is supported. For example, the following queries are supported in Spanner APIs:
SELECT ARRAY(SELECT STRUCT(1 AS A, 2 AS B)) FROM Users;
SELECT ARRAY(SELECT AS STRUCT 1 AS a, 2 AS b) FROM Users;
However, query shapes that can return an ARRAY<STRUCT<...>> typed NULL
value or an ARRAY<STRUCT<...>> typed value with an element that's NULL
aren't supported in Spanner APIs, so the following query
is supported only as a subquery:
SELECT ARRAY(SELECT IF(STARTS_WITH(Users.username, "a"), NULL, STRUCT(1, 2)))
FROM Users;
See Querying STRUCT elements in an ARRAY for more examples on
how to query STRUCTs inside an ARRAY.
Also see notes about using STRUCTs in subqueries.
SELECT AS STRUCTSELECT AS STRUCT expr [[AS] struct_field_name1] [,...]
This produces a value table with a
STRUCT row type, where the
STRUCT field names and types match the column names
and types produced in the SELECT list.
Example:
SELECT ARRAY(SELECT AS STRUCT 1 a, 2 b)
SELECT AS STRUCT can be used in a scalar or array subquery to produce a single
STRUCT type grouping multiple values together. Scalar
and array subqueries (see Subqueries) are normally not
allowed to return multiple columns, but can return a single column with
STRUCT type.
Anonymous columns are allowed.
Example:
SELECT AS STRUCT 1 x, 2, 3
The query above produces STRUCT values of type
STRUCT<int64 x, int64, int64>. The first field has the name x while the
second and third fields are anonymous.
The example above produces the same result as this SELECT AS VALUE query using
a struct constructor:
SELECT AS VALUE STRUCT(1 AS x, 2, 3)
Duplicate columns are allowed.
Example:
SELECT AS STRUCT 1 x, 2 y, 3 x
The query above produces STRUCT values of type
STRUCT<int64 x, int64 y, int64 x>. The first and third fields have the same
name x while the second field has the name y.
The example above produces the same result as this SELECT AS VALUE query
using a struct constructor:
SELECT AS VALUE STRUCT(1 AS x, 2 AS y, 3 AS x)
SELECT AS typenameSELECT AS typename
expr [[AS] field]
[, ...]
A SELECT AS typename statement produces a value table where the row type
is a specific named type. Currently, protocol buffers are the
only supported type that can be used with this syntax.
When selecting as a type that has fields, such as a proto message type,
the SELECT list may produce multiple columns. Each produced column must have
an explicit or implicit alias that matches a unique field of
the named type.
When used with SELECT DISTINCT, or GROUP BY or ORDER BY using column
ordinals, these operators are first applied on the columns in the SELECT list.
The value construction happens last. This means that DISTINCT can be applied
on the input columns to the value construction, including in
cases where DISTINCT wouldn't be allowed after value construction because
grouping isn't supported on the constructed type.
The following is an example of a SELECT AS typename query.
SELECT AS tests.TestProtocolBuffer mytable.key int64_val, mytable.name string_val
FROM mytable;
The query returns the output as a tests.TestProtocolBuffer protocol
buffer. mytable.key int64_val means that values from the key column are
stored in the int64_val field in the protocol buffer. Similarly, values from
the mytable.name column are stored in the string_val protocol buffer field.
To learn more about protocol buffers, see Work with protocol buffers.
SELECT AS VALUESELECT AS VALUE produces a value table from any
SELECT list that produces exactly one column. Instead of producing an
output table with one column, possibly with a name, the output will be a
value table where the row type is just the value type that was produced in the
one SELECT column. Any alias the column had will be discarded in the
value table.
Example:
SELECT AS VALUE 1
The query above produces a table with row type INT64.
Example:
SELECT AS VALUE STRUCT(1 AS a, 2 AS b) xyz
The query above produces a table with row type STRUCT<a int64, b int64>.
Example:
SELECT AS VALUE v FROM (SELECT AS STRUCT 1 a, true b) v WHERE v.b
Given a value table v as input, the query above filters out certain values in
the WHERE clause, and then produces a value table using the exact same value
that was in the input table. If the query above didn't use SELECT AS VALUE,
then the output table schema would differ from the input table schema because
the output table would be a regular table with a column named v containing the
input value.
FROM clauseFROM from_clause[, ...] from_clause: from_item [ tablesample_operator ] from_item: { table_name [ table_hint_expr ] [ as_alias ] | { join_operation | ( join_operation ) } | ( query_expr ) [ table_hint_expr ] [ as_alias ] | field_path | unnest_operator | cte_name [ table_hint_expr ] [ as_alias ] | graph_table_operator [ as_alias ] } as_alias: [ AS ] alias
The FROM clause indicates the table or tables from which to retrieve rows,
and specifies how to join those rows together to produce a single stream of
rows for processing in the rest of the query.
tablesample_operatorSee TABLESAMPLE operator.
graph_table_operatorSee GRAPH_TABLE operator.
table_nameThe name of an existing table.
SELECT * FROM Roster;
join_operationSee Join operation.
query_expr( query_expr ) [ [ AS ] alias ] is a table subquery.
field_pathIn the FROM clause, field_path is any path that
resolves to a field within a data type. field_path can go
arbitrarily deep into a nested data structure.
Some examples of valid field_path values include:
SELECT * FROM T1 t1, t1.array_column;
SELECT * FROM T1 t1, t1.struct_column.array_field;
SELECT (SELECT ARRAY_AGG(c) FROM t1.array_column c) FROM T1 t1;
SELECT a.struct_field1 FROM T1 t1, t1.array_of_structs a;
SELECT (SELECT STRING_AGG(a.struct_field1) FROM t1.array_of_structs a) FROM T1 t1;
Field paths in the FROM clause must end in an
array or a repeated field. In
addition, field paths can't contain arrays
or repeated fields before the end of the path. For example, the path
array_column.some_array.some_array_field is invalid because it
contains an array before the end of the path.
unnest_operatorSee UNNEST operator.
cte_nameCommon table expressions (CTEs) in a WITH Clause act like
temporary tables that you can reference anywhere in the FROM clause.
In the example below, subQ1 and subQ2 are CTEs.
Example:
WITH
subQ1 AS (SELECT * FROM Roster WHERE SchoolID = 52),
subQ2 AS (SELECT SchoolID FROM subQ1)
SELECT DISTINCT * FROM subQ2;
UNNEST operatorunnest_operator: { UNNEST( array ) [ as_alias ] | array_path [ as_alias ] } [ table_hint_expr ] [ WITH OFFSET [ as_alias ] ] array: { array_expression | array_path } as_alias: [AS] alias
The UNNEST operator takes an array and returns a table with one row for each
element in the array. The output of UNNEST is one value table column.
For these ARRAY element types, SELECT * against the value table column
returns multiple columns:
STRUCTPROTOInput values:
array_expression: An expression that produces an array and that's not an
array path.array_path: The
path to an ARRAY type.
UNNEST operation, the path
must
start with
a
range variable name.UNNEST operation, the path can optionally start with a
range variable name.The UNNEST operation with any correlated array_path must
be on the right side of a CROSS JOIN, LEFT JOIN, or
INNER JOIN operation.
as_alias: If specified, defines the explicit name of the value table
column containing the array element values. It can be used to refer to
the column elsewhere in the query.
WITH OFFSET: UNNEST destroys the order of elements in the input
array. Use this optional clause to return an additional column with
the array element indexes, or offsets. Offset counting starts at zero for
each row produced by the UNNEST operation. This column has an
optional alias; If the optional alias isn't used, the default column name is
offset.
Example:
SELECT * FROM UNNEST ([10,20,30]) as numbers WITH OFFSET;
/*---------+--------+
| numbers | offset |
+---------+--------+
| 10 | 0 |
| 20 | 1 |
| 30 | 2 |
+---------+--------*/
You can also use UNNEST outside of the FROM clause with the
IN operator.
For several ways to use UNNEST, including construction, flattening, and
filtering, see Work with arrays.
To learn more about the ways you can use UNNEST explicitly and implicitly,
see Explicit and implicit UNNEST.
UNNEST and structsFor an input array of structs, UNNEST
returns a row for each struct, with a separate column for each field in the
struct. The alias for each column is the name of the corresponding struct
field.
Example:
SELECT *
FROM UNNEST(
ARRAY<
STRUCT<
x INT64,
y STRING,
z ARRAY<INT64>>>[
(1, 'foo', [10, 11]),
(3, 'bar', [20, 21])]);
/*---+-----+----------+
| x | y | z |
+---+-----+----------+
| 1 | foo | {10, 11} |
| 3 | bar | {20, 21} |
+---+-----+----------*/
UNNEST and protocol buffersFor an input array of protocol buffers, UNNEST returns a row for each
protocol buffer, with a separate column for each field in the
protocol buffer. The alias for each column is the name of the corresponding
protocol buffer field.
Example:
SELECT *
FROM UNNEST(
ARRAY<googlesql.examples.music.Album>[
NEW googlesql.examples.music.Album (
'The Goldberg Variations' AS album_name,
['Aria', 'Variation 1', 'Variation 2'] AS song
)
]
);
/*-------------------------+--------+----------------------------------+
| album_name | singer | song |
+-------------------------+--------+----------------------------------+
| The Goldberg Variations | NULL | [Aria, Variation 1, Variation 2] |
+-------------------------+--------+----------------------------------*/
As with structs, you can alias UNNEST to define a range variable. You
can reference this alias in the SELECT list to return a value table where each
row is a protocol buffer element from the array.
SELECT proto_value
FROM UNNEST(
ARRAY<googlesql.examples.music.Album>[
NEW googlesql.examples.music.Album (
'The Goldberg Variations' AS album_name,
['Aria', 'Var. 1'] AS song
)
]
) AS proto_value;
/*---------------------------------------------------------------------+
| proto_value |
+---------------------------------------------------------------------+
| {album_name: "The Goldberg Variations" song: "Aria" song: "Var. 1"} |
+---------------------------------------------------------------------*/
UNNESTArray unnesting can be either explicit or implicit. To learn more, see the following sections.
The UNNEST keyword is required in explicit unnesting. For example:
WITH Coordinates AS (SELECT [1,2] AS position)
SELECT results FROM Coordinates, UNNEST(Coordinates.position) AS results;
This example and the following examples use the array_path called
Coordinates.position to illustrate unnesting.
The UNNEST keyword isn't used in implicit unnesting.
For example:
WITH Coordinates AS (SELECT [1,2] AS position)
SELECT results FROM Coordinates, Coordinates.position AS results;
When you use array_path with implicit UNNEST, array_path must be prepended
with the table. For example:
WITH Coordinates AS (SELECT [1,2] AS position)
SELECT results FROM Coordinates, Coordinates.position AS results;
UNNEST and NULL valuesUNNEST treats NULL values as follows:
NULL and empty arrays produce zero rows.NULL values produces rows containing NULL values.TABLESAMPLE operator
tablesample_clause:
TABLESAMPLE sample_method (sample_size percent_or_rows )
sample_method:
{ BERNOULLI | RESERVOIR }
sample_size:
numeric_value_expression
percent_or_rows:
{ PERCENT | ROWS }
Description
You can use the TABLESAMPLE operator to select a random sample of a dataset.
This operator is useful when you're working with tables that have large
amounts of data and you don't need precise answers.
sample_method: When using the TABLESAMPLE operator, you must specify the
sampling algorithm to use:
BERNOULLI: Each row is independently selected with the probability
given in the percent clause. As a result, you get approximately
N * percent/100 rows.RESERVOIR: Takes as parameter an actual sample size
K (expressed as a number of rows). If the input is smaller than K, it
outputs the entire input relation. If the input is larger than K,
reservoir sampling outputs a sample of size exactly K, where any sample of
size K is equally likely.sample_size: The size of the sample.percent_or_rows: The TABLESAMPLE operator requires that you choose either
ROWS or PERCENT. If you choose PERCENT, the value must be between
0 and 100. If you choose ROWS, the value must be greater than or equal
to 0.Examples
The following examples illustrate the use of the TABLESAMPLE operator.
Select from a table using the RESERVOIR sampling method:
SELECT MessageId
FROM Messages TABLESAMPLE RESERVOIR (100 ROWS);
Select from a table using the BERNOULLI sampling method:
SELECT MessageId
FROM Messages TABLESAMPLE BERNOULLI (0.1 PERCENT);
Use TABLESAMPLE with a subquery:
SELECT Subject FROM
(SELECT MessageId, Subject FROM Messages WHERE ServerId="test")
TABLESAMPLE BERNOULLI(50 PERCENT)
WHERE MessageId > 3;
Use a TABLESAMPLE operation with a join to another table.
SELECT S.Subject
FROM
(SELECT MessageId, ThreadId FROM Messages WHERE ServerId="test") AS R
TABLESAMPLE RESERVOIR(5 ROWS),
Threads AS S
WHERE S.ServerId="test" AND R.ThreadId = S.ThreadId;
GRAPH_TABLE operatorTo learn more about this operator, see
GRAPH_TABLE operator in the
Graph Query Language (GQL) reference guide.
join_operation: { cross_join_operation | condition_join_operation } cross_join_operation: from_item cross_join_operator [ join_hint_expr ] from_item condition_join_operation: from_item condition_join_operator [ join_hint_expr ] from_item join_condition cross_join_operator: { CROSS JOIN | , } condition_join_operator: { [INNER] [ join_method ] JOIN | FULL [OUTER] [ join_method ] JOIN | LEFT [OUTER] [ join_method ] JOIN | RIGHT [OUTER] [ join_method ] JOIN } join_method: { HASH } join_condition: { on_clause | using_clause } on_clause: ON bool_expression using_clause: USING ( column_list )
The JOIN operation merges two from_items so that the SELECT clause can
query them as one source. The join operator and join condition specify how to
combine and discard rows from the two from_items to form a single source.
[INNER] JOINAn INNER JOIN, or simply JOIN, effectively calculates the Cartesian product
of the two from_items and discards all rows that don't meet the join
condition. Effectively means that it's possible to implement an INNER JOIN
without actually calculating the Cartesian product.
FROM A INNER JOIN B ON A.w = B.y
/*
Table A Table B Result
+-------+ +-------+ +---------------+
| w | x | * | y | z | = | w | x | y | z |
+-------+ +-------+ +---------------+
| 1 | a | | 2 | k | | 2 | b | 2 | k |
| 2 | b | | 3 | m | | 3 | c | 3 | m |
| 3 | c | | 3 | n | | 3 | c | 3 | n |
| 3 | d | | 4 | p | | 3 | d | 3 | m |
+-------+ +-------+ | 3 | d | 3 | n |
+---------------+
*/
FROM A INNER JOIN B USING (x)
/*
Table A Table B Result
+-------+ +-------+ +-----------+
| x | y | * | x | z | = | x | y | z |
+-------+ +-------+ +-----------+
| 1 | a | | 2 | k | | 2 | b | k |
| 2 | b | | 3 | m | | 3 | c | m |
| 3 | c | | 3 | n | | 3 | c | n |
| 3 | d | | 4 | p | | 3 | d | m |
+-------+ +-------+ | 3 | d | n |
+-----------+
*/
Example
This query performs an INNER JOIN on the Roster
and TeamMascot tables.
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster JOIN TeamMascot ON Roster.SchoolID = TeamMascot.SchoolID;
/*---------------------------+
| LastName | Mascot |
+---------------------------+
| Adams | Jaguars |
| Buchanan | Lakers |
| Coolidge | Lakers |
| Davis | Knights |
+---------------------------*/
You can use a correlated INNER JOIN to flatten an array
into a set of rows. To learn more, see
Convert elements in an array to rows in a table.
CROSS JOINCROSS JOIN returns the Cartesian product of the two from_items. In other
words, it combines each row from the first from_item with each row from the
second from_item.
If the rows of the two from_items are independent, then the result has
M * N rows, given M rows in one from_item and N in the other. Note that
this still holds for the case when either from_item has zero rows.
In a FROM clause, a CROSS JOIN can be written like this:
FROM A CROSS JOIN B
/*
Table A Table B Result
+-------+ +-------+ +---------------+
| w | x | * | y | z | = | w | x | y | z |
+-------+ +-------+ +---------------+
| 1 | a | | 2 | c | | 1 | a | 2 | c |
| 2 | b | | 3 | d | | 1 | a | 3 | d |
+-------+ +-------+ | 2 | b | 2 | c |
| 2 | b | 3 | d |
+---------------+
*/
You can use a correlated cross join to convert or
flatten an array into a set of rows, though the (equivalent) INNER JOIN is
preferred over CROSS JOIN for this case. To learn more, see
Convert elements in an array to rows in a table.
Examples
This query performs an CROSS JOIN on the Roster
and TeamMascot tables.
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster CROSS JOIN TeamMascot;
/*---------------------------+
| LastName | Mascot |
+---------------------------+
| Adams | Jaguars |
| Adams | Knights |
| Adams | Lakers |
| Adams | Mustangs |
| Buchanan | Jaguars |
| Buchanan | Knights |
| Buchanan | Lakers |
| Buchanan | Mustangs |
| ... |
+---------------------------*/
CROSS JOINs can be written implicitly with a comma. This is
called a comma cross join.
A comma cross join looks like this in a FROM clause:
FROM A, B
/*
Table A Table B Result
+-------+ +-------+ +---------------+
| w | x | * | y | z | = | w | x | y | z |
+-------+ +-------+ +---------------+
| 1 | a | | 2 | c | | 1 | a | 2 | c |
| 2 | b | | 3 | d | | 1 | a | 3 | d |
+-------+ +-------+ | 2 | b | 2 | c |
| 2 | b | 3 | d |
+---------------+
*/
You can't write comma cross joins inside parentheses. To learn more, see Join operations in a sequence.
FROM (A, B) // INVALID
You can use a correlated comma cross join to convert or flatten an array into a set of rows. To learn more, see Convert elements in an array to rows in a table.
Examples
This query performs a comma cross join on the Roster
and TeamMascot tables.
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster, TeamMascot;
/*---------------------------+
| LastName | Mascot |
+---------------------------+
| Adams | Jaguars |
| Adams | Knights |
| Adams | Lakers |
| Adams | Mustangs |
| Buchanan | Jaguars |
| Buchanan | Knights |
| Buchanan | Lakers |
| Buchanan | Mustangs |
| ... |
+---------------------------*/
FULL [OUTER] JOINA FULL OUTER JOIN (or simply FULL JOIN) returns all fields for all matching
rows in both from_items that meet the join condition. If a given row from one
from_item doesn't join to any row in the other from_item, the row returns
with NULL values for all columns from the other from_item.
FROM A FULL OUTER JOIN B ON A.w = B.y
/*
Table A Table B Result
+-------+ +-------+ +---------------------------+
| w | x | * | y | z | = | w | x | y | z |
+-------+ +-------+ +---------------------------+
| 1 | a | | 2 | k | | 1 | a | NULL | NULL |
| 2 | b | | 3 | m | | 2 | b | 2 | k |
| 3 | c | | 3 | n | | 3 | c | 3 | m |
| 3 | d | | 4 | p | | 3 | c | 3 | n |
+-------+ +-------+ | 3 | d | 3 | m |
| 3 | d | 3 | n |
| NULL | NULL | 4 | p |
+---------------------------+
*/
FROM A FULL OUTER JOIN B USING (x)
/*
Table A Table B Result
+-------+ +-------+ +--------------------+
| x | y | * | x | z | = | x | y | z |
+-------+ +-------+ +--------------------+
| 1 | a | | 2 | k | | 1 | a | NULL |
| 2 | b | | 3 | m | | 2 | b | k |
| 3 | c | | 3 | n | | 3 | c | m |
| 3 | d | | 4 | p | | 3 | c | n |
+-------+ +-------+ | 3 | d | m |
| 3 | d | n |
| 4 | NULL | p |
+--------------------+
*/
Example
This query performs a FULL JOIN on the Roster
and TeamMascot tables.
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster FULL JOIN TeamMascot ON Roster.SchoolID = TeamMascot.SchoolID;
/*---------------------------+
| LastName | Mascot |
+---------------------------+
| Adams | Jaguars |
| Buchanan | Lakers |
| Coolidge | Lakers |
| Davis | Knights |
| Eisenhower | NULL |
| NULL | Mustangs |
+---------------------------*/
LEFT [OUTER] JOINThe result of a LEFT OUTER JOIN (or simply LEFT JOIN) for two
from_items always retains all rows of the left from_item in the
JOIN operation, even if no rows in the right from_item satisfy the join
predicate.
All rows from the left from_item are retained;
if a given row from the left from_item doesn't join to any row
in the right from_item, the row will return with NULL values for all
columns exclusively from the right from_item. Rows from the right
from_item that don't join to any row in the left from_item are discarded.
FROM A LEFT OUTER JOIN B ON A.w = B.y
/*
Table A Table B Result
+-------+ +-------+ +---------------------------+
| w | x | * | y | z | = | w | x | y | z |
+-------+ +-------+ +---------------------------+
| 1 | a | | 2 | k | | 1 | a | NULL | NULL |
| 2 | b | | 3 | m | | 2 | b | 2 | k |
| 3 | c | | 3 | n | | 3 | c | 3 | m |
| 3 | d | | 4 | p | | 3 | c | 3 | n |
+-------+ +-------+ | 3 | d | 3 | m |
| 3 | d | 3 | n |
+---------------------------+
*/
FROM A LEFT OUTER JOIN B USING (x)
/*
Table A Table B Result
+-------+ +-------+ +--------------------+
| x | y | * | x | z | = | x | y | z |
+-------+ +-------+ +--------------------+
| 1 | a | | 2 | k | | 1 | a | NULL |
| 2 | b | | 3 | m | | 2 | b | k |
| 3 | c | | 3 | n | | 3 | c | m |
| 3 | d | | 4 | p | | 3 | c | n |
+-------+ +-------+ | 3 | d | m |
| 3 | d | n |
+--------------------+
*/
Example
This query performs a LEFT JOIN on the Roster
and TeamMascot tables.
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster LEFT JOIN TeamMascot ON Roster.SchoolID = TeamMascot.SchoolID;
/*---------------------------+
| LastName | Mascot |
+---------------------------+
| Adams | Jaguars |
| Buchanan | Lakers |
| Coolidge | Lakers |
| Davis | Knights |
| Eisenhower | NULL |
+---------------------------*/
RIGHT [OUTER] JOINThe result of a RIGHT OUTER JOIN (or simply RIGHT JOIN) for two
from_items always retains all rows of the right from_item in the
JOIN operation, even if no rows in the left from_item satisfy the join
predicate.
All rows from the right from_item are returned;
if a given row from the right from_item doesn't join to any row
in the left from_item, the row will return with NULL values for all
columns exclusively from the left from_item. Rows from the left from_item
that don't join to any row in the right from_item are discarded.
FROM A RIGHT OUTER JOIN B ON A.w = B.y
/*
Table A Table B Result
+-------+ +-------+ +---------------------------+
| w | x | * | y | z | = | w | x | y | z |
+-------+ +-------+ +---------------------------+
| 1 | a | | 2 | k | | 2 | b | 2 | k |
| 2 | b | | 3 | m | | 3 | c | 3 | m |
| 3 | c | | 3 | n | | 3 | c | 3 | n |
| 3 | d | | 4 | p | | 3 | d | 3 | m |
+-------+ +-------+ | 3 | d | 3 | n |
| NULL | NULL | 4 | p |
+---------------------------+
*/
FROM A RIGHT OUTER JOIN B USING (x)
/*
Table A Table B Result
+-------+ +-------+ +--------------------+
| x | y | * | x | z | = | x | y | z |
+-------+ +-------+ +--------------------+
| 1 | a | | 2 | k | | 2 | b | k |
| 2 | b | | 3 | m | | 3 | c | m |
| 3 | c | | 3 | n | | 3 | c | n |
| 3 | d | | 4 | p | | 3 | d | m |
+-------+ +-------+ | 3 | d | n |
| 4 | NULL | p |
+--------------------+
*/
Example
This query performs a RIGHT JOIN on the Roster
and TeamMascot tables.
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster RIGHT JOIN TeamMascot ON Roster.SchoolID = TeamMascot.SchoolID;
/*---------------------------+
| LastName | Mascot |
+---------------------------+
| Adams | Jaguars |
| Buchanan | Lakers |
| Coolidge | Lakers |
| Davis | Knights |
| NULL | Mustangs |
+---------------------------*/
In a join operation, a join condition helps specify how to
combine rows in two from_items to form a single source.
The two types of join conditions are the ON clause and
USING clause. You must use a join condition when you perform a
conditional join operation. You can't use a join condition when you perform a
cross join operation.
ON clauseON bool_expression
Description
Given a row from each table, if the ON clause evaluates to TRUE, the query
generates a consolidated row with the result of combining the given rows.
Definitions:
bool_expression: The boolean expression that specifies the condition for
the join. This is frequently a comparison operation or
logical combination of comparison operators.Details:
Similarly to CROSS JOIN, ON produces a column once for each column in each
input table.
A NULL join condition evaluation is equivalent to a FALSE evaluation.
If a column-order sensitive operation such as UNION or SELECT * is used with
the ON join condition, the resulting table contains all of the columns from
the left input in order, and then all of the columns from the right input in
order.
Examples
The following examples show how to use the ON clause:
WITH
A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3),
B AS ( SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4)
SELECT * FROM A INNER JOIN B ON A.x = B.x;
WITH
A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3),
B AS ( SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4)
SELECT A.x, B.x FROM A INNER JOIN B ON A.x = B.x;
/*
Table A Table B Result (A.x, B.x)
+---+ +---+ +-------+
| x | * | x | = | x | x |
+---+ +---+ +-------+
| 1 | | 2 | | 2 | 2 |
| 2 | | 3 | | 3 | 3 |
| 3 | | 4 | +-------+
+---+ +---+
*/
WITH
A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
B AS ( SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT * FROM A LEFT OUTER JOIN B ON A.x = B.x;
WITH
A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
B AS ( SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT A.x, B.x FROM A LEFT OUTER JOIN B ON A.x = B.x;
/*
Table A Table B Result
+------+ +---+ +-------------+
| x | * | x | = | x | x |
+------+ +---+ +-------------+
| 1 | | 2 | | 1 | NULL |
| 2 | | 3 | | 2 | 2 |
| 3 | | 4 | | 3 | 3 |
| NULL | | 5 | | NULL | NULL |
+------+ +---+ +-------------+
*/
WITH
A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
B AS ( SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT * FROM A FULL OUTER JOIN B ON A.x = B.x;
WITH
A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
B AS ( SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT A.x, B.x FROM A FULL OUTER JOIN B ON A.x = B.x;
/*
Table A Table B Result
+------+ +---+ +-------------+
| x | * | x | = | x | x |
+------+ +---+ +-------------+
| 1 | | 2 | | 1 | NULL |
| 2 | | 3 | | 2 | 2 |
| 3 | | 4 | | 3 | 3 |
| NULL | | 5 | | NULL | NULL |
+------+ +---+ | NULL | 4 |
| NULL | 5 |
+-------------+
*/
USING clauseUSING ( column_name_list )
column_name_list:
column_name[, ...]
Description
When you are joining two tables, USING performs an
equality comparison operation on the columns named in
column_name_list. Each column name in column_name_list must appear in both
input tables. For each pair of rows from the input tables, if the
equality comparisons all evaluate to TRUE, one row is added to the resulting
column.
Definitions:
column_name_list: A list of columns to include in the join condition.column_name: The column that exists in both of the tables that you are
joining.Details:
A NULL join condition evaluation is equivalent to a FALSE evaluation.
If a column-order sensitive operation such as UNION or SELECT * is used
with the USING join condition, the resulting table contains columns in this
order:
column_name_list in the order they appear in the USING
clause.A column name in the USING clause must not be qualified by a
table name.
If the join is an INNER JOIN or a LEFT OUTER JOIN, the output
columns are populated from the values in the first table. If the
join is a RIGHT OUTER JOIN, the output columns are populated from the values
in the second table. If the join is a FULL OUTER JOIN, the output columns
are populated by coalescing the values from the left and right
tables in that order.
Examples
The following example shows how to use the USING clause with one
column name in the column name list:
WITH
A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 9 UNION ALL SELECT NULL),
B AS ( SELECT 2 as x UNION ALL SELECT 9 UNION ALL SELECT 9 UNION ALL SELECT 5)
SELECT * FROM A INNER JOIN B USING (x);
/*
Table A Table B Result
+------+ +---+ +---+
| x | * | x | = | x |
+------+ +---+ +---+
| 1 | | 2 | | 2 |
| 2 | | 9 | | 9 |
| 9 | | 9 | | 9 |
| NULL | | 5 | +---+
+------+ +---+
*/
The following example shows how to use the USING clause with
multiple column names in the column name list:
WITH
A AS (
SELECT 1 as x, 15 as y UNION ALL
SELECT 2, 10 UNION ALL
SELECT 9, 16 UNION ALL
SELECT NULL, 12),
B AS (
SELECT 2 as x, 10 as y UNION ALL
SELECT 9, 17 UNION ALL
SELECT 9, 16 UNION ALL
SELECT 5, 15)
SELECT * FROM A INNER JOIN B USING (x, y);
/*
Table A Table B Result
+-----------+ +---------+ +---------+
| x | y | * | x | y | = | x | y |
+-----------+ +---------+ +---------+
| 1 | 15 | | 2 | 10 | | 2 | 10 |
| 2 | 10 | | 9 | 17 | | 9 | 16 |
| 9 | 16 | | 9 | 16 | +---------+
| NULL | 12 | | 5 | 15 |
+-----------+ +---------+
*/
The following examples show additional ways in which to use the USING clause
with one column name in the column name list:
WITH
A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 9 UNION ALL SELECT NULL),
B AS ( SELECT 2 as x UNION ALL SELECT 9 UNION ALL SELECT 9 UNION ALL SELECT 5)
SELECT x, A.x, B.x FROM A INNER JOIN B USING (x)
/*
Table A Table B Result
+------+ +---+ +--------------------+
| x | * | x | = | x | A.x | B.x |
+------+ +---+ +--------------------+
| 1 | | 2 | | 2 | 2 | 2 |
| 2 | | 9 | | 9 | 9 | 9 |
| 9 | | 9 | | 9 | 9 | 9 |
| NULL | | 5 | +--------------------+
+------+ +---+
*/
WITH
A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 9 UNION ALL SELECT NULL),
B AS ( SELECT 2 as x UNION ALL SELECT 9 UNION ALL SELECT 9 UNION ALL SELECT 5)
SELECT x, A.x, B.x FROM A LEFT OUTER JOIN B USING (x)
/*
Table A Table B Result
+------+ +---+ +--------------------+
| x | * | x | = | x | A.x | B.x |
+------+ +---+ +--------------------+
| 1 | | 2 | | 1 | 1 | NULL |
| 2 | | 9 | | 2 | 2 | 2 |
| 9 | | 9 | | 9 | 9 | 9 |
| NULL | | 5 | | 9 | 9 | 9 |
+------+ +---+ | NULL | NULL | NULL |
+--------------------+
*/
WITH
A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 2 UNION ALL SELECT NULL),
B AS ( SELECT 2 as x UNION ALL SELECT 9 UNION ALL SELECT 9 UNION ALL SELECT 5)
SELECT x, A.x, B.x FROM A RIGHT OUTER JOIN B USING (x)
/*
Table A Table B Result
+------+ +---+ +--------------------+
| x | * | x | = | x | A.x | B.x |
+------+ +---+ +--------------------+
| 1 | | 2 | | 2 | 2 | 2 |
| 2 | | 9 | | 2 | 2 | 2 |
| 2 | | 9 | | 9 | NULL | 9 |
| NULL | | 5 | | 9 | NULL | 9 |
+------+ +---+ | 5 | NULL | 5 |
+--------------------+
*/
WITH
A AS ( SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 2 UNION ALL SELECT NULL),
B AS ( SELECT 2 as x UNION ALL SELECT 9 UNION ALL SELECT 9 UNION ALL SELECT 5)
SELECT x, A.x, B.x FROM A FULL OUTER JOIN B USING (x);
/*
Table A Table B Result
+------+ +---+ +--------------------+
| x | * | x | = | x | A.x | B.x |
+------+ +---+ +--------------------+
| 1 | | 2 | | 1 | 1 | NULL |
| 2 | | 9 | | 2 | 2 | 2 |
| 2 | | 9 | | 2 | 2 | 2 |
| NULL | | 5 | | NULL | NULL | NULL |
+------+ +---+ | 9 | NULL | 9 |
| 9 | NULL | 9 |
| 5 | NULL | 5 |
+--------------------+
*/
The following example shows how to use the USING clause with
only some column names in the column name list.
WITH
A AS (
SELECT 1 as x, 15 as y UNION ALL
SELECT 2, 10 UNION ALL
SELECT 9, 16 UNION ALL
SELECT NULL, 12),
B AS (
SELECT 2 as x, 10 as y UNION ALL
SELECT 9, 17 UNION ALL
SELECT 9, 16 UNION ALL
SELECT 5, 15)
SELECT * FROM A INNER JOIN B USING (x);
/*
Table A Table B Result
+-----------+ +---------+ +-----------------+
| x | y | * | x | y | = | x | A.y | B.y |
+-----------+ +---------+ +-----------------+
| 1 | 15 | | 2 | 10 | | 2 | 10 | 10 |
| 2 | 10 | | 9 | 17 | | 9 | 16 | 17 |
| 9 | 16 | | 9 | 16 | | 9 | 16 | 16 |
| NULL | 12 | | 5 | 15 | +-----------------+
+-----------+ +---------+
*/
The following query performs an INNER JOIN on the
Roster and TeamMascot table.
The query returns the rows from Roster and TeamMascot where
Roster.SchoolID is the same as TeamMascot.SchoolID. The results include a
single SchoolID column.
SELECT * FROM Roster INNER JOIN TeamMascot USING (SchoolID);
/*----------------------------------------+
| SchoolID | LastName | Mascot |
+----------------------------------------+
| 50 | Adams | Jaguars |
| 52 | Buchanan | Lakers |
| 52 | Coolidge | Lakers |
| 51 | Davis | Knights |
+----------------------------------------*/
ON and USING equivalencyThe ON and USING join conditions aren't
equivalent, but they share some rules and sometimes they can produce similar
results.
In the following examples, observe what is returned when all rows
are produced for inner and outer joins. Also, look at how
each join condition handles NULL values.
WITH
A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3),
B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4)
SELECT * FROM A INNER JOIN B ON A.x = B.x;
WITH
A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3),
B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4)
SELECT * FROM A INNER JOIN B USING (x);
/*
Table A Table B Result ON Result USING
+---+ +---+ +-------+ +---+
| x | * | x | = | x | x | | x |
+---+ +---+ +-------+ +---+
| 1 | | 2 | | 2 | 2 | | 2 |
| 2 | | 3 | | 3 | 3 | | 3 |
| 3 | | 4 | +-------+ +---+
+---+ +---+
*/
WITH
A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT * FROM A LEFT OUTER JOIN B ON A.x = B.x;
WITH
A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT * FROM A LEFT OUTER JOIN B USING (x);
/*
Table A Table B Result ON Result USING
+------+ +---+ +-------------+ +------+
| x | * | x | = | x | x | | x |
+------+ +---+ +-------------+ +------+
| 1 | | 2 | | 1 | NULL | | 1 |
| 2 | | 3 | | 2 | 2 | | 2 |
| 3 | | 4 | | 3 | 3 | | 3 |
| NULL | | 5 | | NULL | NULL | | NULL |
+------+ +---+ +-------------+ +------+
*/
WITH
A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3),
B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4)
SELECT * FROM A FULL OUTER JOIN B ON A.x = B.x;
WITH
A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3),
B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4)
SELECT * FROM A FULL OUTER JOIN B USING (x);
/*
Table A Table B Result ON Result USING
+---+ +---+ +-------------+ +---+
| x | * | x | = | x | x | | x |
+---+ +---+ +-------------+ +---+
| 1 | | 2 | | 1 | NULL | | 1 |
| 2 | | 3 | | 2 | 2 | | 2 |
| 3 | | 4 | | 3 | 3 | | 3 |
+---+ +---+ | NULL | 4 | | 4 |
+-------------+ +---+
*/
Although ON and USING aren't equivalent, they can return the same
results in some situations if you specify the columns you want to return.
In the following examples, observe what is returned when a specific row
is produced for inner and outer joins. Also, look at how each
join condition handles NULL values.
WITH
A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT A.x FROM A INNER JOIN B ON A.x = B.x;
WITH
A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT x FROM A INNER JOIN B USING (x);
/*
Table A Table B Result ON Result USING
+------+ +---+ +---+ +---+
| x | * | x | = | x | | x |
+------+ +---+ +---+ +---+
| 1 | | 2 | | 2 | | 2 |
| 2 | | 3 | | 3 | | 3 |
| 3 | | 4 | +---+ +---+
| NULL | | 5 |
+------+ +---+
*/
WITH
A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT A.x FROM A LEFT OUTER JOIN B ON A.x = B.x;
WITH
A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT x FROM A LEFT OUTER JOIN B USING (x);
/*
Table A Table B Result ON Result USING
+------+ +---+ +------+ +------+
| x | * | x | = | x | | x |
+------+ +---+ +------+ +------+
| 1 | | 2 | | 1 | | 1 |
| 2 | | 3 | | 2 | | 2 |
| 3 | | 4 | | 3 | | 3 |
| NULL | | 5 | | NULL | | NULL |
+------+ +---+ +------+ +------+
*/
WITH
A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT A.x FROM A FULL OUTER JOIN B ON A.x = B.x;
WITH
A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT x FROM A FULL OUTER JOIN B USING (x);
/*
Table A Table B Result ON Result USING
+------+ +---+ +------+ +------+
| x | * | x | = | x | | x |
+------+ +---+ +------+ +------+
| 1 | | 2 | | 1 | | 1 |
| 2 | | 3 | | 2 | | 2 |
| 3 | | 4 | | 3 | | 3 |
| NULL | | 5 | | NULL | | NULL |
+------+ +---+ | NULL | | 4 |
| NULL | | 5 |
+------+ +------+
*/
WITH
A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT B.x FROM A FULL OUTER JOIN B ON A.x = B.x;
WITH
A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT x FROM A FULL OUTER JOIN B USING (x);
/*
Table A Table B Result ON Result USING
+------+ +---+ +------+ +------+
| x | * | x | = | x | | x |
+------+ +---+ +------+ +------+
| 1 | | 2 | | 2 | | 1 |
| 2 | | 3 | | 3 | | 2 |
| 3 | | 4 | | NULL | | 3 |
| NULL | | 5 | | NULL | | NULL |
+------+ +---+ | 4 | | 4 |
| 5 | | 5 |
+------+ +------+
*/
In the following example, observe what is returned when COALESCE is used
with the ON clause. It provides the same results as a query
with the USING clause.
WITH
A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT COALESCE(A.x, B.x) FROM A FULL OUTER JOIN B ON A.x = B.x;
WITH
A AS (SELECT 1 as x UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT NULL),
B AS (SELECT 2 as x UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5)
SELECT x FROM A FULL OUTER JOIN B USING (x);
/*
Table A Table B Result ON Result USING
+------+ +---+ +------+ +------+
| x | * | x | = | x | | x |
+------+ +---+ +------+ +------+
| 1 | | 2 | | 1 | | 1 |
| 2 | | 3 | | 2 | | 2 |
| 3 | | 4 | | 3 | | 3 |
| NULL | | 5 | | NULL | | NULL |
+------+ +---+ | 4 | | 4 |
| 5 | | 5 |
+------+ +------+
*/
The FROM clause can contain multiple JOIN operations in a sequence.
JOINs are bound from left to right. For example:
FROM A JOIN B USING (x) JOIN C USING (x)
-- A JOIN B USING (x) = result_1
-- result_1 JOIN C USING (x) = result_2
-- result_2 = return value
You can also insert parentheses to group JOINs:
FROM ( (A JOIN B USING (x)) JOIN C USING (x) )
-- A JOIN B USING (x) = result_1
-- result_1 JOIN C USING (x) = result_2
-- result_2 = return value
With parentheses, you can group JOINs so that they are bound in a different
order:
FROM ( A JOIN (B JOIN C USING (x)) USING (x) )
-- B JOIN C USING (x) = result_1
-- A JOIN result_1 = result_2
-- result_2 = return value
When comma cross joins are present in a query with a sequence of JOINs, they
group from left to right like other JOIN types:
FROM A JOIN B USING (x) JOIN C USING (x), D
-- A JOIN B USING (x) = result_1
-- result_1 JOIN C USING (x) = result_2
-- result_2 CROSS JOIN D = return value
There can't be a RIGHT JOIN or FULL JOIN after a comma cross join unless
it's parenthesized:
FROM A, B RIGHT JOIN C ON TRUE // INVALID
FROM A, B FULL JOIN C ON TRUE // INVALID
FROM A, B JOIN C ON TRUE // VALID
FROM A, (B RIGHT JOIN C ON TRUE) // VALID
FROM A, (B FULL JOIN C ON TRUE) // VALID
A join operation is correlated when the right from_item contains a
reference to at least one range variable or
column name introduced by the left from_item.
In a correlated join operation, rows from the right from_item are determined
by a row from the left from_item. Consequently, RIGHT OUTER and FULL OUTER
joins can't be correlated because right from_item rows can't be determined
in the case when there is no row from the left from_item.
All correlated join operations must reference an array in the right from_item.
This is a conceptual example of a correlated join operation that includes a correlated subquery:
FROM A JOIN UNNEST(ARRAY(SELECT AS STRUCT * FROM B WHERE A.ID = B.ID)) AS C
from_item: Afrom_item: UNNEST(...) AS C(SELECT AS STRUCT * FROM B WHERE A.ID = B.ID)This is another conceptual example of a correlated join operation.
array_of_IDs is part of the left from_item but is referenced in the
right from_item.
FROM A JOIN UNNEST(A.array_of_IDs) AS C
The UNNEST operator can be explicit or implicit.
These are both allowed:
FROM A JOIN UNNEST(A.array_of_IDs) AS IDs
FROM A JOIN A.array_of_IDs AS IDs
In a correlated join operation, the right from_item is re-evaluated
against each distinct row from the left from_item. In the following
conceptual example, the correlated join operation first
evaluates A and B, then A and C:
FROM
A
JOIN
UNNEST(ARRAY(SELECT AS STRUCT * FROM B WHERE A.ID = B.ID)) AS C
ON A.Name = C.Name
Caveats
LEFT JOIN, when the input table on the right side is empty
for some row from the left side, it's as if no rows from the right side
satisfied the join condition in a regular LEFT JOIN. When there are no
joining rows, a row with NULL values for all columns on the right side is
generated to join with the row from the left side.CROSS JOIN, when the input table on the right side is
empty for some row from the left side, it's as if no rows from the right
side satisfied the join condition in a regular correlated INNER JOIN. This
means that the row is dropped from the results.Examples
This is an example of a correlated join, using the Roster and PlayerStats tables:
SELECT *
FROM
Roster
JOIN
UNNEST(
ARRAY(
SELECT AS STRUCT *
FROM PlayerStats
WHERE PlayerStats.OpponentID = Roster.SchoolID
)) AS PlayerMatches
ON PlayerMatches.LastName = 'Buchanan'
/*------------+----------+----------+------------+--------------+
| LastName | SchoolID | LastName | OpponentID | PointsScored |
+------------+----------+----------+------------+--------------+
| Adams | 50 | Buchanan | 50 | 13 |
| Eisenhower | 77 | Buchanan | 77 | 0 |
+------------+----------+----------+------------+--------------*/
A common pattern for a correlated LEFT JOIN is to have an UNNEST operation
on the right side that references an array from some column introduced by
input on the left side. For rows where that array is empty or NULL,
the UNNEST operation produces no rows on the right input. In that case, a row
with a NULL entry in each column of the right input is created to join with
the row from the left input. For example:
SELECT A.name, item, ARRAY_LENGTH(A.items) item_count_for_name
FROM
UNNEST(
[
STRUCT(
'first' AS name,
[1, 2, 3, 4] AS items),
STRUCT(
'second' AS name,
[] AS items)]) AS A
LEFT JOIN
A.items AS item;
/*--------+------+---------------------+
| name | item | item_count_for_name |
+--------+------+---------------------+
| first | 1 | 4 |
| first | 2 | 4 |
| first | 3 | 4 |
| first | 4 | 4 |
| second | NULL | 0 |
+--------+------+---------------------*/
In the case of a correlated INNER JOIN or CROSS JOIN, when the input on the
right side is empty for some row from the left side, the final row is dropped
from the results. For example:
SELECT A.name, item
FROM
UNNEST(
[
STRUCT(
'first' AS name,
[1, 2, 3, 4] AS items),
STRUCT(
'second' AS name,
[] AS items)]) AS A
INNER JOIN
A.items AS item;
/*-------+------+
| name | item |
+-------+------+
| first | 1 |
| first | 2 |
| first | 3 |
| first | 4 |
+-------+------*/
WHERE clauseWHERE bool_expression
The WHERE clause filters the results of the FROM clause.
Only rows whose bool_expression evaluates to TRUE are included. Rows
whose bool_expression evaluates to NULL or FALSE are
discarded.
The evaluation of a query with a WHERE clause is typically completed in this
order:
FROMWHEREGROUP BY and aggregationHAVINGDISTINCTORDER BYLIMITEvaluation order doesn't always match syntax order.
The WHERE clause only references columns available via the FROM clause;
it can't reference SELECT list aliases.
Examples
This query returns returns all rows from the Roster table
where the SchoolID column has the value 52:
SELECT * FROM Roster
WHERE SchoolID = 52;
The bool_expression can contain multiple sub-conditions:
SELECT * FROM Roster
WHERE STARTS_WITH(LastName, "Mc") OR STARTS_WITH(LastName, "Mac");
Expressions in an INNER JOIN have an equivalent expression in the
WHERE clause. For example, a query using INNER JOIN and ON has an
equivalent expression using CROSS JOIN and WHERE. For example,
the following two queries are equivalent:
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster INNER JOIN TeamMascot
ON Roster.SchoolID = TeamMascot.SchoolID;
SELECT Roster.LastName, TeamMascot.Mascot
FROM Roster CROSS JOIN TeamMascot
WHERE Roster.SchoolID = TeamMascot.SchoolID;
GROUP BY clauseGROUP [ group_hint_expr ] BY groupable_items
Description
The GROUP BY clause groups together rows in a table that share common values
for certain columns. For a group of rows in the source table with
non-distinct values, the GROUP BY clause aggregates them into a single
combined row. This clause is commonly used when aggregate functions are
present in the SELECT list, or to eliminate redundancy in the output.
Definitions
group_hint_expr: Hints that you can apply to the
GROUP BY clause. To learn more, see Group hints.groupable_items: Group rows in a table that share common values
for certain columns. To learn more, see
Group rows by groupable items.GROUP BY groupable_item[, ...] groupable_item: { value | value_alias | column_ordinal }
Description
The GROUP BY clause can include groupable expressions
and their ordinals.
Definitions
value: An expression that represents a non-distinct, groupable value.
To learn more, see Group rows by values.value_alias: An alias for value.
To learn more, see Group rows by values.column_ordinal: An INT64 value that represents the ordinal assigned to a
groupable expression in the SELECT list.
To learn more, see Group rows by column ordinals.The GROUP BY clause can group rows in a table with non-distinct
values in the GROUP BY clause. For example:
WITH PlayerStats AS (
SELECT 'Adams' as LastName, 'Noam' as FirstName, 3 as PointsScored UNION ALL
SELECT 'Buchanan', 'Jie', 0 UNION ALL
SELECT 'Coolidge', 'Kiran', 1 UNION ALL
SELECT 'Adams', 'Noam', 4 UNION ALL
SELECT 'Buchanan', 'Jie', 13)
SELECT SUM(PointsScored) AS total_points, LastName
FROM PlayerStats
GROUP BY LastName;
/*--------------+----------+
| total_points | LastName |
+--------------+----------+
| 7 | Adams |
| 13 | Buchanan |
| 1 | Coolidge |
+--------------+----------*/
GROUP BY clauses may also refer to aliases. If a query contains aliases in
the SELECT clause, those aliases override names in the corresponding FROM
clause. For example:
WITH PlayerStats AS (
SELECT 'Adams' as LastName, 'Noam' as FirstName, 3 as PointsScored UNION ALL
SELECT 'Buchanan', 'Jie', 0 UNION ALL
SELECT 'Coolidge', 'Kiran', 1 UNION ALL
SELECT 'Adams', 'Noam', 4 UNION ALL
SELECT 'Buchanan', 'Jie', 13)
SELECT SUM(PointsScored) AS total_points, LastName AS last_name
FROM PlayerStats
GROUP BY last_name;
/*--------------+-----------+
| total_points | last_name |
+--------------+-----------+
| 7 | Adams |
| 13 | Buchanan |
| 1 | Coolidge |
+--------------+-----------*/
To learn more about the data types that are supported for values in the
GROUP BY clause, see Groupable data types.
The GROUP BY clause can refer to expression names in the SELECT list. The
GROUP BY clause also allows ordinal references to expressions in the SELECT
list, using integer values. 1 refers to the first value in the
SELECT list, 2 the second, and so forth. The value list can combine
ordinals and value names. The following queries are equivalent:
WITH PlayerStats AS (
SELECT 'Adams' as LastName, 'Noam' as FirstName, 3 as PointsScored UNION ALL
SELECT 'Buchanan', 'Jie', 0 UNION ALL
SELECT 'Coolidge', 'Kiran', 1 UNION ALL
SELECT 'Adams', 'Noam', 4 UNION ALL
SELECT 'Buchanan', 'Jie', 13)
SELECT SUM(PointsScored) AS total_points, LastName, FirstName
FROM PlayerStats
GROUP BY LastName, FirstName;
/*--------------+----------+-----------+
| total_points | LastName | FirstName |
+--------------+----------+-----------+
| 7 | Adams | Noam |
| 13 | Buchanan | Jie |
| 1 | Coolidge | Kiran |
+--------------+----------+-----------*/
WITH PlayerStats AS (
SELECT 'Adams' as LastName, 'Noam' as FirstName, 3 as PointsScored UNION ALL
SELECT 'Buchanan', 'Jie', 0 UNION ALL
SELECT 'Coolidge', 'Kiran', 1 UNION ALL
SELECT 'Adams', 'Noam', 4 UNION ALL
SELECT 'Buchanan', 'Jie', 13)
SELECT SUM(PointsScored) AS total_points, LastName, FirstName
FROM PlayerStats
GROUP BY 2, 3;
/*--------------+----------+-----------+
| total_points | LastName | FirstName |
+--------------+----------+-----------+
| 7 | Adams | Noam |
| 13 | Buchanan | Jie |
| 1 | Coolidge | Kiran |
+--------------+----------+-----------*/
HAVING clauseHAVING bool_expression
The HAVING clause filters the results produced by GROUP BY or
aggregation. GROUP BY or aggregation must be present in the query. If
aggregation is present, the HAVING clause is evaluated once for every
aggregated row in the result set.
Only rows whose bool_expression evaluates to TRUE are included. Rows
whose bool_expression evaluates to NULL or FALSE are
discarded.
The evaluation of a query with a HAVING clause is typically completed in this
order:
FROMWHEREGROUP BY and aggregationHAVINGDISTINCTORDER BYLIMITEvaluation order doesn't always match syntax order.
The HAVING clause references columns available via the FROM clause, as
well as SELECT list aliases. Expressions referenced in the HAVING clause
must either appear in the GROUP BY clause or they must be the result of an
aggregate function:
SELECT LastName
FROM Roster
GROUP BY LastName
HAVING SUM(PointsScored) > 15;
If a query contains aliases in the SELECT clause, those aliases override names
in a FROM clause.
SELECT LastName, SUM(PointsScored) AS ps
FROM Roster
GROUP BY LastName
HAVING ps > 0;
Aggregation doesn't have to be present in the HAVING clause itself, but
aggregation must be present in at least one of the following forms:
SELECT list.SELECT LastName, SUM(PointsScored) AS total
FROM PlayerStats
GROUP BY LastName
HAVING total > 15;
HAVING clause.SELECT LastName
FROM PlayerStats
GROUP BY LastName
HAVING SUM(PointsScored) > 15;
SELECT list and HAVING clause.When aggregation functions are present in both the SELECT list and HAVING
clause, the aggregation functions and the columns they reference don't need
to be the same. In the example below, the two aggregation functions,
COUNT() and SUM(), are different and also use different columns.
SELECT LastName, COUNT(*)
FROM PlayerStats
GROUP BY LastName
HAVING SUM(PointsScored) > 15;
ORDER BY clause
ORDER BY expression
[COLLATE collation_specification]
[{ ASC | DESC }]
[, ...]
collation_specification:
language_tag[:collation_attribute]
The ORDER BY clause specifies a column or expression as the sort criterion for
the result set. If an ORDER BY clause isn't present, the order of the results
of a query isn't defined. Column aliases from a FROM clause or SELECT list
are allowed. If a query contains aliases in the SELECT clause, those aliases
override names in the corresponding FROM clause. The data type of
expression must be orderable.
Optional Clauses
COLLATE: You can use the COLLATE clause to refine how data is ordered
by an ORDER BY clause. Collation refers to a set of rules that determine
how strings are compared according to the conventions and
standards of a particular written language, region, or country. These rules
might define the correct character sequence, with options for specifying
case-insensitivity. You can use COLLATE only on columns of type STRING.
collation_specification represents the collation specification for the
COLLATE clause. The collation specification can be a string literal or
a query parameter. To learn more see
collation specification details.
ASC | DESC: Sort the results in ascending or descending
order of expression values. ASC is the
default value.
Examples
Use the default sort order (ascending).
SELECT x, y
FROM (SELECT 1 AS x, true AS y UNION ALL
SELECT 9, true)
ORDER BY x;
/*------+-------+
| x | y |
+------+-------+
| 1 | true |
| 9 | true |
+------+-------*/
Use descending sort order.
SELECT x, y
FROM (SELECT 1 AS x, true AS y UNION ALL
SELECT 9, true)
ORDER BY x DESC;
/*------+-------+
| x | y |
+------+-------+
| 9 | true |
| 1 | true |
+------+-------*/
It's possible to order by multiple columns. In the example below, the result
set is ordered first by SchoolID and then by LastName:
SELECT LastName, PointsScored, OpponentID
FROM PlayerStats
ORDER BY SchoolID, LastName;
When used in conjunction with
set operators,
the ORDER BY clause applies to the result set of the entire query; it doesn't
apply only to the closest SELECT statement. For this reason, it can be helpful
(though it isn't required) to use parentheses to show the scope of the ORDER
BY.
This query without parentheses:
SELECT * FROM Roster
UNION ALL
SELECT * FROM TeamMascot
ORDER BY SchoolID;
is equivalent to this query with parentheses:
( SELECT * FROM Roster
UNION ALL
SELECT * FROM TeamMascot )
ORDER BY SchoolID;
but isn't equivalent to this query, where the ORDER BY clause applies only to
the second SELECT statement:
SELECT * FROM Roster
UNION ALL
( SELECT * FROM TeamMascot
ORDER BY SchoolID );
You can also use integer literals as column references in ORDER BY clauses. An
integer literal becomes an ordinal (for example, counting starts at 1) into
the SELECT list.
Example - the following two queries are equivalent:
SELECT SUM(PointsScored), LastName
FROM PlayerStats
GROUP BY LastName
ORDER BY LastName;
SELECT SUM(PointsScored), LastName
FROM PlayerStats
GROUP BY 2
ORDER BY 2;
Collate results using English - Canada:
SELECT Place
FROM Locations
ORDER BY Place COLLATE "en_CA"
Collate results using a parameter:
#@collate_param = "arg_EG"
SELECT Place
FROM Locations
ORDER BY Place COLLATE @collate_param
Using multiple COLLATE clauses in a statement:
SELECT APlace, BPlace, CPlace
FROM Locations
ORDER BY APlace COLLATE "en_US" ASC,
BPlace COLLATE "ar_EG" DESC,
CPlace COLLATE "en" DESC
Case insensitive collation:
SELECT Place
FROM Locations
ORDER BY Place COLLATE "en_US:ci"
Default Unicode case-insensitive collation:
SELECT Place
FROM Locations
ORDER BY Place COLLATE "und:ci"
query_expr { UNION { ALL | DISTINCT } | INTERSECT { ALL | DISTINCT } | EXCEPT { ALL | DISTINCT } } query_expr
Set operators combine or filter results from two or more input queries into a single result set.
Definitions
query_expr: One of two input queries whose results are combined or filtered
into a single result set.UNION: Returns the combined results of the left and right input queries.
Values in columns that are matched by position are concatenated vertically.INTERSECT: Returns rows that are found in the results of both the left and
right input queries.EXCEPT: Returns rows from the left input query that aren't present in the
right input query.ALL: Executes the set operation on all rows.DISTINCT: Excludes duplicate rows in the set operation.Positional column matching
Other column-related rules
UNION
ALL, all column types must support
equality comparison.Parenthesized set operators
UNION ALL and UNION DISTINCT are considered different.EXCEPT set operations, for example,
query results can vary depending on the operation grouping.The following examples illustrate the use of parentheses with set operations:
-- Same set operations, no parentheses.
query1
UNION ALL
query2
UNION ALL
query3;
-- Different set operations, parentheses needed.
query1
UNION ALL
(
query2
UNION DISTINCT
query3
);
-- Invalid
query1
UNION ALL
query2
UNION DISTINCT
query3;
-- Same set operations, no parentheses.
query1
EXCEPT ALL
query2
EXCEPT ALL
query3;
-- Equivalent query with optional parentheses, returns same results.
(
query1
EXCEPT ALL
query2
)
EXCEPT ALL
query3;
-- Different execution order with a subquery, parentheses needed.
query1
EXCEPT ALL
(
query2
EXCEPT ALL
query3
);
Set operator behavior with duplicate rows
Consider a given row R that appears exactly m times in the first input query
and n times in the second input query, where m >= 0 and n >= 0:
UNION ALL, row R appears exactly m + n times in the
result.INTERSECT ALL, row R appears exactly MIN(m, n) times in the
result.EXCEPT ALL, row R appears exactly MAX(m - n, 0) times in the
result.UNION DISTINCT, the DISTINCT
is computed after the UNION is computed, so row R appears exactly
one time.INTERSECT DISTINCT, row R appears once in the output if m > 0 and
n > 0.EXCEPT DISTINCT, row R appears once in the output if
m > 0 and n = 0.UNIONThe UNION operator returns the combined results of the left and right input
queries. Columns are matched according to the rules described previously and
rows are concatenated vertically.
Examples
SELECT * FROM UNNEST(ARRAY<INT64>[1, 2, 3]) AS number
UNION ALL
SELECT 1;
/*--------+
| number |
+--------+
| 1 |
| 2 |
| 3 |
| 1 |
+--------*/
SELECT * FROM UNNEST(ARRAY<INT64>[1, 2, 3]) AS number
UNION DISTINCT
SELECT 1;
/*--------+
| number |
+--------+
| 1 |
| 2 |
| 3 |
+--------*/
The following example shows multiple chained operators:
SELECT * FROM UNNEST(ARRAY<INT64>[1, 2, 3]) AS number
UNION DISTINCT
SELECT 1
UNION DISTINCT
SELECT 2;
/*--------+
| number |
+--------+
| 1 |
| 2 |
| 3 |
+--------*/
INTERSECTThe INTERSECT operator returns rows that are found in the results of both the
left and right input queries.
Examples
SELECT * FROM UNNEST(ARRAY<INT64>[1, 2, 3, 3, 4]) AS number
INTERSECT ALL
SELECT * FROM UNNEST(ARRAY<INT64>[2, 3, 3, 5]) AS number;
/*--------+
| number |
+--------+
| 2 |
| 3 |
| 3 |
+--------*/
SELECT * FROM UNNEST(ARRAY<INT64>[1, 2, 3, 3, 4]) AS number
INTERSECT DISTINCT
SELECT * FROM UNNEST(ARRAY<INT64>[2, 3, 3, 5]) AS number;
/*--------+
| number |
+--------+
| 2 |
| 3 |
+--------*/
The following example shows multiple chained operations:
SELECT * FROM UNNEST(ARRAY<INT64>[1, 2, 3, 3, 4]) AS number
INTERSECT DISTINCT
SELECT * FROM UNNEST(ARRAY<INT64>[2, 3, 3, 5]) AS number
INTERSECT DISTINCT
SELECT * FROM UNNEST(ARRAY<INT64>[3, 3, 4, 5]) AS number;
/*--------+
| number |
+--------+
| 3 |
+--------*/
EXCEPTThe EXCEPT operator returns rows from the left input query that aren't present
in the right input query.
Examples
SELECT * FROM UNNEST(ARRAY<INT64>[1, 2, 3, 3, 4]) AS number
EXCEPT ALL
SELECT * FROM UNNEST(ARRAY<INT64>[1, 2]) AS number;
/*--------+
| number |
+--------+
| 3 |
| 3 |
| 4 |
+--------*/
SELECT * FROM UNNEST(ARRAY<INT64>[1, 2, 3, 3, 4]) AS number
EXCEPT DISTINCT
SELECT * FROM UNNEST(ARRAY<INT64>[1, 2]) AS number;
/*--------+
| number |
+--------+
| 3 |
| 4 |
+--------*/
The following example shows multiple chained operations:
SELECT * FROM UNNEST(ARRAY<INT64>[1, 2, 3, 3, 4]) AS number
EXCEPT DISTINCT
SELECT * FROM UNNEST(ARRAY<INT64>[1, 2]) AS number
EXCEPT DISTINCT
SELECT * FROM UNNEST(ARRAY<INT64>[1, 4]) AS number;
/*--------+
| number |
+--------+
| 3 |
+--------*/
The following example modifies the execution behavior of the set operations. The
first input query is used against the result of the last two input queries
instead of the values of the last two queries individually. In this example,
the EXCEPT result of the last two input queries is 2. Therefore, the
EXCEPT results of the entire query are any values other than 2 in the first
input query.
SELECT * FROM UNNEST(ARRAY<INT64>[1, 2, 3, 3, 4]) AS number
EXCEPT DISTINCT
(
SELECT * FROM UNNEST(ARRAY<INT64>[1, 2]) AS number
EXCEPT DISTINCT
SELECT * FROM UNNEST(ARRAY<INT64>[1, 4]) AS number
);
/*--------+
| number |
+--------+
| 1 |
| 3 |
| 4 |
+--------*/
LIMIT and OFFSET clauseLIMIT count [ OFFSET skip_rows ]
Limits the number of rows to return in a query. Optionally includes the ability to skip over rows.
Definitions
LIMIT: Limits the number of rows to produce.
count is an INT64 constant expression that represents the
non-negative, non-NULL limit. No more than count rows are produced.
LIMIT 0 returns 0 rows.
If there is a set operation, LIMIT is applied after the set operation is
evaluated.
OFFSET: Skips a specific number of rows before applying LIMIT.
skip_rows is an INT64 constant expression that represents
the non-negative, non-NULL number of rows to skip.
Details
The rows that are returned by LIMIT and OFFSET have undefined order unless
these clauses are used after ORDER BY.
A constant expression can be represented by a general expression, literal, or parameter value.
Examples
SELECT *
FROM UNNEST(ARRAY<STRING>['a', 'b', 'c', 'd', 'e']) AS letter
ORDER BY letter ASC LIMIT 2;
/*---------+
| letter |
+---------+
| a |
| b |
+---------*/
SELECT *
FROM UNNEST(ARRAY<STRING>['a', 'b', 'c', 'd', 'e']) AS letter
ORDER BY letter ASC LIMIT 3 OFFSET 1;
/*---------+
| letter |
+---------+
| b |
| c |
| d |
+---------*/
FOR UPDATE clauseSELECT expression
FOR UPDATE;
UPDATE
expression;
In serializable isolation,
when you use the SELECT query to scan a table, add a FOR UPDATE clause to
enable exclusive locks at the row-and-column granularity level, otherwise known
as cell-level. The lock remains in place for
the lifetime of the read-write transaction. During this time, the FOR UPDATE
clause prevents other transactions from modifying the locked cells until the
current transaction completes. For more information, see
Use SELECT FOR UPDATE in serializable isolation.
Unlike in serializable isolation, FOR UPDATE doesn't acquire locks under
repeatable read isolation. For more information, see
Use SELECT FOR UPDATE in repeatable read isolation.
Example:
SELECT MarketingBudget
FROM Albums
WHERE SingerId = 1 and AlbumId = 1
FOR UPDATE;
UPDATE Albums
SET MarketingBudget = 100000
WHERE SingerId = 1 and AlbumId = 1;
You can't use the FOR UPDATE clause in the following ways:
LOCK_SCANNED_RANGES hintWITH clause
WITH cte[, ...]
A WITH clause contains one or more common table expressions (CTEs).
A CTE acts like a temporary table that you can reference within a single
query expression. Each CTE binds the results of a subquery
to a table name, which can be used elsewhere in the same query expression,
but rules apply.
cte: cte_name AS ( query_expr )
A common table expression (CTE) contains a subquery and a name associated with the CTE.
WITH clause, but rules apply.In this example, a WITH clause defines two CTEs that
are referenced in the related set operation, where one CTE is referenced by
each of the set operation's input query expressions:
WITH subQ1 AS (SELECT SchoolID FROM Roster),
subQ2 AS (SELECT OpponentID FROM PlayerStats)
SELECT * FROM subQ1
UNION ALL
SELECT * FROM subQ2
WITH isn't supported in a subquery. This returns an error:
SELECT account
FROM (
WITH result AS (SELECT * FROM NPCs)
SELECT *
FROM result)
You can use a FOR UPDATE clause in a CTE subquery to lock the scanned
range of the subquery.
The following query exclusively
locks col1 and col2 in table t.
WITH t1 AS (SELECT col1, col2 FROM t FOR UPDATE)
SELECT * FROM t1;
However, a FOR UPDATE clause in the outer query won't propagate into the CTE.
In the following example, an exclusive lock won't apply to any cells in table t.
WITH t2 AS (SELECT col1, col2 FROM t)
SELECT * FROM t2 FOR UPDATE;
WITH clause isn't supported in DML statements.
Temporary tables defined by the WITH clause are stored in memory.
Spanner dynamically allocates memory for all temporary tables
created by a query. If the available resources aren't sufficient then the query
will fail.
Common table expressions (CTEs) can be referenced inside the query expression
that contains the WITH clause.
Here are some general rules and constraints to consider when working with CTEs:
WITH clause must have a unique name.WITH clause is only visible to other CTEs in the same
WITH clause that were defined after it.UNION DISTINCT isn't allowed inside a WITH RECURSIVE clause.References between common table expressions (CTEs) in the WITH clause can go
backward but not forward.
This is what happens when you have two CTEs that reference
themselves or each other in a WITH clause. Assume that A is the first CTE and B
is the second CTE in the clause:
This produces an error. A can't reference itself because self-references
aren't supported:
WITH
A AS (SELECT 1 AS n UNION ALL (SELECT n + 1 FROM A WHERE n < 3))
SELECT * FROM A
-- Error
This produces an error. A can't reference B because references between
CTEs can go backwards but not forwards:
WITH
A AS (SELECT * FROM B),
B AS (SELECT 1 AS n)
SELECT * FROM B
-- Error
B can reference A because references between CTEs can go backwards:
WITH
A AS (SELECT 1 AS n),
B AS (SELECT * FROM A)
SELECT * FROM B
/*---+
| n |
+---+
| 1 |
+---*/
This produces an error. A and B reference each other, which creates a
cycle:
WITH
A AS (SELECT * FROM B),
B AS (SELECT * FROM A)
SELECT * FROM B
-- Error
An alias is a temporary name given to a table, column, or expression present in
a query. You can introduce explicit aliases in the SELECT list or FROM
clause, or GoogleSQL infers an implicit alias for some expressions.
Expressions with neither an explicit nor implicit alias are anonymous and the
query can't reference them by name.
You can introduce explicit aliases in either the FROM clause or the SELECT
list.
In a FROM clause, you can introduce explicit aliases for any item, including
tables, arrays, subqueries, and UNNEST clauses, using [AS] alias. The AS
keyword is optional.
Example:
SELECT s.FirstName, s2.SongName
FROM Singers AS s, (SELECT * FROM Songs) AS s2;
You can introduce explicit aliases for any expression in the SELECT list using
[AS] alias. The AS keyword is optional.
Example:
SELECT s.FirstName AS name, LOWER(s.FirstName) AS lname
FROM Singers s;
In the SELECT list, if there is an expression that doesn't have an explicit
alias, GoogleSQL assigns an implicit alias according to the following
rules. There can be multiple columns with the same alias in the SELECT list.
SELECT abc
implies AS abc.SELECT abc.def.ghi implies AS ghi.SELECT (struct_function()).fname implies AS
fname.In all other cases, there is no implicit alias, so the column is anonymous and can't be referenced by name. The data from that column will still be returned and the displayed query results may have a generated label for that column, but the label can't be used like an alias.
In a FROM clause, from_items aren't required to have an alias. The
following rules apply:
FROM abc implies AS abc.
FROM abc.def.ghi implies AS ghi
WITH OFFSET has the implicit alias
offset.
FROM UNNEST(x) doesn't have an implicit alias.
After you introduce an explicit alias in a query, there are restrictions on where else in the query you can reference that alias. These restrictions on alias visibility are the result of GoogleSQL name scoping rules.
FROM clauseGoogleSQL processes aliases in a FROM clause from left to right,
and aliases are visible only to subsequent path expressions in a FROM
clause.
Example:
Assume the Singers table had a Concerts column of ARRAY type.
SELECT FirstName
FROM Singers AS s, s.Concerts;
Invalid:
SELECT FirstName
FROM s.Concerts, Singers AS s; // INVALID.
FROM clause aliases are not visible to subqueries in the same FROM
clause. Subqueries in a FROM clause can't contain correlated references to
other tables in the same FROM clause.
Invalid:
SELECT FirstName
FROM Singers AS s, (SELECT (2020 - ReleaseDate) FROM s) // INVALID.
You can use any column name from a table in the FROM as an alias anywhere in
the query, with or without qualification with the table name.
Example:
SELECT FirstName, s.ReleaseDate
FROM Singers s WHERE ReleaseDate = 1975;
If the FROM clause contains an explicit alias, you must use the explicit alias
instead of the implicit alias for the remainder of the query (see
Implicit Aliases). A table alias is useful for brevity or
to eliminate ambiguity in cases such as self-joins, where the same table is
scanned multiple times during query processing.
Example:
SELECT * FROM Singers as s, Songs as s2
ORDER BY s.LastName
Invalid — ORDER BY doesn't use the table alias:
SELECT * FROM Singers as s, Songs as s2
ORDER BY Singers.LastName; // INVALID.
SELECT listAliases in the SELECT list are visible only to the following clauses:
GROUP BY clauseORDER BY clauseHAVING clauseExample:
SELECT LastName AS last, SingerID
FROM Singers
ORDER BY last;
GROUP BY, ORDER BY, and HAVING clausesThese three clauses, GROUP BY, ORDER BY, and HAVING, can refer to only the
following values:
FROM clause and any of their columns.SELECT list.GROUP BY and ORDER BY can also refer to a third group:
SELECT list. The integer 1
refers to the first item in the SELECT list, 2 refers to the second item,
etc.Example:
SELECT SingerID AS sid, COUNT(Songid) AS s2id
FROM Songs
GROUP BY 1
ORDER BY 2 DESC;
The previous query is equivalent to:
SELECT SingerID AS sid, COUNT(Songid) AS s2id
FROM Songs
GROUP BY sid
ORDER BY s2id DESC;
A SELECT list or subquery containing multiple explicit or implicit aliases
of the same name is allowed, as long as the alias name isn't referenced
elsewhere in the query, since the reference would be
ambiguous.
Example:
SELECT 1 AS a, 2 AS a;
/*---+---+
| a | a |
+---+---+
| 1 | 2 |
+---+---*/
GoogleSQL provides an error if accessing a name is ambiguous, meaning it can resolve to more than one unique object in the query or in a table schema, including the schema of a destination table.
The following query contains column names that conflict between tables, since
both Singers and Songs have a column named SingerID:
SELECT SingerID
FROM Singers, Songs;
The following query contains aliases that are ambiguous in the GROUP BY clause
because they are duplicated in the SELECT list:
SELECT FirstName AS name, LastName AS name,
FROM Singers
GROUP BY name;
The following query contains aliases that are ambiguous in the SELECT list and
FROM clause because they share a column and field with same name.
Person table has three columns: FirstName,
LastName, and PrimaryContact.PrimaryContact column represents a struct with these fields:
FirstName and LastName.The alias P is ambiguous and will produce an error because P.FirstName in
the GROUP BY clause could refer to either Person.FirstName or
Person.PrimaryContact.FirstName.
SELECT FirstName, LastName, PrimaryContact AS P
FROM Person AS P
GROUP BY P.FirstName;
A name is not ambiguous in GROUP BY, ORDER BY or HAVING if it's both
a column name and a SELECT list alias, as long as the name resolves to the
same underlying object. In the following example, the alias BirthYear isn't
ambiguous because it resolves to the same underlying column,
Singers.BirthYear.
SELECT LastName, BirthYear AS BirthYear
FROM Singers
GROUP BY BirthYear;
In GoogleSQL, a range variable is a table expression alias in the
FROM clause. Sometimes a range variable is known as a table alias. A
range variable lets you reference rows being scanned from a table expression.
A table expression represents an item in the FROM clause that returns a table.
Common items that this expression can represent include
tables,
value tables,
subqueries,
joins, and parenthesized joins.
In general, a range variable provides a reference to the rows of a table
expression. A range variable can be used to qualify a column reference and
unambiguously identify the related table, for example range_variable.column_1.
When referencing a range variable on its own without a specified column suffix, the result of a table expression is the row type of the related table. Value tables have explicit row types, so for range variables related to value tables, the result type is the value table's row type. Other tables don't have explicit row types, and for those tables, the range variable type is a dynamically defined struct that includes all of the columns in the table.
Examples
In these examples, the WITH clause is used to emulate a temporary table
called Grid. This table has columns x and y. A range variable called
Coordinate refers to the current row as the table is scanned. Coordinate
can be used to access the entire row or columns in the row.
The following example selects column x from range variable Coordinate,
which in effect selects column x from table Grid.
WITH Grid AS (SELECT 1 x, 2 y)
SELECT Coordinate.x FROM Grid AS Coordinate;
/*---+
| x |
+---+
| 1 |
+---*/
The following example selects all columns from range variable Coordinate,
which in effect selects all columns from table Grid.
WITH Grid AS (SELECT 1 x, 2 y)
SELECT Coordinate.* FROM Grid AS Coordinate;
/*---+---+
| x | y |
+---+---+
| 1 | 2 |
+---+---*/
The following example selects the range variable Coordinate, which is a
reference to rows in table Grid. Since Grid isn't a value table,
the result type of Coordinate is a struct that contains all the columns
from Grid.
WITH Grid AS (SELECT 1 x, 2 y)
SELECT Coordinate FROM Grid AS Coordinate;
/*--------------+
| Coordinate |
+--------------+
| {x: 1, y: 2} |
+--------------*/
@{hint_key=hint_value[, ...]}
GoogleSQL supports hints, which make the query optimizer use a specific operator in the execution plan. If performance is an issue for you, a hint might be able to help by suggesting a different query execution plan shape.
The syntax for specifying hints differs between the GoogleSQL and PostgreSQL dialects:
@{ hint_key = value, ... }./*@ hint_key = value, ... */.While the syntax varies, the available hints and their intended effects remain largely consistent. For details on PostgreSQL hint syntax and usage, see Hints in PostgreSQL.
Definitions
hint_key: The name of the hint key.hint_value: The value for hint_key.Examples
@{KEY_ONE=TRUE}
@{KEY_TWO=10, KEY_THREE=FALSE}
USE_ADDITIONAL_PARALLELISMTRUEFALSE (default)
TRUE, the execution engine favors using more
parallelism when possible. Because this can reduce resources available
to other operations, you may want to avoid this hint if you run
latency-sensitive operations on the same instance.
OPTIMIZER_VERSION1 to Nlatest_versiondefault_version
Executes the query using the specified optimizer version. Possible
values are 1 to N (the latest optimizer
version), default_version, or
latest_version. If the hint isn't set, the optimizer
executes against the package that's set in database options or
specified through the client API. If neither of those are set, the
optimizer uses the
default version.
In terms of version setting precedence, the value set by the client API takes precedence over the value in the database options and the value set by this hint takes precedence over everything else.
For more information, see Query optimizer.
OPTIMIZER_STATISTICS_PACKAGEpackage_namelatest
Executes the query using the specified optimizer statistics package.
Possible values for package_name can be found by
running the following query:
SELECT * FROM INFORMATION_SCHEMA.SPANNER_STATISTICSIf the hint isn't set, the optimizer executes against the package that's set in the database option or specified through the client API. If neither of those are set, the optimizer defaults to the latest package.
The value set by the client API takes precedence over the value in the database options and the value set by this hint takes precedence over everything else.
The specified package needs to be pinned by the database option or
have allow_gc=false to prevent garbage collection.
For more information, see Query optimizer statistics packages.
ALLOW_DISTRIBUTED_MERGETRUE (default)FALSE
If TRUE (default), the engine favors using a
distributed merge sort algorithm for certain ORDER BY queries.
When applicable, global sorts are changed to local sorts. This gives
the advantage of parallel sorting close to where the data is stored.
The locally sorted data is then merged to provide globally
sorted data. This allows for removal of full global sorts and
potentially improved latency.
This feature can increase parallelism of certain ORDER BY queries. This hint has been provided so that users can experiment with turning off the distributed merge algorithm if desired.
LOCK_SCANNED_RANGESexclusiveshared (default)
Use this hint to request an exclusive lock on a set of ranges scanned by a transaction. Acquiring an exclusive lock helps in scenarios when you observe high write contention, that is, you notice that multiple transactions are concurrently trying to read and write to the same data, resulting in a large number of aborts.
Without the hint, it's possible that multiple simultaneous transactions will acquire shared locks, and then try to upgrade to exclusive locks. This will cause a deadlock, because each transaction's shared lock is preventing the other transaction(s) from upgrading to exclusive. Spanner aborts all but one of the transactions.
When requesting an exclusive lock using this hint, one transaction acquires the lock and proceeds to execute, while other transactions wait their turn for the lock. Throughput is still limited because the conflicting transactions can only be performed one at a time, but in this case Spanner is always making progress on one transaction, saving time that would otherwise be spent aborting and retrying transactions.
This hint is supported on all statement types, both query and DML.
Spanner always enforces serializability Lock mode hints can affect which transactions wait or abort in contended workloads, but don't change the isolation level.
Because this is just a hint, it shouldn't be considered equivalent to a mutex. In other words, you shouldn't use Spanner exclusive locks as a mutual exclusion mechanism for the execution of code outside of Spanner. For more information, see Locking.
You can't use both the
The default Spanner scan method is
The default Spanner query execution method is
If
For more information, see informational foreign keys.
The following table hints are supported:
FOR UPDATE clause and the
LOCK_SCANNED_RANGES hint in the same query. An error is
returned. For more information, see .
SCAN_METHODAUTO (default)
|
BATCH
|
ROW
|
COLUMNAR
|
NO_COLUMNAR
Use this hint to enforce the query scan method.
AUTO
(automatic). The AUTO setting specifies that
batch-oriented query processing and columnar formatted data, if
available, could be used to improve query
performance.
The ROW scan method forces the scan to read row formatted
data and use row-oriented execution, whereas the BATCH scan
method
uses row formatted data and batch-oriented execution.
The COLUMNAR scan method will read columnar
formatted data, if available, and utilize batch-oriented execution.
The NO_COLUMNAR scan method disables
automatic selection of columnar formatted data but allows
automatic selection of batch-oriented query processing, similar
to the AUTO scan method.
You can't manually set the scan method to
AUTO,
but if you remove the hint, then
uses the AUTO scan method.
For more information, see
Optimize scans.
EXECUTION_METHODDEFAULT
|
BATCH
|
ROW
Use this hint to enforce the query execution method.
DEFAULT. The DEFAULT setting specifies that
batch-oriented execution might be used to improve query performance,
depending on the heuristics of the query. If you want to change the
default execution method, you can use a statement hint to enforce the
BATCH-oriented or ROW-oriented execution
method. You can't manually set the query execution method to
DEFAULT; to do so, remove the statement hint, and
Spanner will set it to the default method. For more
information, see
Optimize query execution.
USE_UNENFORCED_FOREIGN_KEY
TRUE (default)
| FALSE
Use this hint to enforce the query scan method.
TRUE (default), the query optimizer relies on
informational
(NOT ENFORCED) foreign key relationships
to improve query performance. For example, if set to
TRUE, the optimizer can remove redundant scans, and push
some LIMIT operators through the join operators.
USE_UNENFORCED_FOREIGN_KEY overrides the value of the
use_unenforced_foreign_key_for_query_optimization
database option for the applied statement. This hint might introduce
incorrect results if the data is inconsistent with the foreign key
relationships.
ALLOW_TIMESTAMP_PREDICATE_PUSHDOWN
TRUE
| FALSE (default)
If set to
TRUE, the query execution engine uses the
timestamp predicate pushdown optimization technique. This technique
improves the efficiency of queries that use timestamps and data with an
age-based tiered storage policy. For more information, see
Optimize queries with timestamp predicate pushdown.
Table hints
| Hint key | Possible values | Description |
|---|---|---|
FORCE_INDEX |
STRING
|
Note: |
GROUPBY_SCAN_OPTIMIZATION |
TRUE| FALSE |
The group by scan optimization can make queries faster if they use
The optimization is applied if the optimizer estimates that it will
make the query more efficient. The hint overrides that decision. If
the hint is set to |
SCAN_METHOD |
AUTO (default)| BATCH| ROW| COLUMNAR| NO_COLUMNAR
|
Use this hint to enforce the query scan method.
The default Spanner scan method is |
INDEX_STRATEGY |
FORCE_INDEX_UNION |
Use the |
SEEKABLE_KEY_SIZE |
0 to 16 |
Forces the seekable key size to be equal to the specified value. The seekable key size is the length of the key (primary key or index key) that's used in a seekable condition, while the rest of the key is used in a residual condition.
This hint requires the |
The following example shows how to use a secondary index
when reading from a table, by appending an index directive of the form
@{FORCE_INDEX=index_name} to the table name:
SELECT s.SingerId, s.FirstName, s.LastName, s.SingerInfo
FROM Singers@{FORCE_INDEX=SingersByFirstLastName} AS s
WHERE s.FirstName = "Catalina" AND s.LastName > "M";
You can include multiple indexes in a query, though only a single index is supported for each distinct table reference. Example:
SELECT s.SingerId, s.FirstName, s.LastName, s.SingerInfo, c.ConcertDate
FROM Singers@{FORCE_INDEX=SingersByFirstLastName} AS s JOIN
Concerts@{FORCE_INDEX=ConcertsBySingerId} AS c ON s.SingerId = c.SingerId
WHERE s.FirstName = "Catalina" AND s.LastName > "M";
Read more about index directives in Secondary Indexes.
The following join hints are supported:
| Hint key | Possible values | Description |
|---|---|---|
FORCE_JOIN_ORDER |
TRUEFALSE (default)
|
If set to TRUE, use the join order that's specified in the
query.
|
JOIN_METHOD |
HASH_JOINAPPLY_JOINMERGE_JOINPUSH_BROADCAST_HASH_JOIN
|
When implementing a logical join, choose a specific alternative to
use for the underlying join method. Learn more in
Join methods.
To use a HASH join, either use HASH JOIN or
JOIN@{JOIN_METHOD=HASH_JOIN}, but not both.
|
HASH_JOIN_BUILD_SIDE |
BUILD_LEFTBUILD_RIGHT |
Specifies which side of the hash join is used as the build
side. Can only be used with JOIN_METHOD=HASH_JOIN
|
BATCH_MODE |
TRUE (default)FALSE |
Used to disable batched apply join in favor of row-at-a-time apply join.
Can only be used with JOIN_METHOD=APPLY_JOIN.
|
FACTORIZED_MODE |
FACTORIZE_LEFTFACTORIZE_RIGHTFACTORIZE_BOTH
|
Specifies which sides of the join should be factorized (deduplicated) prior to the join. Learn more in Factorized Mode. |
HASH_JOIN_EXECUTION |
MULTI_PASS (default)ONE_PASS
|
For a hash join, specifies what should be done when the hash table size
reaches its memory limit.
Can only be used when JOIN_METHOD=HASH_JOIN.
See Hash Join Execution for more
details.
|
Join methods are specific implementations of the various logical join types. Some join methods are available only for certain join types. The choice of which join method to use depends on the specifics of your query and of the data being queried. The best way to figure out if a particular join method helps with the performance of your query is to try the method and view the resulting query execution plan. See Query Execution Operators for more details.
| Join Method | Description | Operands |
|---|---|---|
HASH_JOIN |
The hash join operator builds a hash table out of one side (the build side), and probes in the hash table for all the elements in the other side (the probe side). | Different variants are used for various join types. View the query execution plan for your query to see which variant is used. Read more about the Hash join operator. |
APPLY_JOIN |
The apply join operator gets each item from one side (the input side), and evaluates the subquery on other side (the map side) using the values of the item from the input side. | Different variants are used for various join types. Cross apply is used for inner join, and outer apply is used for left joins. Read more about the Cross apply operator and Outer apply operator |
MERGE_JOIN |
The merge join operator joins two streams of sorted data. The optimizer adds Sort operators to the plan if the data isn't already providing the required sort property for the given join condition. The engine provides a distributed merge sort by default, which when coupled with merge join may allow for larger joins, potentially avoiding disk spilling and improving scale and latency. | Different variants are used for various join types. View the query execution plan for your query to see which variant is used. Read more about the Merge join operator. |
PUSH_BROADCAST_HASH_JOIN |
The push broadcast hash join operator builds a batch of data from the build side of the join. The batch is then sent in parallel to all the local splits of the probe side of the join. On each of the local servers, a hash join is executed between the batch and the local data. This join is most likely to be beneficial when the input can fit within one batch, but isn't strict. Another potential area of benefit is when operations can be distributed to the local servers, such as an aggregation that occurs after a join. A push broadcast hash join can distribute some aggregation where a traditional hash join can't. | Different variants are used for various join types. View the query execution plan for your query to see which variant is used. Read more about the Push broadcast hash join operator. |
When you execute a join, multiple rows on one side can match a single row or multiple rows on the other side. This can lead to overhead from repeated join-key comparisons or hashtable probes during execution.
Factorization is a join execution technique that reduces redundant work by
deduplicating join-key values on one or both sides before joining them.
Deduplication decreases the number of processed rows and eliminates repeated
computations without altering the join result.
Use the hint values FACTORIZE_LEFT, FACTORIZE_RIGHT, or FACTORIZE_BOTH to
explicitly control which join sides are factorized.
Factorized mode is supported only for INNER JOINs with equality conditions.
Explicit deduplication can introduce its own overhead, potentially outweighing the benefits of factorized join execution. The performance impact depends on several factors, including the join-key's duplication factor, the chosen join method (hash, apply, or merge), the join cardinality (for example, one-to-many, many-to-one, or many-to-many), and the overall join topology. However, the most benefit is expected for queries with a large number of many-to-many joins, which is typical in graph traversal scenarios. For more guidance on when to apply the factorized execution mode, see Best practices for tuning Spanner Graph queries.
To execute a hash join between two tables, Spanner first scans
rows from the build side and loads them into a hash table. Then it scans rows
from the probe side, while comparing them against the hash table. If the hash
table reaches its memory limit, depending on the value of the
HASH_JOIN_EXECUTION query hint, the hash join has one of the following
behaviors:
HASH_JOIN_EXECUTION=MULTI_PASS (default): The query engine splits the
build side table into partitions in a way that the size of a hash table
corresponding to each partition is less than the memory size limit. For
every partition of the build side table, the probe side is scanned once.HASH_JOIN_EXECUTION=ONE_PASS: The query engine writes both the build side
table and the probe side table to disk in partitions in a way that the hash
table of the build side table in each partition is less than the memory
limit. The probe side is only scanned once.The following group hints are supported:
| Group hint key | Group hint values | Description |
|---|---|---|
GROUP_METHOD |
HASH_GROUPSTREAM_GROUP |
Specifies an alternative to choose when implementing a GROUP BY operator. |
Hints are supported for graphs. For more information, see Graph hints.
In addition to standard SQL tables, GoogleSQL supports value tables. In a value table, rather than having rows made up of a list of columns, each row is a single value of a specific type, and there are no column names.
In the following example, a value table for a STRUCT is produced with the
SELECT AS VALUE statement:
SELECT * FROM (SELECT AS VALUE STRUCT(123 AS a, FALSE AS b))
/*-----+-------+
| a | b |
+-----+-------+
| 123 | FALSE |
+-----+-------*/
Value tables are often but not exclusively used with compound data types. A value table can consist of any supported GoogleSQL data type, although value tables consisting of scalar types occur less frequently than structs.
Spanner doesn't support value tables as base tables in database schemas and doesn't support returning value tables in query results. As a consequence, value table producing queries aren't supported as top-level queries.
Value tables can also occur as the output of the UNNEST
operator or a subquery. The WITH clause
introduces a value table if the subquery used produces a value table.
In contexts where a query with exactly one column is expected, a value table query can be used instead. For example, scalar and array subqueries normally require a single-column query, but in GoogleSQL, they also allow using a value table query.
In SET operations like UNION ALL you can combine tables with value tables,
provided that the table consists of a single column with a type that matches the
value table's type. The result of these operations is always a value table.
These examples include statements which perform queries on the
Roster and TeamMascot,
and PlayerStats tables.
The following tables are used to illustrate the behavior of different query clauses in this reference.
The Roster table includes a list of player names (LastName) and the
unique ID assigned to their school (SchoolID). It looks like this:
/*-----------------------+
| LastName | SchoolID |
+-----------------------+
| Adams | 50 |
| Buchanan | 52 |
| Coolidge | 52 |
| Davis | 51 |
| Eisenhower | 77 |
+-----------------------*/
You can use this WITH clause to emulate a temporary table name for the
examples in this reference:
WITH Roster AS
(SELECT 'Adams' as LastName, 50 as SchoolID UNION ALL
SELECT 'Buchanan', 52 UNION ALL
SELECT 'Coolidge', 52 UNION ALL
SELECT 'Davis', 51 UNION ALL
SELECT 'Eisenhower', 77)
SELECT * FROM Roster
The PlayerStats table includes a list of player names (LastName) and the
unique ID assigned to the opponent they played in a given game (OpponentID)
and the number of points scored by the athlete in that game (PointsScored).
/*----------------------------------------+
| LastName | OpponentID | PointsScored |
+----------------------------------------+
| Adams | 51 | 3 |
| Buchanan | 77 | 0 |
| Coolidge | 77 | 1 |
| Adams | 52 | 4 |
| Buchanan | 50 | 13 |
+----------------------------------------*/
You can use this WITH clause to emulate a temporary table name for the
examples in this reference:
WITH PlayerStats AS
(SELECT 'Adams' as LastName, 51 as OpponentID, 3 as PointsScored UNION ALL
SELECT 'Buchanan', 77, 0 UNION ALL
SELECT 'Coolidge', 77, 1 UNION ALL
SELECT 'Adams', 52, 4 UNION ALL
SELECT 'Buchanan', 50, 13)
SELECT * FROM PlayerStats
The TeamMascot table includes a list of unique school IDs (SchoolID) and the
mascot for that school (Mascot).
/*---------------------+
| SchoolID | Mascot |
+---------------------+
| 50 | Jaguars |
| 51 | Knights |
| 52 | Lakers |
| 53 | Mustangs |
+---------------------*/
You can use this WITH clause to emulate a temporary table name for the
examples in this reference:
WITH TeamMascot AS
(SELECT 50 as SchoolID, 'Jaguars' as Mascot UNION ALL
SELECT 51, 'Knights' UNION ALL
SELECT 52, 'Lakers' UNION ALL
SELECT 53, 'Mustangs')
SELECT * FROM TeamMascot
GROUP BY clauseExample:
SELECT LastName, SUM(PointsScored)
FROM PlayerStats
GROUP BY LastName;
| LastName | SUM |
|---|---|
| Adams | 7 |
| Buchanan | 13 |
| Coolidge | 1 |
UNIONThe UNION operator combines the result sets of two or more SELECT statements
by pairing columns from the result set of each SELECT statement and vertically
concatenating them.
Example:
SELECT Mascot AS X, SchoolID AS Y
FROM TeamMascot
UNION ALL
SELECT LastName, PointsScored
FROM PlayerStats;
Results:
| X | Y |
|---|---|
| Jaguars | 50 |
| Knights | 51 |
| Lakers | 52 |
| Mustangs | 53 |
| Adams | 3 |
| Buchanan | 0 |
| Coolidge | 1 |
| Adams | 4 |
| Buchanan | 13 |
INTERSECTThis query returns the last names that are present in both Roster and PlayerStats.
SELECT LastName
FROM Roster
INTERSECT ALL
SELECT LastName
FROM PlayerStats;
Results:
| LastName |
|---|
| Adams |
| Coolidge |
| Buchanan |
EXCEPTThe query below returns last names in Roster that are not present in PlayerStats.
SELECT LastName
FROM Roster
EXCEPT DISTINCT
SELECT LastName
FROM PlayerStats;
Results:
| LastName |
|---|
| Eisenhower |
| Davis |
Reversing the order of the SELECT statements will return last names in
PlayerStats that are not present in Roster:
SELECT LastName
FROM PlayerStats
EXCEPT DISTINCT
SELECT LastName
FROM Roster;
Results:
(empty)
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-06-09 UTC.