SQL Basics
At the end of this topic, students will learn the following:
Detailed description of the features of SQL
The basic structure of a SQL statement and also the basic elements of the language e.g
keywords, data types, and expressions.
How SQL handles missing data through NULL values
Statements
There are about 40 statements in the main body of the SQL Language which can be categorized
into five groups as follows:
1. Data Definition
2. Data Manipulation
3. Access Control
4. Transaction Control
5. Programmatic SQL
The most important and frequently used statements are summarized below according to the
categories described above.
Data Definition
Statement Description
CREATE TABLE Adds a new table to the database
DROP TABLE Removes a table from the database
ALTER TABLE Changes the structure of the existing table
CREATE VIEW Adds a new view to the database
DROP VIEW Removes a view from the database
CREATE INDEX Builds an index for a column
DROP INDEX Removes the index for a column
CREATE SCHEMA Adds a new schema to the database
DROP SCHEMA Removes a schema from the database
CREATE DOMAIN Adds a new data value domain
ALTER DOMAIN Changes a domain definition
DROP DOMAIN Removes a domain from the database
Data Manipulation
Statement Description
SELECT Retrieves data from the database
INSERT Adds new rows of data or tuples to the database
DELETE Removes rows of data or tuples from the database
UPDATE Modifies existing database data
Access Control
Statement Description
GRANT Grants users access privileges
REVOKE Removes users access privileges
Transaction Control
Statement Description
COMMIT Ends the current transaction
ROLLBACK Aborts the current transaction
SET TRANSACTION Defines data access characteristics of the current transaction
Programmatic SQL
Statement Description
DECLARE Defines a cursor for a query
EXPLAIN Describes the data access plan for a query
OPEN Opens a cursor to retrieve query results
FETCH Retrieves a row of query results
CLOSE Closes a cursor
PREPARE Prepares a SQL statement for dynamic execution
EXECUTE Executes a SQL statement dynamically
DESCRIBE Describe a prepared query
TQ:
1. SQL languages is broadly divided into ………… They are: …………..
2. Major operations of SQL commands of databases are: ……………..
As you must have observed in the descriptions above, each statement requests a specific action
from the DBMS. Every SQL statement begins with a keyword that describes what the statement
does. This is called a verb. Examples of typical verbs are CREATE, DELETE, INSERT, DROP, and
ROLLBACK. The SQL statement continues with one or more clauses which may specify the data
to be acted on by the statement, or provide more details about what the statement is supposed
to do. Every clause also begins with a keyword such as FROM, WHERE, HAVING, GROUPBY etc.
Some clauses are optional while others are required. Also, the content and structure of the
clauses vary. While many clauses contain table or column names, others may contain keywords,
constants or expressions.
The ANSI/ISO SQL standard specifies the SQL keywords that are used as verbs and in statement
clauses. These keywords cannot be used to name database objects such as tables and columns.
(Familiarize yourselves with the ANSI/ISO SQL2 Keywords)
Names
Names are used to identify objects in a database. The most fundamental named objects in a
database are table names, column names and user names. The original ANSI/ISO standard
specified that SQL names must contain 1 to 18 characters, must begin with a letter and must
not contain any spaces or special characters. This was reviewed to a maximum of 128
characters in the SQL2 standard.
Table Names
Ideally, tables names should be short but descriptive e.g. STUDENTS, CUSTOMERS, OFFICES etc.
The choice of table names is usually up to the database designer or developer in departmental
or personal databases. However, in large corporate databases, there may be standards
specified for naming tables in order to avoid conflicts in table names. A qualified table name
specifies both the name of the table’s owner and the name of the table separated by a period
(.). For example computer science department could access the STUDENTS table owned by IMT
by using the qualified table name [Link]
Column Names
When a column name is specified in a SQL statement, SQL can normally determine from the
context which column is intended. However, if the statement involves two columns with the
same name from two different tables, a qualified column name must be used to unambiguously
identify the column intended. For example, a column named CITY in the OFFICES table has the
qualified column name [Link]
Data Types
The ANSI/ISO SQL standard specifies the various types of data that can be stored in a database.
Typical data types include the following:
Integers: INT (len), INTEGER )len)
Decimal numbers: NUMERIC (precision, scale), DECIMAL (precision, scale), DEC
(precision, scale)
Floating Point numbers: FLOAT (precision)
Fixed-length character strings: CHAR (len), CHARACTER (len)
Variable-length character strings: VARCHAR (len), CHAR VARYING (len)
Money amounts: MONEY, CURRENCY
Dates and times: DATE, TIME (precision), TIMESTAMP (precision)
Boolean data
Long text
Unstructured byte streams
Constants (Numeric, String, Date and Time, and Symbolic)
Some SQL statements require that a numeric, character, or date data value must be expressed
in text form. Integer and decimal constants are written as ordinary decimal numbers in SQL
statements, with an optional leading plus or minus sign. For example 25 -450 1234.56
+987654.3210. Not all SQL dialects allow the leading sign, so it’s best to avoid it. Commas are
not used between the digits of numeric constants. Floating point constants are specified using
the E notation, e.g. 1.3E3 2.6E-7 0.13745E67 etc.
Character data are enclosed in single quotes (‘….’) as specified by the ANSI/ISO standard, e.g.
‘Minna’ ‘Micheal, Mary J.’ etc. Constant data values for dates, times, and time intervals are also
specified as string constants in SQL products that support date/time data. However the format
of these constants varies from one DBMS to the next.
The SQL language also includes special symbolic constants that return data values maintained in
the DBMS. For example, the symbolic constant CURRENT_DATE returns the value of the current
date and can be used in queries.
Constant data values are also used in expressions.
Expressions
Expressions are used in the SQL language to calculate values that are retrieved from a database
and also to calculate values used to search the database. For example,
SELECT CITY
FROM OFFICES
WHERE TARGET > (1.1*SALES) + 10000.00
The ANSI/ISO standard specifies four arithmetic operators that can be used in expressions:
addition(+), subtraction(-), multiplication(*), and division(/). Parentheses are also used to form
more complicated expression
Built-In Functions
Built-in functions provide data type conversion facilities in most SQL implementations. For
example in DB2, the built in functions MONTH( ) and YEAR( ) take a DATE or TIMESTAMP value
as INPUT and return as integer that is the month or year portion on the value. Built-in functions
are also used for data reformatting.
Missing Data (NULL Values)
There are certain instances in the real world situation in which certain pieces of data are
inevitably missing, unknown, or don’t apply. SQL supports missing, unknown, or inappropriate
data explicitly through the concept of a NULL value. A NULL value is an indicator that tells SQL
and users that the data is missing or not applicable
Simple queries
Simple queries are queries that retrieve data from a single table in the database. The SELECT
statement is used to express SQL queries which could be simple or complex.
The SELECT Statement
The SELECT statement is the most powerful and complex of the SQL statements. There are
many options afforded by the SELECT statement and it’s possible to start simply and then work
up to more complex queries. The SELECT statement retrieves data from the database and
returns it in form of a query result. The keywords SELECT, FROM and WHERE make up the basic
SELECT statement. The full form of the SELECT statement consists of six clauses as follows:
SELECT [ALL|DISTINCT] select-item (column-list, constant, expression)
FROM table-specification (source tables)
WHERE search –condition (desired rows)
GROUP BY grouping-column (summary query)
HAVING search-condition (desired groups)
ORDER BY sort-specification
Examples:
1. List the sales offices with their targets and actual sales
2. List the Eastern region sales offices with their targets and sales
3. List the Eastern region sales offices whose sales exceed their targets, sorted in
alphabetical order by city
4. What are the average target and sales for Eastern region offices?
Query Results
The result of a SQL query is always a table of data, just like the tables in the database. The
number of columns in the query result corresponds to the number of items in the SELECT
clause, while the number of rows depends on the number of rows that meet the search criteria.
Query results can be a single value in some cases. These results are still a table consisting of one
column and one row as seen in example 2 below. It is also possible for a query to produce zero
rows of query results as in example 3 below. The query results are still a table although an
empty table.
SQL support for missing data also extends to query results, for example, if the data item in the
database has a NULL value, the NULL value also appears in the query result as in example 4
below. The fact that SQL query always produce a table of data is very important as it suffices
that query results can be stored back into the database as a table and can be the target of
further queries.
Examples:
1. What are the name, quota and sales of employee number 107?
2. What are the average sales of the sales representatives?
3. List the name and hire date of anyone with sales over $500,000.
4. List the names sales representatives, their quotas and their managers
Calculated Columns
Selecting all columns (SELECT *)
Duplicate Rows (DISTINCT)
If a query includes the primary key of a table in its select list, then every row of the query result
will be unique. However, if the primary key is not included in the query result, duplicate rows
can occur. Duplicate rows of query results can be eliminated by inserting the keyword DISTINCT
in the SELECT statement. The keyword ALL can also be specified to explicitly indicate that
duplicate rows are to be included in the query result.
Example: List the employee numbers of all sales office managers.
Row Selection (WHERE Clauses)
The WHERE clause is used to specify the rows you want to retrieve in a table. The WHERE
clause consists of the keyword WHERE followed by the search condition that specifies the rows
to be retrieved. Here are few examples:
1. List the offices where sales exceed target
2. Show me the name, sales and quota of employee number 105
3. Show me the name and sales of employees managed by Bob smith (employee 104)
Search Conditions
SQL offers a rich set of search conditions that allows the specification of many different kinds of
queries efficiently and naturally. The five basic search conditions (called predicates) are
discussed below.
1. The Comparison Test (=, <>, <, <=, >, >=): This is the most common search condition
used in a SQL query. SQL computes and compares the value of one expression to the
value of another expression for each row of data. The expressions can be a column
name, a constant, or a complex arithmetic expression. When SQL compares the values
of the two expressions in the comparison test, three results can occur:
If the result of the comparison is true, the test yields a TRUE result
If the result of the comparison is false, the test yields a FALSE result.
If either of the two expressions produces a NULL value, the comparison yields a NULL
result.
2. The Range Test (BETWEEN): The range test checks whether a data value lies between
two specified values. It involves three SQL expressions. The first expression defines the
value to be tested; the second and third defines the low and high ends of the range to
be checked. The data types of the three expressions must be comparable. The negated
version of the range test (NOT BETWEEN) checks for values that fall outside the range.
The range test syntax is as follows:
test-expression [NOT] BETWEEN low-expression AND high-expression
Example 1: List the order numbers of orders that fall between 20,000 and
30,000.
Example 2: List sales people whose sales are not between 80 percent and 120
percent of quota.
3. The Set Membership Test (IN): The set membership test, checks whether data value
matches one of a list of target values. The negated version of the set membership test
(NOT IN), is used to check whether the data value does not match any of the target
values. The set membership test syntax is as follows:
test-expression [NOT] IN (constants)
Example: List the salespeople who work in New York, Atlanta, or Denver.
4. The Pattern Matching Test (LIKE): The pattern matching test checks to see whether the
data value in a column matches a specified pattern. The pattern is a string that could
include one or more wildcard character. The wildcard characters include the percent
sign (%), the underscore (_). The percent sign (%) wildcard matches any sequence of
zero or more characters while the underscore (_) wildcard matches a single character.
Strings that do not match a pattern can also be located using the (NOT LIKE) form of the
pattern matching test. The syntax is as follows:
column-name [NOT] LIKE pattern
5. The NULL Value Test (IS NULL): The NULL value test is used to explicitly check for NULL
values in a search condition. The negated form of the NULL value test (IS NOT NULL) is
used to find rows that do not contain a NULL value. The NULL value test syntax is as
follows:
column-name IS [NOT] NULL
Example 1: Find the salesperson not yet assigned to an office.
Example 2: List the salespeople who have been assigned to an office.
Sorting Query Results (ORDER BY Clause)
The ORDER BY clause is used in the SELECT statement to sort the results of a query.
Using the ORDER BY clause, sorting can be requested in an ascending or descending
sequence based on any item in the select list. SQL sorts data in ascending sequence by
default. To request sorting in descending sequence, the keyword DESC is used. If the
column of the query result to be used for sorting is a calculated column, a column
number is used is used instead of a column name. The syntax is as follows:
ORDER BY column-name{column-number} ASC{DESC}
Example 1: Show the sales for each office, sorted in alphabetical order by region
and within each region by city.
Example 2: List the offices, sorted in descending order by sales, so that the
offices with the largest sales appear first.
Rules for Single-Table Query Processing
Single-table queries are simple and usually easy to understand by just reading the SELECT
statement. The following rules describe the procedure for generating the results of a SQL query
that includes the clauses described in the SELECT statement.
Start with the table named in the FROM clause
If there is a WHERE clause, apply its search condition to each row of the table, retaining
only those rows for which the search condition is TRUE, and discarding those rows for
which the search condition is FALSE or NULL.
For each remaining row, calculate the value of each item in the select list to produce a
single row of query results. For each column reference, use the value of the column in
the current row.
If SELECT DISTINCT is specified, eliminate any duplicate rows of the query results that
were produced.
If there is an ORDER BY clause, sort the query as specified.
The rows generated by these steps comprise the query results.
Further Reading
[Link]/sql