What is Data?
● Data is raw facts and figures that can be in the form of numbers, text,
images, or videos. It is the foundation of information and knowledge,
providing valuable insights and driving decision-making processes.
What is a Database?
● A database is a collection of organized data that can be easily accessed,
managed, and updated. It provides a structured way to store and retrieve
information.
Database Management System (DBMS)
● A database management system (DBMS) is software that manages
databases. It provides an interface for users to interact with the database
and perform operations like creating, updating, and deleting data.
Types of DBMS
● Hierarchical DBMS:
● In a hierarchical database, data is organized in a tree-like structure
where each record has one parent record and multiple child records.
● Network DBMS:
● Network databases allow each record to have multiple parent and
child records, creating a more flexible structure than hierarchical
databases.
● Records are linked together in a graph-like structure.
● Object-Oriented DBMS (OODBMS):
● OODBMS stores data in the form of objects, similar to
object-oriented programming.
● Relational DBMS (RDBMS):
● Relational databases store data in tabular form, where each table
consists of rows and columns.
Relational Databases Management
System
● Relational databases store data in tabular form, where each table
consists of rows and columns.
● They enforce a relational model, where relationships between tables
are defined using keys.
● SQL (Structured Query Language) is commonly used to query and
manipulate data in relational databases.
● Examples include MySQL, PostgreSQL, Oracle Database, Microsoft
SQL Server, SQLite.
Difference between DBMS and RDBMS
What is SQL?
SQL (Structured Query Language)
SQL is a programming language used for managing and manipulating relational
databases. It allows users to perform tasks like querying the database, inserting,
updating, and deleting data.
Comparing SQL and NoSQL Databases
SQL and NoSQL are two types of databases that serve different purposes and
have distinct characteristics.
CRUD Operations in SQL
CRUD stands for Create, Read, Update, and Delete, which are the four basic
functions for managing data in a database. Here are examples of CRUD
operations in SQL
Create (INSERT):
The INSERT statement is used to add new records into a table. In the example,
we're inserting a new employee record with specific values for each column.
Retrieve (SELECT):
The SELECT statement retrieves data from one or more tables.
Update (UPDATE):
The UPDATE statement modifies existing records in a table.
Delete (DELETE):
The DELETE statement removes records from a table based on a condition.
Pandas Vs SQL
Pandas and SQL play complementary roles in the data analysis process. Pandas
is well-suited for initial data manipulation and exploratory analysis tasks,
especially with smaller datasets, while SQL excels in querying and aggregating
data directly from databases, making it essential for analyzing large-scale
datasets efficiently. Depending on the specific requirements and the nature of the
data being analyzed, one may choose either pandas, SQL, or a combination of
both for effective data analysis workflows.
Data Exploration in SQL
Data exploration in SQL involves analyzing and understanding the structure,
content, and quality of data stored in a database. It helps in gaining insights into
the dataset.
SQL SELECT Command
The SELECT statement is one of the fundamental commands in SQL used for
retrieving data from one or more tables in a database. It allows users to specify
which columns or expressions to retrieve, as well as filtering, grouping, and
ordering the results.
● SELECT : Keyword indicating the beginning of the query.
● column1, column2, … : Columns to retrieve data from.
● FROM : Keyword specifying the table from which to retrieve data.
● table_name : Name of the table from which data is retrieved.
Basic SELECT Query
- The above query retrieves all columns from the employees table. Here *
means all the data.
Selecting Specific Columns
- This query retrieves only the employee_id, first_name, and last_name
columns from the employees table.
Selecting with Alias
- This query retrieves fname and lname columns with aliases First_Name
and Last_Name respectively.
Derived Column
- This query calculates the Bonus by which is 15% of each employee
monthly salary (salary * 0.15) and retrieves first_name, last_name, and the
calculated field "Bonus". This Bonus column is called a derived column.
Selecting with LIMIT Clause
- This query retrieves the first 5 rows from the employees table. This is
similar to the (.head) method in pandas.
Selecting with OFFSET and LIMIT
This query retrieves 5 rows from the employees table, starting from the 11th row
(offsetting the first 10 rows).
Describe
- This query returns you the information about the columns in the employees
table. It includes the
● column names (Field),
● data types (Type),
● whether the column allows null values (Null),
● whether it's part of a primary key (Key),
● the default value (Default),
● and any additional information (Extra).
Selecting with DISTINCT
DISTINCT keyword is used within the SELECT statement to retrieve unique
values from a specified column or combination of columns in a table. It filters out
duplicate values, returning only distinct values.
● SELECT DISTINCT : Specifies that only unique values should be
retrieved.
● Dep_id : Columns from which distinct values are to be retrieved.
● FROM employees : Specifies the table from which data is retrieved.
Aggregation Function
Aggregation functions are used within the SELECT statement to perform
calculations on a single column or on groups of rows and returns a single value.
These functions allow for summarizing data, calculating statistics, and deriving
insights from the dataset.
Here are some commonly used aggregation functions in SQL:
COUNT(): Counts the number of rows or non-null values in a column.
SUM(): Calculates the sum of values in a numeric column.
AVG(): Calculates the average of values in a numeric column.
MIN(): Finds the minimum value in a column.
MAX(): Finds the maximum value in a column.
All Aggregation Functions on single column
Finding all the aggregated values with respective salary column.
Data Filtering in SQL
Data filtering is achieved using the WHERE clause within the SELECT
statement. This clause allows you to specify conditions that rows must meet to
be included in the result set, effectively filtering out unwanted data.
SQL WHERE Clause
The WHERE clause in SQL is used to filter rows returned by a SELECT
statement based on specified conditions. It allows you to selectively retrieve data
that meets certain criteria from a table or set of tables.
● SELECT : Specifies the columns to retrieve data from.
● column1, column2, … : Columns to be retrieved.
● FROM : Indicates the table from which to retrieve data.
● table_name : Name of the table from which data is retrieved.
● WHERE : Condition that specifies the filtering criteria.
Basic Filtering
- This query retrieves all columns from the employees table where the
department_id is equal to 10.
Using Comparison Operators
- This query retrieves all columns from the employees table where the salary
is greater than 50000.
Using Logical Operators
- This query retrieves all columns from the employees table where the
department_id is equal to 20 and the salary is greater than 60000.
Using Membership Operator
- This query retrieves all columns from the employees table where the
department_id is either 10, 20, or 30.
Using Not Operator
- This query retrieves all columns from the employees table where the
department_id is not 10, 20, or 30.
Using Between Operator
- This query retrieves all columns from the employees table where the salary
falls between 50000 and 80000, inclusive.
Using LIKE Operator for Pattern Matching
- This query retrieves all columns from the employees table where the
first_name starts with the letter 'M'.
Using Regexp Operator (Regular Expression)
- This query retrieves all columns from the employees table where the
first_name starts with the letter 'M'.
Using IS NULL Operator for NULL Values
- This query retrieves all columns from the employees table where the
contact is NULL.
Working of NOT operator
NOT counterparts allow for negating conditions in the WHERE clause, providing
flexibility in querying databases to find rows that do not meet specific criteria.
● NOT BETWEEN - Retrieves rows where the value does not fall within the
specified range.
● NOT REGEXP - Retrieves rows that do not match the specified pattern.
● NOT IN - Retrieves rows where the specified value does not match any
value in the list.
SQL GROUP BY Clause
GROUP BY clause is used to group rows that have the same values into
summary rows, typically to perform aggregate functions such as COUNT, SUM,
AVG, MIN, and MAX on each group.
Example of Aggregated values in GROUP BY
Working of GROUP BY
Limitations of GROUP BY
When using the GROUP BY clause in SQL, it's common to include columns in
the SELECT statement that are also present in the GROUP BY clause. However,
there are limitations to using only those columns in the SELECT command which
are used in the GROUP BY clause. Here are some limitations:
Limited Output
If you only use columns present in the GROUP BY clause in the SELECT
statement, you'll only get grouped data based on those columns. Other details
won't be included in the output.
Loss of Detail
You'll lose specific details from columns not included in the GROUP BY clause.
This can make it harder to understand the context of the aggregated results.
Can't Retrieve Specific Data
You can't directly retrieve data from columns not included in the GROUP BY
clause. So, if you need specific information from those columns, you'll need to
find another way to get it.
Possible Errors with Aggregate Functions
Using aggregate functions on columns not in the GROUP BY clause can lead to
errors or ambiguous results. SQL needs clarity on how to group rows, and using
aggregate functions on non-grouped columns can be confusing.
Less Flexible Queries
Using only grouped columns in the SELECT statement limits your flexibility. You
might need to write extra queries or modify the existing one to get the information
you need from other columns.
In short, while using columns from the GROUP BY clause in the SELECT
statement gives you grouped data, it also means you'll miss out on details from
other columns and have less flexibility in your queries.
GROUP BY Syntax
● SELECT : Indicates the columns to retrieve data from.
● column1, column2, … : Columns to group by and/or retrieve data from.
● aggregate_function(column) : Aggregate functions applied to the
grouped data.
● FROM : Specifies the table from which to retrieve data.
● table_name : Name of the table from which data is retrieved.
● GROUP BY : Specifies the columns to group the result set by.
GROUP BY EXAMPLES
Grouping by a Single Column
- This query will group the orders by product_id and calculate the total
quantity of each product.
Grouping by Multiple Columns
- This query will group the orders by both customer_id and product_id and
calculate the total quantity of each product ordered by each customer.
Using Grouping with Filtering
- This query will group the orders by product_id only for customers from
India and calculate the total quantity of each product ordered by them.
SQL HAVING Clause
HAVING clause is used in combination with the GROUP BY clause to filter
groups of rows returned by a GROUP BY clause based on specified conditions. It
allows you to filter grouped data using aggregate functions.
HAVING Syntax
● SELECT : Indicates the columns to retrieve data from.
● column1, column2, … : Columns to group by and/or retrieve data from.
● aggregate_function(column) : Aggregate functions applied to the
grouped data.
● FROM : Specifies the table from which to retrieve data.
● table_name : Name of the table from which data is retrieved.
● GROUP BY : Specifies the columns to group the result set by.
● HAVING : Condition that filters groups based on aggregate values.
Examples
- This query groups the orders by product_id, calculates the total quantity of
each product using the SUM aggregate function, and then filters out
products with a total quantity greater than 100 using the HAVING clause.
- The HAVING clause is essential for filtering grouped data based on
aggregate values, allowing you to specify conditions on the result of
aggregate functions applied to grouped data.
SQL ORDER BY Clause
ORDER BY clause is used to sort the result set returned by a SELECT statement
based on one or more columns. It allows you to specify the order in which rows
should be displayed in the query result.
ORDER BY Syntax
● SELECT : Indicates the columns to retrieve data from.
● column1, column2, … : Columns to be retrieved.
● FROM : Specifies the table from which to retrieve data.
● table_name : Name of the table from which data is retrieved.
● ORDER BY : Specifies the column(s) to sort the result set by.
● ASC : (Optional) Specifies ascending order (default).
● DESC : (Optional) Specifies descending order.
Example
Order by with Single Column
- This query retrieves the employee_id and hire_date columns from the
employees table and sorts the result set by hire_date in descending order,
meaning the most recent hires will appear first.
Order by with Multiple Columns
- This query retrieves all columns from the employees table and sorts the
result set first by hire_date in ascending order. If there are employees hired
on the same date, it then sorts those employees by last_name in
descending order.
Case Statement
- The Case Statement will act as a IF - ELSE statement where based on the
conditions
- This statement starts with Case keyword and When keyword will help you
to look into a condition and Then keyword will help you to reflect on that
condition.
- In one Case statement you can write multiple conditions and at the end
you will end this statement with the END keyword.
Use Case:
Case Statement helps you create a new column based on the existing
column based on the condition it is similarly act like value mapping this
also helps to create a derived column.
- In the above query we are using Case Statement to create a new column
which is rating_category where the values in this column are based on the
conditions applied on the rating column in the mobiles data.
Order of Execution
- SQL adheres the order for the query execution similarly like Order of a
math formula following the BODMAS rule.
- The Order of execution of an SQL query is like general order.
1. FROM/JOIN
- The FROM and/or JOIN clauses are executed first to determine the data of
interest.
2. WHERE
- The WHERE clause is executed to filter out records that do not meet the
constraints.
3. GROUP BY
- The GROUP BY clause is executed to group the data based on the values
in one or more columns.
4. HAVING
- The HAVING clause is executed to remove the created grouped records
that don’t meet the constraints.
5. SELECT
- The SELECT clause is executed to derive all desired columns and
expressions.
6. ORDER BY
- The ORDER BY clause is executed to sort the derived values in ascending
or descending order.
7. LIMIT/OFFSET
- Finally, the LIMIT and/or OFFSET clauses are executed to keep or skip a
specified number of rows.
What is a Relation in a Database?
Before we dive into the relationship types, let’s clarify some basic terminology:
Entity: An entity represents a person, place, thing, or any item about which data
is stored in the database. In practice, entities are often synonymous with
database tables.
Relation: The term “relation” is used to describe the relationships that exist
between tables in a relational database. These relationships are established
through keys (primary and foreign keys).
Types of Relationships:
1. One-to-Many Relationship (1:N):
● In a one-to-many relationship, one record in the first table (parent table) can
be associated with multiple records in the second table (child table).
● Example: A customer can place multiple orders, but each order belongs to
only one customer.
2. One-to-One Relationship (1:1):
● In a one-to-one relationship, each record in the first table corresponds to
exactly one record in the second table.
● Example: A person’s passport information (stored in one table) is directly
related to the person (stored in another table).
3. Many-to-One Relationship (N:1):
● This is essentially the reverse of the one-to-many relationship.
● Many records in the first table can be associated with a single record in the
second table.
● Example: Multiple orders can be associated with a single customer.
4. Self-Referencing Relationship:
● This occurs when a table relates to itself.
● Example: An organizational chart where employees report to other
employees within the same table.
Keys
In relational databases, the fundamental concepts of primary and foreign keys
play pivotal roles. These keys are implemented as constraints within SQL tables.
A table may feature a primary key, a foreign key, or both, depending on its design
and relational structure.
Primary Key
A primary key serves the crucial function of uniquely identifying each record
within a table. It's essential to note that only one primary key is permitted per
table, and this constraint can be assigned to any single column or combination of
columns. Consequently, every value within the designated column(s) must be
distinct.
Conventionally, in a table structure, the primary key often resides in an ID
column, frequently accompanied by the AUTO_INCREMENT keyword. This
configuration ensures that the value automatically increments as new records are
added to the table.
Example 1
This SQL script creates a new table named "users" with columns for ID (which is
set as the primary key), first name, last name, address, and email. The ID
column is configured to automatically increment with each new record.
Example 2
Alter an existing table and set the primary key to the first_name column.
Composite key
A composite key, in the context of databases, is a combination of two or more columns in a
table that uniquely identifies a record within that table. Unlike a primary key, which consists of a
single column, a composite key comprises multiple columns. Together, these columns must be
unique to ensure the uniqueness of each row in the table.
Example
In above table Employe_ID and Department_ID columns are together acting as
Primary key ie. Composite key.
Foreign Key
A foreign key is a relational database
concept that establishes a connection
between two tables. It can be applied to
one or multiple columns and serves to link
data between these tables. In this
relationship, the table containing the
foreign key is termed the child table, while
the table containing the referenced key,
also known as the candidate key, is termed
the parent table.
This arrangement facilitates data sharing
between the two tables. Moreover, the
presence of a foreign key prevents the
insertion of invalid data into the child table
if the corresponding data is not present in
the parent table, ensuring referential
integrity within the database.
Example 1
Create a new table and turn any columns that reference IDs in other tables into foreign keys.
Example 2
Alter an existing table and create a foreign key.
Joins
In SQL, a JOIN clause aids in fetching a result set by merging data from multiple
tables using a common column present in both tables.
1. Inner Join:
● An inner join returns only the matching rows from both tables.
● It combines rows based on a common column (usually a primary key and
foreign key relationship).
2. Left Outer Join (Left Join):
● A left join returns all rows from the left table and matching rows from the
right table.
● If there’s no match in the right table, null values are included.
3. Right Outer Join (Right Join):
● A right join is similar to a left join but includes all rows from the right table.
● If there’s no match in the left table, null values are included.
4. Full Outer Join (Full Join):
● A full join returns all rows from both tables, including unmatched rows.
● Null values are included where there’s no match.
5. Cross Join (Cartesian Product):
● A cross join combines every row from the first table with every row from the
second table.
● It doesn’t require a common column.
6. Self Join:
● A self join occurs when a table is joined with itself.
● Useful for hierarchical data or comparing rows within the same table.
Consider the following example using an Employees table:
1. Column ID: Acts as the primary key.
2. Column ManagerId: Foreign key that stores values that can be matched up
with the values in Column ID.
● Each employee has a unique Id (Column X).
● The ManagerId column contains the Id of the employee’s manager.
The result will include the employee’s information along with their manager’s
name:
Set Operations
1. UNION:
The UNION operator combines result sets from two SELECT statements.
It returns unique rows (eliminates duplicates).
Example:
2. UNION ALL:
UNION ALL combines all rows from both SELECT statements (including
duplicates).
Example:
Derived Tables
A derived table (also known as an inline view or subquery) is a subquery nested
within the FROM clause of another query. It generates a temporary table-like
result set that you can use just like any other table in your SQL statements.
Derived tables are particularly handy when you need to perform additional
calculations, filtering, or transformations on existing data before joining it with
other tables.
Example 1: Basic Derived Table
Let’s start with a simple example. Consider the following query that
retrieves data from the products table:
Example 2: Joining Derived Tables
You can also join derived tables. Let’s say we have additional table orders:
In this example:
● The derived table retrieves the customerNumber, customerName and
country columns from the customers table where the customers are from
the USA.
● We join this derived table with the orders on the customerNumber.
Common Table Expressions (CTEs)
CTEs does the same work that the derived table doest but it is more readable,
self-referencing, and can be referenced multiple times.
1. Syntax of a CTE:
● The WITH keyword introduces the CTE.
● my_cte is the CTE name.
● The CTE query (inside parentheses) retrieves data from the source table.
● The main query (after the closing parenthesis) references the CTE.
Example 1: Basic Derived CTE
Taking the same example as Derived tables, retrieving data from the products
table:
Example 2: Joining Derived Tables
Just like a derived table you can also join CTE. Let’s say we have table name
orders:
In this example:
● The CTE retrieves the customerNumber, customerName and country
columns from the customers table where the customers are from the USA.
name of the CTE is cust_cte
● We join this CTE table with the orders on the customerNumber.
Example 2: Writing multiple CTE’s
In this example:
● The CTE retrieves the customerNumber, customerName and country
columns from the customers table where the customers are from the USA.
the name of the CTE is cust_cte.
● We created another CTE called ords with columns orderNumber, status,
and customerNumber from the orders table.
● We join this CTE table another called cust_cte with the orders CTE on the
customerNumber.
Inbuilt Functions
String Functions
String Functions
Name Description
ASCII Returns the equivalent ASCII value for a specific character.
CHAR_LENGTH Returns the character length of a string.
CHARACTER_
Same as CHAR_LENGTH.
LENGTH
CONCAT Adds expressions together, with a minimum of 2.
CONCAT_WS Adds expressions together, but with a separator between each
value.
Returns an index value relative to the position of a value within
FIELD
a list of values.
FIND IN SET Returns the position of a string in a list of strings.
When passed a number, returns that number formatted to
FORMAT
include commas (eg 3,400,000).
Allows you to insert one string into another at a certain
INSERT
point, for a certain number of characters.
INSTR Returns the position of the first time one string appears within
another.
LCASE Convert a string to lowercase.
Starting from the left, extract the given number of
LEFT
characters from a string and return them as another.
LENGTH Returns the length of a string, but in bytes.
LOCATE Returns the first occurrence of one string within another,
LOWER Same as LCASE.
LPAD Left pads one string with another, to a specific length.
LTRIM Remove any leading spaces from the given string.
MID Extracts one string from another, starting from any position.
Returns the position of the first time one substring appears
POSITION
within another.
REPEAT Allows you to repeat a string
Allows you to replace any instances of a substring within a
REPLACE
string, with a new substring.
REVERSE Reverses the string.
Starting from the right, extract the given number of
RIGHT
characters from a string and return them as another.
RPAD Right pads one string with another, to a specific length.
RTRIM Removes any trailing spaces from the given string.
SPACE Returns a string full of spaces equal to the amount you pass it.
STRCMP Compares 2 strings for differences
SUBSTR Extracts one substring from another, starting from any position.
SUBSTRING Same as SUBSTR
SUBSTRING_ Returns a substring from a string before the passed
INDEX substring is found the number of times equals to the passed
number.
Removes trailing and leading spaces from the given string.
TRIM
Same as if you were to run LTRIM and RTRIM together.
UCASE Convert a string to uppercase.
UPPER Same as UCASE.
Numeric Functions
Numeric Functions
Name Description
ABS Returns the absolute value of the given number.
ACOS Returns the arc cosine of the given number.
ASIN Returns the arc sine of the given number.
ATAN Returns the arc tangent of one or 2 given numbers.
ATAN2 Return the arc tangent of 2 given numbers.
AVG Returns the average value of the given expression.
Returns the closest whole number (integer) upwards from a
CEIL
given decimal point number.
CEILING Same as CEIL.
COS Returns the cosine of a given number.
COT Returns the cotangent of a given number.
COUNT Returns the amount of records that are returned by a SELECT
query.
DEGREES Converts a radians value to degrees.
DIV Allows you to divide integers.
EXP Returns e to the power of the given number.
Returns the closest whole number (integer) downwards from a
FLOOR
given decimal point number.
GREATEST Returns the highest value in a list of arguments.
LEAST Returns the smallest value in a list of arguments.
LN Returns the natural logarithm of the given number
Returns the natural logarithm of the given number, or the
LOG
logarithm of the given number to the given base
LOG10 Does the same as LOG, but to base 10.
LOG2 Does the same as LOG, but to base 2.
MAX Returns the highest value from a set of values.
MIN Returns the lowest value from a set of values.
Returns the remainder of the given number divided by the other
MOD
given number.
PI Returns PI.
Returns the value of the given number raised to the power of the
POW
other given number.
POWER Same as POW.
RADIANS Converts a degrees value to radians.
RAND Returns a random number.
ROUND Round the given number to the given amount of decimal places.
SIGN Returns the sign of the given number.
SIN Returns the sine of the given number.
SQRT Returns the square root of the given number.
SUM Returns the value of the given set of values combined.
TAN Returns the tangent of the given number.
TRUNCATE Returns a number truncated to the given number of decimal
places.
Date Functions
Date Functions
Name Description
Add a date interval (eg: 10 DAY) to a date (eg: 20/01/20)
ADDDATE
and return the result (eg: 20/01/30).
Add a time interval (eg: 02:00) to a time or datetime
ADDTIME
(05:00) and return the result (07:00).
CURDATE Get the current date.
CURRENT_DATE Same as CURDATE.
CURRENT_TIME Get the current time.
CURRENT_
Get the current date and time.
TIMESTAMP
CURTIME Same as CURRENT_TIME.
DATE Extracts the date from a datetime expression.
DATEDIFF Returns the number of days between the 2 given dates.
DATE_ADD Same as ADDDATE.
DATE_FORMAT Formats the date to the given pattern.
Subtract a date interval (eg: 10 DAY) to a date (eg:
DATE_SUB
20/01/20) and return the result (eg: 20/01/10).
DAY Returns the day for the given date.
DAYNAME Returns the weekday name for the given date.
DAYOFWEEK Returns the index for the weekday for the given date.
DAYOFYEAR Returns the day of the year for the given date.
EXTRACT Extract from the date the given part (eg MONTH for 20/01/20
= 01).
FROM DAYS Return the date from the given numeric date value.
HOUR Return the hour from the given date.
LAST DAY Get the last day of the month for the given date.
LOCALTIME Gets the current local date and time.
LOCALTIMESTAMP Same as LOCALTIME.
Creates a date and returns it, based on the given year and
MAKEDATE
number of days values.
Creates a time and returns it, based on the given hour, minute
MAKETIME
and second values.
MICROSECOND Returns the microsecond of a given time or datetime.
MINUTE Returns the minute of the given time or datetime.
MONTH Returns the month of the given date.
MONTHNAME Returns the name of the month of the given date.
NOW Same as LOCALTIME.
PERIOD_ADD Adds the given number of months to the given period.
PERIOD_DIFF Returns the difference between 2 given periods.
QUARTER Returns the year quarter for the given date.
SECOND Returns the second of a given time or datetime.
SEC_TO_TIME Returns a time based on the given seconds.
STR_TO_DATE Creates a date and returns it based on the given string and
format.
SUBDATE Same as DATE_SUB.
Subtracts a time interval (eg: 02:00) to a time or datetime
SUBTIME
(05:00) and return the result (03:00).
SYSDATE Same as LOCALTIME.
TIME Returns the time from a given time or datetime.
TIME_FORMAT Returns the given time in the given format.
TIME_TO_SEC Converts and returns a time into seconds.
TIMEDIFF Returns the difference between 2 given time/datetime
expressions.
TIMESTAMP Returns the datetime value of the given date or datetime.
Returns the total number of days that have passed from
TO_DAYS
‘00-00- 0000’ to the given date.
WEEK Returns the week number for the given date.
WEEKDAY Returns the weekday number for the given date.
WEEKOFYEAR Returns the week number for the given date.
YEAR Returns the year from the given date.
YEARWEEK Returns the year and week number for the given date.
Misc Functions
Numeric
Functions
Name Description
IN Returns the given number in binary.
BINARY Returns the given value as a binary string.
CAST Convert one type into another.
COALESCE From a list of values, return the first non-null value.
CONNECTION_ID For the current connection, return the unique connection ID.
Convert the given number from one numeric base system
CONV
into another.
CONVERT Convert the given value into the given datatype or character
set.
Return the user and hostname which was used to
CURRENT_USER
authenticate with the server.
DATABASE Get the name of the current database.
Used alongside aggregate functions (COUNT, MAX, MIN,
SUM, AVG) to group the results.
Example: Lists the number of users with active orders.
GROUP BY
SELECT COUNT(user_id), active_orders
FROM users
GROUP BY active_orders;
It’s used in the place of WHERE with aggregate functions.
Example: Lists the number of users with active orders, but only
include users with more than 3 active orders.
HAVING
SELECT COUNT(user_id), active_orders
FROM users
GROUP BY active_orders
HAVING COUNT(user_id) >
3;
IF If the condition is true return a value, otherwise return another
value.
IFNULL If the given expression equates to null, return the given value.
ISNULL If the expression is null, return 1, otherwise return 0.
For the last row which was added or updated in a table,
LAST_INSERT_ID
return the auto increment ID.
Compares the 2 given expressions. If they are equal, NULL is
NULLIF
returned, otherwise the first expression is returned.
SESSION_USER Return the current user and hostnames.
SYSTEM_USER Same as SESSION_USER.
USER Same as SESSION_USER.
VERSION Returns the current version of the MySQL powering the
database.
SQL Commands
1. DDL (Data Definition Language):
● DDL deals with defining and managing the database structure.
● It includes commands to create, modify, and delete database objects (tables, indexes,
views, etc.), but not data.
● Common DDL commands:
○ CREATE: Used to create database objects (e.g., tables, indexes, views).
○ DROP: Deletes objects from the database.
○ ALTER: Modifies the structure of the database (e.g., adding or deleting columns).
○ TRUNCATE: Removes all records from a table.
○ COMMENT: Adds comments to the data dictionary.
2. DQL (Data Query Language):
● DQL is used for querying data within schema objects.
● It includes the SELECT statement.
● DQL retrieves data from the database and imposes order upon it.
3. DML (Data Manipulation Language):
● DML deals with manipulating data within the database.
● It includes commands to insert, update, and delete data.
● Common DML commands:
○ INSERT: Adds new rows to a table.
○ UPDATE: Modifies existing data in a table.
○ DELETE: Removes data from a table.
4. DCL (Data Control Language):
● DCL controls access and security.
● It includes commands to grant or revoke permissions.
● Common DCL commands:
○ GRANT: Provides specific privileges to users or roles.
○ REVOKE: Removes specific privileges from users or roles.
5. TCL (Transaction Control Language):
● TCL manages transactions and data integrity.
● It includes commands to control the flow of transactions.
● Common TCL commands:
○ COMMIT: Saves changes to the database.
○ ROLLBACK: Undoes changes made during the current transaction.
○ SAVEPOINT: Sets a point within a transaction to which you can later roll back.
Data Types
String Data Types
String Data Types
Data Type Description
Fixed length string which can contain letters, numbers and
CHAR(size) special characters. The size parameter sets the maximum
string length, from 0 – 255 with a default of 1.
Variable length string similar to CHAR(), but with a
VARCHAR(size)
maximum string length range from 0 to 65535.
BINARY(size) Similar to CHAR() but stores binary byte strings.
VARBINARY(size) Similar to VARCHAR() but for binary byte strings.
TINYBLOB Holds Binary Large Objects (BLOBs) with a max length of 255
bytes.
Holds a string with a maximum length of 255 characters.
TINYTEXT
Use VARCHAR() instead, as it’s fetched much faster.
Holds a string with a maximum length of 65535 bytes. Again,
TEXT(size)
better to use VARCHAR().
Holds Binary Large Objects (BLOBs) with a max length of
BLOB(size)
65535 bytes.
MEDIUMTEXT Holds a string with a maximum length of 16,777,215 characters.
Holds Binary Large Objects (BLOBs) with a max length of
MEDIUMBLOB
16,777,215 bytes.
LONGTEXT Holds a string with a maximum length of 4,294,967,295
characters.
Holds Binary Large Objects (BLOBs) with a max length of
LONGBLOB
4,294,967,295 bytes.
A string object that only has one value, which is chosen from
a list of values which you define, up to a maximum of 65535
values. If a value is added which isn’t on this list, it’s replaced
ENUM(a, b, c,
with a blank value instead. Think of ENUM being similar to
etc…) HTML radio boxes in this regard.
CREATE TABLE tshirts (color ENUM(‘red’, ‘green’, ‘blue’, ‘yellow’,
‘purple’));
A string object that can have 0 or more values, which is
SET(a, b, c, etc…) chosen from a list of values which you define, up to a
maximum of 64 values. Think of SET being similar to HTML
checkboxs in this regard.
Numeric Data Types
Numeric Data Types
Data Type Description
A bit-value type with a default of 1. The allowed number of
BIT(size)
bits in a value is set via the size parameter, which can hold
values from 1 to 64.
A very small integer with a signed range of -128 to 127, and
TINYINT(size) an unsigned range of 0 to 255. Here, the size parameter
specifies the maximum allowed display width, which is 255.
Essentially a quick way of setting the column to TINYINT with a
BOOL
size of
1. 0 is considered false, whilst 1 is considered true.
BOOLEAN Same as BOOL.
A small integer with a signed range of -32768 to 32767, and
SMALLINT(size) an unsigned range from 0 to 65535. Here, the size
parameter specifies the maximum allowed display width,
which is 255.
A medium integer with a signed range of -8388608 to
MEDIUMINT(size) 8388607, and an unsigned range from 0 to 16777215. Here,
the size parameter specifies the maximum allowed display
width, which is 255.
A medium integer with a signed range of -2147483648 to
2147483647, and an unsigned range from 0 to 4294967295.
INT(size)
Here, the size parameter specifies the maximum allowed display
width, which is 255.
INTEGER(size) Same as INT.
A medium integer with a signed range of
-9223372036854775808 to 9223372036854775807, and an
BIGINT(size)
unsigned range from 0 to 18446744073709551615. Here,
the size parameter specifies the maximum allowed display
width, which is 255.
A floating point number value. If the precision (p) parameter is
between 0 to 24, then the data type is set to FLOAT(), whilst if
FLOAT(p)
its from 25 to 53, the data type is set to DOUBLE(). This
behaviour is to make the storage of values more efficient.
A floating point number value where the total digits are set by
DOUBLE(size, d) the size parameter, and the number of digits after the decimal
point is set by the d parameter.
An exact fixed point number where the total number of digits
DECIMAL(size, d) is set by the size parameters, and the total number of digits
after the decimal point is set by the d parameter.
DEC(size, d) Same as DECIMAL.
Date / Time Data Types
Date / Time Data
Types
Data Type Description
A simple date in YYYY-MM–DD format, with a supported
DATE
range from ‘1000-01-01’ to ‘9999-12-31’.
A date time in YYYY-MM-DD hh:mm:ss format,
DATETIME(fsp) By adding DEFAULT and ON UPDATE to the column definition, it
automatically sets to the current date/time.
By adding DEFAULT CURRENT_TIMESTAMP and ON UPDATE
TIMESTAMP(fsp)
CURRENT TIMESTAMP to the column definition, it
automatically sets to current date/time.
A time in hh:mm:ss format, with a supported range from
TIME(fsp)
‘-838:59:59’ to ‘838:59:59’.
YEAR A year, with a supported range of ‘1901’ to ‘2155’.
SQL Keywords
SQL Keywords
Keyword Description
Creates a new database.
CREATE
DATABASE Example: Creates a new database named innmatics.
CREATE DATABASE innomatics;
Creates a new table .
Example: Creates a new table called ‘users’ in the ‘innomatics’.
CREATE TABLE
CREATE TABLE users (
id int,
first_name varchar(255), surname varchar(255), address
varchar(255), contact_number int);
Creates a new table .
Example: Creates a new table called ‘users’ in the ‘innomatics’
database.
CREATE TABLE
CREATE TABLE users (
id int,
first_name varchar(255), surname varchar(255), address
varchar(255), contact_number int);
Adds, deletes or edits columns in a table. It can also be used to add
and delete constraints in a table, as per the above.
Example: Adds a new boolean column called ‘approved’ to a table
named ‘deals’.
ALTER TABLE
ALTER TABLE deals
ADD approved boolean;
Example 2: Deletes the ‘approved’ column from the ‘deals’ table
ALTER TABLE deals DROP COLUMN approved;
Sets a default value for a column;
Example 1 (MySQL): Creates a new table called Products which has
a name column with a default value of ‘Placeholder Name’ and an
available_ from column with a default value of today’s date.
DEFAULT CREATE TABLE products (
id int,
name varchar(255) DEFAULT 'Placeholder Name',
available_from date DEFAULT GETDATE()
);
Example 2 (MySQL): The same as above, but editing an existing
table.
ALTER TABLE products
ALTER name SET DEFAULT 'Placeholder Name', ALTER
available_from SET DEFAULT GETDATE();
Delete data from a table.
DELETE
Example: Removes a user with a user_id of 674.
DELETE FROM users WHERE user_id = 674;
Deletes a column from a table.
DROP COLUMN Example: Removes the first_name column from the users table.
ALTER TABLE users
DROP COLUMN first_name
Deletes the entire database.
DROP
DATABASE Example: Deletes a database named ‘websitesetup’.
DROP DATABASE websitesetup;
Removes a default value for a column.
DROP DEFAULT Example (MySQL): Removes the default value from the ‘name’
column in the ‘products’ table.
ALTER TABLE products
ALTER COLUMN name DROP DEFAULT;
Deletes a table from a database.
DROP TABLE
Example: Removes the users table.
DROP TABLE users;
Add new rows to a table.
INSERT INTO Example: Adds a new vehicle.
INSERT INTO cars (make, model, mileage, year) VALUES ('Audi',
'A3', 30000, 2016);
Updates existing data in a table.
Example: Updates the mileage and serviceDue values for a vehicle
UPDATE with an id of 45 in the cars table.
UPDATE cars
SET mileage = 230, serviceDue = 0 WHERE id = 45;
Delete data from a table.
DELETE
Example: Removes a user with a user_id of 675.
DELETE FROM users WHERE user_id = 675;
Used alongside UPDATE to update existing data in a table.
Example: Updates the value and quantity values for an order with
SET an id of 642 in the orders table.
UPDATE orders
SET value = 19.49, quantity = 2 WHERE id = 642;
SOME
Identical to ANY.
Used alongside SELECT to return a set number of records from a
TOP
table.
Example: Returns the top 5 users from the users table.
SELECT TOP 5 * FROM users;
Similar to DROP, but instead of deleting the table and its data, this
TRUNCATE
deletes only the data.
TABLE
Example: Empties the sessions table, but leaves the table itself
intact.
TRUNCATE TABLE sessions;
This constraint ensures all values in a column are unique.
Example 1 (MySQL): Adds a unique constraint to the id column
when creating a new users table.
UNIQUE CREATE TABLE users (
id int NOT NULL,
name varchar(255) NOT NULL, UNIQUE (id)
);
Example 2 (MySQL): Alters an existing column to add a UNIQUE
constraint.
ALTER TABLE users ADD UNIQUE (id);
Updates existing data in a table.
Example: Updates the mileage and serviceDue values for a vehicle
UPDATE with an id of 45 in the cars table.
UPDATE cars
SET mileage = 23500, serviceDue = 0 WHERE id = 45;
Used alongside the INSERT INTO keyword to add new values to a
table.
VALUES
Example: Adds a new car to the cars table.
INSERT INTO cars (name, model, year) VALUES ('Ford', 'Fiesta',
2010);