CHAPTER FIVE: STRUCTURED QUERY LANGUAGE (SQL)
1. Design Considerations for Data and Constraints
1.1. Design Representation of Derived Data
During the requirement analysis phase, database designers often identify attributes that contain
"derived data." This refers to information that can be calculated or figured out based on other data
already present in the database. A clear strategy for handling these attributes is essential for
database integrity.
Logical Model vs. Data Dictionary: In most professional settings, derived attributes are
excluded from the logical model to avoid clutter. However, they must be meticulously
recorded in the data dictionary so that developers know they exist and understand how they
are calculated.
Performance vs. Storage Decisions: The designer must make a pivotal choice: should the
derived value be stored physically in a table (occupying disk space), or should it be
calculated by the computer "on the fly" whenever a user asks for it? Storing it makes
retrieval faster (better performance) but requires the database to update the value whenever
the source data changes.
Documentation: Every decision regarding derived data must be documented to ensure
future maintenance is possible.
Beginner Example:
Imagine a "Sales" table with a Quantity column and a UnitPrice column. A TotalPrice column
would be "derived" (Quantity × UnitPrice).
Option A (Store it): You save the total. It’s fast to read, but if you change the price later,
you must remember to update the total too.
Option B (Calculate it): You don't save the total. You just tell the computer: "Whenever
I ask for the total, multiply Quantity by Price." This ensures the data is always 100%
accurate.
1.2. Design Enterprise Constraints
A database must follow rules. Some rules come from the software itself, but others come from
the business or "enterprise" using the database.
DBMS Dependency: The way you write these rules depends entirely on which Database
Management System (DBMS) you are using (e.g., MySQL, PostgreSQL, or SQL Server).
Implementation: These rules must be documented so that anyone managing the database
knows exactly which business policies are being enforced by the code.
2. Designing Physical Representation
Physical design is about how the data is actually organized on the computer's hard drive or SSD
(secondary storage). The goal is to make the database run as fast as possible.
Key Steps in Physical Design:
1. Analyze Transactions: Look at how often users will be reading, writing, or deleting
data.
2. Choose File Organization: Decide how the computer should physically stack the data
rows on the disk.
3. Choose Indexes: Create "search shortcuts" (indexes) so the database doesn't have to look
through millions of rows to find one piece of information.
4. Estimate System Requirements: Calculate how much disk space and RAM the database
will need to grow over the next few years.
3. Additional Design Phases
Beyond the basic structure, a complete database design includes:
User Views: Designing specific "windows" into the data so users only see what they
need.
Security Mechanisms: Deciding who has permission to see or change specific tables.
Controlled Redundancy: Sometimes, it is actually helpful to store the same piece of data
in two places to make the system faster, but this must be done very carefully.
Monitoring and Tuning: After the database is running, the designer must check its
speed and "tune" it like an engine to keep it efficient.
4. Understanding SQL (Structured Query Language)
4.1. What is SQL?
SQL (often pronounced "sequel") is a specialized language designed specifically for interacting
with Relational Database Management Systems (RDBMS). It is not a general programming
language like Python or Java; it is a "declarative" language, meaning you tell the computer what
you want, and the computer figures out how to get it.
Summary of SQL Capabilities:
Access: It acts as the key to enter and talk to the database.
Execution: It runs "queries" (questions) to find data.
Management: It allows you to add (Insert), change (Update), and remove (Delete)
records.
Usability: SQL uses English-like words, making it accessible for beginners.
4.2. Interacting with the DBMS
While many beginners use a Graphical User Interface (GUI)—which is like using a folder-and-
mouse system—this study focuses on writing raw SQL commands. Commands are more
powerful, more precise, and can be automated.
5. SQL Command Categories and Rules
5.1. The Three Primary Command Groups
SQL is organized into functional groups based on what the command does:
1. Data Definition Language (DDL): These commands build the "house" (the tables,
columns, and structures).
2. Data Manipulation Language (DML): These commands handle the "furniture" inside
the house (the actual data records).
3. Transaction Control: These manage the flow of work, ensuring that if a process starts, it
finishes correctly or cancels entirely if there is an error.
5.2. Basic Rules for Writing SQL
Case Sensitivity: SQL doesn't care if you write SELECT or select. However, most
professionals write keywords in ALL CAPS to make them stand out.
White Space: You can write a whole command on one line, or spread it across ten lines.
SQL ignores extra spaces and new lines.
The Semi-colon: Always end your SQL statements with a semi-colon (;). This tells the
database, "I am done with this command; please run it now."
6. Data Definition Language (DDL)
DDL is used to define the "schema" or structure of your database.
6.1. Core DDL Statements
CREATE TABLE: Used to build a new table from scratch.
ALTER TABLE: Used if you realize you forgot a column and need to add it later.
DROP TABLE: This deletes the entire table and all the data inside it forever. Use with
caution!
6.2. Creating Databases and Tables
To start, you create a container for your work:
CREATE DATABASE GJUniv;
Then, you define a table. You must give the table a name, name the columns, and define what
kind of data (datatype) goes into those columns.
Beginner Example:
Let’s build a table to keep track of employees.
CREATE TABLE User (
FirstName TEXT,
LastName TEXT,
UserID TEXT,
Dept TEXT,
EmpNo INTEGER,
PCType TEXT
);
6.3. Understanding Datatypes
Choosing the right datatype is crucial for saving space and preventing errors.
TEXT (or VARCHAR): Use this for names, addresses, or any words.
INTEGER: Use this for whole numbers like ID numbers or quantities.
BOOLEAN: Use this for "Yes/No" or "True/False" data.
6.4. Modifying Tables
If you need to add a "Work From Home" status to your table later:
ALTER TABLE User ADD COLUMN RemoteWorker BOOLEAN;
If you want to delete a column:
ALTER TABLE User DROP COLUMN RemoteWorker;
7. Data Manipulation Language (DML)
Once the table is built, DML is used to manage the information inside.
7.1. Inserting Data
The INSERT INTO command adds a new row to your table.
Beginner Example:
INSERT INTO User (FirstName, LastName, UserID, Dept, EmpNo, PCType)
VALUES ("Jim", "Jones", "Jjones", "Finance", 9, "Dell-Laptop");
7.2. Updating Data
The UPDATE command changes data that is already there. It almost always uses a WHERE clause.
Beginner Example:
Suppose Jim Jones moves from the Finance department to the Marketing department.
UPDATE User SET Dept = "Marketing" WHERE EmpNo = 9;
Critically, we use EmpNo = 9 because it is a unique number. If we said WHERE LastName =
"Jones", we might accidentally move every "Jones" in the company to Marketing!
7.3. Deleting Data
To remove a specific person from the table:
DELETE FROM User WHERE EmpNo = 99;
Note: This removes the person's info but leaves the table structure intact.
8. Advanced Constraints in SQL
Constraints are "guardrails" that prevent bad data from entering the system.
8.1. Check Constraints
This ensures a value falls within a specific range.
Example: CHECK (price > 0) ensures you don't accidentally list a product with a
negative price.
8.2. Not-Null Constraints
This forces a column to have data. It cannot be left blank.
Example: You wouldn't want an employee record without an EmpNo.
8.3. Unique Constraints
This ensures that no two rows have the exact same value in a column.
Example: Two different employees should never have the same UserID.
8.4. Primary Key Constraints
The Primary Key is the most important constraint. It is a column (or group of columns) that
uniquely identifies every single row. It is a combination of Unique and Not-Null.
8.5. Foreign Key Constraints
This is how tables "talk" to each other. A Foreign Key ensures that data in one table matches data
in another, maintaining what we call "Referential Integrity."