MySQL Lab Manual: SQL Basics & Joins
MySQL Lab Manual: SQL Basics & Joins
An INNER JOIN returns rows where there is a match in both tables involved in the join, effectively filtering data to only include records with a common value in the specified columns. This is useful for cases where complete data integrity is necessary for both tables, such as 'SELECT CUSTOMER.CUS_LNAME, INVOICE.INV_DATE FROM CUSTOMER INNER JOIN INVOICE ON CUSTOMER.CUS_CODE = INVOICE.CUS_CODE;' . In contrast, a LEFT JOIN returns all rows from the left table and matched rows from the right table, filling unmatched right-side columns with NULLs. It is especially useful for queries where preserving all records from the main table is necessary, even if there are no matches in the joined table, like 'SELECT CUSTOMER.CUS_CODE, INVOICE.INV_DATE FROM CUSTOMER LEFT JOIN INVOICE ON CUSTOMER.CUS_CODE = INVOICE.CUS_CODE;' .
The WHERE and HAVING clauses both filter records, but they differ in their purpose and application. The WHERE clause is used to filter records before any groupings or aggregations have been applied. It applies to individual rows and is used to set conditions on which records to select from a table. For example, 'SELECT * FROM CUSTOMER WHERE CUS_BALANCE > 200;' filters customers with a balance greater than 200 . In contrast, the HAVING clause filters records after grouping and aggregation have occurred. It is used when conditions need to be applied to aggregated data. For example, 'SELECT EMP_TITLE, COUNT(*) FROM EMP GROUP BY EMP_TITLE HAVING COUNT(*) > 2;' selects job titles that appear more than twice after counting them .
Aggregate functions perform calculations on a set of values, returning a single value result that provides insights into the data. The MAX and MIN functions return the highest and lowest values in a column, such as finding the most or least expensive product with 'SELECT MAX(P_PRICE) FROM PRODUCT;' . AVG calculates the average value of a numeric column, like the average customer balance with 'SELECT AVG(CUS_BALANCE) FROM CUSTOMER;' . COUNT tallies the number of records or non-null entries in a column, useful for summarizing datasets, as in 'SELECT COUNT(*) FROM EMP;' to get the total number of employees . SUM adds up numeric column values, offering totals like 'SELECT SUM(LINE_UNITS) FROM LINE;' for inventory assessments . These functions condense large datasets into meaningful summaries for analysis.
Primary keys are unique identifiers for table records, ensuring each record is distinct. They enforce entity integrity by ensuring that no two rows have the same primary key value, as seen in 'CREATE TABLE student ( studentid INT PRIMARY KEY AUTO_INCREMENT, ... )' . Foreign keys link records between tables, creating relationships that preserve referential integrity. They ensure that the data in one table correlates to relevant data in another, enabling complex queries and operations. For example, 'CREATE TABLE enrollment ( studentid INT, ... FOREIGN KEY(studentid) REFERENCES student(studentid) ON DELETE CASCADE, ... );' ensures that student enrollments automatically remove corresponding entries if a student record is deleted . Together, primary and foreign keys provide a robust structure to maintain data accuracy, prevent redundancy, and ensure all data remains interconnected and reliable.
SQL JOINs facilitate comprehensive data retrieval from different tables by integrating related data based on common keys. INNER JOIN selects records with matching keys in both tables, enabling focused and integrity-driven analysis of intersecting datasets, such as 'SELECT CUSTOMER.CUS_LNAME, INVOICE.INV_DATE FROM CUSTOMER INNER JOIN INVOICE ON CUSTOMER.CUS_CODE = INVOICE.CUS_CODE;' . LEFT JOIN ensures inclusion of all records from the left table and corresponding matches from the right, supporting scenarios where primary table information is central and supplementary data is optional. RIGHT JOIN works oppositely, which is often used when the secondary table's data is crucial, e.g., preserving all vendor details while showing available products. FULL OUTER JOIN, which MySQL simulates through the union of LEFT and RIGHT JOIN, captures all records from both tables, providing the most comprehensive view by including unmatched entries as NULLs, effectively revealing gaps in relational data that need resolution. Using JOINs, databases can simulate complex relational structures, support real-world business scenarios (such as customer orders, employee roles), and refine integration, analysis, and reporting capabilities in diverse, interconnected datasets.
The ALTER TABLE command in SQL provides flexibility to modify an existing database schema to accommodate evolving business requirements without recreating tables. It can add new columns, as shown in 'ALTER TABLE student ADD date_of_birth DATE;' for extending personal details storage . It enables dropping unnecessary columns with 'ALTER TABLE student DROP COLUMN county;' to streamline or refactor storage . ALTER TABLE can also manipulate constraints, such as adding or dropping primary keys for identity management, and FOREIGN KEY constraints to adjust relationships dynamically. Furthermore, changing column data types or default values tunes database design to improve data processing or accommodate enhancements. By facilitating these modifications, ALTER TABLE supports continuous database growth and adaptation, ensuring applications remain aligned with business developments while maintaining performance and structural integrity.
The LIKE operator in SQL provides powerful pattern matching capabilities, enabling flexible and intuitive data filtering that is essential for text-based query requirements. Unlike exact matches, LIKE uses wildcards such as '%' for any sequence of characters and '_' for a single character, offering greater versatility in searches. For instance, 'SELECT * FROM CUSTOMER WHERE CUS_LNAME LIKE 'S%';' retrieves customers with last names starting with 'S' . Its ability to handle both partial and fuzzy matches makes it ideal for querying incomplete or uncertain input data, such as partial string information in search applications, trend analysis with keyword detection, or for user-facing queries where precise value inputs are not feasible. Unlike standardized equality checks, LIKE allows for broader matching scope, vital in scenarios like filtering log entries, user searches, or textual data analyses.
The ORDER BY clause is used to sort the result set by one or more columns in ascending (ASC) or descending (DESC) order. It helps organize data to make it meaningful. The LIMIT clause is then used to restrict the number of rows returned by the query to improve query performance or meet specific requirements by returning only a subset of the data. For instance, 'SELECT * FROM PRODUCT ORDER BY P_PRICE DESC LIMIT 3;' sorts products in descending order by price and returns the top three most expensive products . This combination optimizes data retrieval by focusing only on the most relevant or required data output.
The SELECT clause retrieves data from tables, and the use of logical operators like AND, OR, and NOT allows for combining multiple conditions within a WHERE clause to filter records. 'AND' requires all specified conditions to be true, 'OR' requires any of the conditions to be true, and 'NOT' inverts the condition. For example, using 'SELECT * FROM EMP WHERE EMP_TITLE = 'Mr.' AND EMP_AREACODE = '615';' filters employees with the title 'Mr.' in area code '615'. Using 'OR' enables queries like 'SELECT * FROM CUSTOMER WHERE CUS_BALANCE > 300 OR CUS_LNAME = 'Smith';', returning customers with a balance over 300 or last name 'Smith'. 'NOT' is used for negating a condition, like 'SELECT * FROM EMPLOYEE WHERE NOT EMP_YEARS < 10;', which returns employees with 10 or more years of service .
ON DELETE CASCADE is defined in foreign key relationships to automatically delete rows in related tables when a referenced record is removed, maintaining data integrity and reducing orphan data. It simplifies data maintenance across tables that have parent-child relationships. For example, 'CREATE TABLE enrollment FOREIGN KEY(studentid) REFERENCES student(studentid) ON DELETE CASCADE ...' ensures that deleting a student's record will also remove any enrollment records associated with that student . This cascade effect prevents potential data inconsistencies and unnecessary manual clean-up, especially useful in scenarios with multi-layer dependencies, such as transactional databases. However, it requires careful use as it can lead to unintended data loss if not strategically applied or when dependent relationships are inadequately structured, necessitating comprehensive understanding and planning during database design to safeguard against irreversible deletions.