ADBMS Notes 1
ADBMS Notes 1
• Data Storage & Retrieval: Efficiently stores and fetches data using queries.
• Backup & Recovery: Protects data from loss with automatic recovery mechanisms.
Page | 1
Problems with Traditional File-Based Systems
Before the advent of modern DBMS, data was managed using basic file systems on
hard drives. While this approach allowed users to store, retrieve and update files as needed, it
came with numerous challenges.
• Data Redundancy: Same data stored in multiple files, leading to duplication and
wasted space.
• Difficult Access: Locating specific data required knowing the exact file and manually
searching.
• Poor Security: No control over who could view or edit sensitive information.
• No Multi-user Access: Only one person could use a file at a time, slowing down
teamwork.
• No Backup & Recovery: Data loss was permanent if a file was deleted or corrupted.
Components of DBMS
A DBMS is made up of six key components that work together to handle data
effectively.
Page | 2
1. Hardware
• Stores and processes data; interfaces between real-world inputs and digital systems.
• Examples: Personal computer hard disk, RAM, network devices used for DBMS
operations.
2. Software
• Includes the database engine, OS, network software, and application tools.
3. Data
• Metadata: Data about data (e.g., storage time, size, data type).
4. Procedures
• Covers setup, login/logout, data validation, backup, access control, and report
generation.
• Used to interact with the database (create, read, update, delete data).
Page | 3
6. People
• End Users – Use applications to access the database (e.g., students, employees).
Types of DBMS
There are several types of Database Management Systems (DBMS), each tailored to
different data structures, scalability requirements and application needs. The most common
types are as follows:
• Uses primary keys to uniquely identify rows and foreign keys to establish
relationships between tables.
• Queries are written in SQL (Structured Query Language), which allows for
efficient data manipulation and retrieval.
2. NoSQL DBMS
• They are designed to handle large-scale data and provide high performance for
scenarios where relational models might be restrictive.
• These flexible data models enable rapid scaling and are well-suited for unstructured
or semi-structured data.
Page | 4
3. Object-Oriented DBMS (OODBMS)
• Supports complex data types and relationships, making it ideal for applications
requiring advanced data modeling and real-world simulations.
4. Hierarchical Database
• Organizes data in a tree-like structure, where each record (node) has a single parent
and have multiple children.
• It is efficient for storing data with a clear hierarchy, such as organizational charts or
file directories.
5. Network Database
• Unlike the hierarchical model, it permits each child to have multiple parents,
enabling many-to-many relationships.
• Data is represented using records and sets, where sets define the relationships.
• It is more flexible than the hierarchical model and better suited for applications with
complex data linkages.
6. Cloud-Based Database
• They are hosted on cloud computing platforms like AWS, Azure or Google Cloud.
Page | 5
• They offer on-demand scalability, high availability, automatic backups and remote
accessibility.
Example: Amazon RDS (for SQL), MongoDB Atlas (for NoSQL), Google
BigQuery.
Database Languages
Database languages are specialized sets of commands and instructions used to define,
manipulate and control data within a database. Each language type plays a distinct role in
database management, ensuring efficient storage, retrieval and security of data. The primary
database languages include:
DDL is the short name for Data Definition Language, which deals with database
schemas and descriptions, of how the data should reside in the database.
• CREATE: to create a database and its objects like (table, index, views, store procedure,
function and triggers)
Page | 6
• ALTER: alters the structure of the existing database
• TRUNCATE: remove all records from a table, including all spaces allocated for the
records are removed
DML focuses on manipulating the data stored in the database, enabling users to retrieve,
add, update and delete data.
DCL commands manage access permissions, ensuring data security by controlling who
can perform certain actions on the database.
Page | 7
4. Transaction Control Language (TCL)
• SAVEPOINT: Sets a point within a transaction to which one can later roll back.
• SELECT: The primary DQL command, used to query data from the database without
altering its structure or contents.
Page | 8
COMPARISON BETWEEN DIFFERENT DATABASES:
RELATIONAL VS NON-RELATIONAL
There are two types of DBMSs: relational and non-relational, also referred to as SQL
and NoSQL respectively. Before discussing the most popular database options, let’s take a
closer look at how relational and non-relational database systems differ, considering commonly
used data structures, performance, scalability, and security.
Page | 9
SIGNIFICANCE OF DATABASES
Databases are fundamental to Database Management Systems (DBMS) and hold
significant importance in managing and utilizing data effectively. Their significance can be
understood through several key aspects:
Page | 10
• Reduced Data Redundancy and Anomalies:
By centralizing data storage and enforcing integrity constraints, databases help reduce
data redundancy and eliminate data anomalies, which can lead to inconsistencies and errors in
information.
In the rail route reservation framework, the information base is needed to store the
record or information of ticket appointments, status of train's appearance, and flight.
Additionally, if trains get late, individuals become acquainted with it through the information
base update.
There are many books in the library so; it is difficult to store the record of the relative
multitude of books in a register or duplicate. Along these lines, the data set administration
Page | 11
framework (DBMS) is utilized to keep up all the data identified with the name of the book,
issue date, accessibility of the book, and its writer.
3. Banking
Database the executive's framework is utilized to store the exchange data of the client
in the information base.
4. Education Sector
Presently, assessments are led online by numerous schools and colleges. They deal with
all assessment information through the data set administration framework (DBMS). In spite of
that understudy's enlistments subtleties, grades, courses, expense, participation, results, and so
forth all the data is put away in the information base.
The database Management framework is utilized for buying on charge cards and age of
month-to-month proclamations.
We all utilization of online media sites to associate with companions and to impart our
perspectives to the world. Every day, many people group pursue these online media accounts
like Pinterest, Facebook, Twitter, and Google in addition to. By the utilization of the data set
administration framework, all the data of clients are put away in the information base and, we
become ready to interface with others.
7. Broadcast communications
Without DBMS any media transmission organization can't think. The Database the
executive's framework is fundamental for these organizations to store the call subtleties and
month to month postpaid bills in the information base.
The information base administration framework is utilized for putting away data about
deals, holding and acquisition of monetary instruments, for example, stocks and bonds in a data
set.
Page | 12
9. E-Commerce Websites
These days, web-based shopping has become a major pattern. Nobody needs to visit
the shop and burn through their time. Everybody needs to shop through web-based shopping
sites, (for example, Amazon, Flipkart, Snapdeal) from home. So, all the items are sold and
added uniquely with the assistance of the information base administration framework (DBMS).
Receipt charges, instalments, buy data these are finished with the assistance of DBMS.
11. Manufacturing
Manufacturing organizations make various kinds of items and deal them consistently.
To keep the data about their items like bills, acquisition of the item, amount, inventory network
the executives, information base administration framework (DBMS) is utilized.
DBMS is used in healthcare to manage patient data, medical records, and billing
information.
14. Security
DBMS provides security features to ensure that only authorized users have access to
the data.
15. Telecommunication
Page | 13
ADVANTAGES AND DISADVANTAGES OF DIFFERENT
DATABASE MANAGEMENT SYSTEMS
Advantages of different Database Management systems
In the File System, duplicate data is created in many places because all the programs
have their own files which create data redundancy resulting in wastage of memory. In DBMS,
all the files are integrated in a single database. So there is no chance of duplicate data.
For example: A student record in a library or examination can contain duplicate values,
but when they are converted into a single database, all the duplicate values are removed.
Data security level is high by protecting your precious data from unauthorized access.
Only authorized users should have the grant to access the database with the help of credentials.
Data integrity makes unification of so many files into a single file. DBMS allows data
integrity which makes it easy to decrease data duplicity Data integration and reduces
redundancy as well as data inconsistency.
DBMS allows multiple users to access the same database at a time without any
conflicts.
• Avoidance of inconsistency.
DBMS controls data redundancy and controls data consistency. Data consistency is
nothing but if you want to update data in any files then all the files should not be updated again.
Page | 14
• Shared data
Data can be shared between authorized users of the database in DBMS. All the users
have their own right to access the database. Admin has complete access to the database. He has
a right to assign users to access the database.
• Enforcement of standards
As DBMS have central control of the database. So, a DBA can ensure that all the
applications follow some standards such as format of data, document standards etc. These
standards help in data migrations or in interchanging the data.
Unauthorized persons are not allowed to access the database because of security
credentials.
Data loss is a big problem for all the organizations. In the file system users have to back
up the files in regular intervals which lead to waste of time and resources.
DBMS solves this problem of taking backup automatically and recovery of the
database.
• Tunability
Tuning means adjusting something to get a better performance. Same in the case of
DBMS, as it provides tunability to improve performance. DBA adjusts databases to get
effective results.
• Complexity
The provision of the functionality that is expected of a good DBMS makes the DBMS
an extremely complex piece of software. Database designers, developers, database
administrators and end-users must understand this functionality to take full advantage of it.
Page | 15
Failure to understand the system can lead to bad design decisions, which leads to a
serious consequence for an organization.
• Size
The functionality of DBMS makes use of a large piece of software which occupies
megabytes of disk space.
• Performance
The centralization of resources increases the vulnerability of the system because all
users and applications rely on the availability of DBMS, the failure of any component can bring
operation to halt.
• Cost of DBMS
The cost of DBMS varies significantly depending on the environment and functionality
provided. There is also the recurrent annual maintenance cost.
Page | 16
Examples File System, MySQL, Google Spanner, Legacy Oracle
XML-based PostgreSQL, Cassandra, Systems,
DBMS Oracle DB, SQL CockroachDB Central SQL
Server DBs
Scalability Limited Limited (can High (horizontal Low (single
scale vertically) scaling across point of
nodes) expansion)
Data Location Single system Single system Stored on multiple Stored in one
locations location
Fault Low Low High (replication Low (central
Tolerance across nodes) server failure
= downtime)
Cost Low Moderate High (complex Low to
setup, hardware) moderate
Concurrency Limited Strong Strong (depends Moderate
Support concurrency on
support implementation)
Security Basic Strong (granular Complex (due to Simple (single
access control) multiple nodes) point control)
Definitions:
Page | 17
• Can be homogeneous (same DBMS at all sites) or heterogeneous (different DBMS
at different sites).
• Fault-tolerant and scalable.
• Example: Google Spanner, Cassandra.
4. Centralized Database:
• All data resides in one location (one server/system).
• Simpler to manage and secure.
• Can be a bottleneck and a single point of failure.
• Example: Early enterprise databases, small-scale systems.
Page | 18
MODULE2: RELATIONAL QUERY LANGUAGES
SQL has its own querying methods to interact with the database. But how do these
queries work in the database? These queries work similarly to Relational Algebra that we study
in mathematics. In the database, we have tables participating in relational Algebra.
Relational Database systems are expected to be equipped with a query language that
assists users to query the database. Relational Query Language is used by the user to
communicate with the database user requests for the information from the database. Relational
algebra breaks the user requests and instructs the DBMS to execute the requests. It is the
language by which the user communicates with the database. They are generally on a higher
level than any other programming language. These relational query languages can be
Procedural and Non-Procedural.
In Procedural Language, the user instructs the system to perform a series of operations
on the database to produce the desired results. Users tell what data to be retrieved from the
database and how to retrieve it. Procedural Query Language performs a set of queries
instructing the DBMS to perform various transactions in sequence to meet user requests.
Relational Algebra could be defined as the set of operations on relations. There are a
few operators that are used in relational algebra -
1. Select (sigma): Returns rows of the input relation that satisfy the provided predicate. It
is unary Operator means requires only one operand.
2. Projection (ℼ): Show the list of those attributes which we desire to appear and rest
other attributes are eliminated from the table. It separates the table vertically.
Page | 19
3. Set Difference (-): It returns the difference between two relations. If we have two
relations R and S them R-S will return all the tuples (row) which are in relation R but
not in Relation S, It is binary operator.
4. Cartesian Product (X): Combines every tuple (row) of one table with every tuple
(row) in other table, also referred as cross Product. It is a binary operator.
5. Union (U): Outputs the union of tuples from both the relations. Duplicate tuples are
eliminated automatically. It is a binary operator means it require two operands.
2. Non-Procedural Language
For Example: get the name and the contact number of the student with a Particular ID
will have a single query on STUDENT table.
Page | 20
How Does SQL Work?
Below are the detailed steps involved in the SQL query execution.
1. Input: The user submits a query (e.g., SELECT, INSERT, UPDATE, DELETE) via an
application or interface.
2. Parsing: The query processor breaks the query into parts (tokens) and checks for syntax
and schema correctness.
Page | 21
3. Optimization: The optimizer finds the most efficient way to run the query using
indexes, statistics, and available resources.
4. Execution: The execution engine runs the query using the chosen plan, accessing or
modifying the database as needed.
5. Output: Results are returned to the user — either data (for SELECT) or a success
message (for other operations).
There are certain rules for SQL which would ensure consistency and functionality
across databases. By following these rules, queries will be well formed and well executed in
any database.
Page | 22
• End with Semicolon (;): Each SQL statement must end with a semicolon to execute
properly.
• Case Insensitivity: SQL keywords (e.g., SELECT, INSERT) are not case-sensitive.
However, table/column names may be case-sensitive depending on the DBMS.
• Whitespace Allowed: Queries can span multiple lines, but use spaces between
keywords and names.
• Reserved Words: Avoid using SQL keywords as names. If needed, wrap them in quotes
(" ") or backticks (`).
Comments
• Single-line: -- comment
• Multi-line: /* comment */
Data Constraints:
Use NOT NULL, UNIQUE, PRIMARY KEY, etc., to ensure data accuracy.
String Values:
Enclose strings in single quotes ('text').
Naming Rules:
Page | 23
1. Data Definition Language
These commands are used to define the structure of database objects by creating,
altering, and dropping the database objects. Based on the needs of the business, database
engineers create and modify database objects using DDL. The CREATE command, for
instance, is used by the database engineer to create database objects like tables, views, and
indexes.
Command Description
CREATE Creates a new table, a view on a table, or some other object in the database.
ALTER Modifies an existing database object, such as a table
DROP Deletes an entire table, a view of a table, or other objects in the database
Command Description
INSERT Creates a record.
UPDATE Modifies records.
DELETE Deletes records.
Page | 24
3. Data Query Language
Data retrieval instructions are written in the data query language (DQL), which is used
to access relational databases. The SELECT command is used by software programs to filter
and return particular results from a SQL table.
Command Description
GRANT Gives a privilege to the user.
REVOKE Takes back privileges granted by the user.
Command Description
COMMIT Saves all changes made during the current transaction on a permanent
basis. Some databases provide an auto-commit feature, which can be
configured using settings.
ROLLBACK Reverts changes made during the current transaction, ensuring no
unwanted changes are saved.
SAVEPOINT Sets a point within a transaction to which changes can be rolled back,
allowing partial rollbacks
Benefits of SQL
• Efficiency: SQL is designed to handle complex queries and large datasets with optimal
performance, making data retrieval and manipulation seamless.
• Standardization: As an ANSI and ISO standard language, SQL provides a universal
method to interact with relational databases across platforms.
Page | 25
• Scalability: SQL supports databases ranging from small-scale applications to
enterprise-level systems, ensuring smooth operations regardless of size.
• Flexibility: SQL can be extended with procedural programming (e.g., PL/SQL, T-SQL)
to build complex business logic and custom functions.
Limitations of SQL
Page | 26
QUERYING MULTIPLE RELATIONS
Multiple relation queries involve the execution of queries that involve multiple tables
in a database. JOIN operations are used to combine related tables into a single result set. A
JOIN operation combines columns from two or more tables into a single table, based on a
related column between the tables. The challenges associated with multiple relation queries
and JOIN ordering arise because there are different ways to execute the queries, and the
execution order of JOIN operations can significantly affect query performance. Query
optimization is a technique used to select the most efficient execution plan for a query.
Challenges
Executing multiple relation queries can be challenging because the query optimizer
must determine the most efficient execution plan for the query. The order of JOIN operations
can significantly impact query performance, and selecting the optimal order can be a complex
process. The join selectivity, data size, and distribution, and access path are some of the factors
that can affect the performance of multiple relation queries and JOIN ordering.
Optimizing multiple relation queries and join ordering is an essential task in relational
database management systems to ensure efficient and fast query processing. Here are some
techniques for optimizing multiple relation queries and join ordering?
• Indexing: Creating indexes on columns used in join conditions to retrieve data quickly.
• Statistics: Maintaining statistics on tables to estimate the number of rows retrieved and
choose the most efficient join order.
• Query Rewrite: Rewriting queries to eliminate unnecessary joins or reduce
intermediate results for improved performance.
• Join Ordering: Choosing the most efficient join order based on estimated costs of each
join.
• Materialized Views: Precomputed results stored in the database to reduce computation
at query time.
• Partitioning: Splitting large tables into smaller pieces to reduce the amount of data
processed for each query.
• Parallel Query Processing: Allowing multiple processors to work on a query
simultaneously for improved performance.
Page | 27
Factors that Affect JOIN Ordering
Join ordering is the process of selecting the order in which to join tables in a query to
minimize the overall cost of the query. Several factors affect join ordering, including?
• Join Types: The type of join used in the query can affect join ordering. For example, a
left outer join may be more efficient if it is performed before a right outer join.
• Table Size: The size of each table in the query can affect join ordering. Joining smaller
tables before larger tables can be more efficient.
• Join Cardinality: The number of rows that match in each table for a given join
condition can affect join ordering. Joining tables with a high cardinality first can be
more efficient.
• Join Selectivity: The selectivity of a join condition, or the number of rows that match
the condition, can also affect join ordering. Joining tables with more selective join
conditions first can be more efficient.
• Indexes: The existence of indexes on the join columns can affect join ordering. Joining
tables with indexed columns first can be more efficient.
• Hardware: The hardware on which the database is running can affect join ordering.
Different join orders may be more efficient on different hardware configurations.
• Query Optimization Goals: The goals of the query optimization process can affect
join ordering. For example, if the goal is to minimize disk I/O, join ordering may be
different than if the goal is to minimize CPU usage.
Page | 28
1. One-to-One Relationship
• Definition: Each record in Table A is associated with one and only one record in Table
B, and vice versa.
• Setup: Include a foreign key in one of the tables that references the primary key of the
other table.
• For example: Tables users and user_profiles, where each user has a single
corresponding profile.
Output:
one-to-one relatonship
Page | 29
2. One-to-Many Relationship
• Definition: Each record in Table A can be associated with multiple records in Table B,
but each record in Table B is associated with only one record in Table A.
• Setup: Include a foreign key in the "many" side table (Table B) that references the
primary key of the "one" side table (Table A).
• For example: Tables departments and employees, where each department can have
multiple employees, but each employee belongs to one department.
Output:
3. Many-to-Many Relationship
• Definition: Each record in Table A can be associated with multiple records in Table B,
and vice versa.
• Setup: Create an intermediate table (also known as a junction or linking table) that
contains foreign keys referencing both related tables.
• For example: Tables students and courses, where each student can enroll in multiple
courses, and each course can have multiple students.
Page | 30
Output:
4. Many-to-One Relationship
• Definition: Multiple records in table B can be associated with one record in table A.
• Setup: Crate a Foreign key in "Many Table" that references to Primary Key in "One
Table".
• Example: Table Teachers and Courses, many courses can be taught by single teacher.
Page | 31
Output:
5. Self-Referencing Relationship
• Definition: A table has a foreign key that references its primary key.
• Setup: Include a foreign key column in the same table that references its primary key.
• For example: A table employee with a column manager_id referencing the same table's
employee_id.
Output:
Page | 32
SQL Command:
Example:
• This will completely remove the Students table along with all its records.
2. Altering Relations
Examples:
• Add a new column:
• Delete a column:
Page | 33
Example:
2. Deleting Tuples
• Refers to removing specific rows (records) from a relation (table).
SQL Command:
Example:
Page | 34
Example: Making sure every customer has a valid email address & ensuring that an order in
the database is always linked to an existing customer.
Note: These rules prevent mistakes, such as adding incomplete or incorrect data, and make
sure the database is secure and well-organized.
Types of Integrity Constraints
There are Different types of Integrity Constraints used in DBMS, these are:
1. Domain Constraints
2. Entity Integrity Constraints
3. Key Constraints
4. Referential integrity constraints
5. Assertion
6. Triggers
1. Domain Constraints
Domain constraints are a type of integrity constraint that ensure the values stored in a
column (or attribute) of a database are valid and within a specific range or domain. In simple
terms, they define what type of data is allowed in a column and restrict invalid data entry. The
data type of domain include string, char, time, integer, date, currency etc. The value of the
attribute must be available in comparable domains.
Example: Below table demonstrates domain constraints in action by enforcing rules for each
column
Page | 35
• Semester: Allows specific values like 5th, 6th, etc., and ensures valid input (e.g., no
10th if not permitted).
• Age: Must be an integer within a reasonable range (e.g., 18-30) and cannot contain
invalid data like negative numbers or text.
Example: Let, the not-null constraint be specified on the "Semester" attribute in the
relation/table given below, then the data entry of 4th tuple will violate this integrity constraint,
because the "Semester" attribute in this tuple contains null value. To make this database
instance a legal instance, its entry must not be allowed by database management system.
Entity integrity constraints state that primary key can never contain null value because
primary key is used to determine individual rows in a relation uniquely, if primary key contains
null value, then we cannot identify those rows. A table can contain null value in it except
primary key field.
Page | 36
Key Features of Entity Integrity Constraints:
• Uniqueness: The primary key value must be unique for each row in the table. No
duplicate entries are allowed in the primary key column.
• NOT NULL: The primary key column cannot contain NULL values, as every row must
have a valid identifier.
• Essential for Table Design: Ensures that every record in the table can be uniquely
identified, preventing ambiguity.
Example: It is not allowed because it is containing primary key (Student_id) as NULL value.
Student_id Name Semester Age
21CSE101 Ramesh 5th 20
21CSE102 Kamlesh 5th 21
21CSE103 Aakash 5th 22
NULL Mukesh 5th 20
3. Key Constraints
Page | 37
3.1 Primary Key Constraints
It states that the primary key attributes are required to be unique and not null. That is,
primary key attributes of a relation must not have null values and primary key attributes of two
tuples must never be same. This constraint is specified on database schema to the primary key
attributes to ensure that no two tuples are same.
Example: Here, in the below example the Student_id is the primary key attribute. The data
entry of 4th tuple violates the primary key constraint that is specifies on the database schema
and therefore this instance of database is not a legal instance.
The Unique key constraint in DBMS ensures that all values in a specified column (or
group of columns) are distinct across the table. It prevents duplicate entries, maintaining data
integrity, but unlike the primary key, it allows one NULL value.
Example: Here, in the below example the Email column has NULL value in 2nd record.
Employee_ID Email Name
1 aniket@[Link] Aniket Kumar
2 NULL Shashwat Dubey
3 shashwat@[Link] Manvendra Sharma
• Unique Values: The email column must contain unique values. aniket@[Link]
and shashwat@[Link] are valid. Adding another row with aniket@[Link]
would result in an error.
Page | 38
• Allows One NULL: The email column can contain one NULL value.
• Valid: NULL in the second row.
• Invalid: Adding another row with NULL in email will be rejected.
5. Assertion
An assertion is a declarative mechanism in a database that ensures a specific condition
or rule is always satisfied across the entire database. It is a global integrity constraint, meaning
Page | 39
it applies to multiple tables or the entire database rather than being limited to a single table or
column. An assertion in SQL-92 takes the form:
When an assertion is made, the system tests it for validity. This testing may introduce a
significant amount of overhead; hence assertions should be used with great care.
Example of an Assertion:
Explanation:
The following SQL statement creates an assertion to ensure that the total loan amount
at each branch is always less than the total account balances at the same branch.
• Purpose: This assertion enforces a global business rule. The sum of all loan amounts
for a branch must always be less than the sum of all account balances in the same
branch. This prevents branches from issuing loans beyond their financial capacity.
• Subqueries: The first subquery (SELECT SUM(amount) FROM loan) calculates the
total loan amount for each branch. The second subquery (SELECT SUM(amount)
FROM account) calculates the total balance of accounts for the same branch.
• Condition: The NOT EXISTS clause ensures there is no branch where the loan amount
is greater than or equal to the account balance.
• Behavior: If a transaction (e.g., inserting a loan or updating an account) violates this
rule, the operation will be rejected by the database.
Page | 40
6. Triggers
A trigger is a procedural statement in a database that is automatically executed in
response to certain events such as INSERT, UPDATE, or DELETE. Triggers are often used to
enforce complex integrity constraints or implement business rules that cannot be captured using
standard constraints like primary keys or foreign keys.
Explanation:
• Trigger Type: The trigger runs after an update on the account table.
• Condition: It checks if the account balance is negative after the update.
• Actions: If the balance is negative, the account balance is reset to zero. A loan is created
with the same account number as the loan number, and the loan amount is the absolute
value of the negative balance. The ABS() function ensures the loan amount is positive.
Page | 41