NAME OF PROGRAM
[Link] (CSE) SEMESTER / YEAR: 5TH /3RD
SUBJECT NAME (SUBJECT CODE):
DBMS/BCS-501
Course Outcome:for example, only
SECTION-A (Very Short Answer Type
Questions) Note: 10 questions from each unit as per 2 marks.
UNIT-I
a) Distinguish between DBMS and RDBMS? CO1
b) What is data model? List the types of data model used? CO2
c) Write the difference between super key and candidate key CO2
d) Discuss three level of abstractions or schemas architecture of DBMS. CO1
e) Explain the difference between a weak and a strong entity set with example. CO2
f) Write the difference between DDL and DML. CO2
g) Why Network Data model and Hierarchical Data model are little used now? CO2
h) Name any five database systems? CO1
i) What is the significance of Physical Data Independence? CO3
j) List any four disadvantages of file system approach over database approach. CO2
a) Distinguish between DBMS and RDBMS
Aspect DBMS RDBMS
Database Management System, Relational Database Management System,
Definition
manages databases. manages relational databases.
Data Storage Stores data as files or structures. Stores data in tables with rows and columns.
Does not support relationships
Relationships Supports relationships using foreign keys.
between data.
Data Integrity constraints are not Ensures data integrity using primary and
Integrity enforced. foreign keys.
Examples XML, JSON. MySQL, PostgreSQL, Oracle.
b) What is a data model? List the types of data models used.
• A data model defines how data is stored, organized, and manipulated in a database. It
provides a framework for structuring and understanding data.
Types of Data Models:
1. Hierarchical Data Model
2. Network Data Model
3. Relational Data Model
4. Entity-Relationship (E-R) Model
5. Object-Oriented Data Model
c) Difference between Super Key and Candidate Key
Aspect Super Key Candidate Key
A set of attributes uniquely identifying A minimal super key (no redundant
Definition
a tuple. attributes).
Uniqueness Ensures tuple uniqueness. Ensures uniqueness and minimality.
Candidate keys are a subset of super
Subset Candidate keys are minimal super keys.
keys.
Example {ID, Name}, {ID}, {ID, Email} {ID}, {Email} (minimal).
d) Three Levels of Abstraction or Schemas Architecture of DBMS
1. Physical Level:
o Describes how data is physically stored in the database.
o Focuses on storage mechanisms (e.g., indexes, blocks).
2. Logical Level:
o Defines what data is stored and the relationships among the data.
o Deals with tables, fields, and records.
3. View Level:
o Represents how data is presented to users.
o Hides complexity from end-users by creating views or simplified interfaces.
e) Difference Between Weak and Strong Entity Set
Aspect Weak Entity Set Strong Entity Set
Dependency Depends on a strong entity set for existence. Independent of other entities.
Key Does not have a primary key. Has a primary key.
Symbol Represented by a double rectangle. Represented by a single rectangle.
Example Dependent (dependent on Employee). Employee (independent).
f) Difference Between DDL and DML
Aspect DDL (Data Definition Language) DML (Data Manipulation Language)
Defines structure/schema of the
Purpose Manipulates data within the database.
database.
SELECT, INSERT, UPDATE,
Commands CREATE, ALTER, DROP.
DELETE.
Modifies data without changing
Effect Changes the database schema.
schema.
Examples Create a table, delete a table. Insert or update records in a table.
g) Why Network Data Model and Hierarchical Data Model are Little Used Now?
1. Complex to design and manage compared to relational models.
2. Lack of flexibility; modifications require structural changes.
3. Querying is more difficult and less user-friendly.
4. Relational models (RDBMS) are easier to use and widely supported.
h) Name Any Five Database Systems
1. MySQL
2. PostgreSQL
3. Oracle Database
4. Microsoft SQL Server
5. MongoDB
i) Significance of Physical Data Independence
• Physical data independence allows changes in the physical storage of data without
affecting the logical structure or application programs.
• Example: Changing file format, indexing methods, or storage devices does not require
changes in queries or schema definitions.
j) Four Disadvantages of File System Approach Over Database Approach
1. Data Redundancy: Repetition of the same data across files.
2. Lack of Data Integrity: No mechanisms to enforce rules like constraints.
3. Poor Data Security: Difficult to restrict access to sensitive data.
4. Data Isolation: Data is scattered across files, making it hard to access and manage.
UNIT-II
a) What are different Integrity Constraints? CO1
b) Differentiate between SQL and PL/SQL? CO2
c) Differentiate between relational algebra and relational calculus? CO2
d) Explain different Features of SQL. CO2
e) Define aggregation with example. CO1
f) What are different relational algebra operations? CO2
g) When a relation set is called a recursive relationship set? CO2
h) What do you understand by Union Comparability? CO1
i) What do you mean by Cursor? CO3
j) What do you mean by index? CO3
a) What Are Different Integrity Constraints? (CO1)
Integrity constraints are rules that ensure the accuracy, consistency, and validity of
data stored in a database. They help maintain data integrity during updates, deletions,
or insertions.
1. Primary Key Constraint:
o Ensures that a column (or a combination of columns) uniquely identifies each
row in a table.
o A primary key cannot contain NULL values.
o Example: In a Student table, the Student_ID column is a primary key.
2. Foreign Key Constraint:
o Ensures referential integrity by linking two tables.
o The foreign key in one table refers to the primary key in another table.
o Example: In an Orders table, the Customer_ID column references the
Customer_ID in a Customers table.
3. Unique Constraint:
o Ensures that all values in a column are unique.
o Unlike the primary key, a table can have multiple unique constraints.
o Example: In an Employee table, the Email column must have unique values.
4. Not Null Constraint:
o Ensures that a column cannot contain NULL values.
o Example: In a User table, the Username column cannot be NULL.
5. Check Constraint:
o Ensures that all values in a column satisfy a specific condition.
o Example: In a Products table, the Price column must have values greater than
zero (Price > 0).
6. Default Constraint:
o Assigns a default value to a column if no value is provided during insertion.
o Example: In an Orders table, the Order_Status column can have a default
value of "Pending".
b) Differentiate Between SQL and PL/SQL (CO2)
Aspect SQL PL/SQL
SQL is a structured query PL/SQL is a procedural
language used to query language extension for
Definition
and manage relational SQL, used in Oracle
databases. databases.
To write and execute
To perform CRUD
procedural code (loops,
Purpose (Create, Read, Update,
conditions, etc.) for
Delete) operations.
complex logic.
Executes one query at a Executes blocks of code
Execution
time. with multiple statements.
Supports procedural
Procedural Lacks procedural constructs like loops,
Capability capabilities. conditions, and
exceptions.
Writing queries to Writing triggers,
Use Cases retrieve or manipulate functions, and stored
data. procedures.
Writing a stored
SELECT * FROM
Example procedure to calculate
Employees;
bonuses.
c) Differentiate Between Relational Algebra and Relational Calculus (CO2)
Aspect Relational Algebra Relational Calculus
A procedural query A non-procedural query
Definition language that specifies language that specifies
how to retrieve data. what data to retrieve.
Uses operations like
Uses logical predicates to
Syntax selection (σ), projection
describe queries.
(π), etc.
Focuses on step-by-step
Focuses on defining
Approach operations to manipulate
desired results, not steps.
data.
Includes operations like
Based on tuple or domain
Operations union, intersection, and
variables.
join.
More complex to use as it
Ease of Easier for defining
requires specifying the
Use complex queries logically.
process.
Example σ Age > 30 (Employee) `{t
d) Explain Different Features of SQL (CO2)
SQL (Structured Query Language) is a powerful language used for interacting with
relational databases. Its features include:
1. Data Definition Language (DDL):
o Used to define or modify database structure.
o Commands: CREATE, ALTER, DROP.
2. Data Manipulation Language (DML):
o Used to manipulate data in tables.
o Commands: INSERT, UPDATE, DELETE.
3. Data Query Language (DQL):
o Used to retrieve data from the database.
o Command: SELECT.
4. Transaction Control Language (TCL):
o Manages database transactions.
o Commands: COMMIT, ROLLBACK, SAVEPOINT.
5. Data Control Language (DCL):
o Manages access and permissions.
o Commands: GRANT, REVOKE.
6. Built-In Functions:
o Aggregates data using functions like SUM, AVG, COUNT.
7. Joins:
o Combines data from multiple tables.
o Types: Inner Join, Outer Join, Cross Join, etc.
e) Define Aggregation With Example (CO1)
Aggregation is a process in which relationships are treated as higher-level entities,
especially when they have attributes of their own.
Example:
• A Doctor entity is related to a Patient entity through a Treats relationship.
• If the Treats relationship has an attribute like "Treatment_Date", we use aggregation
to represent it as an entity.
f) What Are Different Relational Algebra Operations? (CO2)
Relational algebra consists of the following operations:
1. Selection (σ): Filters rows based on a condition.
Example: σ Salary > 50000 (Employee).
2. Projection (π): Selects specific columns from a table.
Example: π Name, Age (Employee).
3. Union (∪): Combines tuples from two relations.
Example: Employee ∪ Manager.
4. Set Difference (-): Finds tuples in one relation but not in the other.
Example: Employee - Retired.
5. Cartesian Product (×): Combines tuples from two relations.
Example: Employee × Department.
6. Join: Combines related tuples from two relations based on a condition.
Example: Employee ⋈ Department_ID = ID Department.
7. Intersection (∩): Finds common tuples in two relations.
8. Division (/): Divides one relation by another.
g) When Is a Relation Set Called a Recursive Relationship Set? (CO2)
A recursive relationship set occurs when an entity has a relationship with itself.
Example:
• In an Employee entity, "manages" is a recursive relationship, where one employee
manages another employee.
h) What Do You Understand by Union Comparability? (CO1)
Two relations are union-compatible if:
1. They have the same number of attributes.
2. The corresponding attributes have the same domain (data type).
Example:
• Relation A: (ID, Name)
• Relation B: (ID, Name)
These two relations are union-compatible.
i) What Do You Mean by Cursor? (CO3)
A cursor is a database object used to iterate through a result set row by row. It is often
used in stored procedures and triggers for row-level processing.
Example:
If a query retrieves multiple rows, a cursor can process each row one at a time.
j) What Do You Mean by Index? (CO3)
An index is a data structure that improves the speed of data retrieval in a table. It is
similar to the index in a book that helps locate information quickly.
Example:
Indexing a Student_Name column helps in faster searches when querying by name.
UNIT-III
a) Discuss the various anomalies associated with normalization? CO3
b) MVD is a special case of JD. Discuss? CO3
c) Why do we normalize database? CO2
d) Write different Inference Rule for Functional Dependency? CO2
e) What is Boyce-Codd Normal Form in the DBMS? CO2
f) Explain MVD with the help of suitable example. CO2
g) Define the closure of an attribute set. CO1
h) List all prime and non-prime attributes In Relation R(A,B,C,D,E) with FD set CO3
F = {AB→C, B→E, C→D}.
i) What is Functional Dependency Form in the DBMS? CO2
j) What are different between Lossless and Lossy Join Decomposition? CO3
a) Discuss the Various Anomalies Associated with Normalization?
Anomalies in databases occur when redundancy exists in the data. These anomalies are
resolved through normalization. The main anomalies are:
1. Insertion Anomaly:
o Occurs when we cannot insert data into the database without adding irrelevant
or incomplete information.
o Example: In a Student-Course table, we cannot add a new course unless a
student is also enrolled in it.
2. Deletion Anomaly:
o Occurs when deleting a piece of information also unintentionally removes other
valuable data.
o Example: If a student drops their last course, deleting that entry may also
remove information about the student.
3. Update Anomaly:
o Occurs when redundant data exists, and updating one instance requires multiple
updates in the database.
o Example: Changing a professor's name in a Course table with redundant entries
requires multiple updates.
b) MVD Is a Special Case of JD. Discuss?
A Multivalued Dependency (MVD) is a special case of a Join Dependency (JD):
• Join Dependency (JD): A relation is said to have a JD if it can be decomposed into two
or more relations such that the original relation can be reconstructed by joining them.
• MVD: A special type of JD where a non-trivial dependency exists between a set of
attributes and multiple values of another attribute.
Example:
For a relation R(A, B, C), if A →→ B holds, then each value of A is associated with multiple
values of B.
c) Why Do We Normalize a Database?
Normalization is performed to:
1. Reduce Redundancy: Eliminates duplicate data, reducing storage requirements.
2. Prevent Anomalies: Avoids insertion, deletion, and update anomalies.
3. Improve Data Integrity: Ensures data consistency by organizing it into logical
structures.
4. Simplify Queries: Improves query performance by structuring data efficiently.
5. Improve Scalability: Makes it easier to modify the database structure in the future.
d) Write Different Inference Rules for Functional Dependency?
The Armstrong's Axioms are the inference rules for functional dependencies:
1. Reflexivity Rule:
If Y ⊆ X, then X → Y.
2. Augmentation Rule:
If X → Y, then XZ → YZ for any attribute Z.
3. Transitivity Rule:
If X → Y and Y → Z, then X → Z.
4. Union Rule:
If X → Y and X → Z, then X → YZ.
5. Decomposition Rule:
If X → YZ, then X → Y and X → Z.
6. Pseudotransitivity Rule:
If X → Y and WY → Z, then WX → Z.
e) What Is Boyce-Codd Normal Form in the DBMS?
Boyce-Codd Normal Form (BCNF) is an advanced version of the Third Normal Form (3NF).
Definition: A relation is in BCNF if, for every functional dependency X → Y, X is a
superkey (i.e., it uniquely identifies all attributes in the table).
Example:
• Relation: R(A, B, C)
• FD: A → B
If A is not a superkey, the relation violates BCNF.
f) Explain MVD With the Help of a Suitable Example.
Multivalued Dependency (MVD) occurs when an attribute in a table is dependent on another
attribute, but this dependency is independent of other attributes.
Example:
Consider a relation R(Student, Hobby, Skill):
• Student →→ Hobby (A student can have multiple hobbies).
• Student →→ Skill (A student can have multiple skills).
This implies two independent multivalued dependencies.
g) Define the Closure of an Attribute Set.
The closure of an attribute set X (denoted as X⁺) is the set of all attributes that can be
functionally determined by X using a given set of functional dependencies.
Steps to Find Closure:
1. Start with the attributes in X.
2. Add attributes determined by X using the functional dependencies.
3. Repeat until no new attributes can be added.
h) List All Prime and Non-Prime Attributes in Relation R(A, B, C, D, E) with FD Set F =
{AB→C, B→E, C→D}.
• Prime Attributes: Attributes that are part of at least one candidate key.
• Non-Prime Attributes: Attributes that are not part of any candidate key.
Candidate Keys:
1. AB (determines all attributes).
Prime Attributes: A, B.
Non-Prime Attributes: C, D, E.
i) What Is Functional Dependency Form in the DBMS?
A functional dependency occurs when one attribute uniquely determines another attribute in
a relation.
Notation: X → Y
• Means: If two tuples have the same value for attribute X, they must also have the same
value for attribute Y.
Example:
In a Student table, Roll_No → Name (Roll Number uniquely determines the Name).
j) What Are the Differences Between Lossless and Lossy Join Decomposition?
Aspect Lossless Join Decomposition Lossy Join Decomposition
Ensures no data is lost when Results in extra tuples or loss of original
Definition
relations are joined. data when relations are joined.
Desirable property in database Undesirable as it compromises data
Importance
design. integrity.
A decomposition is lossless if: (R1
Condition No such condition is satisfied.
∩ R2) → R1 or R2.
Relations can be perfectly Joining relations results in incorrect or
Example
reconstructed. redundant data.
UNIT-IV
a) Define the term ACID properties? CO2
b) What are various reasons for transaction failure? CO1
c) What are serial, non-serial schedules? CO3
d) When is a transaction Rolled Back? CO2
e) Draw a state diagram and discuss the typical states that a transaction goes through during CO4
execution.
f) Discuss Consistency and Isolation property of a transaction. CO3
g) Describe how view serializability is related to conflict serializability. CO3
h) What is deadlock? CO2
i) What are distributed databases? CO2
j) What is log file? Write the steps for log based recovery of a system. CO3
a) Define the Term ACID Properties?
ACID properties are a set of rules that guarantee reliable transaction processing in a database
system. They ensure data integrity during failures or concurrent access.
1. Atomicity: Ensures that a transaction is treated as a single unit. It either completes
fully or has no effect at all.
o Example: In a banking system, transferring money from one account to another
involves two steps: debiting one account and crediting another. If one step fails,
the other must also fail.
2. Consistency: Ensures that a transaction moves the database from one valid state to
another.
o Example: If a withdrawal transaction violates the rule that an account balance
cannot go below zero, the transaction is aborted.
3. Isolation: Ensures that transactions do not interfere with each other. Intermediate
states of a transaction are invisible to others.
o Example: If two transactions are updating a record simultaneously, isolation
ensures the updates do not conflict.
4. Durability: Ensures that once a transaction is committed, its effects are permanent,
even in the event of a system failure.
o Example: After transferring funds, the changes remain even if the system
crashes immediately afterward.
b) What Are Various Reasons for Transaction Failure?
Transactions may fail due to several reasons:
1. System Errors:
o Hardware malfunctions, software bugs, or power failures cause the transaction
to stop midway.
2. Deadlocks:
o Transactions waiting for each other to release resources create a deadlock,
halting their progress.
3. Concurrency Issues:
o Conflicting operations from multiple transactions accessing the same data may
cause inconsistencies.
4. Invalid Input or Violations:
o Transactions trying to update the database with invalid or restricted data are
aborted.
5. Disk Failures:
o Data stored on a faulty disk might lead to transaction failures.
c) What Are Serial and Non-Serial Schedules?
1. Serial Schedule:
o Transactions are executed sequentially, one after the other.
o Ensures consistency but reduces concurrency.
o Example:
mathematica
Copy code
T1: Read(A), Write(A), Read(B), Write(B)
T2: Read(C), Write(C), Read(D), Write(D)
2. Non-Serial Schedule:
o Transactions are interleaved, improving concurrency.
o May lead to inconsistencies if not carefully managed.
o Example:
mathematica
Copy code
T1: Read(A), Write(A), T2: Read(C), Write(C), T1: Read(B), Write(B), T2: Read(D),
Write(D)
d) When Is a Transaction Rolled Back?
A transaction is rolled back in the following cases:
1. System Failure: Power outages or crashes occur during transaction execution.
2. Constraint Violations: The transaction violates integrity rules, such as foreign key
constraints.
3. Deadlock Detection: The system detects a deadlock and aborts one or more
transactions.
4. Explicit User Action: The user or application explicitly issues a rollback command.
Rollback ensures that the database reverts to its previous consistent state.
e) State Diagram of Transaction States and Discussion
States of a Transaction:
1. Active: The transaction is in progress and executing operations.
2. Partially Committed: The final operation of the transaction is executed, but changes
are not yet permanent.
3. Committed: All changes are permanent in the database.
4. Failed: An error or failure prevents the transaction from completing.
5. Aborted: The transaction is terminated, and changes are rolled back.
State Diagram:
css
Copy code
[Active] → [Partially Committed] → [Committed]
↓ ↑
[Failed] ← [Aborted]
Description:
• Transactions begin in the Active state.
• If completed without errors, they move to Partially Committed and then Committed.
• If a failure occurs, they move to Failed or Aborted.
f) Consistency and Isolation Property of a Transaction
1. Consistency:
o Maintains database integrity by ensuring transactions leave the database in a
valid state.
o Example: When transferring money, the total amount in the system remains
constant.
2. Isolation:
o Ensures transactions are executed independently.
o Example: If two users simultaneously update an account, isolation ensures each
transaction sees consistent data without interference.
g) Relation Between View Serializability and Conflict Serializability
1. Conflict Serializability:
o A schedule is conflict-serializable if it can be rearranged into a serial schedule
by swapping non-conflicting operations.
2. View Serializability:
o A schedule is view-serializable if it produces the same result as a serial
schedule, even if conflicting operations exist.
Relation:
• Every conflict-serializable schedule is view-serializable.
• Not all view-serializable schedules are conflict-serializable.
h) What Is Deadlock?
A deadlock occurs when two or more transactions are waiting for resources held by each
other, and none can proceed.
Example:
• Transaction T1 holds resource R1 and needs R2 to proceed.
• Transaction T2 holds resource R2 and needs R1.
This circular dependency causes both transactions to wait indefinitely.
i) What Are Distributed Databases?
A distributed database is a collection of databases located at different physical sites but
connected by a network. It provides a unified interface to users as if it were a single
database.
Advantages:
1. Improved performance through parallel query processing.
2. Higher availability and fault tolerance.
3. Scalability for growing data needs.
Examples: Google Spanner, Amazon DynamoDB.
j) What Is Log File? Steps for Log-Based Recovery
A log file is a record of all transactions and changes made to the database. It is used for
recovery in case of failure.
Steps for Log-Based Recovery:
1. Analyze Log: Identify committed and uncommitted transactions.
2. Redo Committed Transactions: Apply changes from committed transactions to ensure
durability.
3. Undo Uncommitted Transactions: Rollback changes made by uncommitted
transactions to restore consistency.
Example:
• <T1, Start>: Transaction T1 started.
• <T1, A, 100, 200>: Transaction T1 changed A from 100 to 200.
• <T1, Commit>: Transaction T1 committed.
The log helps replay or rollback transactions as needed.
UNIT-V
a) Distinguish between Shared and Exclusive Locks? CO2
b) What are Concurrent Transactions? CO3
c) Discuss Conservative 2PL and Strict 2PL CO4
d) What are various reasons for transaction failure? CO5
e) What is Lock in Transaction Management? CO1
f) List the various levels of locking? CO2
g) What is the Dirty Read Problems? CO3
h) What is Lock Compatibility Table? CO4
i) Explain two phase locking protocol? CO2
j) What is the Lost Update Problems? CO1
a) Distinguish Between Shared and Exclusive Locks?
Locks are used in transaction management to ensure data consistency during concurrent
transactions.
Shared Lock (S) Exclusive Lock (X)
Allows multiple transactions to read the Allows only one transaction to both read and
data but prevents any writing. write the data.
Non-conflicting; multiple shared locks Conflicting; no other lock can exist on the same
can coexist on the same data item. data item while an exclusive lock is held.
Example: Reading account balance. Example: Updating account balance.
b) What Are Concurrent Transactions?
Concurrent transactions refer to multiple transactions executed simultaneously in a database
system.
Advantages:
1. Improved Performance: Utilizes CPU and I/O efficiently by overlapping operations.
2. Resource Sharing: Enables multiple users to access the database concurrently.
Problems:
1. Lost Updates: One transaction overwrites changes made by another.
2. Dirty Reads: Reading uncommitted changes from another transaction.
3. Uncommitted Dependency: Changes are made visible before being finalized.
c) Discuss Conservative 2PL and Strict 2PL
1. Conservative Two-Phase Locking (C2PL):
o Prevents deadlocks by acquiring all required locks before the transaction starts.
o If all locks are not available, the transaction waits without holding any locks.
o Ensures no deadlocks but can lead to waiting delays.
2. Strict Two-Phase Locking (S2PL):
o Requires that all exclusive locks are held until the transaction commits or
aborts.
o Ensures strict serialization by avoiding cascading rollbacks.
o Guarantees strict schedules but increases locking time, reducing concurrency.
d) What Are Various Reasons for Transaction Failure?
Reasons for Transaction Failure:
1. System Errors:
o Hardware malfunctions or software bugs cause transaction termination.
2. Concurrency Issues:
o Conflicts between multiple transactions accessing the same data.
3. Deadlocks:
o Transactions waiting for each other indefinitely to release resources.
4. Invalid Operations:
o Violation of integrity constraints, such as primary key duplication.
5. Disk Failures:
o Physical damage to storage media disrupts transaction processing.
e) What Is Lock in Transaction Management?
A lock is a mechanism to control access to data during concurrent transactions, ensuring data
integrity and preventing conflicts.
Types of Locks:
1. Shared Lock: Allows reading but not writing.
2. Exclusive Lock: Allows both reading and writing.
Locks prevent anomalies like dirty reads, uncommitted dependencies, and lost updates.
f) List the Various Levels of Locking?
1. Row-Level Locking:
o Locks a specific row in a table.
o High concurrency but increases overhead.
2. Page-Level Locking:
o Locks an entire page of rows in the database.
o Balances overhead and concurrency.
3. Table-Level Locking:
o Locks the entire table.
o Simple to implement but reduces concurrency.
4. Database-Level Locking:
o Locks the entire database.
o Used for administrative tasks but not for regular transactions.
g) What Is the Dirty Read Problem?
A dirty read occurs when a transaction reads data that another transaction has modified but
not yet committed.
Example:
• Transaction T1 updates a record but has not committed.
• Transaction T2 reads this updated value.
• If T1 rolls back, the value read by T2 becomes invalid.
This leads to inconsistent data being propagated.
h) What Is Lock Compatibility Table?
A lock compatibility table specifies whether two locks can coexist on the same data item.
Table:
Lock Request Shared (S) Exclusive (X)
Shared (S) Compatible Not Compatible
Exclusive (X) Not Compatible Not Compatible
Usage:
• Helps determine if a lock can be granted based on existing locks on the data item.
i) Explain Two-Phase Locking Protocol?
The Two-Phase Locking (2PL) Protocol ensures conflict-serializability by dividing the
locking process into two distinct phases:
1. Growing Phase:
o A transaction acquires locks but does not release any.
o Continues until the transaction obtains all required locks.
2. Shrinking Phase:
o A transaction releases locks but does not acquire any new locks.
Example:
• Transaction acquires locks for rows A and B during the growing phase.
• Updates the rows and releases locks during the shrinking phase.
2PL ensures that no transaction can access inconsistent data.
j) What Is the Lost Update Problem?
The lost update problem occurs when two transactions simultaneously update the same data,
and one overwrites the other without proper synchronization.
Example:
1. Transaction T1 reads the value of a balance (1000).
2. Transaction T2 also reads the same value (1000).
3. T1 adds 500 and writes back 1500.
4. T2 subtracts 200 and writes back 800, overwriting T1’s update.
This results in the update by T1 being lost, leading to incorrect data.
SECTION-B&C
Note:10 questions from each unit as per 10 marks or 5marks.
UNIT-I
a) Compare and contrast the differences between File Processing System and DBMS? Also discuss the CO3
terms Generalization, Specialization and Aggregation with suitable example?
b) Distinguish the terms: super key, candidate key, primary key, Unique Key and foreign key with CO3
example?
c) Draw overall structure of DBMS and explain its components in brief. CO2
d) Differentiate the file system and database Management system by 10 key Factors. CO3
e) What do you mean by offset in database structure? Define the role of it in database development CO4
process.
f) What are the various types of users involved in DBMS operations? Explain each. CO3
g) State the procedural DML and non procedural DML with their differences. CO4
h) Describe the three-schema architecture. Why do we need mappings between schema levels? How do CO3
different schema definition languages support this
architecture?
i) What are the different types of Data Models in DBMS? Explain them. CO3
j) Compare Generalization, Specialization and aggregation with suitable examples. CO3
a) Compare and Contrast the Differences Between File Processing System and DBMS?
File Processing System (FPS):
1. Definition:
A traditional approach where data is stored and managed using separate files for each
application, without a centralized system.
2. Limitations:
o Redundancy: Duplicate data is stored across multiple files, wasting storage.
o Inconsistency: Difficult to maintain uniform data values due to redundancy.
o Poor Security: Limited control over access to data.
o Complex Queries: Requires manual effort to extract and process data.
Database Management System (DBMS):
1. Definition:
A software system that provides a centralized approach to manage, store, and retrieve
data efficiently.
2. Features:
o Reduced Redundancy: Centralized data storage eliminates duplication.
o Improved Consistency: Ensures uniformity through data constraints.
o Enhanced Security: Provides controlled access to data.
o Efficient Querying: Allows complex data retrieval through SQL.
Comparison Table
Feature File Processing System Database Management System
Data Redundancy High Low
Data Sharing Limited Easy and controlled
Security Minimal High
Query Language Not Available SQL or other query languages
Concurrency Not Supported Supported
Recovery Manual or absent Automatic
Generalization, Specialization, and Aggregation
1. Generalization:
o Definition: Combining two or more similar entities into a generalized entity.
o Example: Entities Car and Bike can be generalized into Vehicle.
2. Specialization:
o Definition: Dividing a generalized entity into more specific entities.
o Example: The entity Employee can be specialized into Manager and Technician.
3. Aggregation:
o Definition: Representing a relationship as a higher-level entity.
o Example: A Project involves Employees. This relationship can be aggregated
into a new entity, Team.
b) Distinguish Between Super Key, Candidate Key, Primary Key, Unique Key, and Foreign
Key
Key Type Definition Example
A set of one or more attributes that uniquely {ID}, {ID, Name} are super keys
Super Key
identify a record in a table. in a table with ID and Name.
Candidate A minimal super key; no proper subset can {ID} is a candidate key in a table
Key uniquely identify records. with ID and Name.
Primary A candidate key chosen to uniquely identify
ID is the primary key.
Key records.
Unique Ensures values are unique but allows NULL
Email column in a user table.
Key values.
An attribute linking one table to the primary OrderID in Orders table
Foreign
key of another table, maintaining referential referencing CustomerID in
Key
integrity. Customers table.
c) Overall Structure of DBMS and Its Components
Diagram (Textual Representation):
diff
Copy code
+-----------------------------------------+
| Users |
+-----------------------------------------+
| Query Processor |
+-----------------------------------------+
| Storage & Transaction Manager |
+-----------------------------------------+
| Database |
+-----------------------------------------+
Components:
1. Users:
o End-Users: Interact with the database through applications.
o Database Administrators: Manage, monitor, and maintain the database.
2. Query Processor:
o Translates user queries into database instructions.
o Includes DML Compiler (processes data manipulation commands).
3. Storage Manager:
o Manages physical storage of data.
o Components include Buffer Manager and File Manager.
4. Transaction Manager:
o Ensures data integrity during concurrent transactions.
d) File System vs. DBMS by 10 Key Factors
Feature File System Database Management System
Data Redundancy High Low
Data Integrity Weak Strong (constraints ensure integrity).
Concurrency Not Supported Supported (ensures consistency).
Security Limited Robust
Query Language Absent SQL provides flexible querying.
Backup Manual Automated
Recovery Difficult or manual Automatic (log-based recovery).
Data Consistency Hard to maintain Ensured through constraints.
Storage Management Decentralized Centralized and efficient.
Flexibility Minimal High (supports multiple data models).
e) What Do You Mean by Offset in Database Structure? Define Its Role in Database
Development Process
1. Definition of Offset:
An offset is the position or location of a data record or field relative to the start of a
data structure or file.
2. Role in Database Development:
o Efficient Access: Offset helps to directly locate a record or field in large
datasets.
o Indexing: Used in index structures to point to the exact memory location of
data.
o Database Design: Helps in organizing data for optimized access patterns.
Example:
In a fixed-length record structure, if each record is 100 bytes and the required record is the
5th one, its offset is 4×100=4004 \times 100 = 4004×100=400.
f) What are the various types of users involved in DBMS operations? Explain each.
There are several types of users involved in DBMS operations, each with a specific role:
1. Database Administrators (DBAs):
o DBAs are responsible for the overall management of the database system. They
handle tasks such as creating and maintaining the database, configuring security
policies, managing user access, ensuring backup and recovery, and optimizing
performance.
2. End Users:
o End users interact with the database to perform operations like inserting,
updating, or querying data. There are two types of end users:
▪ Casual Users: These users access the database occasionally, often
through an interface that doesn't require much technical knowledge (e.g.,
web forms).
▪ Naive Users: These users interact with the database through a predefined
interface (like an application) and typically don't require much knowledge
of how the system works.
▪ Power Users: These users can use more advanced database
functionalities like running complex queries or using advanced features in
the database system.
3. Application Programmers:
o These users write programs to interact with the database system. They use
programming languages like Java, Python, or SQL to develop software
applications that rely on data stored in the DBMS.
4. System Analysts:
o System analysts analyze the system's requirements and design the database
structure. They act as intermediaries between users and the DBMS, ensuring
that user needs are translated into database requirements.
5. Database Designers:
o Database designers are responsible for designing the schema of the database,
determining how data will be stored and related within the system, and ensuring
that the structure meets performance and integrity requirements.
g) State the procedural DML and non-procedural DML with their differences.
Procedural DML:
• In procedural DML (Data Manipulation Language), the user specifies how to retrieve
or modify data. The user provides a detailed set of operations to be performed on the
database.
• Example: SQL SELECT queries using a cursor or iterating through rows with specific
commands.
• Example: A user might write a sequence of instructions to retrieve data from a
database based on certain conditions.
Non-Procedural DML:
• In non-procedural DML, the user specifies what data is needed without having to
describe the specific procedures to retrieve or modify that data. The DBMS decides
how to execute the query efficiently.
• Example: SQL queries like SELECT, INSERT, UPDATE, where you just specify
what you want, and the DBMS determines how to achieve it.
• Example: "SELECT * FROM Employees WHERE age > 30" simply tells the DBMS
to retrieve all employees over 30 years old, without specifying how to do it.
Differences:
1. Nature of Query: Procedural DML focuses on how to perform operations, while non-
procedural DML focuses on what operations to perform.
2. Complexity: Procedural DML is more complex, requiring step-by-step instructions,
whereas non-procedural DML is simpler and more user-friendly.
3. Efficiency: Non-procedural DML is generally more efficient, as the DBMS optimizes
the execution of the query.
h) Describe the three-schema architecture. Why do we need mappings between schema
levels? How do different schema definition languages support this architecture?
Three-Schema Architecture:
• The three-schema architecture is a framework that separates the user views, logical
structure, and physical storage of data. It consists of three levels:
1. Internal Schema (Physical Schema):
o This level defines how the data is physically stored in the database. It is
concerned with the storage of data on hardware and involves concepts like file
organization and indexing.
2. Conceptual Schema (Logical Schema):
o This level defines the logical structure of the entire database, including the
relationships between different data entities. It is independent of the physical
storage and deals with the semantics of the data.
3. External Schema (View Level):
o This level defines how the data is viewed by different users. It includes multiple
user views, which can vary depending on the role of the user. For example, a
manager might have a different view of the data than a regular employee.
Need for Mappings Between Schema Levels:
• Mappings are necessary to maintain independence between the three schemas. This
allows:
o Data Independence: Changes at one level do not affect other levels. For
instance, changing the physical storage in the internal schema doesn't affect the
conceptual schema or external schemas.
o Flexibility: Users can interact with the database without worrying about its
internal structure.
o Security and Optimization: Different user views (external schemas) can be
optimized without altering the underlying database structure.
Schema Definition Languages:
• Data Definition Language (DDL): Used to define the internal, conceptual, and
external schemas. DDL allows the creation, alteration, and deletion of database
structures.
• Mapping Language: Used to specify mappings between the schemas. The DBMS
uses these mappings to translate between user views (external schema), the logical
model (conceptual schema), and physical storage (internal schema).
i) What are the different types of Data Models in DBMS? Explain them.
1. Hierarchical Model:
• The data is organized in a tree-like structure where each record has a single parent and
possibly multiple children. This model is efficient for representing data with a clear
hierarchy, such as an organizational structure.
• Example: A company's organizational chart.
2. Network Model:
• Similar to the hierarchical model but allows more complex relationships with multiple
parent nodes. This model uses pointers to create more flexible relationships between
records.
• Example: A network of cities connected by various routes.
3. Relational Model:
• The data is stored in tables (called relations), where each table consists of rows
(tuples) and columns (attributes). It is the most widely used data model due to its
simplicity and ease of use.
• Example: A table storing student information with columns for name, age, and
address.
4. Object-Oriented Model:
• The data is represented as objects, similar to how data is represented in object-oriented
programming. Each object contains data and methods that operate on the data.
• Example: A database for managing shapes, where each shape is an object with
attributes (like color) and methods (like calculating area).
5. Entity-Relationship Model (ER Model):
• This model represents data using entities (objects) and relationships between them. It
is often used for database design and conceptual schema representation.
• Example: An entity for students, with relationships to courses they are enrolled in.
6. Document Model:
• Data is stored in the form of documents, typically in JSON or XML format. This
model is commonly used in NoSQL databases.
• Example: A collection of blog posts, where each post is a document containing text,
images, and comments.
7. Key-Value Model:
• In this model, data is stored as a collection of key-value pairs, with a unique key used
to retrieve a corresponding value. It is widely used in NoSQL databases.
• Example: A cache system where each key is a URL and the value is the cached page
content.
j) Compare Generalization, Specialization, and Aggregation with suitable examples.
Generalization:
• Generalization is the process of abstracting common characteristics from multiple
entities to create a higher-level, generalized entity.
• Example: In a university database, "Professor" and "Lecturer" can be generalized into
a "Teacher" entity because both share common attributes like name and employee ID.
Specialization:
• Specialization is the process of dividing a higher-level entity into lower-level entities
based on certain attributes.
• Example: A "Person" entity can be specialized into "Employee" and "Student"
entities, where each entity has attributes specific to its type (e.g., Employee has salary,
Student has grades).
Aggregation:
• Aggregation is the process of treating a relationship set as an entity in itself. It is used
when there are complex relationships between entities that need to be represented as a
single unit.
• Example: In a company database, an "Employee" is related to a "Department," and
the relationship between "Employee" and "Department" can be treated as a new entity
called "DepartmentAssignment" to handle the assignment of employees to
departments.
Comparison:
• Generalization is about abstracting common features, Specialization is about
creating sub-entities, and Aggregation is about treating relationships as entities.
UNIT-II
a) Consider the following Scheme: CO6
SUPPLIER (SUPPLIER ID, SUPPLIER_NAME, SUPPLIER_ADDRESS)
PARTS (PART ID, PART_NAME, COLOR)
CATALOG (SUPPLIER ID, PART ID, COST)
Write the following queries in Relational Algebra and in SQL:
(i) Find the name of the suppliers who supply Black Parts.
(ii) Find the name of suppliers who supply both Blue and Black Parts.
(iii) Find the name of suppliers who supply all Parts.
b) Student (RollNo, Name, Father_ Name, Branch) CO6
Book (ISBN, Title, Author, Publisher)
Issue (RollNo, ISBN, Date-of –Issue)
Write the following queries in SQL and relational algebra:
I. List roll number and name of all students of the branch ‘CSE’.
II. Find the name of student who has issued a book published by ‘ABC’publisher
III. List title of all books and their authors issued to a student ‘RAM’.
IV. List title of all books issued on or before December 1, 2020.
V. List all books published by publisher ‘ABC’.
c) What are the symbols used in E-R diagram? Construct an E-R diagram for a car Insurance company CO6
whose customers own one or more cars each. Each car has Associated with it zero to any number of
recorded accidents? Also convert the E-R Diagram into tables?
d) Explain the concepts of natural join? Also discuss the types of Outer join with suitable example? CO4
e) Write difference between Cross Join, Natural Join, left outer join and right outer join with suitable CO3
example.
f) Explain different types of Triggers in SQL/PL SQL. CO4
g) A database is being constructed to keep track of the teams and games of a sportleague. A team has a CO5
number of players, not all of whom participate in each game. It is desired to keep track of players
participating in each game for each team, the positions they play in that game and the result of the
game.
(i) Design an E-R schema diagram for this application.
(ii) Map the E-R diagram into relational model
h) Discuss the concept of Trigger with a suitable example? Also differentiate between Views and CO3
Indexes?
i) What is Relational Algebra? Explain Different Operations of Relational Algebra with Example. CO3
j) What are the different types of domains used in relational model? Discuss the referential integrity CO3
constraint with suitable example.
l What do you mean by view? Explain it with an example. How are they Implemented in DBMS? CO4
m What is Aggregate Function in SQL? Write SQL query for different Aggregate Function. CO3
n Explain the following with example CO2
(i) indexes
(ii) Views
a) Queries in Relational Algebra and SQL
Given Schema:
• SUPPLIER (SUPPLIER_ID, SUPPLIER_NAME, SUPPLIER_ADDRESS)
• PARTS (PART_ID, PART_NAME, COLOR)
• CATALOG (SUPPLIER_ID, PART_ID, COST)
i) Find the name of the suppliers who supply Black Parts.
Relational Algebra:
objectivec
Copy code
π SUPPLIER_NAME (σ COLOR='Black' (PARTS) ⨝ CATALOG ⨝ SUPPLIER)
• Explanation: Select parts that are black, join with the catalog to get the supplier information, and
then project the supplier names.
SQL:
sql
Copy code
SELECT SUPPLIER_NAME
FROM SUPPLIER S, PARTS P, CATALOG C
WHERE S.SUPPLIER_ID = C.SUPPLIER_ID
AND P.PART_ID = C.PART_ID
AND [Link] = 'Black';
ii) Find the name of suppliers who supply both Blue and Black Parts.
Relational Algebra:
less
Copy code
π SUPPLIER_NAME ((σ COLOR='Blue' (PARTS) ⨝ CATALOG ⨝ SUPPLIER) ∩ (σ COLOR='Black'
(PARTS) ⨝ CATALOG ⨝ SUPPLIER))
• Explanation: Find suppliers who supply both blue and black parts by intersecting the results of two
selections.
SQL:
sql
Copy code
SELECT SUPPLIER_NAME
FROM SUPPLIER S
WHERE S.SUPPLIER_ID IN (
SELECT C.SUPPLIER_ID
FROM CATALOG C, PARTS P
WHERE P.PART_ID = C.PART_ID AND [Link] = 'Blue'
)
AND S.SUPPLIER_ID IN (
SELECT C.SUPPLIER_ID
FROM CATALOG C, PARTS P
WHERE P.PART_ID = C.PART_ID AND [Link] = 'Black'
);
iii) Find the name of suppliers who supply all Parts.
Relational Algebra:
scss
Copy code
π SUPPLIER_NAME (SUPPLIER) - π SUPPLIER_NAME ( (SUPPLIER ⨝ CATALOG ⨝ PARTS) - (π
PART_ID (PARTS) - π PART_ID (SUPPLIER ⨝ CATALOG)))
• Explanation: Select suppliers who are not missing any part by subtracting the suppliers who don't
supply all parts from the total list.
SQL:
sql
Copy code
SELECT SUPPLIER_NAME
FROM SUPPLIER S
WHERE NOT EXISTS (
SELECT PART_ID
FROM PARTS P
WHERE NOT EXISTS (
SELECT 1
FROM CATALOG C
WHERE C.PART_ID = P.PART_ID AND C.SUPPLIER_ID = S.SUPPLIER_ID
)
);
b) Student, Book, and Issue Queries in SQL and Relational Algebra
Given Schema:
• Student (RollNo, Name, Father_Name, Branch)
• Book (ISBN, Title, Author, Publisher)
• Issue (RollNo, ISBN, Date-of-Issue)
I. List roll number and name of all students of the branch ‘CSE’.
Relational Algebra:
arduino
Copy code
π RollNo, Name (σ Branch='CSE' (Student))
SQL:
sql
Copy code
SELECT RollNo, Name
FROM Student
WHERE Branch = 'CSE';
II. Find the name of the student who has issued a book published by ‘ABC’ publisher.
Relational Algebra:
arduino
Copy code
π Name (σ Publisher='ABC' (Book) ⨝ Issue ⨝ Student)
SQL:
sql
Copy code
SELECT DISTINCT [Link]
FROM Student S, Book B, Issue I
WHERE [Link] = [Link]
AND [Link] = [Link]
AND [Link] = 'ABC';
III. List title of all books and their authors issued to a student ‘RAM’.
Relational Algebra:
arduino
Copy code
π Title, Author (σ Name='RAM' (Student) ⨝ Issue ⨝ Book)
SQL:
sql
Copy code
SELECT [Link], [Link]
FROM Book B, Issue I, Student S
WHERE [Link] = 'RAM'
AND [Link] = [Link]
AND [Link] = [Link];
IV. List title of all books issued on or before December 1, 2020.
Relational Algebra:
javascript
Copy code
π Title (σ Date-of-Issue <= '2020-12-01' (Issue ⨝ Book))
SQL:
sql
Copy code
SELECT [Link]
FROM Book B, Issue I
WHERE [Link] = [Link]
AND I.Date_of_Issue <= '2020-12-01';
V. List all books published by publisher ‘ABC’.
Relational Algebra:
arduino
Copy code
π Title, Author (σ Publisher='ABC' (Book))
SQL:
sql
Copy code
SELECT Title, Author
FROM Book
WHERE Publisher = 'ABC';
c) Symbols Used in E-R Diagram & Construct an E-R Diagram for Car Insurance
Symbols in E-R Diagram:
• Entity: Represented by a rectangle.
• Attribute: Represented by an oval.
• Relationship: Represented by a diamond.
• Primary Key: Underlined attribute.
• Weak Entity: Double rectangle.
• Multi-valued Attribute: Double oval.
• Derived Attribute: Dashed oval.
E-R Diagram for Car Insurance:
• Entities:
o Customer (CustomerID, Name, Address)
o Car (CarID, Model, RegistrationNo)
o Accident (AccidentID, Date, Description)
• Relationships:
o Owns (Customer → Car)
o AssociatedWith (Car → Accident)
Tables (Relational Model):
1. Customer (CustomerID, Name, Address)
2. Car (CarID, Model, RegistrationNo, CustomerID)
3. Accident (AccidentID, Date, Description, CarID)
d) Natural Join
• A Natural Join automatically joins two tables based on all columns with the same name and
compatible data types.
• Example: Joining Employee and Department tables based on DeptID.
sql
Copy code
SELECT *
FROM Employee NATURAL JOIN Department;
Outer Joins:
1. Left Outer Join: Includes all records from the left table and matching records from the right table.
Non-matching rows will have NULLs for the right table.
2. Right Outer Join: Includes all records from the right table and matching records from the left table.
Non-matching rows will have NULLs for the left table.
3. Full Outer Join: Includes all records from both tables. Non-matching rows will have NULLs where
there is no match.
e) Difference Between Cross Join, Natural Join, Left Outer Join, and Right Outer Join
Join Type Description Example Query
Returns the Cartesian product of two tables
Cross Join SELECT * FROM A CROSS JOIN B;
(every combination of rows).
Automatically joins tables based on common SELECT * FROM A NATURAL JOIN
Natural Join
columns with the same name. B;
Left Outer Returns all rows from the left table and matching SELECT * FROM A LEFT OUTER
Join rows from the right table. JOIN B ON [Link] = [Link];
Right Outer Returns all rows from the right table and SELECT * FROM A RIGHT OUTER
Join matching rows from the left table. JOIN B ON [Link] = [Link];
f) Types of Triggers in SQL/PL SQL
• DML Triggers: Triggered by Data Manipulation Language (INSERT, UPDATE, DELETE)
operations.
o Before Trigger: Fired before a DML operation is executed.
o After Trigger: Fired after a DML operation is executed.
• DDL Triggers: Triggered by Data Definition Language (CREATE, ALTER, DROP) operations.
• LOGON/LOGOFF Triggers: Triggered when a user logs in or logs out.
• Compound Triggers: Combination of multiple trigger actions (before and after) in a single trigger.
g) Team and Game E-R Schema
(i) E-R Diagram for Sport League:
• Entities:
o Team (TeamID, TeamName)
o Player (PlayerID, Name, Position)
o Game (GameID, Date, Result)
• Relationships:
o Participates (Team → Player)
o Plays (Player → Game)
o GameResult (Game → Team)
(ii) Relational Model:
1. Team (TeamID, TeamName)
2. Player (PlayerID, Name, TeamID)
3. Game (GameID, Date, Result)
4. Participates (PlayerID, GameID, Position)
5. GameResult (GameID, TeamID, Result)
h) Concept of Trigger
A trigger is a set of SQL statements that automatically execute in response to a specific event (INSERT,
UPDATE, DELETE) on a table or view.
Example:
sql
Copy code
CREATE TRIGGER UpdateSalary
AFTER INSERT ON Employee
FOR EACH ROW
BEGIN
UPDATE Employee
SET Salary = Salary + 1000
WHERE EmployeeID = [Link];
END;
Difference Between Views and Indexes:
• View: A virtual table created by a query that represents data from one or more tables.
• Index: A database object that improves query performance by providing quick access to rows in a
table.
i) Relational Algebra
Relational Algebra is a procedural query language used to query relational databases. It consists of a set of
operations to retrieve data from relations (tables).
Operations:
1. Selection (σ): Selects rows that satisfy a condition.
2. Projection (π): Selects specific columns.
3. Union (∪): Combines results from two relations.
4. Difference (−): Returns rows from one relation that are not in another.
5. Cartesian Product (×): Combines every row of one relation with every row of another.
6. Join (⨝): Combines rows from two relations based on a condition.
j) Types of Domains and Referential Integrity
• Types of Domains:
o Integer, String, Date, Boolean, etc.
• Referential Integrity Constraint: Ensures that foreign keys in a table match primary keys in
another table.
o Example: If Order has a CustomerID as a foreign key, it must refer to an existing
CustomerID in the Customer table.
l) View and Implementation
• A View is a virtual table created by a query on one or more tables. It can simplify complex queries
and restrict access to certain data.
SQL to Create a View:
sql
Copy code
CREATE VIEW EmployeeView AS
SELECT Name, Salary
FROM Employee
WHERE Salary > 50000;
m) Aggregate Functions in SQL
• Aggregate Functions perform calculations on multiple rows to return a single value. Examples
include:
o COUNT(): Counts the number of rows.
o SUM(): Adds up values.
o AVG(): Computes the average.
o MAX(): Returns the maximum value.
o MIN(): Returns the minimum value.
sql
Copy code
SELECT COUNT(*), AVG(Salary)
FROM Employee;
n) Indexes and Views
1. Indexes:
o Index is a database object that speeds up retrieval of rows by providing efficient search
mechanisms.
o Example: Index on the EmployeeID column to speed up searches by EmployeeID.
2. Views:
o A View is a virtual table based on the result of a SELECT query. It simplifies complex
queries and can provide data security by restricting access to certain columns.
o Example: A view showing employee names and salaries for HR use only.
4o mini
UNIT-III
a) Define partial functional dependency. Consider the following two sets of functional dependencies F= CO4
{A ->C, AC ->D, E ->AD, E ->H} and G = {A ->CD, E ->AH}. Check whether or not they are
equivalent.
b) Define Minimal Cover. Suppose a relation R (A,B,C) has FD set F = {A→B, B→C, A→C, AB→B, CO3
AB→C, AC→B} convert this FD set into minimal cover.
c) A set of FDs for the relation R{A, B, C, D, E, F} is {AB →C, C → DE, E →F,F → a} Find What is CO3
highest normal form
d) What is Functional Dependency? Explain the procedure of calculating the Canonical Cover of a given CO5
Functional Dependency Set with suitable example
e) Consider the relation R(a,b,c,d) with Set F={a→c,b→d}. Decompose this relation in 2 NF. CO4
f) Explain the Loss Less Decomposition with example. CO3
g) What are RAT axioms? Also discuss the algorithm for finding the closure of functional dependency CO3
with a suitable example?
h) What do you mean by Loss-Less Join Decomposition? Explain with suitable example that how CO2
functional dependency can be used to show that decompositions are loss-less?
i) De Given the following set of FDs on schema R (V,W,X,Y,Z) CO5
{Z→V, W→Y, XY→Z, V→WX}State whether the following decomposition are loss-less-join
decompositions or not.
(i) R1=(V,W,X) , R2=(V,Y,Z)
(ii) R1=(V,W,X), R2=(X,Y,Z)
j) What is the purpose of Normalization? Explain 1NF, 2NF, 3NF and BCNF with suitable example? CO3
a) Define Partial Functional Dependency. Consider the following two sets of functional
dependencies F= {A -> C, AC -> D, E -> AD, E -> H} and G = {A -> CD, E -> AH}. Check
whether or not they are equivalent.
Answer:
Partial Functional Dependency:
Partial functional dependency occurs when a non-prime attribute is functionally dependent
on only part of a candidate key, rather than the whole candidate key. In simpler terms, it
happens when a non-prime attribute is dependent on a subset of the candidate key instead of
the entire key.
Checking Equivalence of F and G:
• F = {A -> C, AC -> D, E -> AD, E -> H}
• G = {A -> CD, E -> AH}
To check if two sets of functional dependencies are equivalent, we need to see if each set
can be derived from the other.
From F to G:
1. From E -> AD and E -> H in F, we can derive E -> AH (which is part of G).
2. From A -> C and AC -> D, we can derive A -> CD.
From G to F:
1. From A -> CD, we can break it into A -> C and A -> D (both of which are in F).
2. From E -> AH, we can split it into E -> A and E -> H.
Since both sets can derive each other, F and G are equivalent.
b) Define Minimal Cover. Suppose a relation R (A,B,C) has FD set F = {A→B, B→C,
A→C, AB→B, AB→C, AC→B} convert this FD set into minimal cover.
Answer:
Minimal Cover (Canonical Cover):
A minimal cover (or canonical cover) of a set of functional dependencies is a simplified
version of the set that contains no redundant dependencies. To convert a set of functional
dependencies to its minimal cover, follow these steps:
1. Remove extraneous attributes from the left-hand side of each functional dependency.
2. Remove redundant functional dependencies.
Given the FD set:
F = {A → B, B → C, A → C, AB → B, AB → C, AC → B}
Step 1: Remove extraneous attributes
• AB → B: Since B → C is in the set, AB → B is redundant, so remove it.
• AB → C: Since A → C is already in the set, AB → C is redundant, so remove it.
• AC → B: Since A → B is already in the set, AC → B is redundant, so remove it.
Step 2: Final Minimal Cover
After removing redundant dependencies, the minimal cover for F is:
• A→B
• B→C
• A→C
Thus, the minimal cover is: {A → B, B → C, A → C}
c) A set of FDs for the relation R{A, B, C, D, E, F} is {AB →C, C → DE, E →F, F → a}
Find What is the highest normal form.
Answer:
To find the highest normal form (HNFs) of a relation, we need to check for violations of
each normal form (1NF, 2NF, 3NF, BCNF).
Given the set of functional dependencies:
• AB → C
• C → DE
• E→F
• F→a
Let's analyze step by step.
1. 1NF (First Normal Form):
o The relation is in 1NF if all attributes contain atomic values (i.e., no multi-
valued attributes).
o Assuming all attributes are atomic, the relation is in 1NF.
2. 2NF (Second Normal Form):
o The relation is in 2NF if it is in 1NF and there is no partial dependency on a
composite key.
o Here, AB → C is a functional dependency where AB is a composite key, and
there are no partial dependencies. So, the relation is in 2NF.
3. 3NF (Third Normal Form):
o The relation is in 3NF if it is in 2NF and there are no transitive dependencies
between non-prime attributes.
o C → DE and E → F are transitive dependencies involving non-prime attributes
(C, E). Therefore, the relation is not in 3NF.
4. BCNF (Boyce-Codd Normal Form):
o The relation is in BCNF if for every functional dependency X → Y, X is a
superkey.
o F → a violates BCNF because F is not a superkey. Hence, the relation is not in
BCNF.
The highest normal form is 2NF.
d) What is Functional Dependency? Explain the procedure of calculating the Canonical
Cover of a given Functional Dependency Set with a suitable example.
Answer:
Functional Dependency (FD):
A functional dependency (FD) is a constraint between two sets of attributes in a relation,
stating that the value of one attribute (or a set of attributes) determines the value of another
attribute. Specifically, for a functional dependency X → Y, if two tuples have the same
value for attribute(s) in X, they must have the same value for attribute(s) in Y.
Procedure for Calculating the Canonical Cover:
To calculate the canonical cover of a given set of functional dependencies, follow these
steps:
1. Remove extraneous attributes from the left-hand side of each FD.
2. Remove redundant functional dependencies.
3. Decompose the dependencies so that the right-hand side contains only a single
attribute.
Example: Given the FD set:
• F = {A → B, B → C, A → C, AB → B, AB → C, AC → B}
Step 1: Remove extraneous attributes:
• AB → B: Since B → C is already in the set, AB → B is redundant, so remove it.
• AB → C: Since A → C is already in the set, AB → C is redundant, so remove it.
• AC → B: Since A → B is already in the set, AC → B is redundant, so remove it.
Step 2: Minimal Cover: After removing redundant dependencies, the minimal cover is:
• A→B
• B→C
• A→C
Thus, the canonical cover is: {A → B, B → C, A → C}.
e) Consider the relation R(a,b,c,d) with Set F={a→c,b→d}. Decompose this relation in 2NF.
Answer:
To decompose a relation into 2NF, it must first satisfy 1NF (atomic values). Then, there
should be no partial dependencies of non-prime attributes on part of a candidate key.
Given:
• Relation R(a, b, c, d)
• FD set F = {a → c, b → d}
Step 1: Identify the candidate key:
• The candidate key in this case is {a, b}, since a and b together determine c and d.
Step 2: Check for partial dependencies:
• a → c: This is a partial dependency because a is part of the candidate key.
• b → d: This is also a partial dependency because b is part of the candidate key.
Step 3: Decompose into 2NF:
To eliminate partial dependencies, we decompose the relation into two relations:
• R1(a, c) with FD a → c.
• R2(b, d) with FD b → d.
• R3(a, b) with FD a, b as a candidate key.
Thus, the decomposition is:
• R1(a, c)
• R2(b, d)
• R3(a, b)
f) Explain the Lossless Decomposition with Example.
Answer:
Lossless Decomposition:
A decomposition of a relation is lossless if we can recreate the original relation by joining
the decomposed relations. The key to ensuring lossless decomposition is that the intersection
of the decomposed relations must be a key for at least one of the relations.
Example: Consider the relation R(A, B, C) with the following functional dependencies:
• A→B
• B→C
We decompose R into two relations:
• R1(A, B)
• R2(B, C)
The intersection of R1 and R2 is B, which is a key in both relations. Therefore, the
decomposition is lossless, meaning that we can join R1 and R2 to recover the original
relation R(A, B, C).
g) What are RAT Axioms? Also, discuss the algorithm for finding the closure of functional
dependency with a suitable example?
Answer:
RAT Axioms (Rules of Attribute Closure):
RAT axioms are used to infer functional dependencies in a relation. The axioms are:
1. Reflexivity: If Y ⊆ X, then X → Y.
2. Augmentation: If X → Y, then XZ → YZ for any Z.
3. Transitivity: If X → Y and Y → Z, then X → Z.
Algorithm for Finding the Closure of a Set of Attributes (X+):
1. Start with the set of attributes X+ = X.
2. Apply the functional dependencies in the set to X+ to add any new attributes.
3. Repeat this process until no new attributes can be added.
Example: Given the set of functional dependencies:
• A →B
• B → C
• A →C
To find the closure of A (A+):
• Start with A+ = {A}.
• A → B: Add B to A+, so now A+ = {A, B}.
• B → C: Add C to A+, so now A+ = {A, B, C}.
• A → C is already included, so the closure of A is {A, B, C}.
4o mini
UNIT-IV
a) What is deadlock? Discuss about the deadlock detection? Also mention the CO3
steps for recovery from deadlock?
b) What are distributed databases? Discuss the various concurrency protocol used in distributed database CO3
in detail?
c) Which of the following schedules are conflicts serializable? For each serializable CO4
schedule find the equivalent schedule.
S1: r1(x); r3(x); w3(x); w1(x); r2(x)
S2: r3(x); r2(x); w3(x); r1(x); w1(x)
S3: r1(x); r2(x); r3(y); w1(x); r2(z); r2(y); w2(y)
d) Explain the method of testing the serializability. Consider the schedule CO4
S1 and S2 given below
S1: R1(A),R2(B),W1(A),W2(B)
S2: R2(B),R1(A),W2(B), W1(A)
Check whether the given schedules are conflict equivalent or not?
e) What is Conflict Serializable Schedule? Check the given Schedule S1 is Conflict Serializable or not? CO2
S1: R1(X), R2(X),R2(Y),W2(Y),R1(Y),W1(X)
f) Explain Deadlock Handling with Suitable Example CO3
g) Define the precedence graph. What is the role of it in transaction processing? Discuss in detail. CO3
h) What do you understand by interleaving of transaction? How is it differ from the serial execution? CO2
i) Describe serializable schedule. Discuss conflict serializability with suitable example. CO5
j) Consider schedules S1, S2, and S3 below. Determine whether each schedule is strict, cascade less, CO4
recoverable, or non recoverable. (Determine the strictest recoverability condition that each schedule
satisfies.)
S1: r1 (X); r2 (Z); r1 (Z); r3 (X); r3 (Y); w1 (X); c1; w3 (Y); c3; r2 (Y); w2 (Z); w2(Y); c2;
S2: r1 (X); r2 (Z); r1 (Z); r3 (X); r3 (Y); w1 (X); w3 (Y); r2 (Y); w2 (Z); w2 (Y);c1;c2; c3;
S3: r1 (X); r2 (Z); r3 (X); r1 (Z); r2 (Y); r3 (Y); w1 (X); c1; w2 (Z); w3 (Y); w2(Y);c3; c2;
k) Consider the three transactions T1, T2, and T3, and the schedules S1 and S2given below. State CO4
whether each schedule is serializable or not. If a schedule is serializable, write down the equivalent
serial schedule(s).
T1: r1 (X); r1 (Z); w1 (X);
T2: r2 (Z); r2 (Y); w2 (Z); w2 (Y);
T3: r3 (X); r3 (Y); w3 (Y);
S1: r1 (X); r2 (Z); r1 (Z); r3 (X); r3 (Y); w1 (X); w3 (Y); r2 (Y); w2 (Z); w2 (Y);
S2: r1 (X); r2 (Z); r3 (X); r1 (Z); r2 (Y); r3 (Y); w1 (X); w2 (Z); w3 (Y); w2 (Y);
l What is a schedule? Define the concepts of recoverable, cascade less and strict schedules, and CO4
compare them in terms of their recoverability
Hhh
a) What is Deadlock? Discuss Deadlock Detection and Steps for Recovery from
Deadlock.
Deadlock:
Deadlock in computer science, particularly in the context of databases and operating
systems, occurs when two or more transactions are blocked forever because each one
is waiting for the other to release a resource, leading to a state where none of the
processes can proceed.
Deadlock Detection:
Deadlock detection refers to the process of identifying whether a deadlock has
occurred. The system continuously checks the resource allocation graph (RAG) or
wait-for graph to detect cycles, as cycles indicate deadlock.
Steps for Recovery from Deadlock:
1. Process Termination: Kill one or more processes involved in the deadlock to break
the cycle.
o Abort all deadlocked processes.
o Abort one process at a time and check if the deadlock is resolved.
2. Resource Preemption: Take resources away from some processes and assign them
to others to resolve deadlock.
o Rollback transactions to a safe state.
o Preempt resources held by one process and assign them to other processes
involved in the deadlock.
b) What are Distributed Databases? Discuss Various Concurrency Protocols
Used in Distributed Databases.
Distributed Databases:
A distributed database is a collection of databases that are spread across multiple sites
connected by a network. The data in distributed databases is often replicated or
partitioned across different nodes to improve availability, reliability, and performance.
Concurrency Protocols in Distributed Databases:
1. Two-Phase Locking (2PL):
In 2PL, transactions first acquire all locks and then release them. This ensures
serializability but can cause deadlocks.
2. Timestamp Ordering:
Each transaction is assigned a unique timestamp. Transactions are executed based on
their timestamps to maintain consistency.
3. Optimistic Concurrency Control (OCC):
OCC allows transactions to execute without locks and checks for conflicts during
commit. If conflicts occur, transactions are rolled back.
4. Multi-Version Concurrency Control (MVCC):
MVCC uses multiple versions of data items to allow transactions to access different
versions without conflict, improving concurrency.
c) Which of the Following Schedules are Conflicts Serializable? For Each
Serializable Schedule, Find the Equivalent Schedule.
Schedules:
1. S1: r1(x); r3(x); w3(x); w1(x); r2(x)
2. S2: r3(x); r2(x); w3(x); r1(x); w1(x)
3. S3: r1(x); r2(x); r3(y); w1(x); r2(z); r2(y); w2(y)
Checking Conflict Serializability:
1. S1:
o The transactions r1(x), r3(x), w3(x), w1(x), r2(x) can be reordered to form an
equivalent serial schedule.
o Equivalent serial schedule: T1 → T3 → T2
2. S2:
o The transactions r3(x), r2(x), w3(x), r1(x), w1(x) can be reordered as well.
o Equivalent serial schedule: T3 → T2 → T1
3. S3:
o r1(x), r2(x), r3(y), w1(x), r2(z), r2(y), w2(y): There are conflicts between
operations, but this can be reordered to form a serial schedule.
o Equivalent serial schedule: T1 → T2 → T3
d) Explain the Method of Testing the Serializability. Consider the Schedules S1
and S2 Given Below. Check Whether the Given Schedules Are Conflict
Equivalent or Not.
Schedules:
1. S1: R1(A), R2(B), W1(A), W2(B)
2. S2: R2(B), R1(A), W2(B), W1(A)
Conflict Equivalence Testing:
1. Step 1: Identify the conflicting operations in both schedules (same data item and
different operations).
2. Step 2: Construct a precedence graph to check if the schedules can be reordered
without violating the dependency.
• In S1, the sequence of operations is T1 → T2.
• In S2, the sequence of operations is T2 → T1.
• Since there are no conflicting cycles in the precedence graph, S1 and S2 are conflict
equivalent.
e) What is a Conflict Serializable Schedule? Check if the Given Schedule S1 is
Conflict Serializable or Not.
Schedule:
• S1: R1(X), R2(X), R2(Y), W2(Y), R1(Y), W1(X)
Conflict Serializable Schedule:
A schedule is conflict serializable if its precedence graph has no cycles, indicating
that the schedule can be transformed into a serial schedule by reordering non-
conflicting transactions.
• In S1, the precedence graph can be constructed based on conflicting operations:
o T1 → T2 (due to R1(X) and R2(X))
o T2 → T1 (due to W2(Y) and R1(Y))
• There is a cycle in the precedence graph, so S1 is not conflict serializable.
f) Explain Deadlock Handling with a Suitable Example.
Deadlock Handling:
Deadlock handling is the process of detecting, preventing, and recovering from
deadlock situations.
Example: Consider two processes P1 and P2 that are accessing resources R1 and R2:
• P1 holds R1 and waits for R2.
• P2 holds R2 and waits for R1. This is a classic deadlock.
Handling Deadlock:
1. Prevention: Ensure that one of the necessary conditions (like holding and waiting) is
avoided.
2. Detection: Use a wait-for graph to detect cycles, indicating deadlock.
3. Recovery: Terminate or roll back one of the processes to break the deadlock.
g) Define the Precedence Graph. What is its Role in Transaction Processing?
Discuss in Detail.
Precedence Graph:
A precedence graph is a directed graph where:
• Each node represents a transaction.
• A directed edge from Ti to Tj means that transaction Ti must precede transaction Tj
in the schedule.
Role in Transaction Processing:
• It is used to check conflict serializability. If there is a cycle in the precedence graph,
the schedule is not serializable.
• It helps in determining dependencies and conflicts between transactions.
h) What Do You Understand by Interleaving of Transactions? How Is It
Different from Serial Execution?
Interleaving of Transactions:
Interleaving is when multiple transactions are executed concurrently, with their
operations being interwoven in such a way that they appear as a single sequence of
operations.
Difference from Serial Execution:
• Serial Execution: Transactions are executed one after the other without any overlap.
• Interleaving: Operations of different transactions can be interwoven, leading to
increased concurrency and potentially more efficient use of resources.
i) Describe Serializable Schedule. Discuss Conflict Serializability with a Suitable
Example.
Serializable Schedule:
A schedule is serializable if its outcome is equivalent to the outcome of some serial
execution of the transactions. It is the highest level of correctness for a schedule.
Conflict Serializability:
Conflict serializability means that the schedule can be rearranged (by swapping non-
conflicting operations) into a serial schedule.
Example: Schedule: S: R1(X), W2(X), R2(Y), W1(Y)
• The precedence graph can be constructed, and if it is acyclic, the schedule is conflict
serializable.
j) Consider the Given Schedules. Determine Whether Each Schedule is Strict,
Cascade Less, Recoverable, or Non-Recoverable.
1. S1:
o Strict: No, because T1 reads a value written by T3 without committing.
o Cascade Less: Yes, no transaction reads uncommitted values.
o Recoverable: Yes, as there is no case where a transaction reads from another
transaction and then does not commit.
2. S2:
o Strict: No, similar issue as in S1.
o Cascade Less: Yes.
o Recoverable: Yes.
3. S3:
o Strict: Yes.
o Cascade Less: Yes.
o Recoverable: Yes.
k) State Whether Each Schedule is Serializable or Not. If Serializable, Write
Down the Equivalent Serial Schedule(s).
Given the schedules, the analysis involves checking for conflicts and determining if
the schedule is serializable by constructing a precedence graph. If a cycle exists in the
graph, the schedule is not serializable.
l) What is a Schedule? Define the Concepts of Recoverable, Cascade Less, and
Strict Schedules, and Compare Them in Terms of Their Recoverability.
Schedule:
A schedule is a sequence of operations from multiple transactions arranged in a
manner where their execution order is consistent with the underlying data.
Definitions:
1. Recoverable Schedule: A schedule is recoverable if, for every pair of transactions
Ti and Tj, if Ti reads the data written by Tj, then Tj must commit before Ti does.
2. Cascade Less Schedule: A schedule is cascade less if no transaction reads
uncommitted data from another transaction.
3. Strict Schedule: A strict schedule is one in which no transaction can read or write
data written by another transaction until that transaction commits.
Comparison in Terms of Recoverability:
• Strict schedules are always recoverable and cascade less.
• Cascade less schedules are recoverable, but not necessarily strict.
• Recoverable schedules are the least restrictive in terms of cascading behavior.
UNIT-V
a) Discuss the concept of two phase locking protocol? Also differentiate between strict two phase & CO4
rigorous two phase locking protocol?
b) Explain the concept of Multiple Granularity? Also discuss Multi-version Schemes?. CO3
c) Compare and contrast the differences between time stamp protocol and validation based protocol CO3
for concurrency control?
d) Explain two phase locking protocol with suitable example. CO3
e) Discuss 2 phase commit (2PC) protocol and time stamp based protocol withsuitable example. How CO2
the validation based protocols differ from 2PC?
f) Explain the Validation Based protocol for concurrency control. CO3
g) Explain Time Stamp Based Concurrency Control technique. CO3
h) Explain Recovery from Concurrent Transaction. CO2
i) Discuss the timestamp ordering protocol for concurrency control. How does strict timestamp CO3
ordering differ from basic timestamp ordering?
j) Explain the concept of Multiple Granularity? Also discuss Multi-version Schemes? CO3
k What do you mean by time stamping protocol for concurrency controlling? Discuss multi version CO3
scheme of concurrency control.
That
a) Discuss the Concept of Two-Phase Locking Protocol? Also Differentiate Between Strict Two-Phase &
Rigorous Two-Phase Locking Protocol?
Two-Phase Locking Protocol (2PL):
Two-Phase Locking (2PL) is a concurrency control protocol that guarantees serializability in transaction processing
systems. It ensures that the execution of transactions follows a two-phase process:
1. Growing Phase: In this phase, a transaction can acquire locks but cannot release any locks. The transaction
keeps acquiring locks on resources it needs until it has obtained all the necessary locks.
2. Shrinking Phase: Once the transaction begins releasing locks, it enters the shrinking phase. During this
phase, the transaction can no longer acquire any new locks. This ensures that no other transaction can access
the locked data and causes a conflict.
The 2PL protocol ensures serializability because it prevents circular waits by ensuring that all transactions acquire
their locks before releasing any of them.
Differences Between Strict Two-Phase & Rigorous Two-Phase Locking Protocol:
Aspect Strict Two-Phase Locking Rigorous Two-Phase Locking
Locks can be released after the transaction
Lock Release Locks are held until the transaction commits.
aborts.
Commit Locks are released after the transaction
Locks are held until the transaction commits.
Handling commits or aborts.
Ensures recoverability by ensuring that no Stronger recoverability by holding all locks until the
Recoverability
transaction releases locks until it commits. transaction commits, making it more restrictive.
Transaction Conflicts can still arise between transactions Conflicts are less likely because no locks are
Conflicts in the shrinking phase. released until commit.
b) Explain the Concept of Multiple Granularity? Also Discuss Multi-Version Schemes?
Multiple Granularity:
Multiple Granularity refers to a technique in database management where data is locked at different levels of
granularity (e.g., at the database level, table level, row level, and field level). It allows transactions to lock only the
portions of the data that they need, improving performance and concurrency. For example, a transaction may lock a
table at a coarser granularity (table level) if it needs to access many rows, or it may lock specific rows or even
individual fields at a finer granularity if it only needs to access a few pieces of data.
Advantages of Multiple Granularity:
• Improved Concurrency: By allowing finer granularity locks, more transactions can execute concurrently.
• Reduced Locking Overhead: Locking only necessary parts of the data reduces overhead compared to
locking the entire data structure.
• Hierarchical Locking: The system may support hierarchical locking, where a transaction can lock a higher-
level entity like a table and then lock more specific items like rows within that table.
Multi-Version Schemes:
Multi-Version Concurrency Control (MVCC) is a technique used to handle concurrency by maintaining multiple
versions of a data item. When a transaction reads data, it sees a snapshot of the data from the point in time when it
started, ensuring consistency even if other transactions are writing to the data. MVCC prevents read-write conflicts by
allowing transactions to read older versions of data while new versions are being written.
• Write Operation: When a transaction writes to a data item, a new version of the data is created, and the
original version is retained.
• Read Operation: When a transaction reads a data item, it reads the most recent version that was valid at the
time the transaction started.
• Advantages:
o MVCC reduces the need for locking, leading to higher concurrency.
o It allows transactions to proceed without blocking other transactions, thus improving performance.
c) Compare and Contrast the Differences Between Timestamp Protocol and Validation-Based Protocol for
Concurrency Control?
Timestamp Protocol:
Timestamp-based protocols assign a unique timestamp to each transaction when it starts. The basic idea is that the
transactions should follow the order of their timestamps to ensure serializability. A transaction can only read or write a
data item if it respects the timestamps of conflicting transactions.
• Mechanism:
o Each transaction is assigned a global timestamp when it starts.
o The protocol checks if a transaction’s operation conflicts with the timestamp of other transactions.
o If conflicts arise, one of the transactions is aborted to maintain serializability.
• Advantages:
o Simple to implement.
o Transactions do not need to lock data.
• Disadvantages:
o Conflicts can result in frequent rollbacks, which can reduce system throughput.
Validation-Based Protocol:
Validation-based protocols allow transactions to execute without locking data, but before committing, they validate
that no conflicts have occurred. Validation takes place in three phases: Read, Validate, and Write.
• Mechanism:
o Read Phase: The transaction reads the data and executes its operations.
o Validate Phase: The system checks whether the transaction conflicts with any other concurrent
transactions.
o Write Phase: If validation is successful, the transaction writes its results to the database; otherwise, it
is aborted.
• Advantages:
o Higher concurrency compared to other protocols.
o No need for locking data during execution.
• Disadvantages:
o Validation phase can add overhead.
o More complex to implement compared to timestamp-based protocols.
Differences Between Timestamp and Validation-Based Protocols:
Aspect Timestamp Protocol Validation-Based Protocol
Detects conflicts based on transaction Detects conflicts during the validation phase before
Conflict Detection
timestamps. committing.
Concurrency Ensures transactions are executed in Transactions can execute without locks, but they must be
Control timestamp order. validated before committing.
Rollback Rollbacks are performed when Rollbacks occur if a transaction fails during the validation
Mechanism timestamp conflicts occur. phase.
May lead to frequent rollbacks, More efficient in terms of concurrency, but validation
Performance
reducing throughput. adds overhead.
d) Explain Two-Phase Locking Protocol with Suitable Example.
Two-Phase Locking Protocol (2PL):
The Two-Phase Locking (2PL) protocol ensures serializability by controlling the acquisition and release of locks in
two phases:
1. Growing Phase: The transaction can acquire locks but cannot release any locks.
2. Shrinking Phase: Once the transaction releases any lock, it can no longer acquire new locks.
Example:
Consider two transactions, T1 and T2, operating on a database with two items: X and Y.
• Transaction T1:
o Phase 1: T1 locks X.
o Phase 2: T1 locks Y and then releases X.
o T1 commits after processing.
• Transaction T2:
o Phase 1: T2 locks X and Y.
o Phase 2: T2 releases the locks after processing.
By using 2PL, the system ensures that no other transactions can access X or Y until the transactions commit,
preventing conflicts between concurrent transactions.
e) Discuss 2-Phase Commit (2PC) Protocol and Timestamp-Based Protocol with Suitable Example. How Does
the Validation-Based Protocol Differ from 2PC?
2-Phase Commit (2PC):
The 2PC protocol is a distributed transaction protocol that ensures atomicity by coordinating the commit or abort
decision across multiple participants (databases). It consists of two phases:
1. Prepare Phase: The coordinator sends a "prepare" message to all participants. Each participant votes either
"commit" or "abort" based on whether it can successfully complete the transaction.
2. Commit Phase: If all participants vote to commit, the coordinator sends a "commit" message to all
participants. If any participant votes to abort, the coordinator sends an "abort" message to all participants.
Example:
• Coordinator sends "Prepare" message to all participants.
• Participants vote: If all participants respond with "commit", the coordinator sends the "commit" message.
• If any participant votes to abort, the coordinator sends the "abort" message.
Differences Between Validation-Based Protocol and 2PC:
Aspect 2-Phase Commit Protocol Validation-Based Protocol
Protocol Ensures serializability and concurrency in
Ensures atomicity in distributed systems.
Purpose transaction systems.
Execution
Prepare and Commit phases. Read, Validate, and Write phases.
Phases
Conflict Conflicts are resolved by aborting transactions Conflicts are handled during the validation phase,
Handling in case of failure. not during execution.
Rollback Participants roll back the transaction if any
Transaction is rolled back if validation fails.
Handling participant votes "abort."
f) Explain the Validation-Based Protocol for Concurrency Control.
(Already covered in c above.)
g) Explain Time Stamp-Based Concurrency Control Technique.
(Already covered in c above.)
h) Explain Recovery from Concurrent Transaction.
Recovery from concurrent transactions involves mechanisms to ensure the consistency of the database after a failure.
Key methods include:
• Undo Operations: If a transaction fails, any changes made by it are undone using the log.
• Redo Operations: If a transaction committed before failure, its changes are reapplied using the log.
• Logging Mechanism: The system uses a log to record all operations performed by transactions so that they
can be rolled back or reapplied as necessary.
i) Discuss the Timestamp Ordering Protocol for Concurrency Control. How Does Strict Timestamp Ordering
Differ from Basic Timestamp Ordering?
(Already covered in c above.)
j) Explain the Concept of Multiple Granularity? Also Discuss Multi-Version Schemes?
(Already covered in b above.)
k) What Do You Mean by Time Stamping Protocol for Concurrency Controlling? Discuss Multi-Version
Scheme of Concurrency Control.
(Already covered in c and b above.)