0% found this document useful (0 votes)
9 views11 pages

Database Management System Overview

The document provides an overview of databases, including types such as relational and NoSQL databases, and discusses key concepts like DBMS, DDL, DML, cardinality, and keys. It explains the architecture of databases, transaction control, and various relational operations such as selection, projection, and joins. Additionally, it covers the importance of ACID properties and the role of keys in uniquely identifying records within a database.

Uploaded by

Aryan Raj
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views11 pages

Database Management System Overview

The document provides an overview of databases, including types such as relational and NoSQL databases, and discusses key concepts like DBMS, DDL, DML, cardinality, and keys. It explains the architecture of databases, transaction control, and various relational operations such as selection, projection, and joins. Additionally, it covers the importance of ACID properties and the role of keys in uniquely identifying records within a database.

Uploaded by

Aryan Raj
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Database DBMS Languages Cardinality Keys ER Diagram example.

The database is a collection of inter-related data which is used 1. Data Definition Language (DDL) Cardinality means how the entities are arranged to each other 1. Keys play an important role in the relational database.
to retrieve, insert, and delete the data efficiently. It is also DDL stands for Data Definition Language. It is used to define or what is the relationship structure between entities in a 2. It is used to uniquely identify any record or row of data
used to organize the data in the form of a table, schema, database structure or pattern. relationship set. In a Database Management System, from the table. It is also used to establish and identify
views, and reports, etc. It is used to create schema, tables, indexes, constraints, etc. in Cardinality represents a number that denotes how many relationships between tables.
DBMS the database. times an entity is participating with another entity in a For example, ID is used as a key in the Student table because
Database management system is a software which is used to Using the DDL statements, you can create the skeleton of the relationship set. The Cardinality of DBMS is a very important it is unique for each student. In the PERSON table,
manage the database. For example: MySQL, Oracle, etc are a database. attribute in representing the structure of a Database. In a passport_number, license_number, SSN are keys since they
very popular commercial database which is used in different Data definition language is used to store the information of table, the number of rows or tuples represents the are unique for each person.
applications. metadata like the number of tables and schemas, their names, Cardinality.
DBMS provides an interface to perform various operations like indexes, columns in each table, constraints, etc. Cardinality Ratio
database creation, storing data in it, updating data, creating a tasks under DDL: Cardinality ratio is also called Cardinality Mapping, which
table in the database and a lot more. Create: It is used to create objects in the database. represents the mapping of one entity set to another entity set
It provides protection and security to the database. In the Alter: It is used to alter the structure of the database. in a relationship set. We generally take the example of a
case of multiple users, it also maintains data consistency. Drop: It is used to delete objects from the database. binary relationship set where two entities are mapped to each 1. Primary key
NoSQL Database Truncate: It is used to remove all records from a table. other. It is the first key used to identify one and only one instance of
A NoSQL database is an approach to design such databases Rename: It is used to rename an object. Cardinality is very important in the Database of various an entity uniquely. An entity can contain multiple keys, as we
that can accommodate a wide variety of data models. NoSQL Comment: It is used to comment on the data dictionary. businesses. For example, if we want to track the purchase saw in the PERSON table. The key which is most suitable from
stands for "not only SQL." It is an alternative to traditional 2. Data Manipulation Language (DML) history of each customer then we can use the one-to-many those lists becomes a primary key.
relational databases in which data is placed in tables, and data DML stands for Data Manipulation Language. It is used for cardinality to find the data of a specific customer. The In the EMPLOYEE table, ID can be the primary key since it is
schema is perfectly designed before the database is built. accessing and manipulating data in a database. It handles user Cardinality model can be used in Databases by Database unique for each employee. In the EMPLOYEE table, we can
NoSQL databases are useful for a large set of distributed data. requests. Managers for a variety of purposes, but corporations often even select License_Number and Passport_Number as
Some examples: MongoDB, CouchDB, Cloudant. tasks under DML: use it to evaluate customer or inventory data. primary keys since they are also unique.
Relational Database Select: It is used to retrieve data from a database. For each entity, the primary key selection is based on
This database is based on the relational data model, which Insert: It is used to insert data into a table. There are four types of Cardinality Mapping in Database requirements and developers.
stores data in the form of rows(tuple) and Update: It is used to update existing data within a table. Management Systems:
columns(attributes), and together forms a table(relation). A Delete: It is used to delete all records from a table.
relational database uses SQL for storing, manipulating, as well Merge: It performs UPSERT operation, i.e., insert or update 1. One to one
as maintaining the data. E.F. Codd invented the database in operations. 2. Many to one
1970. Each table in the database carries a key that makes the Call: It is used to call a structured query language or a Java 3. One to many
data unique from others. Examples of Relational databases subprogram. 4. Many to many
are MySQL, Microsoft SQL Server, Oracle, etc. Explain Plan: It has the parameter of explaining data. One to One cardinality
Relational Algebra
Degree: Lock Table: It controls concurrency. One to one cardinality is represented by a 1:1 symbol. In this,
Relational algebra is a procedural query language. It gives a
The total number of attributes that comprise a relation is 3. Data Control Language (DCL) there is at most one relationship from one entity to another
step-by-step process to obtain the result of the query. It uses
known as the degree of the table. DCL stands for Data Control Language. It is used to retrieve entity. There are a lot of examples of one-to-one cardinality in 2. Candidate key operators to perform queries.
Domain: the stored or saved data. real life databases. A candidate key is an attribute or set of attributes that can
The domain refers to the possible values each attribute can The DCL execution is transactional. It also has rollback For example, one student can have only one student id, and uniquely identify a tuple. Types of Relational operation
contain. It can be specified using standard data types such as parameters. one student id can belong to only one student. So, the Except for the primary key, the remaining attributes are
integers, floating numbers, etc. For example, An attribute tasks under DCL: relationship mapping between student and student id will be considered a candidate key. The candidate keys are as strong
entitled Marital_Status may be limited to married or Grant: It is used to give user access privileges to a database. one to one cardinality mapping. as the primary key.
unmarried values. Revoke: It is used to take back permissions from the user. For example: In the EMPLOYEE table, id is best suited for the
Relational instance: In the relational database system, the 4. Transaction Control Language (TCL) Another example is the relationship between the director of primary key. The rest of the attributes, like SSN,
relational instance is represented by a finite set of tuples. TCL is used to run the changes made by the DML statement. the school and the school because one school can have a Passport_Number, License_Number, etc., are considered a
Relation instances do not have duplicate tuples. TCL can be grouped into a logical transaction. maximum of one director, and one director can belong to only candidate key.
Relational schema: A relational schema contains the name of tasks under TCL: one school.
the relation and name of all columns or attributes. Commit: It is used to save the transaction on the database. Many to One Cardinality: 1. Select Operation:
Relational key: In the relational key, each row has one or Rollback: It is used to restore the database to original since In many to one cardinality mapping, from set 1, there can be The select operation selects tuples that satisfy a given
more attributes. It can identify the row in the relation the last Commit. multiple sets that can make relationships with a single entity predicate.
uniquely. ACID Properties of set 2. Or we can also describe it as from set 2, and one It is denoted by sigma (σ).
1-Tier Architecture 1) Atomicity entity can make a relationship with more than one entity of Notation: σ p(r)
In this architecture, the database is directly available to the The term atomicity defines that the data remains atomic. It set 1. Where:
user. It means the user can directly sit on the DBMS and uses means if any operation is performed on the data, either it
it. Any changes done here will directly be done on the should be performed or executed completely or should not be One to one Cardinality is the subset of Many to one σ is used for selection prediction
database itself. It doesn't provide a handy tool for end users. executed at all. It further means that the operation should not Cardinality. It can be represented by M:1. r is used for relation
The 1-Tier architecture is used for development of the local break in between or execute partially. In the case of executing For example, there are multiple patients in a hospital who are p is used as a propositional logic formula which may use
application, where programmers can directly communicate operations on the transaction, the operation should be served by a single doctor, so the relationship between 3. Super Key
Super key is an attribute set that can uniquely identify a tuple. connectors like: AND OR and NOT. These relational can use as
with the database for the quick response. completely executed and not partially. patients and doctors can be represented by Many to one relational operators like =, ≠, ≥, <, >, ≤.
2-Tier Architecture 2) Consistency Cardinality. A super key is a superset of a candidate key.
Example: σ BRANCH_NAME="perryride" (LOAN)
The 2-Tier architecture is same as basic client-server. In the The word consistency means that the value should remain One to Many Cardinalities: 2. Project Operation:
two-tier architecture, applications on the client end can preserved always. In DBMS, the integrity of the data should be In One-to-many cardinality mapping, from set 1, there can be This operation shows the list of those attributes that we wish
directly communicate with the database at the server side. maintained, which means if a change in the database is made, a maximum single set that can make relationships with a to appear in the result. Rest of the attributes are eliminated
For this interaction, API's like: ODBC, JDBC are used. it should remain preserved always. In the case of transactions, single or more than one entity of set 2. Or we can also from the table.
The user interfaces and application programs are run on the the integrity of the data is very essential so that the database describe it as from set 2, more than one entity can make a It is denoted by ∏.
client-side. remains consistent before and after the transaction. The data relationship with only one entity of set 1.
The server side is responsible to provide the functionalities should always be correct. Notation: ∏ A1, A2, An (r)
like query processing and transaction management. 3) Isolation One to one cardinality is the subset of One-to-many Where:
To communicate with the DBMS, client-side application The term 'isolation' means separation. In DBMS, Isolation is Cardinality. It can be represented by 1: M.
establishes a connection with the server side. the property of a database where no data should affect the For Example, in a hospital, there can be various compounders, A1, A2, A3 is used as an attribute name of relation r.
For example: In the above EMPLOYEE table, for(EMPLOEE_ID,
other one and may occur concurrently. In short, the operation so the relationship between the hospital and compounders Example: ∏ NAME, CITY (CUSTOMER)
EMPLOYEE_NAME), the name of two employees can be the
on one database should begin when the operation on the first can be mapped through One-to-many Cardinality. 3. Union Operation:
same, but their EMPLYEE_ID can't be the same. Hence, this
database gets complete. It means if two operations are being Many to Many Cardinalities: Suppose there are two tuples R and S. The union operation
combination can also be a key.
performed on two different databases, they may not affect In many, many cardinalities mapping, there can be one or contains all the tuples that are either in R or S or both in R & S.
The super key would be EMPLOYEE-ID (EMPLOYEE_ID,
the value of one another. In the case of transactions, when more than one entity that can associate with one or more It eliminates the duplicate tuples. It is denoted by ∪.
EMPLOYEE-NAME), etc.
two or more transactions occur simultaneously, the than one entity of set 2. In the same way from the end of set Notation: R ∪ S
4. Foreign key
consistency should remain maintained. Any changes that 2, one or more than one entity can make a relation with one A union operation must hold the following condition:
Foreign keys are the column of the table used to point to the
occur in any particular transaction will not be seen by other or more than one entity of set 1.
primary key of another table.
transactions until the change is not committed in the memory. R and S must have the attribute of the same number.
Every employee works in a specific department in a company,
It is represented by M: N or N: M. Duplicate tuples are eliminated automatically.
and employee and department are two different entities. So
4) Durability One to one cardinality, One to many cardinalities, and Many we can't store the department's information in the employee Example: ∏ CUSTOMER_NAME (BORROW) ∪ ∏
Durability ensures the permanency of something. In DBMS, to one cardinality is the subset of the many to many table. That's why we link these two tables through the CUSTOMER_NAME (DEPOSITOR)
the term durability ensures that the data after the successful cardinalities. primary key of one table. 4. Set Intersection:
execution of the operation becomes permanent in the For Example, in a college, multiple students can work on a We add the primary key of the DEPARTMENT table, Suppose there are two tuples R and S. The set intersection
database. The durability of the data should be so perfect that single project, and a single student can also work on multiple Department_Id, as a new attribute in the EMPLOYEE table. operation contains all tuples that are in both R & S.
even if the system fails or leads to a crash, the database still projects. So, the relationship between the project and the In the EMPLOYEE table, Department_Id is the foreign key, and It is denoted by intersection ∩.
survives. However, if gets lost, it becomes the responsibility of student can be represented by many to many cardinalities. both the tables are related. Notation: R ∩ S
the recovery manager for ensuring the durability of the Generalization Example: ∏ CUSTOMER_NAME (BORROW) ∩ ∏
database. For committing the values, the COMMIT command Generalization is like a bottom-up approach in which two or CUSTOMER_NAME (DEPOSITOR)
must be used every time we make changes. more entities of lower level combine to form a higher level 5. Set Difference:
entity if they have some attributes in common. Suppose there are two tuples R and S. The set intersection
Join Operations: In generalization, an entity of a higher level can also combine operation contains all tuples that are in R but not in S.
A Join operation combines related tuples from different with the entities of the lower level to form a further higher It is denoted by intersection minus (-).
relations, if and only if a given join condition is satisfied. It is level entity. Notation: R – S.
denoted by ⋈. Generalization is more like subclass and superclass system, Example: ∏ CUSTOMER_NAME (BORROW) - ∏
3-Tier Architecture Types of Join operations: but the only difference is the approach. Generalization uses CUSTOMER_NAME (DEPOSITOR)
The 3-Tier architecture contains another layer between the the bottom-up approach. 6. Cartesian product
client and server. In this architecture, client can't directly In generalization, entities are combined to form a more The Cartesian product is used to combine each row in one
communicate with the server. generalized entity, i.e., subclasses are combined to make a 5. Alternate key table with each row in the other table. It is also known as a
The application on the client-end interacts with an application superclass. There may be one or more attributes or a combination of cross product.
server which further communicates with the database system. For example, Faculty and Student entities can be generalized attributes that uniquely identify each tuple in a relation. These It is denoted by X.
End user has no idea about the existence of the database and create a higher level entity Person. attributes or combinations of the attributes are called the Notation: E X D
beyond the application server. The database also has no idea candidate keys. One key is chosen as the primary key from Example: EMPLOYEE X DEPARTMENT
about any other user beyond the application. these candidate keys, and the remaining candidate key, if it 7. Rename Operation:
The 3-Tier architecture is used in case of large web exists, is termed the alternate key. In other words, the total The rename operation is used to rename the output relation.
application. number of the alternate keys is the total number of candidate It is denoted by rho (ρ).
keys minus the primary key. The alternate key may or may not Example: We can use the rename operator to rename
exist. If there is only one candidate key in a relation, it does STUDENT relation to STUDENT1.
not have an alternate key.
For example, employee relation has two attributes, ρ(STUDENT1, STUDENT)
Employee_Id and PAN_No, that act as candidate keys. In this
relation, Employee_Id is chosen as the primary key, so the Functional Dependency
other candidate key, PAN_No, acts as the Alternate key. The functional dependency is a relationship that exists
Specialization between two attributes. It typically exists between the
1. Natural Join: Specialization is a top-down approach, and it is opposite to primary key and non-key attribute within a table.
A natural join is the set of tuples of all combinations in R and S Generalization. In specialization, one higher level entity can be
that are equal on their common attribute names. broken down into two lower level entities. X → Y
It is denoted by ⋈. Specialization is used to identify the subset of an entity set The left side of FD is known as a determinant, the right side of
Example: that shares some distinguishing characteristics. the production is known as a dependent.
Three schema Architecture ∏EMP_NAME, SALARY (EMPLOYEE ⋈ SALARY) Normally, the superclass is defined first, the subclass and its
The three-schema architecture is also called ANSI/SPARC 2. Outer Join: related attributes are defined next, and relationship set are 6. Composite key For example:
architecture or three-level architecture. The outer join operation is an extension of the join operation. then added. Whenever a primary key consists of more than one attribute, Assume we have an employee table with attributes: Emp_Id,
This framework is used to describe the structure of a specific It is used to deal with missing information. For example: In an Employee management system, EMPLOYEE it is known as a composite key. This key is also known as Emp_Name, Emp_Address.
database system. Example: entity can be specialized as TESTER or DEVELOPER based on Concatenated Key.
The three-schema architecture is also used to separate the (EMPLOYEE ⋈ FACT_WORKERS) what role they play in the company. Here Emp_Id attribute can uniquely identify the Emp_Name
user applications and physical database. An outer join is basically of three types: attribute of employee table because if we know the Emp_Id,
The three-schema architecture contains three-levels. It breaks 1. Left outer join we can tell that employee name associated with it.
the database down into three different categories. 2. Right outer join
The three-schema architecture is as follows: 3. Full outer join Functional dependency can be written as:
a. Left outer join: Emp_Id → Emp_Name
Left outer join contains the set of tuples of all combinations in We can say that Emp_Name is functionally dependent on
R and S that are equal on their common attribute names. Emp_Id.
For example, in employee relations, we assume that an
In the left outer join, tuples in R have no matching tuples in S.
employee may be assigned multiple roles, and an employee
It is denoted by ⟕. Types of Functional dependency
may work on multiple projects simultaneously. So the primary
Example: EMPLOYEE ⟕ FACT_WORKERS 1. Trivial functional dependency
b. Right outer join: key will be composed of all three attributes, namely Emp_ID, 2. Non-trivial functional dependency
Right outer join contains the set of tuples of all combinations Emp_role, and Proj_ID in combination. So these attributes act
in R and S that are equal on their common attribute names. Aggregation as a composite key since the primary key comprises more 1. Trivial functional dependency
In right outer join, tuples in S have no matching tuples in R. In aggregation, the relation between two entities is treated as than one attribute. A → B has trivial functional dependency if B is a subset of A.
It is denoted by ⟖. a single entity. In aggregation, relationship with its 7. Artificial key The following dependencies are also trivial like: A → A, B → B
Example: corresponding entities is aggregated into a higher level entity. The key created using arbitrarily assigned data are known as Example:
EMPLOYEE ⟖ FACT_WORKERS For example: Center entity offers the Course entity act as a artificial keys. These keys are created when a primary key is Consider a table with two columns Employee_Id and
c. Full outer join: single entity in the relationship which is in a relationship with large and complex and has no relationship with many other Employee_Name.
Full outer join is like a left or right join except that it contains another entity visitor. In the real world, if a visitor visits a relations. The data values of the artificial keys are usually {Employee_id, Employee_Name} → Employee_Id is a trivial
all rows from both tables. coaching center then he will never enquiry about the Course numbered in a serial order. functional dependency as
In the above diagram: In full outer join, tuples in R that have no matching tuples in S only or just about the Center instead he will ask the enquiry For example, the primary key, which is composed of Emp_ID, Employee_Id is a subset of {Employee_Id, Employee_Name}.
1. It shows the DBMS architecture. and tuples in S that have no matching tuples in R in their about both. Emp_role, and Proj_ID, is large in employee relations. So it Also, Employee_Id → Employee_Id and Employee_Name →
2. Mapping is used to transform the request and response common attribute name. would be better to add a new virtual attribute to identify each Employee_Name are trivial dependencies too.
between various database levels of architecture. It is denoted by ⟗. tuple in the relation uniquely.
Mapping is not good for small DBMS because it takes more Example: EMPLOYEE ⟗ FACT_WORKERS 2. Non-trivial functional dependency
time. 3. Equi join: A → B has a non-trivial functional dependency if B is not a
3. In External / Conceptual mapping, it is necessary to It is also known as an inner join. It is the most common join. It subset of A.
transform the request from external level to conceptual is based on matched data as per the equality condition. The When A intersection B is NULL, then A → B is called as
schema. equi join uses the comparison operator(=). complete non-trivial.
4. In Conceptual / Internal mapping, DBMS transform the Example:
request from the conceptual to internal level. Example: CUSTOMER ⋈ PRODUCT
ID → Name,
Name → DOB
1. Entity: Integrity Constraints What is Normalization? Normal Form Description Boyce Codd normal form (BCNF)
An entity may be any object, class, person or place. In the ER Integrity constraints are a set of rules. It is used to maintain Normalization is the process of organizing the data in the First Normal Form (1NF) BCNF is the advance version of 3NF. It is stricter than 3NF.
diagram, an entity can be represented as rectangles. the quality of information. database. A relation will be 1NF if it contains an atomic value. A table is in BCNF if every functional dependency X → Y, X is
Consider an organization as an example- manager, product, Integrity constraints ensure that the data insertion, updating, Normalization is used to minimize the redundancy from a It states that an attribute of a table cannot hold multiple the super key of the table.
employee, department etc. can be taken as an entity. and other processes have to be performed in such a way that relation or set of relations. It is also used to eliminate values. It must hold only single-valued attribute. For BCNF, the table should be in 3NF, and for every FD, LHS is
data integrity is not affected. undesirable characteristics like Insertion, Update, and First normal form disallows the multi-valued attribute, super key.
Thus, integrity constraint is used to guard against accidental Deletion Anomalies. composite attribute, and their combinations. Example: Let's assume there is a company where employees
damage to the database. Normalization divides the larger table into smaller and links Example: Relation EMPLOYEE is not in 1NF because of multi- work in more than one department.
a. Weak Entity Types of Integrity Constraint them using relationships. valued attribute EMP_PHONE.
An entity that depends on another entity called a weak entity. The normal form is used to reduce redundancy from the EMPLOYEE table: EMPLOYEE table:
The weak entity doesn't contain any key attribute of its own. database table. EMP_ID EMP_NAME EMP_PHONE EMP_STATE EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE
The weak entity is represented by a double rectangle. The main reason for normalizing the relations is removing 14 John 7272826385, EMP_DEPT_NO
these anomalies. Failure to eliminate anomalies leads to data 9064738238 UP 264 India Designing D394 283
redundancy and can cause data integrity and other problems 20 Harry 8574783832 Bihar 264 India Testing D394 300
as the database grows. Normalization consists of a series of 12 Sam 7390372389, 364 UK Stores D283 232
2. Attribute guidelines that helps to guide you in creating a good database 8589830302 Punjab 364 UK Developing D283 549
The attribute is used to describe the property of an entity. structure.
Eclipse is used to represent an attribute. Data modification anomalies can be categorized into three The decomposition of the EMPLOYEE table into 1NF has been In the above table Functional dependencies are as follows:
For example, id, age, contact number, name, etc. can be types: shown below:
attributes of a student. 1. Domain constraints Insertion Anomaly: Insertion Anomaly refers to when one EMP_ID → EMP_COUNTRY
Domain constraints can be defined as the definition of a valid cannot insert a new tuple into a relationship due to lack of EMP_ID EMP_NAME EMP_PHONE EMP_STATE EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
set of values for an attribute. data. 14 John 7272826385 UP Candidate key: {EMP-ID, EMP-DEPT}
The data type of domain includes string, character, integer, Deletion Anomaly: The delete anomaly refers to the situation 14 John 9064738238 UP
time, date, currency, etc. The value of the attribute must be where the deletion of data results in the unintended loss of 20 Harry 8574783832 Bihar The table is not in BCNF because neither EMP_DEPT nor
available in the corresponding domain. some other important data. 12 Sam 7390372389 Punjab EMP_ID alone are keys.
Example: Updatation Anomaly: The update anomaly is when an update 12 Sam 8589830302 Punjab
of a single data value requires multiple rows of data to be To convert the given table into BCNF, we decompose it into
updated. Second Normal Form (2NF) three tables:
Types of Normal Forms: In the 2NF, relational must be in 1NF.
Normalization works through a series of stages called Normal In the second normal form, all non-key attributes are fully EMP_COUNTRY table:
forms. The normal forms apply to individual relations. The functional dependent on the primary key
a. Key Attribute relation is said to be in particular normal form if it satisfies Example: Let's assume, a school can store the data of teachers EMP_ID EMP_COUNTRY
The key attribute is used to represent the main characteristics constraints. and the subjects they teach. In a school, a teacher can teach 264 India
of an entity. It represents a primary key. The key attribute is 2. Entity integrity constraints more than one subject. 264 India
represented by an ellipse with the text underlined. The entity integrity constraint states that primary key value EMP_DEPT table:
can't be null. TEACHER table
This is because the primary key value is used to identify EMP_DEPT DEPT_TYPE EMP_DEPT_NO
individual rows in relation and if the primary key has a null TEACHER_ID SUBJECT TEACHER_AGE Designing D394 283
value, then we can't identify those rows. 25 Chemistry 30 Testing D394 300
A table can contain a null value other than the primary key 25 Biology 30 Stores D283 232
field. 47 English 35 Developing D283 549
Example: 83 Math 38 EMP_DEPT_MAPPING table:
83 Computer 38
In the given table, non-prime attribute TEACHER_AGE is EMP_ID EMP_DEPT
dependent on TEACHER_ID which is a proper subset of a D394 283
candidate key. That's why it violates the rule for 2NF. D394 300
b. Composite Attribute D283 232
An attribute that composed of many other attributes is known To convert the given table into 2NF, we decompose it into two D283 549
as a composite attribute. The composite attribute is Normal Form overview in short tables: Functional dependencies:
represented by an ellipse, and those ellipses are connected 1NF: A relation is in 1NF if it contains an atomic value.
with an ellipse. 2NF: A relation will be in 2NF if it is in 1NF and all non-key TEACHER_DETAIL table: EMP_ID → EMP_COUNTRY
3. Referential Integrity Constraints attributes are fully functional dependent on the primary key. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
A referential integrity constraint is specified between two 3NF: A relation will be in 3NF if it is in 2NF and no transition TEACHER_ID TEACHER_AGE Candidate keys:
tables. dependency exists. 25 30
In the Referential integrity constraints, if a foreign key in Table BCNF: A stronger definition of 3NF is known as Boyce Codd's 47 35 For the first table: EMP_ID
1 refers to the Primary Key of Table 2, then every value of the normal form. 83 38 For the second table: EMP_DEPT
Foreign Key in Table 1 must be null or be available in Table 2. 4NF: A relation will be in 4NF if it is in Boyce Codd's normal TEACHER_SUBJECT table: For the third table: {EMP_ID, EMP_DEPT}
Example: form and has no multi-valued dependency.
5NF: A relation is in 5NF. If it is in 4NF and does not contain TEACHER_ID SUBJECT Now, this is in BCNF because left side part of both the
any join dependency, joining should be lossless. 25 Chemistry functional dependencies is a key.
Advantages of Normalization 25 Biology
• Normalization helps to minimize data 47 English Fourth normal form (4NF)
c. Multivalued Attribute redundancy. 83 Math A relation will be in 4NF if it is in Boyce Codd normal form and
An attribute can have more than one value. These attributes • Greater overall database organization. 83 Computer has no multi-valued dependency.
are known as a multivalued attribute. The double oval is used • Data consistency within the database. For a dependency A → B, if for a single value of A, multiple
to represent multivalued attribute. • Much more flexible database design. Third Normal Form (3NF) values of B exists, then the relation will be a multi-valued
For example, a student can have more than one phone • Enforces the concept of relational integrity. A relation will be in 3NF if it is in 2NF and not contain any dependency.
number. Disadvantages of Normalization transitive partial dependency. Example
• You cannot start building the database before 3NF is used to reduce the data duplication. It is also used to STUDENT
knowing what the user needs. achieve the data integrity.
• The performance degrades when normalizing If there is no transitive dependency for non-prime attributes, STU_ID COURSE HOBBY
the relations to higher normal forms, i.e., 4NF, then the relation must be in third normal form. 21 Computer Dancing
5NF. A relation is in third normal form if it holds atleast one of the 21 Math Singing
• It is very time-consuming and difficult to following conditions for every non-trivial function dependency 34 Chemistry Dancing
4. Key constraints X → Y. 74 Biology Cricket
normalize relations of a higher degree.
Keys are the entity set that is used to identify an entity within 59 Physics Hockey
• Careless decomposition may lead to a bad
its entity set uniquely. X is a super key. The given STUDENT table is in 3NF, but the COURSE and
database design, leading to serious problems.
d. Derived Attribute An entity set can have multiple keys, but out of which one key Y is a prime attribute, i.e., each element of Y is part of some HOBBY are two independent entity. Hence, there is no
An attribute that can be derived from other attribute is known will be the primary key. A primary key can contain a unique candidate key. relationship between COURSE and HOBBY.
as a derived attribute. It can be represented by a dashed and null value in the relational table. Relational Decomposition Example:
ellipse. Example: When a relation in the relational model is not in appropriate In the STUDENT relation, a student with STU_ID, 21 contains
normal form then the decomposition of a relation is required. EMPLOYEE_DETAIL table: two courses, Computer and Math and two hobbies, Dancing
In a database, it breaks the table into multiple tables. and Singing. So there is a Multi-valued dependency on
If the relation has no proper decomposition, then it may lead EMP_ID EMP_NAME EMP_ZIP EMP_STATE EMP_CITY STU_ID, which leads to unnecessary repetition of data.
to problems like loss of information. 222 Harry 201010 UP Noida
Decomposition is used to eliminate some of the problems of 333 Stephan 02228 US Boston So to make the above table into 4NF, we can decompose it
bad design like anomalies, inconsistencies, and redundancy. 444 Lan 60007 US Chicago into two tables:
Types of Decomposition 555 Katharine 06389 UK Norwich
666 John 462007 MPBhopal STUDENT_COURSE
Super key in the table above:
Relational Calculus STU_ID COURSE
There is an alternate way of formulating queries known as {EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, 21 Computer
Relational Calculus. Relational calculus is a non-procedural EMP_ZIP}....so on 21 Math
query language. In the non-procedural query language, the Candidate key: {EMP_ID} 34 Chemistry
For example, A person's age changes over time and can be user is concerned with the details of how to obtain the end 74 Biology
derived from another attribute like Date of birth. results. The relational calculus tells what to do but never Non-prime attributes: In the given table, all attributes except 59 Physics
3. Relationship explains how to do. Most commercial relational languages are EMP_ID are non-prime. STUDENT_HOBBY
A relationship is used to describe the relation between based on aspects of relational calculus including SQL-QBE and
entities. Diamond or rhombus is used to represent the QUEL. Lossless Decomposition
If the information is not lost from the relation that is Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and STU_ID HOBBY
relationship. Types of Relational calculus:
decomposed, then the decomposition will be lossless. EMP_ZIP dependent on EMP_ID. The non-prime attributes 21 Dancing
The lossless decomposition guarantees that the join of (EMP_STATE, EMP_CITY) transitively dependent on super 21 Singing
relations will result in the same relation as it was key(EMP_ID). It violates the rule of third normal form. 34 Dancing
decomposed. 74 Cricket
Types of relationship are as follows: The relation is said to be lossless decomposition if natural That's why we need to move the EMP_CITY and EMP_STATE 59 Hockey
a. One-to-One Relationship joins of all the decomposition give the original relation. to the new <EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary Fifth normal form (5NF)
When only one instance of an entity is associated with the Example: key. A relation is in 5NF if it is in 4NF and not contains any join
relationship, then it is known as one to one relationship. EMPLOYEE_DEPARTMENT table: dependency and joining should be lossless.
For example, A female can marry to one male, and a male can EMPLOYEE table: 5NF is satisfied when all the tables are broken into as many
marry to one female. EMP_ID EMP_NAME EMP_AGE EMP_CITY tables as possible in order to avoid redundancy.
1. Tuple Relational Calculus (TRC) DEPT_ID DEPT_NAME EMP_ID EMP_NAME EMP_ZIP 5NF is also known as Project-join normal form (PJ/NF).
It is a non-procedural query language which is based on 22 Denim 28 Mumbai 222 Harry 201010 Example
finding a number of tuple variables also known as range 827 Sales 333 Stephan 02228 SUBJECT LECTURER SEMESTER
variable for which predicate holds true. It describes the 33 Alina 25 Delhi 444 Lan 60007 Computer Anshika Semester 1
b. One-to-many relationship
desired information without giving a specific procedure for 438 Marketing 555 Katharine 06389 Computer John Semester 1
When only one instance of the entity on the left, and more
obtaining that information. The tuple relational calculus is 46 Stephan 30 Bangalore 666 John 462007 Math John Semester 1
than one instance of an entity on the right associates with the
specified to select the tuples in a relation. In TRC, filtering 869 Finance EMPLOYEE_ZIP table: Math Akash Semester 2
relationship then this is known as a one-to-many relationship.
variable uses the tuples of a relation. The result of the relation 52 Katherine 36 Mumbai Chemistry Praveen Semester 1
For example, Scientist can invent many inventions, but the
can have one or more tuples. 575 Production EMP_ZIP EMP_STATE EMP_CITY In the above table, John takes both Computer and Math class
invention is done by the only specific scientist.
60 Jack 40 Noida 201010 UP Noida for Semester 1 but he doesn't take Math class for Semester 2.
Notation: 678 Testing 02228 US Boston In this case, combination of all these fields required to identify
A Query in the tuple relational calculus is expressed as The above relation is decomposed into two relations 60007 US Chicago a valid data.
following notation. EMPLOYEE and DEPARTMENT 06389 UK Norwich Suppose we add a new Semester as Semester 3 but do not
c. Many-to-one relationship {T | P (T)} or {T | Condition (T)} 462007 MP Bhopal know about the subject and who will be taking that subject so
When more than one instance of the entity on the left, and Where, EMPLOYEE table: we leave Lecturer and Subject as NULL. But all three columns
only one instance of an entity on the right associates with the T is the resulting tuples together acts as a primary key, so we can't leave other two
relationship then it is known as a many-to-one relationship. P(T) is the condition used to fetch T. EMP_ID EMP_NAME EMP_AGE EMP_CITY columns blank.
For example, Student enrolls for only one course, but a course For example: 22 Denim 28 Mumbai So to make the above table into 5NF, we can decompose it
can have many students. { [Link] | Author(T) AND [Link] = 'database' } 33 Alina 25 Delhi into three relations P1, P2 & P3:
Output: This query selects the tuples from the AUTHOR 46 Stephan 30 Bangalore
relation. It returns a tuple with 'name' from Author who has 52 Katherine 36 Mumbai P1
written an article on 'database'. 60 Jack 40 Noida
TRC (tuple relation calculus) can be quantified. In TRC, we can DEPARTMENT table SEMESTER SUBJECT
d. Many-to-many relationship
use Existential (∃) and Universal Quantifiers (∀). Semester 1 Computer
When more than one instance of the entity on the left, and
2. Domain Relational Calculus (DRC) DEPT_ID EMP_ID DEPT_NAME Semester 1 Math
more than one instance of an entity on the right associates
The second form of relation is known as Domain relational 827 22 Sales Semester 1 Chemistry
with the relationship then it is known as a many-to-many
calculus. In domain relational calculus, filtering variable uses 438 33 Marketing Semester 2 Math
relationship.
the domain of attributes. Domain relational calculus uses the 869 46 Finance P2
same operators as tuple calculus. It uses logical connectives ∧ 575 52 Production
(and), ∨ (or) and ┓ (not). It uses Existential (∃) and Universal 678 60 Testing SUBJECT LECTURER
Quantifiers (∀) to bind the variable. The QBE or Query by Now, when these two relations are joined on the common Computer Anshika
For example, Employee can assign by many projects and example is a query language related to domain relational column "EMP_ID", then the resultant relation will look like: Computer John
project can have many employees. calculus. Math John
Notation: Employee ⋈ Department Math Akash
{ a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)} Chemistry Praveen
Where, EMP_ID EMP_NAME EMP_AGE EMP_CITY P3
a1, a2 are attributes DEPT_ID DEPT_NAME
P stands for formula built by inner attributes 22 Denim 28 Mumbai SEMSTER LECTURER
827 Sales Semester 1 Anshika
For example: 33 Alina 25 Delhi Semester 1 John
{< article, page, subject > | ∈ javatpoint ∧ subject = 438 Marketing Semester 1 John
'database'} 46 Stephan 30 Bangalore Semester 2 Akash
Output: This query will yield the article, page, and subject 869 Finance Semester 1 Praveen
from the relational javatpoint, where the subject is a 52 Katherine 36 Mumbai
database. 575 Production
60 Jack 40 Noida
678 Testing
Hence, the decomposition is Lossless join decomposition.
Multivalued Dependency SQL pg-1 Pg-2 1. Cursor in PL/SQL : Transaction
Multivalued dependency occurs when two attributes in a SQL stands for Structured Query Language. It is used for 2. Data Manipulation Language commands: A cursor can be basically referred to as a pointer to the The transaction is a set of logically related operation. It
table are independent of each other but, both depend on a storing and managing data in relational database a. INSERT: The INSERT statement is a SQL query. It is used to context [Link] area is a memory area that is created by contains a group of tasks.
third attribute. management system (RDMS). insert data into the row of a table. Oracle when SQL statement is [Link] cursor is thus A transaction is an action or series of actions. It is performed
A multivalued dependency consists of at least two attributes It is a standard language for Relational Database System. It responsible for holding the rows that have been returned by a by a single user to perform operations for accessing the
that are dependent on a third attribute that's why it always enables a user to create, read, update and delete relational Syntax: SQL [Link] the PL/SQL controls the context area by contents of the database.
requires at least three attributes. databases and tables. the help of [Link] Active set is basically the set of rows Example: Suppose an employee of bank transfers Rs 800 from
Example: Suppose there is a bike manufacturer company All the RDBMS like MySQL, Informix, Oracle, MS Access and INSERT INTO TABLE_NAME that the cursor holds. The cursor can be of two types: Implicit X's account to Y's account. This small transaction contains
which produces two colors(white and black) of each model SQL Server use SQL as their standard database language. (col1, col2, col3,.... col N) Cursor, and Explicit Cursor. several low-level tasks:
every year. SQL allows users to query the database in a number of ways, VALUES (value1, value2, value3, .... valueN); Advantages of Cursor: X's Account
using English-like statements. Or They are helpful in performing the row by row processing and Open_Account(X)
BIKE_MODEL MANUF_YEAR COLOR Rules: also row wise validation on each row. Old_Balance = [Link]
M2011 2008 White SQL follows the following rules: INSERT INTO TABLE_NAME Better concurrency control can be achieved by using cursors. New_Balance = Old_Balance - 800
M2001 2008 Black VALUES (value1, value2, value3, .... valueN); Cursors are faster than while loops. [Link] = New_Balance
M3001 2013 White Structure query language is not case sensitive. Generally, For example: Disadvantages of Cursor: Close_Account(X)
M3001 2013 Black keywords of SQL are written in uppercase. They use more resources each time and thus may result in Y's Account
M4006 2017 White Statements of SQL are dependent on text lines. We can use a INSERT INTO javatpoint (Author, Subject) VALUES ("Sonoo", network round trip. Open_Account(Y)
M4006 2017 Black single SQL statement on one or multiple text line. "DBMS"); More number of network round trips can degrade the Old_Balance = [Link]
Here columns COLOR and MANUF_YEAR are dependent on Using the SQL statements, you can perform most of the b. UPDATE: This command is used to update or modify the performance and reduce the speed. New_Balance = Old_Balance + 800
BIKE_MODEL and independent of each other. actions in a database. value of a column in the table. 2. Trigger in PL/SQL : [Link] = New_Balance
SQL depends on tuple relational calculus and relational A Trigger is basically a program which gets automatically Close_Account(Y)
In this case, these two columns can be called as multivalued algebra. Syntax: executed in response to some events such as modification in Operations of Transaction:
dependent on BIKE_MODEL. The representation of these the [Link] of the events for their execution are DDL Following are the main operations of transaction:
dependencies is shown below: SQL process: UPDATE table_name SET [column_name1= statement, DML statement or any Database Read(X): Read operation is used to read the value of X from
When an SQL command is executing for any RDBMS, then the value1,...column_nameN = valueN] [WHERE CONDITION] [Link] are thus stored within the database and the database and stores it in a buffer in main memory.
BIKE_MODEL → → MANUF_YEAR system figure out the best way to carry out the request and For example: come into action when specific conditions [Link], they Write(X): Write operation is used to write the value back to
BIKE_MODEL → → COLOR the SQL engine determines that how to interpret the task. can be defined on any schema, table, view etc. There are six the database from the buffer.
This can be read as "BIKE_MODEL multidetermined In the process, various components are included. These UPDATE students types of triggers: BEFORE INSERT, AFTER INSERT, BEFORE Let's take an example to debit transaction from an account
MANUF_YEAR" and "BIKE_MODEL multidetermined COLOR". components can be optimization Engine, Query engine, Query SET User_Name = 'Sonoo' UPDATE, AFTER UPDATE, BEFORE DELETE, and AFTER DELETE. which consists of following operations:
Join Dependency dispatcher, classic, etc. WHERE Student_Id = '3' Example of Trigger in SQL Server 1. R(X);
Whenever we can recreate a table by simply joining various All the non-SQL queries are handled by the classic query c. DELETE: It is used to remove one or more row from a table. Let us understand how we can work with triggers in the SQL 2. X = X - 500;
tables where each of these tables consists of a subset of the engine, but SQL query engine won't handle logical files. Server. We can do this by first creating a table named 3. W(X);
table’s attribute, then this table is known as a Join Syntax: 'Employee' using the below statements: Let's assume the value of X before starting of the transaction
Dependency. is 4000.
Join decomposition is a further generalization of Multivalued DELETE FROM table_name [WHERE condition]; CREATE TABLE Employee The first operation reads X's value from database and stores it
dependencies. For example: ( in a buffer.
If the join of R1 and R2 over C is equal to relation R, then we Id INT PRIMARY KEY, The second operation will decrease the value of X by 500. So
can say that a join dependency (JD) exists. DELETE FROM javatpoint Name VARCHAR(45), buffer will contain 3500.
Where R1 and R2 are the decompositions R1(A, B, C) and WHERE Author="Sonoo"; Salary INT, The third operation will write the buffer's value to the
R2(C, D) of a given relations R (A, B, C, D). 3. Data Control Language commands: Gender VARCHAR(12), database. So X's final value will be 3500.
Alternatively, R1 and R2 are a lossless decomposition of R. a. Grant: It is used to give user access privileges to a database. DepartmentId INT But it may be possible that because of the failure of hardware,
A JD ⋈ {R1, R2,..., Rn} is said to hold over a relation R if R1, ) software or power, etc. that transaction may fail before
R2,....., Rn is a lossless-join decomposition. Example Advantages of Trigger: finished all the operations in the set.
The *(A, B, C, D), (C, D) will be a JD of R if the join of join's They are helpful in keeping the track of all the changes within For example: If in the above transaction, the debit transaction
attribute is equal to the relation R. GRANT SELECT, UPDATE ON MY_TABLE TO SOME_USER, the database. fails after executing operation 2 then X's value will remain
Here, *(R1, R2, R3) is used to indicate that relation R1, R2, R3 ANOTHER_USER; They also help in maintaining the integrity constraints. 4000 in the database which is not acceptable by the bank.
and so on are a JD of R. b. Revoke: It is used to take back permissions from the user. Disadvantages of Trigger: To solve this problem, we have two important operations:
Inclusion Dependency They are very difficult to view which makes the debugging also Commit: It is used to save the work done permanently.
An inclusion dependency specifies that one table's attribute Example difficult. Rollback: It is used to undo the work done.
values must be a subset of another table's attribute values. Characteristics of SQL Too much use of the triggers or writing complex codes within Transaction property
In other words, it enforces a constraint that values in one SQL is easy to learn. REVOKE SELECT, UPDATE ON MY_TABLE FROM USER1, USER2; a trigger can slow down the performance. The transaction has the four properties. These are used to
table must exist in another table. This is often used in SQL is used to access data from relational database 4. Transaction Control Language commands: maintain consistency in a database, before and after the
situations where one table represents a subset or management systems. Difference between Cursor and Trigger: transaction.
specialization of another table. SQL can execute queries against the database. a. Commit: Commit command is used to save all the Property of Transaction
Inclusion dependencies are quite common. They typically SQL is used to describe the data. transactions to the database. Atomicity
[Link] Cursor Trigger
show little influence on designing of the database. SQL is used to define the data in the database and manipulate Consistency
The inclusion dependency is a statement in which some it when needed. Syntax: Isolation
columns of a relation are contained in other columns. SQL is used to create and drop the database and table. Durability
SQL is used to create a view, stored procedure, function in a It is a pointer
The example of inclusion dependency is a foreign key. In one COMMIT;
database. which is used to It is a program
relation, the referring relation is contained in the primary key Example:
SQL allows users to set permissions on tables, procedures, and control the which gets
column(s) of the referenced relation.
views. context area and executed in
Suppose we have two relations R and S which was obtained by DELETE FROM CUSTOMERS 1.
Advantages of SQL also to go response to
translating two entity sets such that every R entity is also an S WHERE AGE = 25;
There are the following advantages of SQL: through the occurrence of
entity. COMMIT;
High speed: Using the SQL queries, the user can quickly and records in the some events.
Inclusion dependency would be happen if projecting R on its
efficiently retrieve a large amount of records from a database. database.
key attributes yields a relation that is contained in the relation b. Rollback: Rollback command is used to undo transactions
obtained by projecting S on its key attributes. No coding needed: In the standard SQL, it is very easy to that have not already been saved to the database.
In inclusion dependency, we should not split groups of manage the database system. It doesn't require a substantial Syntax:
attributes that participate in an inclusion dependency. amount of code to manage the database system. ROLLBACK;
Well defined standards: Long established are used by the SQL A cursor can be
In practice, most inclusion dependencies are key-based that is Example:
databases that are being used by ISO and ANSI. created within a
involved only keys. A trigger cannot
Portability: SQL can be used in laptop, PCs, server and even trigger by writing Atomicity
Dependency Preserving DELETE FROM CUSTOMERS 2. be created within
some mobile phones. the declare It states that all operations of the transaction take place at
It is an important constraint of the database. WHERE AGE = 25; a cursor.
Interactive language: SQL is a domain language used to statement inside once if not, the transaction is aborted.
In the dependency preservation, at least one decomposed ROLLBACK;
communicate with the database. It is also used to receive the trigger. There is no midway, i.e., the transaction cannot occur
table must satisfy every dependency. c. SAVEPOINT: It is used to roll the transaction back to a
If a relation R is decomposed into relation R1 and R2, then the answers to the complex questions in seconds. certain point without rolling back the entire transaction. partially. Each transaction is treated as one unit and either run
dependencies of R either must be a part of R1 or R2 or must Multiple data view: Using the SQL language, the users can Syntax: to completion or is not executed at all.
be derivable from the combination of functional make different views of the database structure. SAVEPOINT SAVEPOINT_NAME; Atomicity involves the following two operations:
dependencies of R1 and R2. Abort: If a transaction aborts then all the changes made are
For example, suppose there is a relation R (A, B, C, D) with SQL Datatype 5. Data Query Language commands: not visible.
functional dependency set (A->BC). The relational R is SQL Datatype is used to define the values that a column can DQL is used to fetch the data from the database. Commit: If a transaction commits then all the changes made
It gets created in
decomposed into R1(ABC) and R2(AD) which is dependency contain. are visible.
response to
preserving because FD A->BC is a part of relation R1(ABC). Every column is required to have a name and data type in the It uses only one command: Example: Let's assume that following transaction T consisting
execution of SQL It is a previously
database table. • SELECT 3. of T1 and T2. A consists of Rs 600 and B consists of Rs 300.
statement thus it stored program.
Checkpoint Datatype of SQL: a. SELECT: This is the same as the projection operation of Transfer Rs 100 from account A to account B.
is not previously
The checkpoint is a type of mechanism where all the previous relational algebra. It is used to select the attribute based on
stored.
logs are removed from the system and permanently stored in the condition described by WHERE clause. T1 T2
the storage disk.
The checkpoint is like a bookmark. While the execution of the Syntax:
transaction, such checkpoints are marked, and the transaction The main
The main Read(A) Read(B)
is executed then using the steps of the transaction, the log SELECT expressions function of the
function of A:= A-100 Y:= Y+100
files will be created. FROM TABLES cursor is retrieval
trigger is to Write(A) Write(B)
When it reaches to the checkpoint, then the transaction will WHERE conditions; 4. of rows from the
be updated into the database, and till that point, the entire For example: maintain the
result set one at After completion of the transaction, A consists of Rs 500 and B
log file will be removed from the file. Then the log file is integrity of the
SQL Commands a time (row by consists of Rs 400.
updated with the new step of transaction till next checkpoint SELECT emp_name database.
SQL commands are instructions. It is used to communicate row). If the transaction T fails after the completion of transaction T1
and so on. FROM employee but before completion of transaction T2, then the amount will
The checkpoint is used to declare a point before which the with the database. It is also used to perform specific tasks, WHERE age > 20; be deducted from A but not added to B. This shows the
DBMS was in the consistent state, and all transactions were functions, and queries of data.
SQL can perform various tasks like create a table, add data to inconsistent database state. In order to ensure correctness of
committed. Aggregate functions A trigger is database state, the transaction must be executed in entirety.
Recovery using Checkpoint tables, drop the table, modify the table, set permission for
Aggregate functions in a Database Management System A cursor is executed in Consistency
In the following manner, a recovery system recovers the users.
(DBMS) are a set of functions that operate on a group of rows activated and response to a The integrity constraints are maintained so that the database
database from this failure: Types of SQL Commands
or records from a database table and return a single value as a 5. thus created in DDL statement, is consistent before and after the transaction.
There are five types of SQL commands: DDL, DML, DCL, TCL,
result. These functions are often used to perform calculations response to any DML statement The execution of a transaction will leave a database in either
and DQL.
on data within a specified subset of records. Aggregate SQL statement. or any database its prior stable state or a new stable state.
functions are essential for generating summary statistics, operation. The consistent property of database states that every
performing data analysis, and summarizing data in various transaction sees a consistent database instance.
ways. Common aggregate functions in most relational DBMSs The transaction is used to transform the database from one
include: consistent state to another consistent state.
The main For example: The total amount must be maintained before or
COUNT: This function returns the number of rows in a The main
disadvantage of after the transaction.
specified group. It can be used to count all rows or rows that disadvantage of
cursor is that it Total before T occurs = 600+300=900
meet a specific condition. trigger is that
uses more Total after T occurs= 500+400=900
Example: they are hard to
6. resources each Therefore, the database is consistent. In the case when T1 is
SELECT COUNT(*) FROM employees; view which
time and thus completed but T2 fails, then inconsistency will occur.
The recovery system reads log files from the end to start. It makes the
results in Isolation
reads log files from T4 to T1. SUM: This function adds up the values of a numeric column debugging really
network round It shows that the data which is used at the time of execution
Recovery system maintains two lists, a redo-list, and an undo- within a specified group of rows. difficult.
trip. of a transaction cannot be used by the second transaction
list. Example:
The transaction is put into redo state if the recovery system 1. Data Definition Language (DDL) commands until the first one is completed.
a. CREATE It is used to create a new table in the database. SELECT SUM(salary) FROM employees WHERE department = In isolation, if the transaction T1 is being executed and using
sees a log with <Tn, Start> and <Tn, Commit> or just <Tn, 'Sales';
Commit>. In the redo-list and their previous list, all the Syntax: the data item X, then that data item can't be accessed by any
transactions are removed and then redone before saving their CREATE TABLE TABLE_NAME (COLUMN_NAME other transaction T2 until the transaction T1 ends.
DATATYPES[,....]); AVG: This function calculates the average (mean) value of a The concurrency control subsystem of the DBMS enforced the
logs. numeric column within a specified group.
For example: In the log file, transaction T2 and T3 will have Example: isolation property.
CREATE TABLE EMPLOYEE(Name VARCHAR2(20), Email Example: Durability
<Tn, Start> and <Tn, Commit>. The T1 transaction will have SELECT AVG(age) FROM students;
only <Tn, commit> in the log file. That's why the transaction is VARCHAR2(100), DOB DATE); The durability property is used to indicate the performance of
committed after the checkpoint is crossed. Hence it puts T1, b. DROP: It is used to delete both the structure and record the database's consistent state. It states that the transaction
stored in the table. MIN: This function returns the minimum value of a column made the permanent changes.
T2 and T3 transaction into redo list. within a specified group.
The transaction is put into undo state if the recovery system Syntax They cannot be lost by the erroneous operation of a faulty
DROP TABLE table_name; Example: transaction or by the system failure. When a transaction is
sees a log with <Tn, Start> but no commit or abort log found. SELECT MIN(price) FROM products;
In the undo-list, all the transactions are undone, and their logs Example completed, then the database reaches a state known as the
are removed. DROP TABLE EMPLOYEE; consistent state. That consistent state cannot be lost, even in
c. ALTER: It is used to alter the structure of the database. This MAX: This function returns the maximum value of a column the event of a system's failure.
For example: Transaction T4 will have <Tn, Start>. So T4 will within a specified group.
be put into undo list since this transaction is not yet complete change could be either to modify the characteristics of an The recovery subsystem of the DBMS has the responsibility of
existing attribute or probably to add a new attribute. Example: Durability property.
and failed amid. SELECT MAX(score) FROM exam_results;
Syntax:
To add a new column in the table GROUP_CONCAT (or STRING_AGG in some DBMSs): This
ALTER TABLE table_name ADD column_name COLUMN- function concatenates values from multiple rows into a single
definition; string, with an optional delimiter, within a specified group.
To modify existing column in the table: Example (MySQL):
ALTER TABLE table_name MODIFY(column_definitions....); SELECT GROUP_CONCAT(name) FROM students;
EXAMPLE:
ALTER TABLE STU_DETAILS ADD(ADDRESS VARCHAR2(20)); Variance and Standard Deviation: Some DBMSs provide
ALTER TABLE STU_DETAILS MODIFY (NAME VARCHAR2(20)); functions to calculate the variance and standard deviation of a
d. TRUNCATE: It is used to delete all the rows from the table numeric column within a specified group. These functions
and free the space containing the table. help in analyzing the spread of data.
Syntax: Example (PostgreSQL):
TRUNCATE TABLE table_name; SELECT VARIANCE(sales) FROM monthly_data;
Example:
TRUNCATE TABLE EMPLOYEE; These aggregate functions are often used in conjunction with
the GROUP BY clause, which groups rows based on one or
more columns, allowing you to perform aggregate calculations
on each group independently.

Example (GROUP BY with COUNT):


SELECT department, COUNT(*) FROM employees GROUP BY
department;
Concurrency Control Timestamp Ordering Protocol for Concurrency Control States of Transaction Deadlock in DBMS Schedule
Concurrency Control is the working concept that is required In a database, the transaction can be in one of the following A deadlock is a condition where two or more transactions are A series of operation from one transaction to another
for controlling and managing the concurrent execution of The Timestamp Ordering Protocol is used to order the states - waiting indefinitely for one another to give up locks. Deadlock transaction is known as schedule. It is used to preserve the
database operations and thus avoiding the inconsistencies in transactions based on their Timestamps. The order of is said to be one of the most feared complications in DBMS as order of the operation in each of the individual transaction.
the database. Thus, for maintaining the concurrency of the transaction is nothing but the ascending order of the no task ever gets finished and is in waiting state forever.
database, we have the concurrency control protocols. transaction creation. For example: In the student table, transaction T1 holds a lock
Concurrency Control Protocols The priority of the older transaction is higher that's why it on some rows and needs to update some rows in the grade
The concurrency control protocols ensure the atomicity, executes first. To determine the timestamp of the transaction, table. Simultaneously, transaction T2 holds locks on some
consistency, isolation, durability and serializability of the this protocol uses system time or logical counter. rows in the grade table and needs to update the rows in the
concurrent execution of the database transactions. Therefore, The lock-based protocol is used to manage the order between Student table held by Transaction T1.
these protocols are categorized as: conflicting pairs among transactions at the execution time. Now, the main problem arises. Now Transaction T1 is waiting
• Lock Based Concurrency Control Protocol But Timestamp based protocols start working as soon as a for T2 to release its lock and similarly, transaction T2 is waiting
• Time Stamp Concurrency Control Protocol transaction is created. for T1 to release its lock. All activities come to a halt state and
• Validation Based Concurrency Control Protocol Let's assume there are two transactions T1 and T2. Suppose remain at a standstill. It will remain in a standstill until the
We will understand and discuss each protocol one by one in the transaction T1 has entered the system at 007 times and DBMS detects the deadlock and aborts one of the
our next sections. transaction T2 has entered the system at 009 times. T1 has Active state transactions.
Recovery with Concurrent Transaction the higher priority, so it executes first as it is entered the The active state is the first state of every transaction. In this
Whenever more than one transaction is being executed, then system first. state, the transaction is being executed.
the interleaved of logs occur. During recovery, it would The timestamp ordering protocol also maintains the For example: Insertion or deletion or updating a record is
1. Serial Schedule
become difficult for the recovery system to backtrack all logs timestamp of last 'read' and 'write' operation on a data. done here. But all the records are still not saved to the
The serial schedule is a type of schedule where one
and then start recovering. Basic Timestamp ordering protocol works as follows: database.
transaction is executed completely before starting another
To ease this situation, 'checkpoint' concept is used by most 1. Check the following condition whenever a transaction Ti Partially committed
transaction. In the serial schedule, when the first transaction
DBMS. issues a Read (X) operation: In the partially committed state, a transaction executes its
completes its cycle, then the next transaction is executed.
As we have discussed checkpoint in Transaction Processing If W_TS(X) >TS(Ti) then the operation is rejected. final operation, but the data is still not saved to the database.
For example: Suppose there are two transactions T1 and T2
Concept of this tutorial, so you can go through the concepts If W_TS(X) <= TS(Ti) then the operation is executed. In the total mark calculation example, a final display of the
which have some operations. If it has no interleaving of
again to make things more clear. Timestamps of all the data items are updated. total marks step is executed in this state.
operations, then there are the following two possible
Lock-Based Protocol 2. Check the following condition whenever a transaction Ti Committed
outcomes:
In this type of protocol, any transaction cannot read or write issues a Write(X) operation: A transaction is said to be in a committed state if it executes Deadlock Avoidance Execute all the operations of T1 which was followed by all the
data until it acquires an appropriate lock on it. There are two If TS(Ti) < R_TS(X) then the operation is rejected. If TS(Ti) < all its operations successfully. In this state, all the effects are When a database is stuck in a deadlock state, then it is better operations of T2.
types of lock: W_TS(X) then the operation is rejected and Ti is rolled back now permanently saved on the database system. to avoid the database rather than aborting or restating the Execute all the operations of T1 which was followed by all the
1. Shared lock: otherwise the operation is executed. Failed state database. This is a waste of time and resource. operations of T2.
It is also known as a Read-only lock. In a shared lock, the data Where, TS(TI) denotes the timestamp of the transaction Ti. If any of the checks made by the database recovery system Deadlock avoidance mechanism is used to detect any In the given (a) figure, Schedule A shows the serial schedule
item can only read by the transaction. R_TS(X) denotes the Read time-stamp of data-item X. fails, then the transaction is said to be in the failed state. deadlock situation in advance. A method like "wait for graph" where T1 followed by T2.
It can be shared between the transactions because when the W_TS(X) denotes the Write time-stamp of data-item X. In the example of total mark calculation, if the database is not is used for detecting the deadlock situation but this method is In the given (b) figure, Schedule B shows the serial schedule
transaction holds a lock, then it can't update the data on the able to fire a query to fetch the marks, then the transaction suitable only for the smaller database. For the larger where T2 followed by T1.
data item. Validation Based Protocol for Concurrency Control will fail to execute. database, deadlock prevention method can be used. 2. Non-serial Schedule
2. Exclusive lock: Validation phase is also known as optimistic concurrency Aborted Deadlock Detection If interleaving of operations is allowed, then there will be non-
In the exclusive lock, the data item can be both reads as well control technique. In the validation based protocol, the If any of the checks fail and the transaction has reached a In a database, when a transaction waits indefinitely to obtain serial schedule.
as written by the transaction. transaction is executed in the following three phases: failed state then the database recovery system will make sure a lock, then the DBMS should detect whether the transaction It contains many possible orders in which the system can
This lock is exclusive, and in this lock, multiple transactions do Read phase: In this phase, the transaction T is read and that the database is involved in a deadlock or not. The lock manager maintains a execute the individual operations of the transactions.
not modify the same data simultaneously. executed. It is used to read the value of various data items Wait for the graph to detect the deadlock cycle in the In the given figure (c) and (d), Schedule C and Schedule D are
There are four types of lock protocols available: and stores them in temporary local variables. It can perform Thomas write Rule database. the non-serial schedules. It has interleaving of operations.
1. Simplistic lock protocol all the write operations on temporary variables without an Thomas Write Rule provides the guarantee of serializability Wait for Graph 3. Serializable schedule
It is the simplest way of locking the data while transaction. update to the actual database. order for the protocol. It improves the Basic Timestamp This is the suitable method for deadlock detection. In this The serializability of schedules is used to find non-serial
Simplistic lock-based protocols allow all the transactions to Validation phase: In this phase, the temporary variable value Ordering Algorithm. method, a graph is created based on the transaction and their schedules that allow the transaction to execute concurrently
get the lock on the data before insert or delete or update on will be validated against the actual data to see if it violates the The basic Thomas write rules are as follows: lock. If the created graph has a cycle or closed loop, then without interfering with one another.
it. It will unlock the data item after completing the serializability. If TS(T) < R_TS(X) then transaction T is aborted and rolled there is a deadlock. It identifies which schedules are correct when executions of
transaction. Write phase: If the validation of the transaction is validated, back, and operation is rejected. The wait for the graph is maintained by the system for every the transaction have interleaving of their operations.
2. Pre-claiming Lock Protocol then the temporary results are written to the database or If TS(T) < W_TS(X) then don't execute the W_item(X) transaction which is waiting for some data held by the others. A non-serial schedule will be serializable if its result is equal to
Pre-claiming Lock Protocols evaluate the transaction to list all system otherwise the transaction is rolled back. operation of the transaction and continue processing. The system keeps checking the graph if there is any cycle in the result of its transactions executed serially.
the data items on which they need locks. Here each phase has the following different timestamps: If neither condition 1 nor condition 2 occurs, then allowed to the graph.
Before initiating an execution of the transaction, it requests Start(Ti): It contains the time when Ti started its execution. execute the WRITE operation by transaction Ti and set The wait for a graph for the above scenario is shown below:
DBMS for all the lock on all those data items. Validation (Ti): It contains the time when Ti finishes its read W_TS(X) to TS(T).
If all the locks are granted then this protocol allows the phase and starts its validation phase. If we use the Thomas write rule then some serializable
transaction to begin. When the transaction is completed then Finish(Ti): It contains the time when Ti finishes its write phase. schedule can be permitted that does not conflict serializable
it releases all the lock. This protocol is used to determine the time stamp for the as illustrate by the schedule in a given figure:
If all the locks are not granted then this protocol allows the transaction for serialization using the time stamp of the
transaction to rolls back and waits until all the locks are validation phase, as it is the actual phase which determines if
granted. the transaction will commit or rollback.
Hence TS(T) = validation(T).
The serializability is determined during the validation process.
It can't be decided in advance.
While executing the transaction, it ensures a greater degree of
concurrency and also less number of conflicts. Figure: A Serializable Schedule that is not Conflict Serializable
Thus it contains transactions which have less number of In the above figure, T1's read and precedes T1's write of the
rollbacks. same data item. This schedule does not conflict serializable.
Granularity: Thomas write rule checks that T2's write is never seen by any
It is the size of data item allowed to lock. transaction. If we delete the write operation in transaction T2,
Multiple Granularity: then conflict serializable schedule can be obtained which is
shown in below figure. Deadlock Prevention
It can be defined as hierarchically breaking up the database
3. Two-phase locking (2PL) Deadlock prevention method is suitable for a large database.
into blocks which can be locked.
The two-phase locking protocol divides the execution phase of If the resources are allocated in such a way that deadlock
The Multiple Granularity protocol enhances concurrency and
the transaction into three parts. never occurs, then the deadlock can be prevented.
reduces lock overhead.
In the first part, when the execution of the transaction starts, The Database management system analyzes the operations of
It maintains the track of what to lock and how to lock.
it seeks permission for the lock it requires. the transaction whether they can create a deadlock situation
It makes easy to decide either to lock a data item or to unlock
In the second part, the transaction acquires all the locks. The or not. If they do, then the DBMS never allowed that
a data item. This type of hierarchy can be graphically
third phase is started as soon as the transaction releases its transaction to be executed.
represented as a tree. Figure: A Conflict Serializable Schedule
first lock. Wait-Die scheme
For example: Consider a tree which has four levels of nodes.
In the third phase, the transaction cannot demand any new In this scheme, if a transaction requests for a resource which
The first level or higher level shows the entire database.
locks. It only releases the acquired locks. The second level represents a node of type area. The higher Recoverability of Schedule is already held with a conflicting lock by another transaction
level database consists of exactly these areas. Sometimes a transaction may not execute completely due to a then the DBMS simply checks the timestamp of both
The area consists of children nodes which are known as files. software issue, system crash or hardware failure. In that case, transactions. It allows the older transaction to wait until the
No file can be present in more than one area. the failed transaction has to be rollback. But some other resource is available for execution.
Finally, each file contains child nodes known as records. The transaction may also have used value produced by the failed Let's assume there are two transactions Ti and Tj and let TS(T)
file has exactly those records that are its child nodes. No transaction. So we also have to rollback those transactions. is a timestamp of any transaction T. If T2 holds a lock by some
records represent in more than one file. other transaction and T1 is requesting for resources held by
Hence, the levels of the tree starting from the top level are as T2 then the following actions are performed by DBMS:
follows: Check if TS(Ti) < TS(Tj) - If Ti is the older transaction and Tj has
Database held some resource, then Ti is allowed to wait until the data-
There are two phases of 2PL: Area item is available for execution. That means if the older
Growing phase: In the growing phase, a new lock on the data File transaction is waiting for a resource which is locked by the
item may be acquired by the transaction, but none can be Record younger transaction, then the older transaction is allowed to
released. wait for resource until it is available.
Shrinking phase: In the shrinking phase, existing lock held by Check if TS(Ti) < TS(Tj) - If Ti is older transaction and has held
the transaction may be released, but no new locks can be some resource and if Tj is waiting for it, then Tj is killed and
acquired. restarted later with the random delay but with the same
In the below example, if lock conversion is allowed then the timestamp.
following phase can happen: Wound wait scheme
Upgrading of lock (from S(a) to X (a)) is allowed in growing In wound wait scheme, if the older transaction requests for a
The above table 1 shows a schedule which has two resource which is held by the younger transaction, then older
phase.
transactions. T1 reads and writes the value of A and that value transaction forces younger one to kill the transaction and
Downgrading of lock (from X(a) to S(a)) must be done in
is read and written by T2. T2 commits but later on, T1 fails. release the resource. After the minute delay, the younger
shrinking phase.
Due to the failure, we have to rollback T1. T2 should also be transaction is restarted but with the same timestamp.
Example:
rollback because it reads the value written by T1, but T2 can't If the older transaction has held a resource which is requested
be rollback because it already committed. So this type of by the Younger transaction, then the younger transaction is
In this example, the highest level shows the entire database. Here,
schedule is known as irrecoverable schedule. asked to wait until older releases it.
The levels below are file, record, and fields. Schedule A and Schedule B are serial schedule.
Irrecoverable schedule: The schedule will be irrecoverable if DBMS Concurrency Control
There are three additional lock modes with multiple Schedule C and Schedule D are Non-serial schedule.
Tj reads the updated value of Ti and Tj committed before Ti Concurrency Control is the management procedure that is
granularity:
commit. required for controlling concurrent execution of the
Intention Mode Lock
Intention-shared (IS): It contains explicit locking at a lower operations that take place on a database.
level of the tree but only with shared locks. But before knowing about concurrency control, we should
Intention-Exclusive (IX): It contains explicit locking at a lower know about concurrent execution.
level with exclusive or shared locks. Concurrent Execution in DBMS
Shared & Intention-Exclusive (SIX): In this lock, the node is In a multi-user system, multiple users can access and use the
locked in shared mode, and some node is locked in exclusive same database at one time, which is known as the concurrent
mode by the same transaction. execution of the database. It means that the same database is
Compatibility Matrix with Intention Lock Modes: The below executed simultaneously on a multi-user system by different
The following way shows how unlocking and locking work with table describes the compatibility matrix for these lock modes: users.
2-PL. While working on the database transactions, there occurs the
Transaction T1: requirement of using the database by multiple users for
Growing phase: from step 1-3 performing different operations, and in that case, concurrent
Shrinking phase: from step 5-7 execution of the database is performed.
Lock point: at 3 The thing is that the simultaneous execution that is performed
The above table 2 shows a schedule with two transactions. should be done in an interleaved manner, and no operation
Transaction T2: Transaction T1 reads and writes A, and that value is read and
Growing phase: from step 2-6 should affect the other executing operations, thus maintaining
written by transaction T2. But later on, T1 fails. Due to this, the consistency of the database. Thus, on making the
Shrinking phase: from step 8-9 we have to rollback T1. T2 should be rollback because T2 has
Lock point: at 6 concurrent execution of the transaction operations, there
read the value written by T1. As it has not committed before occur several challenging problems that need to be solved.
4. Strict Two-phase locking (Strict-2PL) It uses the intention lock modes to ensure serializability. It T1 commits so we can rollback transaction T2 as well. So it is
The first phase of Strict-2PL is similar to 2PL. In the first phase, requires that if a transaction attempts to lock a node, then Problems with Concurrent Execution
recoverable with cascade rollback. In a database transaction, the two main operations
after acquiring all the locks, the transaction continues to that node must follow these protocols: Recoverable with cascading rollback: The schedule will be
execute normally. Transaction T1 should follow the lock-compatibility matrix. are READ and WRITE operations. So, there is a need to
recoverable with cascading rollback if Tj reads the updated manage these two operations in the concurrent execution of
The only difference between 2PL and strict 2PL is that Strict- Transaction T1 firstly locks the root of the tree. It can lock it in value of Ti. Commit of Tj is delayed till commit of Ti.
2PL does not release a lock after using it. any mode. the transactions as if these operations are not performed in
Strict-2PL waits until the whole transaction to commit, and If T1 currently has the parent of the node locked in either IX or an interleaved manner, and the data may become
then it releases all the locks at a time. IS mode, then the transaction T1 will lock a node in S or IS inconsistent. So, the following problems occur with the
Strict-2PL protocol does not have shrinking phase of lock mode only. Concurrent Execution of the operations:
release. If T1 currently has the parent of the node locked in either IX or Problem 1: Lost Update Problems (W - W Conflict)
SIX modes, then the transaction T1 will lock a node in X, SIX, The problem occurs when two different database transactions
or IX mode only. perform the read/write operations on the same database
If T1 has not previously unlocked any node only, then the items in an interleaved manner (i.e., concurrent execution)
Transaction T1 can lock a node. that makes the values of the items incorrect hence making the
If T1 currently has none of the children of the node-locked database inconsistent.
only, then Transaction T1 will unlock a node. Dirty Read Problems (W-R Conflict)
Observe that in multiple-granularity, the locks are acquired in The dirty read problem occurs when one transaction updates
top-down order, and locks must be released in bottom-up an item of the database, and somehow the transaction fails,
The above Table 3 shows a schedule with two transactions. and before the data gets rollback, the updated database item
order.
Transaction T1 reads and write A and commits, and that value is accessed by another transaction. There comes the Read-
If transaction T1 reads record Ra9 in file Fa, then transaction T1
is read and written by T2. So this is a cascade less recoverable Write Conflict between both transactions.
needs to lock the database, area A1 and file Fa in IX mode.
schedule. Unrepeatable Read Problem (W-R Conflict)
Finally, it needs to lock Ra2 in S mode.
It does not have cascading abort as 2PL does. If transaction T2 modifies record Ra9 in file Fa, then it can do so Also known as Inconsistent Retrievals Problem that occurs
after locking the database, area A1 and file Fa in IX mode. when in a transaction, two different values are read for the
Finally, it needs to lock the Ra9 in X mode. same database item.
If transaction T3 reads all the records in file Fa, then
transaction T3 needs to lock the database, and area A in IS
mode. At last, it needs to lock Fa in S mode.
If transaction T4 reads the entire database, then T4 needs to
lock the database in S mode.
Testing of Serializability Conflict Serializable Schedule Failure Classification
Serialization Graph is used to test the Serializability of a A schedule is called conflict serializability if after swapping of Difference Between DROP DELETE and TRUNCATE To find that where the problem has occurred, we generalize a
schedule. non-conflicting operations, it can transform into a serial failure into the following categories:
Assume a schedule S. For S, we construct a graph known as schedule. • Transaction failure
precedence graph. This graph has a pair G = (V, E), where V The schedule will be a conflict serializable if it is conflict • System crash
consists a set of vertices, and E consists a set of edges. The set equivalent to a serial schedule. TRUNCA TRUN • Disk failure
DROP
of vertices is used to contain all the transactions participating Conflicting Operations TE CATE 1. Transaction failure
in the schedule. The set of edges is used to contain all edges Ti The two operations become conflicting if all conditions satisfy: The transaction failure occurs when it fails to execute or when
->Tj for which one of the three conditions holds: Both belong to separate transactions. it reaches a point from where it can't go any further. If a few
Create a node Ti → Tj if Ti executes write (Q) before Tj They have the same data item. transaction or process is hurt, then this is called as transaction
executes read (Q). They contain at least one write operation. failure.
Create a node Ti → Tj if Ti executes read (Q) before Tj Example: It It Reasons for a transaction failure could be -
executes write (Q). Swapping is possible only if S1 and S2 are logically equal. compl It remov Logical errors: If a transaction cannot complete due to some
Create a node Ti → Tj if Ti executes write (Q) before Tj etely removes es all code error or an internal error condition, then the logical error
executes write (Q). remov one or the occurs.
es the more rows Syntax error: It occurs where the DBMS itself terminates an
table records from active transaction because the database system is not able to
from from the execute it. For example, The system aborts an active
the the existin transaction, in case of deadlock or resource unavailability.
databa table. g 2. System Crash
se. table System failure can occur due to power failure or other
If a precedence graph contains a single edge Ti → Tj, then all
the instructions of Ti are executed before the first instruction hardware or software failure. Example: Operating system
of Tj is executed. error.
If a precedence graph for schedule S contains a cycle, then S is Fail-stop assumption: In the system crash, non-volatile storage
Here, S1 = S2. That means it is non-conflict.
non-serializable. If the precedence graph has no cycle, then S It is a It is a It is a is assumed not to be corrupted.
is known as serializable. DDL DML DDL 3. Disk Failure
For example: comm comman comm It occurs where hard-disk drives or storage drives used to fail
and d and frequently. It was a common problem in the early days of
technology evolution.
Disk failure occurs due to the formation of bad sectors, disk
head crash, and unreachability to the disk or any other failure,
which destroy all or part of disk storage.
DELETE
FROM TRUN
DROP Log-Based Recovery
tble_na CATE
TABLE The log is a sequence of records. Log of each transaction is
meWHE TABLE
table_ maintained in some stable storage so that if any failure
RE table_
Here, S1 ≠ S2. That means it is conflict. name; occurs, then it can be recovered from there.
conditio name;
Conflict Equivalent If any operation is performed on the database, then it will be
ns;
In the conflict equivalent, one can be transformed to another recorded in the log.
by swapping non-conflicting operations. In the given example, But the process of storing the logs should be done before the
S2 is conflict equivalent to S1 (S1 can be converted to S2 by actual transaction is applied in the database.
Explanation: swapping non-conflicting operations). Let's assume there is a transaction to modify the City of a
Read(A): In T1, no subsequent writes to A, so no new edges Two schedules are said to be conflict equivalent if and only if: It student. The following logs are written for this transaction.
Read(B): In T2, no subsequent writes to B, so no new edges They contain the same set of the transaction. compl When the transaction is initiated, then it writes 'start' log.
Read(C): In T3, no subsequent writes to C, so no new edges If each pair of conflict operations are ordered in the same etely It <Tn, Start>
Write(B): B is subsequently read by T3, so add edge T2 → T3 way. remov It doesn When the transaction modifies the City from 'Noida' to
Example: es the doesn’t ’t free 'Bangalore', then another log is written to the file.
Write(C): C is subsequently read by T1, so add edge T3 → T1
allocat free the the <Tn, City, 'Noida', 'Bangalore' >
Write(A): A is subsequently read by T2, so add edge T1 → T2
ed allocate alloca When the transaction is finished, then it writes another log to
Write(A): In T2, no subsequent reads to A, so no new edges
space d space ted indicate the end of the transaction.
Write(C): In T1, no subsequent reads to C, so no new edges
for the of the space <Tn, Commit>
Write(B): In T3, no subsequent reads to B, so no new edges
table table. of the There are two approaches to modify the database:
Precedence graph for schedule S1:
from table. 1. Deferred database modification:
memo The deferred modification technique occurs if the transaction
ry. does not modify the database until it has committed.
In this method, all the logs are created and stored in the
stable storage, and the database is updated when a
transaction commits.
Doesn 2. Immediate database modification:
Remov Doesn’t The Immediate modification technique occurs if database
’t
Schedule S2 is a serial schedule because, in this, all operations es the affect modification occurs while the transaction is still active.
The precedence graph for schedule S1 contains a cycle that's affect
of T1 are performed before starting any operation of T2. entire the In this technique, the database is modified immediately after
why Schedule S1 is non-serializable the
Schedule S1 can be transformed into a serial schedule by table table every operation. It follows an actual database modification.
table
swapping non-conflicting operations of S1. struct structur Recovery using Log records
struct
After swapping of non-conflict operations, the schedule S1 ure. e When the system is crashed, then the system consults the log
ure
becomes: to find which transactions need to be undone and which need
T1 T2 to be redone.
If the log contains the record <Ti, Start> and <Ti, Commit> or
Read(A)
<Ti, Commit>, then the Transaction Ti needs to be redone. If
It is It is It is
Write(A) log contains record<Tn, Start> but does not contain the record
faster slower faster
Read(B) either <Ti, commit> or <Ti, abort>, then the Transaction Ti
Write(B)
than than the than
needs to be undone
Read(A) DELET DROP both
Write(A) E but and the
Read(B) slower TRUNCA DELET
Write(B)
than TE E and
TRUN comman DROP
Since, S1 is conflict serializable.
View Serializability
Explanation: A schedule will view serializable if it is view equivalent to a DBA
Read(A): In T4,no subsequent writes to A, so no new edges serial schedule. CATE ds as it comm DBA stands for Database Administrator. A Database
Read(C): In T4, no subsequent writes to C, so no new edges If a schedule is conflict serializable, then it will be view as it deletes ands Administrator is a professional responsible for managing and
Write(A): A is subsequently read by T5, so add edge T4 → T5 serializable. firstly one row as it maintaining databases within an organization. Their primary
Read(B): In T5,no subsequent writes to B, so no new edges The view serializable which does not conflict serializable delete at a delete responsibilities revolve around ensuring the reliability,
Write(C): C is subsequently read by T6, so add edge T4 → T6 contains blind writes. s the time s all security, and performance of a database system.
Write(B): A is subsequently read by T6, so add edge T5 → T6 View Equivalent rows base on the Responsilbilities of a Database Administrator:
Write(C): In T6, no subsequent reads to C, so no new edges Two schedules S1 and S2 are said to be view equivalent if they and the record 2. Database installation and configuration.
Write(A): In T5, no subsequent reads to A, so no new edges satisfy the following conditions: then specifie s at a 3. Database design and optimization.
Write(B): In T6, no subsequent reads to B, so no new edges 1. Initial Read the d time 4. Data security and access control.
Precedence graph for schedule S2: An initial read of both schedules must be the same. Suppose table conditio witho
5. Backup and recovery planning.
two schedule S1 and S2. In schedule S1, if a transaction T1 is from ns. ut any
6. Performance tuning.
reading the data item A, then in S2, transaction T1 should also the condit
databa ion. 7. Patch management and updates.
read A.
se. 8. Data migration and ETL processes.
9. Capacity planning.
10. Troubleshooting and problem resolution.
11. Documentation and compliance.

Not Can’t 12. Disaster recovery planning.


applic be
able as used
The precedence graph for schedule S2 contains no cycle that's it as it is
why ScheduleS2 is serializable. Can be
Above two schedules are view equivalent because Initial read operat applic
used
operation in S1 is done by T1 and in S2 it is also done by T1. es on able
2. Updated Read the on the
In schedule S1, if Ti is reading A which is updated by Tj then in entire entire
S2 also, Ti should read A which is updated by Tj. table table

1.

Above two schedules are not view equal because, in S1, T3 is


reading A updated by T2 and in S2, T3 is reading A updated by
T1.
3. Final Write
A final write must be the same between both the schedules.
In schedule S1, if a transaction T1 updates A at last then in S2,
final writes operations should also be done by T1.

Above two schedules is view equal because Final write


operation in S1 is done by T3 and in S2, the final write
operation is also done by T3.

You might also like