PL/SQL Cursors and Triggers Explained
PL/SQL Cursors and Triggers Explained
A Cursor in PL/SQL is a pointer to a context area that stores the result set of a query.
PL/SQL Cursors
The cursor is used to retrieve data one row at a time from the results set, unlike other SQL
commands that operate on all rows at once.
The Data that is stored in the Cursor is called the Active Data Set. Oracle DBMS has another
predefined area in the main memory Set, within which the cursors are opened. Hence the size of
the cursor is limited by the size of this pre-defined area.
Cursor Actions
Key actions involved in working with cursors in PL/SQL are:
1. Declare Cursor: A cursor is declared by defining the SQL statement that returns a result
set.
2. Open: A Cursor is opened and populated by executing the SQL statement defined by the
cursor.
3. Fetch: When the cursor is opened, rows can be fetched from the cursor one by one or in a
block to perform data manipulation.
4. Close: After data manipulation, close the cursor explicitly.
5. Deallocate: Finally, delete the cursor definition and release all the system resources
associated with the cursor.
Types of Cursors in PL/SQL
Cursors are classified depending on the circumstances in which they are opened.
• Implicit Cursor: If the Oracle engine opened a cursor for its internal processing it is
known as an Implicit Cursor. It is created “automatically” for the user by Oracle when a
query is executed and is simpler to code.
• Explicit Cursor: A Cursor can also be opened for processing data through a PL/SQL
block, on demand. Such a user-defined cursor is known as an Explicit Cursor.
Explicit cursor
An explicit cursor is defined in the declaration section of the PL/SQL Block. It is created on a
SELECT Statement which returns more than one row.
Where,
Syntax
DECLARE
variables;
records;
CURSOR cursor_name IS select_statement;
BEGIN
OPEN cursor_name;
LOOP
FETCH cursor_name INTO variables OR records;
EXIT WHEN cursor_name%NOTFOUND;
It automatically runs some code when something happens to a table — like when you insert,
update, or delete a record.
You don't have to call it manually — the database "triggers" it by itself when the event happens.
2. Simple Example
Instead of manually writing code every time, you can create a trigger that says:
"Hey database, whenever someone updates the Bank_Account table, automatically save the old
and new balance in a history table."
The trigger will do it every time without you worrying about it!
3. Real Life Analogy
sql
CopyEdit
CREATE OR REPLACE TRIGGER trigger_name
BEFORE INSERT ON table_name
FOR EACH ROW
BEGIN
-- Code to run automatically
END;
Meaning:
Fourth Normal Form (4NF): A relation said to be in 4NF if it is in Boyce Codd normal
form and should have no multi-valued dependency.
✓ For a dependency A→ B, if for a single value of A, multiple value of B exists then the
relation will be multi-valued dependency.
✓ Note: Multi Valued Dependency: A table is said to have multi-valued dependency, if the
following conditions are true,
1. For a dependency A → B, if for a single value of A, multiple value of B exists, then the
table may have multi-valued dependency.
2. Also, a table should have at-least 3 columns for it to have a multi-valued dependency.
3. And, for a relation R (A, B, C), if there is a multi-valued dependency between, A and
B, then B and C should be independent of each other.
■ If all these conditions are true for any relation (table), it is said to have multi-valued
dependency.
Example
➢ The given STUDENT table is in 3NF but the COURSE and HOBBY are two independent
entity. Hence, there is no relationship between COURSE and HOBBY. In the STUDENT
relation, student with STU_ID, 21 contains two courses, Computer and Math and two
hobbies, Dancing and Singing. So there is a Multi-valued dependency on STU_ID,
which leads to un-necessary repetition of data.
➢ So to make the above table into 4NF, we can decompose it into two tables:
STUDENT_COURSE
STUDENT_HOBBY
A join dependency happens when you can split a table into two or more smaller tables,
and then join (combine) them back together to get the original table — exactly the same,
without any missing or wrong data.
Think of it like breaking a big LEGO house into smaller sections and being able to
rebuild the same house perfectly.
But if splitting messes up the original information when rejoining, it’s a problem!
Join dependency ensures that after splitting and rejoining, nothing is lost or wrongly
added.
3. Formal Definition
In formal terms:
A relation (a table) R has a join dependency if it can be reconstructed exactly by joining
multiple projections (smaller tables with selected columns) without any extra or missing
rows.
4. Simple Example
Imagine a table:
Student Course
John Math
Scienc
John
e
Mary Math
Course Teacher
Math Mr. A
Scienc Mr. B
e
Now, if you join Table 1 and Table 2 on the Course column, you get:
A table is in 5NF if it has been broken down (split into smaller tables) as much as possible
without losing any data, and you can join those smaller tables back to get the original
table exactly.
5NF is all about removing very tricky, hidden repetition and making sure the table is
perfectly organized.
Sometimes, even after splitting tables into smaller parts (like in 3NF or 4NF), there are still
complex hidden connections between the columns.
5NF ensures:
Original Table:
You notice:
But there's no direct link between Employee and Machine, only through the Project.
• Employee-Project Table
• Project-Machine Table
Employee-Project:
Employee Project
Alice P1
Bob P2
Project-Machine:
Project Machine
P1 M1
P1 M2
P2 M1
P2 M2
If you join these two tables on Project, you can recreate the original table!
No extra information.
No missing information.
4. Real-Life Analogy
• Buns (Employee)
• Patty (Project)
• Sauce (Machine)
Then, you can create every correct burger combination just by joining the two lists.
No wasted ingredients, no wrong burgers.
That’s 5NF!
Domain-Key Normal Form (DKNF) is the highest and most perfect level of
normalization in a database.
A table is in DKNF if every rule (constraint) about the data is based only on domains (the
allowed types of values) and keys (unique identifiers).
In DKNF:
• No other strange or hidden rules are needed to keep the data correct.
• Only the meaning (domain) and uniqueness (key) of data control the table.
DKNF says:
"If your table is correct just because of domain rules and key rules, and you don’t need any
extra weird conditions, then your table is perfect — it's in DKNF."
• Domain rules:
o Student_ID must be a number.
o Grade must be A, B, C, D, or F.
• Key rules:
o (Student_ID, Course) together must be unique.
If all the data rules are based only on these domain rules and key rules,
and there are no extra rules like "if a student is taking Math, grade must be A or B only",
then the table is in DKNF!
If these two rules are enough to make the form valid, and you don't need extra manual
checks, then your form is like DKNF — clean, perfect, self-managed!
Point Meaning
Domain Type or range of allowed values
Key Unique identifier for rows
Only domain and key rules control data
DKNF
correctness
Result No anomalies, pure clean database
Difficulty Hard to achieve in practice
Query Processing in DBMS
Query Processing is the activity performed in extracting data from the
database. In query processing, it takes various steps for fetching the data
from the database. The steps involved are:
2. Optimization
3. Evaluation
As query processing includes certain activities for data retrieval. Initially, the
given user queries get translated in high-level database languages such as
SQL. It gets translated into expressions that can be further used at the
physical level of the fle system. After this, the actual evaluation of the
queries and a variety of query -optimizing transformations and takes place.
Thus before processing a query, a computer system needs to translate the
query into a human-readable and understandable language. Consequently,
SQL or Structured Query Language is the best suitable choice for humans.
But, it is not perfectly suitable for the internal representation of the query to
the system. Relational algebra is well suited for the internal representation of
a query. The translation process in query processing is similar to the parser
of a query. When a user executes any query, for generating the internal form
of the query, the parser in the system checks the syntax of the query,
verifes the name of the relation in the database, the tuple, and fnally the
required attribute value. The parser creates a tree of the query, known as
'parse-tree.' Further, translate it into the form of relational algebra. With this,
it evenly replaces all the use of the views when used in the query.
Thus, we can understand the working of a query processing in the below-
described diagram:
Suppose a user executes a query. As we have learned that there are various
methods of extracting the data from the database. In SQL, a user wants to
fetch the records of the employees whose salary is greater than or equal to
10000. For doing this, the following query is undertaken:
Evaluation
o The annotations in the evaluation plan may refer to the algorithms to be used
for the particular index or the specifc operations.
Optimization
o The cost of the query evaluation can vary for diferent types of queries.
Although the system is responsible for constructing the evaluation plan, the
user does need not to write their query efciently.
o Usually, a database system generates an efcient query evaluation plan,
which minimizes its cost. This type of task performed by the database system
and is known as Query Optimization.
o For optimizing a query, the query optimizer should have an estimated cost
analysis of each operation. It is because the overall operation cost depends
on the memory allocations to several operations, execution costs, and so on.
Finally, after selecting an evaluation plan, the system evaluates the query
and produces the output of the query.
ADBMS
Unit 2
Transaction processing and Concurrency control
Transaction
X's Account
1. Open_Account(X)
2. Old_Balance = [Link]
3. New_Balance = Old_Balance - 800
4. [Link] = New_Balance
5. Close_Account(X)
Y's Account
1. Open_Account(Y)
2. Old_Balance = [Link]
3. New_Balance = Old_Balance + 800
4. [Link] = New_Balance
5. Close_Account(Y)
Operations of Transaction:
Read(X): Read operation is used to read the value of X from the database and
stores it in a buffer in main memory.
Write(X): Write operation is used to write the value back to the database from
the buffer.
Let's take an example to debit transaction from an account which consists of
following operations:
1. 1. R(X);
2. 2. X = X - 500;
3. 3. W(X);
o he first operation reads X's value from database and stores it in a buffer.
o The second operation will decrease the value of X by 500. So buffer will
contain 3500.
o The third operation will write the buffer's value to the database. So X's
final value will be 3500.
For example: If in the above transaction, the debit transaction fails after
executing operation 2 then X's value will remain 4000 in the database which is
not acceptable by the bank.
ACID Properties
The transaction has the four properties. These are used to maintain consistency
in a database, before and after the transaction.
Property of Transaction
1. Atomicity
2. Consistency
3. Isolation
4. Durability
Atomicity
o It states that all operations of the transaction take place at once if not, the
transaction is aborted.
o There is no midway, i.e., the transaction cannot occur partially. Each
transaction is treated as one unit and either run to completion or is not
executed at all.
Abort: If a transaction aborts then all the changes made are not visible.
Commit: If a transaction commits then all the changes made are visible.
T1 T2
Read(A) Read(B)
A:= A-100 Y:=
Write(A) Write(B)
Consistency
o The integrity constraints are maintained so that the database is consistent
before and after the transaction.
o The execution of a transaction will leave a database in either its prior
stable state or a new stable state.
o The consistent property of database states that every transaction sees a
consistent database instance.
o The transaction is used to transform the database from one consistent
state to another consistent state.
For example: The total amount must be maintained before or after the
transaction.
1. Total before T occurs = 600+300=900
2. Total after T occurs= 500+400=900
Therefore, the database is consistent. In the case when T1 is completed but T2
fails, then inconsistency will occur.
Isolation
o It shows that the data which is used at the time of execution of a
transaction cannot be used by the second transaction until the first one is
completed.
o In isolation, if the transaction T1 is being executed and using the data
item X, then that data item can't be accessed by any other transaction T2
until the transaction T1 ends.
o The concurrency control subsystem of the DBMS enforced the isolation
property.
Durability
o The durability property is used to indicate the performance of the
database's consistent state. It states that the transaction made the
permanent changes.
o They cannot be lost by the erroneous operation of a faulty transaction or
by the system failure. When a transaction is completed, then the database
reaches a state known as the consistent state. That consistent state cannot
be lost, even in the event of a system's failure.
o The recovery subsystem of the DBMS has the responsibility of Durability
property.
States of Transaction
The different stages a transaction goes through during its lifecycle are known as
the transaction states. The following is a diagrammatic representation of the
different stages of a transaction.
Active state
o The active state is the first state of every transaction. In this state, the
transaction is being executed.
o For example: Insertion or deletion or updating a record is done here. But
all the records are still not saved to the database.
Partially committed
Committed
o If any of the checks made by the database recovery system fails, then the
transaction is said to be in the failed state.
o In the example of total mark calculation, if the database is not able to fire
a query to fetch the marks, then the transaction will fail to execute.
Aborted
o If any of the checks fail and the transaction has reached a failed state then
the database recovery system will make sure that the database is in its
previous consistent state. If not then it will abort or roll back the
transaction to bring the database into a consistent state.
o If the transaction fails in the middle of the transaction then before
executing the transaction, all the executed transactions are rolled back to
its consistent state.
o After aborting the transaction, the database recovery module will select
one of the two operations:
1. Re-start the transaction
2. Kill the transaction
Example
Let us take a very simple example of Railway ticket booking. Can you think of
the things that need to be retrieved from the database when you initiate the
booking process?
You will need the train details, the already booked ticket details, the platform
details, and many more such things. Now, once these details are retrieved the
transaction of booking a ticket enters the active state.
After the user has completed the entire process of booking a ticket from their
end, the transaction enters the partially committed state. In case any error
occurred during the process, then the transaction will enter the failed state.
Now, say the process was successful and the transaction entered the partially
committed state, now if the saving in the database is completed successfully
then the transaction enters the committed state. In case there is any error while
saving in the database then it enters the failed state.
Anything from the failed state enters the aborted state so that rollbacks can
take place and the database consistency is maintained.
Now, let’s talk about the terminated state. If the booking is permanently saved
in the database, or it has been aborted due to some unforeseen reasons then the
transaction enters the terminated state.
Serializability
In the field of computer science, serializability is a term that is a property
of the system that describes how the different process operates the shared
data.
If the result given by the system is similar to the operation performed by
the system, then in this situation, we call that system serializable.
Here the cooperation of the system means there is no overlapping in the
execution of the data. In DBMS, when the data is being written or read
then, the DBMS can stop all the other processes from accessing the data.
A schedule is serialized if it is equivalent to a serial schedule. A
concurrent schedule must ensure it is the same as if executed serially
means one after another. It refers to the sequence of actions such as read,
write, abort, commit are performed in a serial manner.
Example:
Transaction-1 Transaction-2
R(a)
W(a)
R(b)
W(b)
R(b)
W(b)
R(a)
W(a)
2. Here, we can see that Transaction-2 starts its execution after the
completion of Transaction-1.
Example:
Transaction-1 Transaction-2
R(a)
W(a)
R(b)
W(b)
R(b)
R(a)
W(b)
W(a)
We can see that Transaction-2 starts its execution before the completion of
Transaction-1, and they are interchangeably working on the same data, i.e., "a"
and "b".
Types of serializability
1. Conflict serializability
Conflict serializability is a type of conflict operation in serializability that
operates the same data item that should be executed in a particular order and
maintains the consistency of the database. In DBMS, each transaction has some
unique value, and every transaction of the database is based on that unique
value of the database.
This unique value ensures that no two operations having the same conflict value
are executed concurrently. For example, let's consider two examples, i.e., the
order table and the customer table. One customer can have multiple orders, but
each order only belongs to one customer. There is some condition for the
conflict serializability of the database. These are as below.
If there are two transactions that are executed concurrently, one operation has to
add the transaction of the first customer, and another operation has added by the
second operation. This process ensures that there would be no inconsistency in
the database.
1. READ(a) - WRITE(a)
2. WRITE(a) - WRITE(a)
3. WRITE(a) - READ(a)
2. View serializability
If a non-serial schedule is view equivalent to some other serial schedule then
the schedule is called View Serializable Schedule. It is needed to ensure the
consistency of a schedule.
The two conditions needed by schedules(S1 and S2) to be view equivalent are:
1. Initial read must be on the same piece of data.
Example: If transaction t1 is reading "A" from database in schedule S1, then in
schedule S2, t1 must read A.
2. Final write must be on the same piece of data.
Example: If a transaction t1 updated A at last in S1, then in S2, t1 should
perform final write as well.
3. The mid sequence should also be in the same order.
Example: If t1 is reading A which is updated by t2 in S1, then in S2, t1 should
read A which should be updated by t2.
Example: We have a schedule "S" having two transactions t1, and t2 working
simultaneously.
S:
t1 t2
R(x)
W(x)
t1 t2
R(x)
W(x)
R(y)
W(y)
R(y)
W(y)
t1 t2
R(x)
W(x)
R(y)
W(y)
R(x)
W(x)
R(y)
W(y)
Prioritization
Prioritization is useful for browsing tasks, and tasks that use a lot of processor
time. Input/Output bound tasks can take the required amount of CPU, and move
on to the next read/write wait. CPU-intensive tasks take higher priority over the
less intensive tasks. Prioritization can be implemented in all CICS® systems. It
is more important in a high-activity system than in a low-activity system. With
careful priority selection, you can improve overall throughput and response
time. Prioritization can minimize resource usage of certain resource-bound
transactions. Prioritization increases the response time for lower-priority tasks,
and can distort the regulating effects of MXT and the MAXACTIVE attribute of
the transaction class definition.
Priorities do not affect the order of servicing terminal input messages and,
therefore, the time they wait to be attached to the transaction manager. Because
prioritization is determined in three sets of definitions (terminal, transaction,
and operator), it can be a time-consuming process for you to track many
transactions in a system. CICS prioritization is not interrupt-driven as is the case
with operating system prioritization, but determines the position on a ready
queue. This means that, after a task is given control of the processor, the task
does not relinquish that control until it issues a CICS command that calls the
CICS dispatcher. After the dispatch of a processor-bound task, CICS can be tied
up for long periods if CICS requests are infrequent. For that reason,
prioritization should be implemented only if MXT and the MAXACTIVE
attribute of the transaction class definition adjustments have proved to be
insufficient.
You should use prioritization sparingly, if at all, and only after you have already
adjusted task levels using MXT and the MAXACTIVE attribute of the
transaction class definition. It is probably best to set all tasks to the same
priority, and then prioritize some transactions either higher or lower on an
exception basis, and according to the specific constraints in a system. Do not
prioritize against slow tasks unless you can accept the longer task life and
greater dispatch overhead; these tasks are slow, in any case, and give up control
each time they have to wait for I/O. Use small priority values and differences
and concentrate on transaction priority. Give priority to control operator tasks
rather than the person, or at least to the control operator's signon ID rather than
to a specific physical terminal (the control operator may move around).
Consider for high priority a task that uses large resources. However, the effects
of this on the overall system need careful monitoring to ensure that loading a
large transaction of this type does not lock out other transactions. Also
consider for high priority those transactions that cause enqueues to system
resources, thus locking out other transactions. As a result, these can process
quickly and then release resources. Here are some examples:
Traditional databases, on the other hand, focus on providing centralized, controlled access to data.
• There are multiple computers, called sites or nodes. These sites must be connected by an
underlying communication network to transmit data and commands among sites, as shown
• ■ Logical interrelation of the connected databases- It is essential that the information in the
databases be logically related.
• ■ Absence of homogeneity constraint among connected nodes. -It is not necessary that all
nodes be identical in terms of data, hardware, and software.
Transparency
• The internal details of the distribution are hidden from the users(hiding implementation details
from end users.)
• Location transparency refers to the fact that the command used to perform a task is
independent of the location of the data and the location of the node where the command was
issued. (user need not be aware about physical location of database)
• Naming transparency implies that once a name is associated with an object, the named objects
can be accessed unambiguously without additional specification as to where the data is located.
(user need not be provide any additional information about name of database)
Replication transparency.
• copies of the same data objects may be stored at multiple sites for better availability,
performance, and reliability.
Fragmentation transparency.
• Horizontal fragmentation-- distributes a relation (table) into subrelations that are subsets of
the tuples (rows) in the original relation.
• referring to freedom from knowing how the distributed database is designed and where a
transaction executes.
Autonomy
• Autonomy determines the extent to which individual nodes or DBs in a connected DDB can
operate independently
• Design autonomy refers to independence of data model usage and transaction management
techniques among nodes.
• Communication autonomy determines the extent to which each node can decide on sharing of
information with other nodes.
• Reliability is broadly defined as the probability that a system is running (not down) at a certain
time point,
• Availability is the probability that the system is continuously available during a time interval
• Improved performance- Data localization reduces the contention for CPU and I/O services and
simultaneously reduces access delays involved in wide area networks.
Distributed Data Storage
Consider a relation r that is to be stored in the database.
There are two approaches to storing this relation in the distributed database:
Replication.
The system maintains several identical replicas (copies) of the relation, and stores each replica at a
different site.
Fragmentation.
The system partitions the relation into several fragments, and stores each fragment at a different site.
Data Replication
1 )Availability
2)Increased parallelism :
• The more replicas of r there are, the greater the chance that the needed data will be found in
the site where the transaction is executing.
• The system must ensure that all replicas of a relation r are consistent; otherwise, erroneous
computations may result.
• Thus, whenever r is updated, the update must be propagated to all sites containing replicas.
The result is increased overhead.
Data Fragmentation
r1, r2, . . . , r n.
• These fragments contain sufficient information to allow reconstruction of the original relation r.
Horizontal fragmentation,
Each tuple of relation r must belong to at least one of the fragments, so that the original relation can be
reconstructed, if needed.
keep tuples at the sites where they are used the most, to minimize data transfer.
ri = σPi (r)
• If the banking system has only two branches - Hillside and Valley view, then there are two
different fragments :
Vertical fragmentation
• Vertical fragmentation split the relation by decomposing the scheme R of relation 'r'.
• Vertical fragmentation of r(R) involves the definition of several subset of attributes R1, R2, ........
Rn, of the scheme R so that
UNIT - II
Parallel Databases
A parallel database system seeks to improve performance through parallellization of various operations,
such as loading data, building indexes and evaluating queries. Although data may be stored in a
distributed fashion such a system, the distribution is governed solely by performance considerations.
Scaleup :
Scaleup relates to the ability to process larger tasks in the same amount of time by providing more
resources.
Let Q be a task and QN be a task that is N times bigger than Q. Suppose execution time of task Q on
machine MS is TS and the execution time of task QN on parallel machine ML which is N times larger
than MS is TL.
Scaleup is defined as TS / TL.
Where,
TL : Execution time of a task on the larger machine
TS : The execution, time of the same task on the smaller machine
The parallel system ML is said to demonstrate linear scaleup on task Q if.
TL = TS.
If TL > TS the system is said to demonstrate sublinear scaleup.
o Shared disk : All the processors share a common set of disk (Fig. 1.3(b)). Shared-disk are
sometimes called clusters.
o Shared nothing : The processors share neither a common memory nor common disk (Fig.
1.3(c)).
Fig. 1.3(c) : Shared nothing
In shared memory architecture, the processors and disks have access to a common memory,
typically via a bus or through an interconnection network.
Advantages :
The benefit shared memory is extremely efficient communication between processors. Data in
shared memory can be accessed by any processor without being moved with software.
A processor can send messages to other processors much faster by using memory writes (which
usually rake less than a microsecond) than by sending a message through communication
mechanism.
Disadvantages :
The downside of shared-memory that the architecture is not scalable beyond 32 or 64 processors
because the bus or interconnection network becomes a bottleneck (since it is shared by all
processors).
Adding more number of processors should be avoided as they most of the time in waiting for their
turn on the bus to access memory.
Shared-memory architectures usually have large memory caches at each processor so that
referencing of the shared memory is avoided whenever possible.
However, at least some of the data will not be in the cache and accesses will have to go to the
shared memory. Moreover, the caches need to be kept coherent.
Maintaining cache-coherency becomes an increasing overhead with increasing overhead with
increasing number of processors.
Consequently, shared memory machines are not capable of scaling up beyond a point; current
shared-memory machines cannot support more than 64 processors.
In the shared-disk model, all processors can access all disks directly via an interconnection
network, but the processors have private memories.
Advantages :
Since each processor has its own memory, the memory bus is not a bottleneck.
It offers a cheap way to provide a degree of fault tolerance.
If a processor (or its memory) fails, the other processor can take over its tasks, since the database is
resident on disks that are accessible from all processors.
We can make the disk subsystem itself fault tolerant by using RAID architecture,
The shared-disk architecture has found acceptance in many applications.
Disadvantages :
Example :
DEC clusters running Rdb were One of the early commercial users of the shared disk database
architecture. (Rdb is now owned by Oracle, and is tailed Oracle Rdb. Digital Equipment Corporation
(DEC) is now owned by Compaq.)
In a shared-nothing system, each node of the machine consists of a processor, memory, and one or
more disks.
The processors at one node may communicate with one another processor at another node by a
high-speed interconnection network.
A node functions as the server for the data on the disk or disks that the node owns. Since local disk
references are serviced by local disks at each processor.
Advantages :
The shared-nothing model overcomes the disadvantage of requiring all I/O to go through a singly
interconnection network; only queries, accesses to non local disks, and result relations pass through
the network.
Moreover, the interconnection networks for shared nothing systems are usually designed to be
scalable, so that their transmission capacity increases as more nodes are added.
Consequently, shared-nothing architectures are more scalable, and can easily support a large
number of processors.
Disadvantage :
The main drawback of shared nothing systems is the costs of communication and of nonlocal disk
access, which are higher than in a shared memory or shared-disk architecture since sending data involves
software interaction at both ends.
Applications :
The Teradata database machine was among (the earliest commercial systems to use the shared-
nothing database architecture.
The Grace and the Gamma research prototypes also used shared-nothing architectures.
1.3.4 Hierarchical :
Fig. 1.3(d)
Definition : I/O parallelism refers to reducing the time required to retrieve relations from disk by
partitioning the relations on multiple disks. The most common form of data partitioning in a parallel
database environment is horizontal partitioning.
In horizontal partitioning, the tuples of a relation are divided (or declustered) among many disks,
so that each tuple resides on one disk. Several partitioning strategies have been proposed.
UNIT-IV
Features of XML
XML is heavily used as a format for document storage and processing, both online and offline.
Enhances search ability, making it possible for search engines to categorize data instead of
wasting processing power on context-based full-text searches.
XML does not allow References to external data entities. Named character references are not
allowed in XML.
XML is extensible, because it only specifies the structural rules of tags. No specification on tags
them self.
XML data is stored in plain text format. This provides a software- and hardware-independent
way of storing data.
This makes it much easier to create data that can be shared by different applications.
XML data is stored in text format. This makes it easier to expand or upgrade to new operating
systems, new applications, or new browsers, without losing data.
• A Document Type Definition (DTD) defines the legal building blocks of an XML document.
• It defines the document structure with a list of legal elements and attributes.
• If the DTD is declared inside the XML file, it should be wrapped in a DOCTYPE definition with the
following syntax:
• !DOCTYPE note defines that the root element of this document is note
• !ELEMENT note defines that the note element contains four elements: "to, from, heading,
body"
• If the DTD is declared in an external file, it should be wrapped in a DOCTYPE definition with the
following syntax:
• This is the same XML document as above, but with an external DTD
Why Use a DTD?
• With a DTD, each of your XML files can carry a description of its own format.
• With a DTD, independent groups of people can agree to use a standard DTD for interchanging
data.
• Your application can use a standard DTD to verify that the data you receive from the outside
world is valid.
XML Schema:
• XML Schema is an XML-based alternative to DTD.
• The XML Schema language is also referred to as XML Schema Definition (XSD).
• The purpose of an XML Schema is to define the legal building blocks of an XML document, just
like a DTD.
An XML Schema:
• We think that very soon XML Schemas will be used in most Web applications as a replacement
for DTDs.
One of the greatest strength of XML Schemas is the support for data types.
• Another great strength about XML Schemas is that they are written in XML.
Some benefits of that XML Schemas are written in XML:
• You can use your XML editor to edit your Schema files
• You can use your XML parser to parse your Schema files
• When sending data from a sender to a receiver, it is essential that both parts have the same
"expectations" about the content.
• With XML Schemas, the sender can describe the data in a way that the receiver will understand.
• A date like: "03-11-2004" will, in some countries, be interpreted as [Link] and in other
countries as [Link].
<date type="date">2004-03-11</date>
• ensures a mutual understanding of the content, because the XML data type "date" requires the
format "YYYY-MM-DD".
• Create your own data types derived from the standard types
• However, the "only text" restriction is quite misleading. The text can be of many different types.
• It can be one of the types included in the XML Schema definition (boolean, string, date, etc.), or
it can be a custom type that you can define yourself.
• You can also add restrictions (facets) to a data type in order to limit its content, or you can
require the data to match a specific pattern.
where xxx is the name of the element and yyy is the data type of the element.
• XML Schema has a lot of built-in data types. The most common types are:
DOM
• document object model (DOM), treats XML content as a tree, with each element
represented by a node, called a DOMNode.
• Programs may access parts of the document in a navigational fashion, beginning with the
root.
• DOM libraries are available for most common programming langauges and are even
present in Web browsers, where it may be used to manipulate the document displayed to
the user.
• The Java DOM API provides an interface called Node, and interfaces Element and
Attribute, which inherit from the Node interface.
• individual members of the list can be accessed by the method item(i), which returns the
ith element in the list.
• Attribute values of an element can be accessed by name, using the method getAttribute(
name). The text value of an element is modeled as a Text node.
• The method getData() on the Text node returns the text contents.
• DOM also provides a variety of functions for updating the document by adding and
deleting attribute and element children of a node, setting node values, and so on.
• DOM can be used to access XML data stored in databases, and an XML database can be
built using DOM as its primary interface for accessing and modifying data.
SAX (Simple API for XML) is an event sequential access parser API developed by the XML-DEV mailing
list for XML documents.
• SAX provides a mechanism for reading data from an XML document that is an alternative to that
provided by the Document Object Model (DOM).
Where the DOM operates on the document as a whole, SAX parsers operate on each piece of
the XML document sequentially.
• Client just overrides the methods of the API and place his own code inside there
• When the parser encounters start-tag, end-tag,etc., it thinks of them as events
• When such an event occurs, the handler automatically calls back to a particular method
overridden by the client, and feeds as arguments the method what it sees
Advantage:
(1) It is simple
Disadvantage:
• The data is broken into pieces and clients never have all the information as a whole unless they
create their own data structure
Just as the output of a relational query is a relation, the output of an XML query can be an XML
document. As a result, querying and transformation can be combined into a single tool.
• XPath is a language for path expressions, and is actually a building block for the remaining two
query languages.
• XSLT was designed to be a transformation language, as part of the XSL style sheet system,
which is used to control the formatting of XML data into HTML or other print or display
languages. Although designed for formatting, XSLT can generate XML as output, and can
express many interesting queries.
XPath
XPath addresses parts of an XML document by means of path expressions.
The language can be viewed as an extension of the simple path expressions in object-oriented
and object-relational databases
A path expression in XPath is a sequence of location steps separated by ―/‖
The result of a path expression is a set of values.
<name>Joe</name>
<name>Lisa</name>
<name>Mary</name>
The expression
/bank-2/customer/name/text()
would return the same names, but without the enclosing tags.
X-Query
• The best way to explain XQuery is to say that XQuery is to XML what SQL is to database
tables.
• XQuery is a language for finding and extracting elements and attributes from XML documents.
• XQuery is designed to query XML data – not just XML files, but anything that can appear as
XML.
Uses of XQuery
FLWOR
• For – binds a variable to each item returned by the in expression
General expression: FLWOR expression FOR < for-variable > IN < in-expression >
LET < let-variable > := < let-expression> [ WHERE < filter-expression> ]
[ ORDER BY < order-specification > ] RETURN <expression>
Example: retrieve the name of instructors who have a salary that is higher than 30000
• for $x in doc(”[Link]")/university/instructor
where $x/salary>30000
return <instr> {$x/name} </instr>
For/Let Clause :
• for <variable> in <expression>, . .
• LET clause values to a single or multiple variables as FOR clause does but without iteration
Where Clause
Where clause is optional. It is used to specify one or more conditions as per the requirement.
It is used to restrict the number of nodes returned by expression
Return Clause :
For each query return clause is evaluated.
The result produced are concatenated & return to users
XSL Transformations:
XSLT is a language for transforming XML documents into XHTML documents or to other XML documents.
With XSLT you can add/remove elements and attributes to or from the output file.
You can also rearrange and sort elements, perform tests and make decisions about which elements to
hide and display.
B2B Exchange :XML provides necessary standards which is required to exchange B2B data amongst
the organizations.
XML is less expensive & flexible
• Legacy System Integration :XML provides the facility to integrate legacy system data with modern
e-commerce System
• Database Support :A DBMS which supports XML exchange creates new system by enabling
integration with external system
On the other side database in its native format stores the XML data.
Database Meta-dictionaires: for databases XML can also facilitate creation of meta dictionaries &
vocabularies.
Meta dictionary created are not dependent on the DBMS type, it uses common language for data
description.
• XML Databases : To handle vast XML data exchange, to manage & utilize data efficiently many
XML software are available in the market.
Such products also includes OODBMS with XML interfaces to full XML database engines & servers.
XML provides advanced features than provided by traditional DBMS it used to handle complex
relationship.
XML can store the contents of book which include chapters, paragraphs, headers etc.
UNIT-V
Why Have a Separate DataWarehouse?
Because operational databases store huge amounts of data, you may wonder, ―why not perform
on-line analytical processing directly on such databases instead of spending additional time and
resources to construct a separate data warehouse?‖
1. What is Parallel Database?
A parallel database uses multiple processors (or computers) to work together and
process data faster.
It splits the workload among processors so tasks like query execution, data retrieval, and
storage happen simultaneously.
Shared Nothing Architecture (SN) means each processor (also called a node) has:
Its own CPU
Its own memory
Its own disk/storage
Nothing is shared between processors — hence, "shared nothing."
Each node is completely independent and communicates with other nodes only through
a network when necessary.
3. Visual Representation:
No memory or disk is shared between nodes — only network connects them if needed.
6. Disadvantages:
Disadvantage Explanation
7. Real-world Examples:
Google BigQuery
Amazon Redshift
Apache Cassandra (for some workloads)
Teradata
All of these rely on shared nothing design principles.
9. Summary Sentence:
In Shared Nothing Architecture, every processor-node is fully independent with its own CPU,
memory, and disk, enabling high scalability, fault tolerance, and parallel performance in a
distributed database system.
Active Databases
What is an Active Database?
An active database enhances the traditional passive database by allowing automatic reactions
to events. Instead of waiting for external programs or users to initiate actions, it responds based
on predefined rules.
Explanation:
• Event: DELETE on Orders.
• Action: Log the deletion to Audit_Log.
• Condition: Not explicitly stated but implicitly always true.
Limitations of Triggers
Issue Explanation
Complex Debugging Trigger chains can make flow hard to trace.
Performance Overhead Too many triggers slow down transactions.
Portability Issues Syntax varies across DBMSs (e.g., MySQL vs. Oracle).
Recursive/Infinite Loops If not properly handled, triggers can cause infinite rule firing.
Summary Table
Feature Description
Active Database DB that can respond to events via rules
ECA Rule Event → Condition → Action execution model
Trigger Database code that automatically executes on certain operations
Use Cases Automation, alerts, logging, policy enforcement
Challenge Debugging, recursion, DBMS-specific syntax differences
1. Introduction
A Temporal Database is a special type of database that manages time-varying data. Unlike
conventional databases that store only the current data, temporal databases store historical,
current, and sometimes future data with associated timestamps. This is useful for applications
where it is necessary to know the state of the data at any point in time.
4. Example
| Emp_ID | Dept | Valid_From | Valid_To | Tx_Start | Tx_End |
| 101 | HR | 2020-01-01 | 2022-12-31 | 2021-03-10| 9999-12-31|
This shows that the employee was in HR (valid time), and the info was stored in March 2021
(transaction time).
5. Advantages
• Allows historical tracking and time-based queries.
• Useful for auditing, legal compliance, and trend analysis.
• Ensures data integrity over time.
• Supports “time-travel” queries (e.g., "What was true on Jan 1, 2022?").
6. Applications
Area Use Case
HR Employee role changes
Finance Account balance history
Healthcare Medical record history
E-commerce Product price changes over time
8. Use Cases :
• HR systems (track employee roles over time)
• Banking (transaction history)
• Healthcare (track patient diagnosis history)
• Legal (audit trail for case files)
• E-commerce (track price changes of products)
9. Conclusion
Temporal databases are essential for systems where data evolves over time and where historical
accuracy is important. They offer enhanced functionality for real-world applications that
demand time-based reasoning.
A. SPATIAL DATABASES
3. Spatial Relationships
Spatial databases support spatial predicates such as:
• Contains – A region contains a point.
• Intersects – A road intersects a river.
• Within – A point lies within a city boundary.
• Nearest Neighbor – Find the nearest hospital.
6. Real-World Applications
Domain Example
Navigation Google Maps, GPS routing
Urban Planning Zoning, traffic simulation
Agriculture Precision farming (satellite-based field mapping)
Defense Target tracking, geospatial intelligence
B. MULTIMEDIA DATABASES
3. Storage Methods
Multimedia objects are stored using:
• BLOB (Binary Large Objects) for raw media.
• CLOB (Character Large Objects) for text/media descriptions.
a. Keyword-based Retrieval
• Search using tags/metadata:
"Find videos with keyword 'sunset'"
b. Content-Based Retrieval (CBIR/CBVR)
• Based on media features (color, shape, texture, etc.)
• Example:
Find images visually similar to this one.
c. Semantic Retrieval
• Higher-level understanding:
"Find videos of cats playing with yarn"
7. Real-World Applications
Field Example
Social Media Facebook, Instagram image tagging
Healthcare Medical image analysis (X-rays, MRIs)
Education E-learning platforms (videos, documents)
Law Enforcement Facial recognition from surveillance feeds
Conclusion
Spatial and Multimedia Databases represent the evolution of DBMS to handle complex, real-
world data. Spatial DBs empower GIS and location-aware services, while Multimedia DBs
enable storage and intelligent retrieval of rich media content. These systems are crucial for next-
generation applications across almost every domain, from defense to healthcare and
entertainment.
DEDUCTIVE DATABASE
A deductive database is a type of database that combines a traditional relational database with
logic-based reasoning, particularly using deductive logic (such as rules and facts) to derive new
information from stored data.
In a deductive database, you not only store data but also have the ability to define logical rules
and facts that can be applied to the data to infer new knowledge or derive conclusions. This
system uses a set of rules (often in logic programming languages like Prolog) to infer
relationships and new facts from existing ones, making it a powerful tool for complex querying
and problem-solving.
Here are key characteristics of a deductive database:
1. Rules and Facts: You define facts (basic information) and rules (logical inferences) that
the system uses to generate conclusions. For example, if "A is a parent of B" and "B is a
parent of C," the system can deduce that "A is a grandparent of C."
2. Recursive Queries: Deductive databases often support recursive queries, which means
they can ask about relationships that involve multiple levels, such as a family tree or a
network of connections.
3. Inferences: Deductive databases allow for the automatic inference of new facts based on
the existing data and the defined rules. This helps in scenarios like data analysis,
decision-making, and complex problem-solving.
4. Declarative Nature: The user defines what they want to know without necessarily
specifying how to compute it. The database engine then determines the most efficient
way to derive the answer.
5. Logic Programming: Deductive databases often employ a logic programming language
such as Datalog, which is a declarative language closely related to Prolog, to express
rules and queries.
Deductive databases are a sophisticated extension of traditional relational databases, leveraging
logic programming and reasoning to perform complex queries and inference over data. In more
advanced applications, they can provide significant benefits by allowing data-driven conclusions
to be drawn automatically, enhancing decision-making, analysis, and automation processes in
various fields.
Let’s dive into some advanced applications of deductive databases and how they are used in
real-world systems:
1. Knowledge Representation and Expert Systems
• Domain: Artificial Intelligence, Decision Support
• Application: Deductive databases are ideal for expert systems, which simulate human
expertise in specialized domains (e.g., medical diagnosis, financial analysis). In these
systems, the knowledge about the domain is stored as facts and rules.
• Example: In a medical expert system, rules like "If a patient has a fever and cough, then
they might have a respiratory infection" are stored. Given facts such as "This patient has
a fever and cough," the system can infer potential diagnoses.
• Benefits: The ability to deduce new facts based on existing knowledge, making the
system adaptable and capable of answering complex, context-sensitive queries.
2. Data Integration and Warehousing
• Domain: Big Data, Data Warehousing, Enterprise Integration
• Application: Deductive databases are crucial in integrating data from multiple
heterogeneous sources. By using logical rules, you can transform data from different
formats into a unified schema, infer missing data, or apply business rules for consistency
across systems.
• Example: In a business intelligence system, you can integrate data from different
departments (sales, inventory, HR) and use logical rules to deduce overall company
performance or future trends based on historical data.
• Benefits: The ability to automatically handle inconsistencies, apply complex business
rules, and unify disparate data sources.
3. Recursive Queries and Complex Graph Traversals
• Domain: Network Analysis, Social Networks, Supply Chains
• Application: Deductive databases excel in recursive queries and graph-based problems.
You can query hierarchical data (such as organizational charts, family trees, or network
topologies) and traverse graphs to find relationships or patterns.
• Example: In a social network analysis, rules can be defined to deduce indirect
relationships, such as friends of friends. "If A is friends with B, and B is friends with C,
then A and C are indirectly connected."
• Benefits: Recursive queries allow you to analyze multi-level relationships (e.g.,
grandparent relationships or supply chain dependencies) effortlessly.
4. Semantic Web and Linked Data
• Domain: Web Technologies, Data Interoperability
• Application: The Semantic Web aims to make internet data machine-readable by
structuring data using ontologies. Deductive databases are used to manage these
ontologies, infer new facts based on linked data, and support reasoning about
relationships between data points on the web.
• Example: In a linked-data context, if one dataset contains information about authors and
another about books, a deductive database could infer which author wrote which books,
even if the relationship is not explicitly stored.
• Benefits: Facilitates automated reasoning across interconnected data sources and
supports more advanced querying on the web.
5. Automated Planning and Decision Making
• Domain: Robotics, Process Management, Supply Chain Optimization
• Application: Deductive databases can be used in automated planning systems to
generate step-by-step procedures for achieving specific goals based on available
resources and constraints.
• Example: In a robotics system, rules about the robot's capabilities (e.g., "If the robot has
a gripper, it can pick up objects") and the environment (e.g., "If there is an object in front
of the robot, it can move towards it") allow the system to plan actions dynamically.
• Benefits: Supports intelligent decision-making based on real-time data and predefined
knowledge.
6. Legal and Compliance Systems
• Domain: Law, Compliance, Regulatory Systems
• Application: Deductive databases are used to model and analyze complex legal rules and
regulations, automatically verifying if an action or scenario complies with legal
requirements.
• Example: In a compliance system for financial services, you can define rules about
acceptable financial transactions (e.g., "A transaction over $10,000 must be reported")
and infer whether transactions are in compliance with regulations.
• Benefits: Helps organizations ensure they comply with legal requirements by
automatically detecting potential violations.
7. Complex Event Processing (CEP)
• Domain: Real-Time Systems, Monitoring, and Alerting
• Application: Deductive databases can be used to detect patterns in real-time data streams
by defining rules that trigger actions or alerts when certain conditions are met.
• Example: In a stock market monitoring system, you could define a rule: "If the price of
stock X exceeds a certain threshold and the volume traded increases by 20%, trigger an
alert for potential insider trading."
• Benefits: Provides real-time insights and alerts based on complex patterns and
relationships in dynamic data.
8. Database Query Optimization
• Domain: Database Management Systems (DBMS), Query Processing
• Application: Deductive databases can assist in optimizing queries by leveraging logical
inference to reduce the complexity of queries and avoid redundant computations. The
system can reason about how to combine facts and rules in an optimized way.
• Example: A complex query can be simplified using logical inference rules to avoid
unnecessary table joins or data retrieval steps.
• Benefits: Improves the performance of complex queries by reasoning about relationships
between data and applying optimization techniques.
Technologies Supporting Deductive Databases:
• Datalog: A declarative logic programming language, often used in deductive databases. It
is a subset of Prolog and is commonly used for querying and reasoning over logical data.
• Logic Programming: Systems like Prolog provide the logical framework for
implementing deductive databases, enabling automatic inference.
• Ontology Languages: Languages like RDF (Resource Description Framework) and
OWL (Web Ontology Language) are frequently used to define the rules and relationships
in a semantic web-based deductive database.
• Graph Databases: Technologies like Neo4j or GraphDB, though not strictly deductive
databases, support similar use cases involving complex relationships, recursion, and
pattern recognition.
Challenges and Considerations:
• Complexity: Deductive databases can be computationally expensive, especially with
large datasets and complex rules.
• Scalability: While they provide powerful inference capabilities, scaling deductive
databases for massive amounts of data and complex rules can be challenging.
• Performance: Recursive queries and logical inferences can slow down response times,
especially if the underlying database system is not optimized for such operations.
In conclusion, deductive databases offer a robust framework for solving complex problems
involving logic, reasoning, and inference across large datasets. They enable more advanced
applications such as expert systems, real-time event processing, legal compliance, and more,
offering greater flexibility and intelligence in decision-making and data analysis.
OBJECT-BASED DATABASES
Object-based databases (OBD) are a type of database that integrate object-oriented programming
concepts into the database management system (DBMS). Unlike traditional relational databases,
which use tables to store data in rows and columns, object-based databases store data as objects,
similar to how data is represented in object-oriented programming languages (like Java, C++, or
Python).
Here are some advanced concepts related to object-based databases in the context of database
management systems:
1. Object-Oriented Concepts
Object-based databases implement the following core object-oriented concepts:
• Encapsulation: Data and operations (methods) are bundled together in a single unit
(object).
• Inheritance: Objects can inherit properties and methods from other objects.
• Polymorphism: Objects can be treated as instances of their parent class, and their
methods can be called without knowing the specific type of object.
• Abstraction: Only essential data is exposed, hiding internal complexity.
2. Objects and Object Identity
• Object: In object-based databases, an object represents a real-world entity, such as a
customer or an order. Each object is self-contained with its own properties (attributes)
and methods (functions).
• Object Identity: Every object in an object database has a unique identifier (OID) that
distinguishes it from other objects.
3. Persistent Objects
In an object-based DBMS, objects that are created during the execution of a program can be
stored persistently in the database. These objects maintain their state even after the program that
created them has terminated.
4. Complex Data Types
Object-based databases allow complex data types, such as arrays, lists, sets, and even other
objects, to be stored directly in the database, unlike relational databases where data must be
broken down into primitive data types (e.g., integers, strings).
5. Object Query Languages (OQL)
Object-based databases use object query languages like OQL, which is a query language
designed to work with objects and their relationships. OQL allows querying and retrieving
objects using object-oriented concepts like classes and inheritance.
6. Mapping Object-Oriented Models to Relational Models
One of the challenges of object-based databases is mapping complex object-oriented models to
relational databases, which is often done using Object-Relational Mapping (ORM) tools. Some
object-based databases provide a seamless way to integrate with relational databases, allowing
developers to work in an object-oriented manner without having to worry about relational
schemas.
7. Advantages of Object-Based Databases
• Natural representation: Real-world entities can be represented directly as objects,
leading to a more intuitive mapping between the application and the database.
• Support for complex data types: Objects can contain multiple types of data, including
other objects, which makes them suitable for applications that require handling complex
data (e.g., CAD systems, multimedia databases).
• Inheritance and Reusability: The use of inheritance in object-based databases allows
for code and data reuse, making the system more flexible.
8. Disadvantages of Object-Based Databases
• Complexity: Object-based databases can be more complex to design and maintain,
especially if developers are not familiar with object-oriented programming principles.
• Performance: Object databases can sometimes have performance issues due to the need
to map objects to database records, particularly for large or complex datasets.
• Lack of standardization: Unlike relational databases, object-based databases do not
have a widely adopted standard, leading to potential portability issues.
9. Examples of Object-Based Databases
• ObjectDB: A Java-based object database that provides an object-oriented interface for
storing Java objects directly.
• db4o: A database for Java and .NET applications that allows developers to store complex
objects without needing to translate them into relational models.
• Versant Object Database: A database that supports object-oriented programming
concepts, optimized for handling large-scale, complex applications.
10. Object-Relational Hybrid Databases
Some modern databases combine object-oriented features with relational database features,
known as Object-Relational Databases (ORDs). These systems attempt to bridge the gap
between relational and object-oriented databases by supporting both types of data models.
Examples include PostgreSQL (with support for user-defined types and inheritance) and Oracle
Database (with object-relational features).
11. Usage in Real-World Applications
Object-based databases are particularly well-suited for applications that involve complex data
models, such as:
• CAD/CAM systems: These systems require the ability to model objects that have both
attributes and behaviors.
• Multimedia databases: Object databases can store multimedia content (images, video,
audio) more effectively than relational databases.
• Telecommunications: Telecommunications systems involve complex objects, such as
calls and customer profiles, which are naturally represented in object-oriented models.
Isolation ensures that transactions do not interfere with each other when executed concurrently, which is essential for maintaining data consistency. It prevents other transactions from accessing data that is being modified by a current transaction until the modification is complete, often enforced by the database's concurrency control subsystem. This helps in maintaining integrity and consistency within the database .
XML's flexibility and platform-independent nature make it ideal for data exchange between disparate systems and for integrating legacy data. However, while XML supports complex data models better than traditional relational databases, it can be verbose and may require significant storage space. Furthermore, processing XML to extract information can be computationally intensive, necessitating powerful systems for efficient management .
The ACID properties—Atomicity, Consistency, Isolation, and Durability—ensure that database transactions are processed reliably. Atomicity guarantees that a transaction is either fully completed or not at all. Consistency ensures that transactions transform the database from one valid state to another. Isolation ensures that concurrent execution of transactions results in a system state that would be obtained if transactions were executed serially. Durability ensures that once a transaction has been committed, it remains so, even in the case of a system failure .
The 'Failed' state in transaction processing indicates that an error occurred, preventing the transaction from completing successfully. In such a case, to maintain data consistency and integrity, the system often rolls back to revert the database to its prior state or perform necessary corrections. This prevents the database from ending up in an inconsistent state caused by partial completion of transactions .
A partially committed state occurs when a transaction has executed its final operation but has not yet saved the data permanently to the database. It represents a transitional phase before the database reflects the changes made by the transaction. In contrast, a committed state signifies that all operations have been successfully executed, and all changes are now permanently recorded in the database system .
The Rollback operation supports transaction management by undoing all operations of a transaction when it fails, which ensures that the database is returned to its previous consistent state. This is critical to maintaining database integrity, as it prevents the application of incomplete or erroneous changes caused by transaction failures .
Serializability ensures that the outcome of executing transactions concurrently is the same as if transactions were executed serially. It is a fundamental mechanism in database systems to maintain consistency in concurrent transactions by preventing anomalies and ensuring that concurrent execution does not lead to conflicts or inconsistent system states .
'Shared Nothing Architecture' in parallel databases implies that each node operates independently with its own resources, including CPU, memory, and storage. This architecture facilitates high scalability and fault isolation, as nodes do not share resources and communicate only via a network when needed. It supports efficient query execution by distributing processing loads and mitigates bottlenecks, improving the overall performance .
Atomicity ensures that all operations within a transaction are completed; if even a single operation fails, the entire transaction is aborted, thereby maintaining the database's consistency. Without Atomicity, partial transactions would lead to data inconsistencies, such as the case where funds are deducted from one account but not credited to another, which would leave the system in an incorrect state after a failure .
Deductive databases face challenges in scalability due to the computational intensity of logical inferences and complex rule processing across large datasets. This can lead to performance bottlenecks, especially with recursive queries and slow response times if not optimized. These limitations may hinder the adoption of deductive databases in real-world scenarios where data volumes are vast and real-time processing is required .