0% found this document useful (0 votes)

9 views32 pages

Chapter 6-Distributed Database System

Chapter 6 discusses distributed databases and client-server architectures, covering key concepts such as data fragmentation, replication, and allocation. It outlines the advantages and disadvantages of distributed databases, types of distributed database systems, and the complexities involved in query processing and concurrency control. The chapter emphasizes the importance of optimizing data transfer and ensuring data integrity across multiple sites in a distributed environment.

Uploaded by

abduseidmehammed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views32 pages

Chapter 6-Distributed Database System

Uploaded by

abduseidmehammed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Chapter 6

Distributed Databases
and
Client-Server Architectures

1
Outline

1. Distributed Database Concepts

2. Data Fragmentation, Replication and Allocation
3. Types of Distributed Database Systems
4. Query Processing
5. Concurrency Control and Recovery
6. 3-Tier Client-Server Architecture

2
1. Distributed Database Concepts
▪ A transaction can be executed by multiple networked
computers in a unified manner.
▪ A distributed database (DDB) processes Unit of execution (a
transaction) in a distributed manner.
▪ A distributed database (DDB) can be defined as :
– A collection of multiple logically related database
distributed over a computer network, and a distributed
database management system as a software system that
manages a distributed database while making the
distribution transparent to the user.
– The physical placement of data (files, relations, etc.) which
is not known to the user (distribution transparency).

3
• Example : EMPLOYEE, PROJECT, and WORKS_ON
tables may be fragmented horizontally and stored with
possible replication as shown below.

4
❑ Advantages of DDB :
i. Distribution and Network transparency:
▪ Users do not have to worry about operational details of
the network.
– There is Location transparency, which refers to
freedom of issuing command from any location
without affecting its working.
– Then there is Naming transparency, which allows
access to any names object (files, relations, etc.) from
any location.
ii. Replication transparency:
▪ It allows to store copies of a data at multiple sites as
shown in the above diagram.
▪ This is done to minimize access time to the required
data.
iii. Fragmentation transparency:
• Allows to fragment a relation horizontally (create a
subset of rows of a relation) or vertically (create a subset
5
of columns of a relation).
iv. Increased reliability and availability:
▪ Reliability refers to system live time, that is, system is
running efficiently most of the time. Availability is the
probability that the system is continuously available
(usable or accessible) during a time interval.
▪ A distributed database system has multiple nodes
(computers) and if one fails then others are available to
do the job.
v. Improved performance:
▪ A distributed DBMS fragments the database to keep data
closer to where it is needed most.
▪ This reduces data management (access and
modification) time significantly.
vi. Easier expansion (scalability):
▪ Allows new nodes (computers) to be added anytime
without changing the entire configuration. 6
❑ Disadvantages of Distributed Database

i. Complexity- The data replication , failure recovery , network

management …make the system more complex than the central
DBMSs
ii. Cost- Since DDBMS needs more people and more hardware ,
maintaining and running the system can be more expensive
than the centralized system .
[Link] of connecting Dissimilar Machine- Additional layers
of operation system software are needed to translate and
coordinate the flow of data between machines.
[Link] integrity and security problem - Because data maintained
by distributed systems can be accessed at locations in the
network, controlling the integrity of a database can be
difficult.
7
2. Data Replication and Fragmentation: Distributed data storage
▪ There are two approaches to store the relation in the distributed
database : Replication and Fragmentation
I. Data Replication
▪ The system maintain several identical copies of the relation & store
each copy at a different site
▪ In general it enhance the performance of read operation and
increase the availability of data to read only transaction.
Advantages :
▪ Increased availability (data can be accessed even if one site fails).
▪ Reduced latency (data is closer to the user).
Challenges :
▪ Ensuring consistency across replicas.
▪ Managing update propagation and synchronization.
▪

8
I. Data Fragmentation
– To divide data into smaller, manageable pieces for efficient querying
and storage.
–Types :
▪ Horizontal Fragmentation : Divides a table into subsets of rows.
Each fragment contains a subset of the rows but all columns.
Example: Dividing an employee table into two fragments
one for employees in Department A and another for Department
B.
▪ Vertical Fragmentation : Divides a table into subsets of columns.
Each fragment contains a subset of columns but all rows.
Example: Splitting an employee table into one fragment with
personal details (e.g., name, address) and another with
employment details (e.g., salary, department).
▪ Hybrid Fragmentation : Combines horizontal and vertical
fragmentation for more granular control.
9
– Consider the Employee relation. A vertical fragment of can be
created by keeping the values of Name, Bdate, Sex, and
Address.
– Because there is no condition for creating a vertical fragment,
each fragment must include the primary key attribute of the
parent relation Employee. In this way all vertical fragments of
a relation are connected.
▪ Representation
▪ There three rules that must be followed during fragmentation
▪ Completeness – if a relation r is decomposed into
fragments r1, r2… rn , each data item that can be found in r
must appear in at least one fragment
▪ Reconstruction – it must be possible to define a relation
operation that will reconstruct the relation r from fragments
▪ Disjointness –if a data item di appears in fragment ri ,
then it shouldn’t appear in any other fragment
10
–The main reasons for fragmenting a relation are

• Efficiency- data that is not needed by the local applications is

not stored

• Parallelism – a transaction can be divided

• into several subqueries that operate on fragments which will

increase the degree of concurrency

–but reconstruction of the whole relation will require accessing

data from all sites containing part of the relation

11
3. Data Allocation
– Purpose : To decide where to place fragments or replicas in
the distributed system.

Strategies :

– Centralized Allocation : All data is stored at a single site.

– Partitioned Allocation : Data is divided and stored

across multiple sites without replication.

– Replicated Allocation : Data is replicated across multiple

sites for redundancy.

12
3. Types of Distributed Database Systems

• Homogeneous Window
– All sites of the database Site 5 Unix
Oracle Site 1
system have identical Oracle
setup, i.e., same database Window
system software. Site 4 Communications
– The system may have network
little or no local
autonomy Oracle
– The underlying operating Site 3 Site 2
systems can be a mixture Linux Oracle Linux Oracle
of Linux, Window, Unix,
etc.
– Example: Multiple
branches of a bank using
the same DBMS software. 13
• Heterogeneous
– Definition : A distributed database system where at least one
database is from a different vendor (e.g., Oracle, MySQL,
PostgreSQL).
– Key Characteristics :
– Each site may use a different DBMS with varying schemas,
query languages, and data models.
– Translation mechanisms are required to enable
communication between heterogeneous systems. Object Unix Relational
Oriented Site 5 Unix
– Data integration is achieved Site 1
Hierarchical
through middleware or gateways Window
Communications
Site 4
that translate queries and data formats. network

– Example : Object
Network
DBMS
– Site A uses Oracle for its relational database. Site 3
Linux
Oriented Site 2
Linux
Relational

– Site B uses MongoDB for its NoSQL database.

– Middleware translates SQL queries into
14
MongoDB-compatible queries when accessing data from Site B.
–Federated: A distributed database system where each site may run a different DBMS, but data
access is managed through a single conceptual schema (global schema).
–Key Characteristics :
– Global Schema : A unified view of all databases in the system,
allowing users to query data as if it were stored in a single database.
– Local Autonomy : Sites retain some degree of autonomy, but they
must adhere to a centralized access policy defined by the global schema.
– Integration : The global schema integrates local schemas, providing a
consistent interface for users.
Implications : -The degree of local autonomy is minimum because each
site must conform to the centralized access policy.
– Query processing involves mapping global queries to local queries
using the global schema.
Example : Site A uses PostgreSQL and Site B uses MySQL .
– A federated system defines a global schema that combines customer
data from both sites.
– Users can execute a query like SELECT * FROM Customers without
worrying about which database stores the data. 15
▪ Federated Database Management Systems Issues
– Differences in data models:
• Relational, Objected oriented, hierarchical, network, etc.
– Differences in constraints:
• Each site may have their own data accessing and
processing constraints.
– Differences in query language:
• Some site may use SQL, some may use SQL-89, some
may use SQL-92, and so on.

16
–Multidatabase: A distributed database system where there is no
single conceptual global schema . Instead, a schema is constructed
dynamically as needed by the application software.
• Key Characteristics : Dynamic Schema Construction : Applications
construct a schema on-the-fly based on the specific data access
requirements.
• No Centralized Policy : There is no global schema or centralized
access policy. Each site maintains full autonomy over its data.
• Loose Coupling : Sites are loosely coupled, and data integration
is handled at the application level.
• Implications : The degree of local autonomy is maximum because
each site operates independently.
• Query processing is more complex, as applications must handle
schema mismatches and data translation manually.
• Example : Site A uses Oracle , Site B uses MongoDB , and Site C uses
Cassandra . 17
4. Query Processing in Distributed Databases
▪ Query processing in distributed databases involves breaking down a user query
into subqueries, executing them on relevant nodes, and combining the results. Key
steps include:
Query Decomposition :
▪ Parsing, validating, and transforming the query into a relational algebra
expression.
Example: Breaking a SQL query into smaller operations like selection, projection,
and join.
Data Localization :
▪ Identifying which fragments or nodes contain the required data.
▪ Example: Mapping a query to specific horizontal or vertical fragments.
Global Query Optimization :
▪ Deciding the most efficient way to execute the query, considering factors like
network latency, data transfer costs, and local processing costs.
e.g: Choosing between retrieving data from a remote site or performing a local join.
▪ Execution and Result Integration :
▪ Executing subqueries on relevant nodes and combining the results to produce the
18
final output.
4. Query Processing in Distributed Databases
▪ Issues
– Cost of transferring data (files and results) over the network.
• This cost is usually high, so some optimization is necessary.
• Example: suppose there are three sites. Where the relation Employee
at site 1, Department at Site 2 and no relation at site 3
– Employee at site 1. 10,000 rows. Row size = 100 bytes. Table
size = 106 bytes.
– Department at Site 2. 100 rows. Row size = 35 bytes. Table size
= 3,500 bytes.
– And a query is initiated from S3 to retrieve employees [First Name (15
byte long), Last name (15 byte long) and Department name (10 byte long)
total of 40 bytes]
• Q: For each employee, retrieve employee Fname, Lname, and
department name
• Q: Fname,Lname,Dname (Employee Dno = Dnumber Department)
Fname Minit Lname SSN Bdate Address Sex Salary Superssn Dno

Dname Dnumber Mgrssn Mgrstartdate 19

▪ Assumption
– The result of this query will have 10,000 rows, assuming
that every employee is related to a department.
– Suppose each result row 40 bytes long. The query is
submitted at site 3 and the result is sent to this site.
– Problem: Employee and Department relations are not
present at site 3.

• what is your best strategy that can optimize data

transportation cost?

20
• Strategies : Minimizing data transfer.
1. Transfer Employee and Department to site 3.
• Total transfer bytes = 1,000,000 + 3500 = 1,003,500 bytes.
2. Transfer Employee to site 2, execute join at site 2 and send
the result to site 3.
• Transferring employees data from site 1 to site 2: 1,000,000 bytes
• Query result size = 40 * 10,000 = 400,000 bytes.
• Total transfer size = 1,000,000 + 400,000 = 1,400,000 bytes.
3. Transfer Department relation to site 1, execute the join at site
1, and send the result to site 3.
• Data Transfer from site 2 to site 1: 3500 bytes
• Query result size = 40 * 10,000 = 400,000 bytes
• Total bytes transferred = 3500+ 400,000 = 403,500 bytes.
– Preferred approach: strategy 3.

21
Example 2 : Consider the query
– Q’: For each department, retrieve the department name ,Fname
and LName of the department manager
• Relational Algebra expression:
– Fname,Lname,Dname (Employee Mgrssn = SSN Department)
• The result of this query will have 100 tuples, assuming that every
department has a manager, the execution strategies are:
1. Transfer Employee and Department to the result site and
perform the join at site 3.
• Total bytes transferred = 1,000,000 + 3500 = 1,003,500
bytes.
2. Transfer Employee to site 2, execute join at site 2 and send the
result to site 3.
• data ransfer from site 1 to site 2: 1,000,000
• Query result size = 40 * 100 = 4000 bytes.
• Total transfer size = 4000 + 1,000,000 = 1,004,000 bytes.
3. Transfer Department relation to site 1, execute join at site 1 and
send the result to site 3.
• Total transfer size = 4000 + 3500 = 7500 bytes. 22
Preferred strategy: Choose strategy 3.
Example 3: Now suppose the result is needed at site2. Possible strategies
:
1. Transfer Employee relation to site 2, execute the query and
present the result to the user at site 2.
• Total transfer size = 1,000,000 bytes for both queries Q
and Q’.
2. Transfer Department relation to site 1, execute join at site 1
and send the result back to site 2.
• Total transfer size for
– Q = 400,000 + 3500 = 403,500 bytes
– Q’ = 4000 + 3500 = 7500 bytes.

❑ Preferred strategy: Choose strategy 2.

23
5. Concurrency Control and Recovery
▪ Distributed Databases encounter a number of concurrency
control and recovery problems which are not present in
centralized databases. Some of them are listed below.

– Dealing with multiple copies of data items:

➢The concurrency control must maintain global
consistency. Likewise the recovery mechanism must
recover all copies and maintain consistency after
recovery.
– Failure of individual sites:
• Database availability must not be affected due to the
failure of one or two sites and the recovery scheme must
recover them before they are available for use.

24
– Communication link failure:

• This failure may create network partition which would affect

database availability even though all database sites may be
running.

– Distributed commit:

• A transaction may be fragmented and they may be executed by a

number of sites. This require a two or three-phase commit
approach for transaction commit.

– Distributed deadlock:

• Since transactions are processed at multiple sites, two or more

sites may get involved in deadlock. This must be resolved in a
distributed manner. 25
5. 1 Distributed Concurrency control
i. Primary site technique: A single site is designated as a
primary site which serves as a coordinator for transaction
management.

Primary site
Site 5
Site 1

Site 4 Communications neteork

Site 3 Site 2

26
• Transaction management:
– Concurrency control and commit are managed by this site.
– In two phase locking, this site manages locking and
releasing data items. If all transactions follow two-phase
policy at all sites, then serializability is guaranteed.

– Advantages:
• An extension to the centralized two phase locking so
implementation and management is simple.
• Data items are locked only at one site but they can be
accessed at any site.
– Disadvantages:
• All transaction management activities go to primary site
which is likely to overload the site.
• If the primary site fails, the entire system is inaccessible.
– To aid recovery a backup site is designated which behaves as
a shadow of primary site. In case of primary site failure,
backup site can act as primary site. 27
i. Primary Copy Technique:

– In this approach, instead of a site, a data item partition is

designated as primary copy. To lock a data item just the
primary copy of the data item is locked.

• Advantages:

– Since primary copies are distributed at various sites, a

single site is not overloaded with locking and unlocking
requests.

• Disadvantages:

– Identification of a primary copy is complex. A distributed

directory must be maintained, possibly at all sites. 28
Recovery from a coordinator failure

• In both approaches a coordinator site or copy may become unavailable. This

will require the selection of a new coordinator.

– Primary site approach with no backup site:

• Aborts and restarts all active transactions at all sites. Elects a new
coordinator and initiates transaction processing.

– Primary site approach with backup site:

• Suspends all active transactions, designates the backup site as the

primary site and identifies a new back up site.

• Primary site receives all transaction management information to

resume processing.

– Primary and backup sites fail or no backup site:

• Use election process to select a new coordinator site. 29

i. Concurrency control based on voting:

– There is no primary copy of coordinator.

– Send lock request to sites that have data item.

– If majority of sites grant lock then the requesting transaction

gets the data item.

– Locking information (grant or denied) is sent to all these

sites.

– To avoid unacceptably long wait, a time-out period is

defined. If the requesting transaction does not get any vote
information then the transaction is aborted.
30
Client-Server Database Architecture

• It consists of clients running client software, a set of servers

which provide all database functionalities and a reliable
communication infrastructure.

Server 1 Client 1

Client 2

Server 2 Client 3

Server n Client n

31
• Clients reach server for desired service, but server does reach
clients.
• The server software is responsible for local data management
at a site, much like centralized DBMS software.
• The client software is responsible for most of the distribution
function.
• The communication software manages communication among
clients and servers.
• The processing of a SQL queries goes as follows:
– Client parses a user query and decomposes it into a number
of independent sub-queries. Each subquery is sent to
appropriate site for execution.
– Each server processes its query and sends the result to the
client.
– The client combines the results of subqueries and produces
the final result. 32

Distributed Databases Overview and Architecture
No ratings yet
Distributed Databases Overview and Architecture
29 pages
Understanding Distributed Database Systems
No ratings yet
Understanding Distributed Database Systems
42 pages
Data Allocation in Distributed Databases
No ratings yet
Data Allocation in Distributed Databases
45 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
26 pages
Dbms 4th 4
No ratings yet
Dbms 4th 4
73 pages
Understanding Distributed Database Systems
No ratings yet
Understanding Distributed Database Systems
27 pages
Overview of Distributed Database Concepts
No ratings yet
Overview of Distributed Database Concepts
52 pages
Distributed Databases and Architectures
No ratings yet
Distributed Databases and Architectures
27 pages
Distributed Database Management Overview
No ratings yet
Distributed Database Management Overview
36 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
66 pages
Distributed Database Systems Overview
No ratings yet
Distributed Database Systems Overview
35 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
15 pages
Distributed Database System
No ratings yet
Distributed Database System
13 pages
Centralized vs Distributed Databases
No ratings yet
Centralized vs Distributed Databases
29 pages
Lecture 5-1
No ratings yet
Lecture 5-1
31 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
16 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
36 pages
Distributed Database Architecture Overview
No ratings yet
Distributed Database Architecture Overview
27 pages
Unit 5 - 1
No ratings yet
Unit 5 - 1
7 pages
UC Irvine's Distributed Database Systems
No ratings yet
UC Irvine's Distributed Database Systems
67 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
14 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
30 pages
DBMS Unit - 4 DDB
No ratings yet
DBMS Unit - 4 DDB
12 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
95 pages
Distributed Databases & Client-Server Systems
No ratings yet
Distributed Databases & Client-Server Systems
25 pages
Overview of Distributed Database Systems
100% (1)
Overview of Distributed Database Systems
24 pages
Advanced Distributed Database Concepts
No ratings yet
Advanced Distributed Database Concepts
59 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
9 pages
Understanding DDB Replication and Fragmentation
No ratings yet
Understanding DDB Replication and Fragmentation
67 pages
Distributed Database Systems Overview
No ratings yet
Distributed Database Systems Overview
41 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
22 pages
Distributed Database System Overview
No ratings yet
Distributed Database System Overview
46 pages
Advantages of Distributed Databases
No ratings yet
Advantages of Distributed Databases
16 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
24 pages
Distributed Database (Mtcs041)
No ratings yet
Distributed Database (Mtcs041)
65 pages
Overview of Distributed Databases
No ratings yet
Overview of Distributed Databases
160 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
52 pages
Overview of Distributed Databases
No ratings yet
Overview of Distributed Databases
11 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
12 pages
Advanced Distributed Database Concepts
No ratings yet
Advanced Distributed Database Concepts
20 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
25 pages
Overview of Distributed Databases
No ratings yet
Overview of Distributed Databases
19 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
36 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
14 pages
Types and Allocation in DDBMS
No ratings yet
Types and Allocation in DDBMS
56 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
9 pages
Client-Server Architecture in DDBs
No ratings yet
Client-Server Architecture in DDBs
60 pages
Advanced Distributed Databases Overview
No ratings yet
Advanced Distributed Databases Overview
22 pages
Types and Design of Distributed Databases
No ratings yet
Types and Design of Distributed Databases
22 pages
Distributed Databases and Client-Server Architectures
No ratings yet
Distributed Databases and Client-Server Architectures
41 pages
Unit I
No ratings yet
Unit I
6 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
42 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
44 pages
Understanding MySQL Distributed Databases
No ratings yet
Understanding MySQL Distributed Databases
22 pages
Understanding Distributed Database Systems
No ratings yet
Understanding Distributed Database Systems
7 pages
El Clon Episode Download Links
No ratings yet
El Clon Episode Download Links
3 pages
RDBMS Using MySQL Syllabus
No ratings yet
RDBMS Using MySQL Syllabus
2 pages
AI's Impact on Library User Services
No ratings yet
AI's Impact on Library User Services
20 pages
NoSQL Databases: Types and Benefits
No ratings yet
NoSQL Databases: Types and Benefits
2 pages
Step-By-Step Tutorial On Google Data Studio
No ratings yet
Step-By-Step Tutorial On Google Data Studio
8 pages
Business Analyst Resume Overview
No ratings yet
Business Analyst Resume Overview
1 page
Wos 05 06 2020
No ratings yet
Wos 05 06 2020
19 pages
Edp CH-6
No ratings yet
Edp CH-6
29 pages
Sequence Alignment and BLAST Overview
No ratings yet
Sequence Alignment and BLAST Overview
21 pages
Mobile Device Security in Remote Work
No ratings yet
Mobile Device Security in Remote Work
12 pages
Data-Driven Decision-Making Explained
No ratings yet
Data-Driven Decision-Making Explained
3 pages
SLNX Administrators Guide PDF
No ratings yet
SLNX Administrators Guide PDF
876 pages
Plant Reproduction Presentation Slides
No ratings yet
Plant Reproduction Presentation Slides
25 pages
Components of HRIS in HR Management
No ratings yet
Components of HRIS in HR Management
6 pages
DW Migraiton Method
No ratings yet
DW Migraiton Method
8 pages
Merise Data Modeling for Schools
No ratings yet
Merise Data Modeling for Schools
8 pages
Understanding Database Keys in DBMS
No ratings yet
Understanding Database Keys in DBMS
19 pages
Bulba Code ICE Model Response Testing
No ratings yet
Bulba Code ICE Model Response Testing
8 pages
MasterCard's Advanced Fraud Solutions with Cloudera
No ratings yet
MasterCard's Advanced Fraud Solutions with Cloudera
2 pages
LinkedIn's Big Data Ecosystem Overview
No ratings yet
LinkedIn's Big Data Ecosystem Overview
22 pages
Project Staffing and Management Overview
No ratings yet
Project Staffing and Management Overview
11 pages
Database Connection Essentials in JDBC
No ratings yet
Database Connection Essentials in JDBC
10 pages
NetBackup83 9x Tuning Guide
No ratings yet
NetBackup83 9x Tuning Guide
165 pages
NLP Techniques in Fraud Detection
No ratings yet
NLP Techniques in Fraud Detection
26 pages
Nitin Gupta: Data Scientist Profile
No ratings yet
Nitin Gupta: Data Scientist Profile
2 pages
Netapp Fas Storage Arrays: The #1 Customer-Trusted, Cloud-Optimized Unified Storage Platform
No ratings yet
Netapp Fas Storage Arrays: The #1 Customer-Trusted, Cloud-Optimized Unified Storage Platform
5 pages
Systematic Searching
100% (4)
Systematic Searching
353 pages
Fixing Informatica -1101 Address Error
No ratings yet
Fixing Informatica -1101 Address Error
2 pages
Class 10 Data Science Notes
No ratings yet
Class 10 Data Science Notes
3 pages
Technical SEO Basics Guide
No ratings yet
Technical SEO Basics Guide
3 pages

Chapter 6-Distributed Database System

Uploaded by

Chapter 6-Distributed Database System

Uploaded by

Chapter 6

1. Distributed Database Concepts

i. Complexity- The data replication , failure recovery , network

• Efficiency- data that is not needed by the local applications is

• Parallelism – a transaction can be divided

• into several subqueries that operate on fragments which will

–but reconstruction of the whole relation will require accessing

– Centralized Allocation : All data is stored at a single site.

– Partitioned Allocation : Data is divided and stored

– Replicated Allocation : Data is replicated across multiple

– Site B uses MongoDB for its NoSQL database.

Dname Dnumber Mgrssn Mgrstartdate 19

• what is your best strategy that can optimize data

❑ Preferred strategy: Choose strategy 2.

– Dealing with multiple copies of data items:

• This failure may create network partition which would affect

• A transaction may be fragmented and they may be executed by a

• Since transactions are processed at multiple sites, two or more

Site 4 Communications neteork

– In this approach, instead of a site, a data item partition is

– Since primary copies are distributed at various sites, a

– Identification of a primary copy is complex. A distributed

• In both approaches a coordinator site or copy may become unavailable. This

– Primary site approach with no backup site:

– Primary site approach with backup site:

• Suspends all active transactions, designates the backup site as the

• Primary site receives all transaction management information to

– Primary and backup sites fail or no backup site:

• Use election process to select a new coordinator site. 29

– There is no primary copy of coordinator.

– Send lock request to sites that have data item.

– If majority of sites grant lock then the requesting transaction

– Locking information (grant or denied) is sent to all these

– To avoid unacceptably long wait, a time-out period is

• It consists of clients running client software, a set of servers

You might also like