0% found this document useful (0 votes)

15 views12 pages

DBMS Unit - 4 DDB

The document discusses distributed databases (DDB), detailing their concepts, architectures, and techniques such as fragmentation, replication, and allocation. It outlines the advantages and disadvantages of DDBs, types of distributed database systems (homogeneous and heterogeneous), and various architectures including centralized, pure distributed, federated, and peer-to-peer systems. Additionally, it compares distributed computing with parallel computing and highlights the differences between centralized and distributed databases.

Uploaded by

ranjithkumar312003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views12 pages

DBMS Unit - 4 DDB

Uploaded by

ranjithkumar312003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

MCA – DBS: UNIT-4: DDB (CH-25.1, 25.2, 25.3, 25.

Distributed Databases: Distributed Database Concepts, Types of Distributed Database

Systems, Distributed Database Architectures, Data Fragmentation, Replication, and
Allocation Techniques for Distributed Database Design.

Distributed Database
Distributed databases bring the advantages of distributed computing to the database
management domain.

DDB technology resulted from a merger of two technologies: database technology, and
network and data communication technology.

DDB is a collection of multiple, logically interrelated databases distributed over a computer

network.

DDBMS: It is a software system that manages a distributed database while making

distribution transparent to the user.

For a database to be called distributed, these minimum conditions should be satisfied:

Network Connection: All database sites (computers) must be connected via a

communication network to share data and commands, as shown later in Figure(c).

Logical Relationship:
The data stored in different sites must be logically related.

No Uniformity Required:
The sites can differ in their data, hardware, or software — they don’t have to be the same.
Distributed Database Concepts:
i) Fragmentation:

The process of dividing the database into a smaller multiple parts is called fragmentation.

These fragments are stored at different locations.

The data fragmentation process should be carried out in such a way that the reconstruction
of original database from the fragments is possible.

The system partitions/divides the relation into several fragments, and stores each fragment
at different sites.

Horizontal data Fragmentation

 It breaks relation R by assigning each tuple of R to one or more fragments.

 Each fragment is a subset of the tuples in original relation R.

 Horizontal (using union operation) → R ⟨R₁, R₂⟩ → R₁ ∪ R₂ = R

Vertical data Fragmentation

 It breaks relation R by decomposing schema.

 Each fragment is a subset of the attributes of the original relation R.

 Vertical (using join operation) → R ⟨R₁, R₂⟩ → R₁ ⨝ R₂ = R

Mixed Fragmentation:

It is a combination of both horizontal and vertical fragmentation.

Original relation is obtained by the combination of join and union operations.

ii) Replication:

It means storing a copy (or replica) of a relation or relation fragments in two or more sites.

Full Replication:

Distribution of entire relation at all the sites.

Partial Replication:

Only some fragments of a relation are replicated.

Why Replication is Desirable:

i) Increased availability of data

ii) Better performance
iii) Transparency:

In distributed system, the user should be able to access the database exactly as if the system
were local.
Hiding details such as data storage, how data can be accessed is called as data transparency.

Types of Transparency:

i) Location transparency: refers to the fact that the command used to perform a task is
independent of the location of the data and the location of the node where the command
was issued.
ii) Fragmentation transparency: Fragmentation transparency makes the user unaware of the
existence of fragments.
iii) Replication transparency: Replication transparency makes the user unaware of the
existence of these copies. Copies of the same data objects may be stored at multiple sites for
better availability, performance, and reliability.
iv) Naming transparency: implies that once a name is associated with an object, the named
objects can be accessed unambiguously without additional specification as to where the
data is located.

iv) Autonomy:

Autonomy determines the extent to which individual nodes or DBs in a connected DDB can
operate independently.

Autonomy refers to the degree of independence each site (or node) in a distributed
database has over its own operations — such as managing data, running queries, or
handling users.

v) Reliability and Availability:

Reliability is broadly defined as the probability that a system is running (not down) at a
certain time point.

availability is the probability that the system is continuously available during a time interval.
Features of Distributed Database:

i) Data is stored at a number of sites.

ii) Sites are interconnected by a network
iii) DDB is logically a single db.
iv) DDBMS has full functionality of DBMS.

Advantages of Distributed Database:

i) Sharing of Data
ii) Improved Availability and Reliability
iii) Autonomy
iv) Easier expansion
v) Reduced operating cost

Disadvantages of Distributed Database:

i) Complexity of management and control.

ii) Deadlock handling
iii) security
iv) lack of standards

Types of Distributed Databases:

Homogeneous

i) Share a common global schema.

ii) Run identical DBMS s/w.
iii) Each site provides part of its autonomy in terms of right to change schema or s/w.
iv) Same s/w – No problem in transaction processing.
v) Same schema – No problem in query processing.

Heterogeneous

i) Different sites can have different schema.

ii) Run different DBMS s/w.
iii) Each site maintains its own right to change the schema or s/w.
iv) Different s/w – Major problem in transaction processing.
v) Different schema – Problem in query processing.
Classification of Distributed Database Systems:

A: Centralized Database System

 No distribution, no heterogeneity, high autonomy

 One site handle everything.
Example: A clinic management system in a small private hospital where patient records,
billing, and appointments are stored on a single local server using MySQL. All operations are
performed on one machine; there is no need for distribution.

B: Pure Distributed Database System

 Fully distributed, homogeneous, zero local autonomy

 Looks like a single centralized DB to the user.
 All data access is through a common interface.
 Single global schema
 Sites do not act independently.
[Example: Google Spanner used by Google for managing distributed data across its global
data centers. Appears as a single unified database to users, despite being distributed; it is fully
homogeneous with a single global schema. Spanner is used for mission-critical applications
that require high availability, global scale, and strong consistency, such as financial services,
gaming, and e-commerce platforms.]

C: Federated Database System (FDBS)

 Some distribution, some heterogeneity, moderate autonomy
 Sites have local users and local DBAs.
 There is a global schema shared across sites.
 Sites can run independently, but participate in a shared federation.
Example: Healthcare Information Systems connecting various hospitals that use different
local databases (Oracle, SQL Server, etc.) but share patient data under a unified health
program. Each hospital retains control over its local database but can participate in a shared
health data ecosystem.

D: Peer-to-Peer System

 High distribution, high heterogeneity, full local autonomy

 No global schema exists.
 Each site constructs necessary schemas only when needed.
 Sites can run on different DBMS models (relational, object, hierarchical, etc.)
Example: University Collaboration System
 Different universities maintain their own local databases (student info, courses,
results), each built using different DBMSs.
 They occasionally share data for student exchange programs or research
collaboration, but there’s no unified global schema.
 Each university decides what to share, when, and how, often using custom-built APIs
or schema mappings.

Concepts/Techniques in Distributed Database Design

1. Fragmentation

o The process of breaking up the database into logical units called fragments.
o Each fragment can represent a portion of a table (horizontal or vertical).
o Purpose: to improve locality of access and efficiency.
o Types of fragmentation:
 Horizontal fragmentation: rows are divided across sites.
 Vertical fragmentation: columns (attributes) are divided.
 Mixed/hybrid fragmentation: combination of both horizontal and
vertical.

2. Replication

o The technique of storing copies of data (or fragments) at multiple sites.

o Increases data availability and fault tolerance.
o Comes at the cost of maintaining data consistency during updates.
3. Allocation

o The process of assigning fragments or replicas to various sites in the

distributed system.
o Allocation strategies:
 Centralized: all data is stored at one site.
 Partitioned: fragments are stored at different sites.
 Replicated: multiple copies of fragments are stored at several sites.

4. Global Directory

o Stores metadata about the fragmentation, replication, and allocation of

data.
o Acts as a catalog used by the Distributed Database System (DDBS) to locate
and access data.
o Must be efficiently maintained and accessible to all DDBS applications.

5. Purpose of These Techniques

o Improve performance, reliability, scalability, and availability of the

distributed database system.
o These decisions are made during the design phase of a DDBS.
Q. Describe any two distributed database architectures with diagrams.

1. A three-tier Client-Server Architecture [TB page - 921]

 Clients (users or applications) request services from servers.

 Database servers manage the data and respond to queries.

Key Components:

 Client Tier: User interfaces or front-end applications.

 Application Server: Handles business logic.

 Database Server: Manages storage, query processing, and transaction management.

Advantages:

 Clear separation of concerns.

 Centralized control over data.

 Easy to scale and maintain.

Diagram:

Figure: The three-tier client-server architecture.

2. Peer-to-Peer (P2P) or Fully Distributed Architecture

 All sites (or nodes) in the network function as peers.

 Each site has equal responsibility and autonomy.

 No central server or controller.

Key Features:

 High local autonomy.

 Sites may run different DBMSs (heterogeneous).

 Data can be fragmented and replicated across sites.

 No global schema needed; sites interact only when necessary.

Advantages:

 Highly scalable and fault-tolerant.

 Flexible and decentralized.

 Supports dynamic and evolving environments.

Diagram:

These two architectures — Client-Server and Peer-to-Peer (P2P) — are most commonly
described.
Extra:

Advances in Database Management Systems

Coursework syllabus:

Distributed Database Concepts: Distributed Database Concepts, Data Fragmentation,

Replication, and Allocation Techniques for Distributed Database Design

Overview of Concurrency Control and Recovery in Distributed Databases

Overview of Transaction Management in Distributed Databases

Query Processing and Optimization in Distributed Databases

Types of Distributed Database Systems, Distributed Database Architectures, Distributed

Catalogue Management.
Parallel computing utilizes multiple processors within a single machine, while distributed
computing uses multiple, independent computers connected over a network.

Parallel Computing:

 Focus: Executes multiple parts of a single task simultaneously on different processors

within the same machine (e.g., multi-core CPUs, GPUs).

 Communication: Processors share memory and communicate through shared

resources, typically with low latency.

 Goal: To speed up the execution of a single task by breaking it down into smaller,
parallelizable parts.

 Example: Using a multi-core processor to render a complex 3D scene in a video

game, where each core handles a portion of the image.

Distributed Computing:

 Focus: Uses multiple independent computers (nodes) connected over a network to

work together on a task.

 Communication: Nodes communicate by sending messages over the network, which

can have higher latency than shared memory communication.

 Goal: To handle large workloads or solve complex problems that are too large or
resource-intensive for a single machine.

 Example: A search engine distributing a query across many servers to find results
from a massive database.
A centralized database stores all data in a single location, while a distributed database stores
data across multiple locations.

Centralized Database:

 Location: Data is stored on a single server or site.

 Management: Easier to manage and maintain due to the single location.
 Backup: Backups are simpler and more straightforward.
 Performance: Can experience performance bottlenecks if many users access it
simultaneously.
 Reliability: A single point of failure, meaning if the central server goes down, the
entire system is affected.
 Scalability: Scaling is often limited by the capabilities of the single server.

Distributed Database:

 Location: Data is spread across multiple servers or sites.

 Management: More complex to manage and synchronize data across different
locations.
 Backup: Backups require coordination across multiple sites.
 Performance: Can offer better performance due to data distribution.
 Reliability: More resilient to failures as data can be accessed from other locations.
 Scalability: Horizontal scalability, meaning it can easily handle larger workloads by
adding more nodes.

In essence, a centralized database is like having all your books in one library, while a
distributed database is like having multiple library branches with some books at each
location. This difference in data storage location leads to varying implications for
management, reliability, and performance.

Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
52 pages
Chapter 6-Distributed Database System
No ratings yet
Chapter 6-Distributed Database System
32 pages
Advantages and Disadvantages of DDBs
No ratings yet
Advantages and Disadvantages of DDBs
46 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
25 pages
Unit 5
No ratings yet
Unit 5
36 pages
Understanding Distributed Database Systems
No ratings yet
Understanding Distributed Database Systems
42 pages
Distributed Database System
No ratings yet
Distributed Database System
13 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
30 pages
Overview of Distributed Databases
No ratings yet
Overview of Distributed Databases
19 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
16 pages
Dbms 4th 4
No ratings yet
Dbms 4th 4
73 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
26 pages
Understanding Distributed Data Processing
No ratings yet
Understanding Distributed Data Processing
25 pages
Types and Design of Distributed Databases
No ratings yet
Types and Design of Distributed Databases
22 pages
Unit 5
No ratings yet
Unit 5
34 pages
Unit 5 Distributed Database
No ratings yet
Unit 5 Distributed Database
36 pages
Data Fragmentation in Distributed DBMS
No ratings yet
Data Fragmentation in Distributed DBMS
24 pages
DDB Notes 1 11
No ratings yet
DDB Notes 1 11
11 pages
Distributed Database (Mtcs041)
No ratings yet
Distributed Database (Mtcs041)
65 pages
Distributed Databases Overview and Architecture
No ratings yet
Distributed Databases Overview and Architecture
29 pages
Understanding Distributed Database Systems
No ratings yet
Understanding Distributed Database Systems
27 pages
Overview of RDBMS and DDBMS Concepts
No ratings yet
Overview of RDBMS and DDBMS Concepts
136 pages
Distributed Database System Overview
No ratings yet
Distributed Database System Overview
22 pages
Lecture 5-1
No ratings yet
Lecture 5-1
31 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
14 pages
Microsoft & Gramener AI for Local Risks
No ratings yet
Microsoft & Gramener AI for Local Risks
107 pages
Types of Distributed Database Systems
No ratings yet
Types of Distributed Database Systems
37 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
32 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
95 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
36 pages
Unit I
No ratings yet
Unit I
6 pages
Overview of Distributed Databases
No ratings yet
Overview of Distributed Databases
160 pages
Chapter 4
No ratings yet
Chapter 4
4 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
35 pages
Overview of Distributed Database Concepts
No ratings yet
Overview of Distributed Database Concepts
52 pages
Overview of Distributed Database Systems
100% (1)
Overview of Distributed Database Systems
24 pages
Distributed Database Architecture Overview
No ratings yet
Distributed Database Architecture Overview
27 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
84 pages
Types and Allocation in DDBMS
No ratings yet
Types and Allocation in DDBMS
56 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
54 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
12 pages
Introduction to Distributed Databases
No ratings yet
Introduction to Distributed Databases
13 pages
Distributed & Client-Server Databases Overview
No ratings yet
Distributed & Client-Server Databases Overview
23 pages
Distributed Databases and Big Data Concepts
No ratings yet
Distributed Databases and Big Data Concepts
92 pages
Unit1 DDB
No ratings yet
Unit1 DDB
62 pages
Understanding Distributed Databases
No ratings yet
Understanding Distributed Databases
22 pages
Overview of NoSQL Database Types
No ratings yet
Overview of NoSQL Database Types
124 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
52 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
44 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
17 pages
Overview of Distributed Database Systems
No ratings yet
Overview of Distributed Database Systems
66 pages
Advantages of Distributed Databases
No ratings yet
Advantages of Distributed Databases
16 pages
Advanced Distributed Databases Overview
No ratings yet
Advanced Distributed Databases Overview
22 pages
Benefits and Types of Distributed Databases
No ratings yet
Benefits and Types of Distributed Databases
4 pages
Distributed Database Concepts Overview
No ratings yet
Distributed Database Concepts Overview
62 pages
Semester End Examinations - September / October 2023: USN 1 M S
No ratings yet
Semester End Examinations - September / October 2023: USN 1 M S
3 pages
24MCA32 ML Unit Wise Question Bank Covering All Topics
No ratings yet
24MCA32 ML Unit Wise Question Bank Covering All Topics
5 pages
Dbms Unit - 5 Mongodb
No ratings yet
Dbms Unit - 5 Mongodb
30 pages
Unit 5-4
No ratings yet
Unit 5-4
7 pages
Understanding the Application Layer in TCP/IP
No ratings yet
Understanding the Application Layer in TCP/IP
10 pages
UNITT-1 Introduction To Web
No ratings yet
UNITT-1 Introduction To Web
18 pages
Configuring Storage Quotas in Windows Server
No ratings yet
Configuring Storage Quotas in Windows Server
292 pages
Attendance Monitoring System Proposal
No ratings yet
Attendance Monitoring System Proposal
38 pages
Image Encryption Content
No ratings yet
Image Encryption Content
5 pages
IPC Techniques and Methods Overview
No ratings yet
IPC Techniques and Methods Overview
32 pages
Secured Inter-Campus Network Design
No ratings yet
Secured Inter-Campus Network Design
60 pages
Python Programs for Enrollment Tasks
No ratings yet
Python Programs for Enrollment Tasks
23 pages
Cloud Computing Overview and Models
No ratings yet
Cloud Computing Overview and Models
11 pages
Cloud Computing Week 1 Assignment Answers
No ratings yet
Cloud Computing Week 1 Assignment Answers
4 pages
DataTrustX: Trustworthy Data Marketplace
No ratings yet
DataTrustX: Trustworthy Data Marketplace
26 pages
Script Guide for ReportServer Users
No ratings yet
Script Guide for ReportServer Users
125 pages
Queue Manager System Guide
No ratings yet
Queue Manager System Guide
30 pages
Senior Java Developer with 24 Years Experience
No ratings yet
Senior Java Developer with 24 Years Experience
4 pages
System Development Life Cycle Overview
No ratings yet
System Development Life Cycle Overview
98 pages
ServiceNow Developer Resume Overview
No ratings yet
ServiceNow Developer Resume Overview
5 pages
Understanding the Application Layer in OSI
No ratings yet
Understanding the Application Layer in OSI
37 pages
Functions and Importance of Operating Systems
No ratings yet
Functions and Importance of Operating Systems
61 pages
Overview of Distributed Operating Systems
100% (1)
Overview of Distributed Operating Systems
24 pages
MERN Stack E-Commerce for Interior Design
No ratings yet
MERN Stack E-Commerce for Interior Design
29 pages
Application Layer Protocols Overview
No ratings yet
Application Layer Protocols Overview
53 pages
Understanding Binary and ASCII Representation
No ratings yet
Understanding Binary and ASCII Representation
115 pages
Online Examination System Proposal
No ratings yet
Online Examination System Proposal
8 pages
IEC 61850: Substation Automation Standard
No ratings yet
IEC 61850: Substation Automation Standard
58 pages
Simplified Bitcoin Client-Server App
No ratings yet
Simplified Bitcoin Client-Server App
7 pages
Network Programming With Go
100% (1)
Network Programming With Go
123 pages
Peer-to-Peer Network Configuration Guide
No ratings yet
Peer-to-Peer Network Configuration Guide
5 pages
Web Attack Detection via RFEMI Techniques
No ratings yet
Web Attack Detection via RFEMI Techniques
9 pages
History of Internet Evolution and TCP/IP
No ratings yet
History of Internet Evolution and TCP/IP
87 pages
ERP Intro PDF
No ratings yet
ERP Intro PDF
12 pages

DBMS Unit - 4 DDB

Uploaded by

DBMS Unit - 4 DDB

Uploaded by

MCA – DBS: UNIT-4: DDB (CH-25.1, 25.2, 25.3, 25.

Distributed Databases: Distributed Database Concepts, Types of Distributed Database

DDB is a collection of multiple, logically interrelated databases distributed over a computer

DDBMS: It is a software system that manages a distributed database while making

For a database to be called distributed, these minimum conditions should be satisfied:

Network Connection: All database sites (computers) must be connected via a

These fragments are stored at different locations.

Horizontal data Fragmentation

 It breaks relation R by assigning each tuple of R to one or more fragments.

 Each fragment is a subset of the tuples in original relation R.

 Horizontal (using union operation) → R ⟨R₁, R₂⟩ → R₁ ∪ R₂ = R

Vertical data Fragmentation

 It breaks relation R by decomposing schema.

 Each fragment is a subset of the attributes of the original relation R.

 Vertical (using join operation) → R ⟨R₁, R₂⟩ → R₁ ⨝ R₂ = R

It is a combination of both horizontal and vertical fragmentation.

Distribution of entire relation at all the sites.

Only some fragments of a relation are replicated.

Why Replication is Desirable:

i) Increased availability of data

v) Reliability and Availability:

i) Data is stored at a number of sites.

Advantages of Distributed Database:

Disadvantages of Distributed Database:

i) Complexity of management and control.

Types of Distributed Databases:

i) Share a common global schema.

i) Different sites can have different schema.

A: Centralized Database System

 No distribution, no heterogeneity, high autonomy

B: Pure Distributed Database System

 Fully distributed, homogeneous, zero local autonomy

C: Federated Database System (FDBS)

 High distribution, high heterogeneity, full local autonomy

Concepts/Techniques in Distributed Database Design

o The technique of storing copies of data (or fragments) at multiple sites.

o The process of assigning fragments or replicas to various sites in the

o Stores metadata about the fragmentation, replication, and allocation of

5. Purpose of These Techniques

o Improve performance, reliability, scalability, and availability of the

1. A three-tier Client-Server Architecture [TB page - 921]

 Clients (users or applications) request services from servers.

 Database servers manage the data and respond to queries.

 Client Tier: User interfaces or front-end applications.

 Application Server: Handles business logic.

 Database Server: Manages storage, query processing, and transaction management.

 Clear separation of concerns.

 Centralized control over data.

 Easy to scale and maintain.

Figure: The three-tier client-server architecture.

 All sites (or nodes) in the network function as peers.

 Each site has equal responsibility and autonomy.

 No central server or controller.

 High local autonomy.

 Sites may run different DBMSs (heterogeneous).

 Data can be fragmented and replicated across sites.

 No global schema needed; sites interact only when necessary.

 Highly scalable and fault-tolerant.

 Flexible and decentralized.

 Supports dynamic and evolving environments.

Advances in Database Management Systems

Distributed Database Concepts: Distributed Database Concepts, Data Fragmentation,

Overview of Concurrency Control and Recovery in Distributed Databases

Overview of Transaction Management in Distributed Databases

Query Processing and Optimization in Distributed Databases

Types of Distributed Database Systems, Distributed Database Architectures, Distributed

 Focus: Executes multiple parts of a single task simultaneously on different processors

 Communication: Processors share memory and communicate through shared

 Example: Using a multi-core processor to render a complex 3D scene in a video

 Focus: Uses multiple independent computers (nodes) connected over a network to

 Communication: Nodes communicate by sending messages over the network, which

 Location: Data is stored on a single server or site.

 Location: Data is spread across multiple servers or sites.

You might also like