Distributed Systems
1
Distributed Systems
1 Introduction
▶Introduction
▶Processes
▶Communication
▶Coordination
▶Distributed programming
▶Python basics
2
Introduction
Distributed systems
• Definitions
• Design goals
• Classification
• Pitfalls
Basic Concepts
1 Introduction
Definition 1:
A distributed system is a a collection of autonomous computing elements
that appears to its users as a single coherent system.
4
Realization of distributed systems
Basic Concepts
1 Introduction
Definition 2:
A distributed system is a networked computer system in which processes and
resources are sufficiently spread across multiple computers ⇐ Expansive view
Examples
• Google (Search, Mail, etc.)
• Finance and commerce (Banks, Amazon, eBay, etc.)
• Content Delivery Network (CDN)
• Telecommunication
Decentralized systems: necessarily. E.g., federated learning, distributed ledger
(blockchain), etc ⇐ Integrative view
6
Design Goals
1 Introduction
A distributed system should:
• make resources easily accessible
• hide the fact that resources are distributed across a network
• be open, and
• be scalable
Distribution transparency
Design Goals
7
The distribution of processes and resources is transparent, that is, invisible, to end users
and applications.
Distribution transparency
Transparency Description
Access Hide differences in data representation and how an object is accessed
Location Hide where an object is located
Relocation Hide that an object may move to another location
Migration Hide that an object may be moved to another location while in use
Replication Hide that an object is replicated
Concurrency Hide that an object may be shared by several independent users Failure
Hide the failure and recovery of an object
Degree of distribution transparency
Distribution transparency
8
Trade-off between a high degree of transparency and the performance of a system.
Example
Video streams (failure to access server)
• How to hide transmission delays for wide-area distributed systems?
• How to distinguish slow system from failing one?
Note:
Distribution transparency is a nice a goal, but achieving it is a different story.
Openness
Design Goals
The system communicates with services of other systems irrespective of the underlying
environment.
9
Openness
• Interoperability – different manufacturers can co-exist and work together by a
common standard.
• Portability – an application can be executed on different distributed systems
without modification.
• Extensible – adding new components or replacing existing ones without affecting
the other components.
Lack of openness ⇒ open-source
Separating policy versus mechanism policies
openness • What level of consistency
do we require for client-
Implementing openness: cached data?
10
• Which QoS requirements do we adjust in • Provide adjustable QoS
the face of varying bandwidth? parameters per data stream
• What level of secrecy do we require for • Offer different encryption
communication? algorithms
Implementing openness: mechanisms
• Allow (dynamic) setting of caching policies
Large distributed systems ⇒ reasonable defaults and self-configurable systems
Dependability
Design Goals
The system operates as expected.
11
Partial failures and fault tolerance
Requirements
• Availability – the probability of operating correctly at any given moment.
• Reliability – continue to work without interruption.
• Safety – no catastrophic event happens during temporary fails.
• Maintainability – how easily a failed system can be repaired.
12
Scalability
13
Design Goals
Scalability
Three components:
• Size scalability
– add users/resources without noticeable loss of performance.
• Geographical scalability
– users and resources may lie far apart, but communication
delay is hardly noticed.
• Administrative scalability
– can be easily managed even if it spans many
independent administrative organizations.
14
Scaling techniques
Scalability
Performance problems caused by limited capacity of servers and network?
• Scale up – increasing memory, upgrading CPUs, or replacing network modules.
• Scale out – expanding the distributed system (more machines)
Scaling out
• Hiding communication latencies – avoid waiting for responses to remote-service
requests as much as possible. However, the approach doesn’t fit for every
application.
• Partitioning and distribution – partition data and computations across multiple
machines. Example: World Wide Web documents
• Replication – availability, load balance and latency — Problem: consistency
15
Classification of distributed systems
1 Introduction
• Distributed computing systems
— Cluster computing
— Grid computing
— Cloud computing
• Distributed information systems
• Distributed pervasive systems
Cluster computing
Distributed computing systems
16
Distributed computing systems are used for high-performance computing tasks.
Cluster computing
• Homogeneous: same OS, near-identical hardware
• Single managing node
Grid computing
Distributed computing systems
17
Grid computing
• Fabric layer – shared heterogeneous resources
• Connectivity layer – communication and security protocols to authenticate and
transfer data
• Resource layer – protocols for operating on a single shared resources. E.g., get
configuration, create process
• Collective layer – coordinates sharing of resources.
E.g., allocation and scheduling of tasks onto multiple
resources
• Application layer – applications running on the gird.
18
Cloud computing
19
Distributed computing systems
Cloud computing
Four layers
• Hardware – physical storage, processors, network
devices, etc.
• Infrastructure– provides virtually unlimited “raw”
computing, storage, and network resources
• Platform– set of tools or middleware that are used to
develop or deploy applications on the cloud
• Application– running applications.
20
Distributed information systems
Classification
• The traditional environments, where databases play an important role.
• Database and processing of the system are separated components.
— Transaction processing systems
Transaction primitives
Primitive Description
BEGIN_TRANSACTION Mark the start of a transaction
END_TRANSACTION Terminate the transaction and try to commit
ABORT_TRANSACTION Kill the transaction and restore the old values
READ Read data from a file, a table, or otherwise
21
WRITE Write data to a file, a table, or otherwise
Transaction processing
Distributed information systems
Properties of transactions
Transactions adhere to the so-called ACID properties:
• Atomicity – All operations either succeed, or all of them fail. When the transaction
fails, the state of the object will remain unaffected by the transaction.
• Consistency – A transaction establishes a valid (predefined) state transition, i.e.,
corruption or errors in your data do not create unintended consequences for the
integrity of your table.
• Isolation – Concurrent transactions do not interfere with each other. Each request
occur as though they were occurring one by one.
22
• Durability – After the execution of a transaction, its effects are made permanent,
i.e., changes to the state survive failures.
Distributed pervasive systems
Classification
Emerging next-generation of distributed systems in which nodes are small, mobile, and
often embedded in a larger system, characterized by the fact that the system naturally
blends into the user’s environment.
23
Pervasive systems
Types (overlapping characters):
• Ubiquitous computing systems
• Mobile computing systems
• Sensor networks
Ubiquitous computing systems
Pervasive systems
Pervasive and continuously present, i.e., user will be continuously interacting with the
system .
24
Core requirements
• Distribution – Devices are networked, distributed, and accessible transparently
(hidden from view)
• Interaction – Interaction between users and devices is highly unobtrusive (implicit)
• Context awareness – The system is aware of a user’s context to optimize
interaction
• Autonomy – Devices operate autonomously without human intervention, and are
thus highly self-managed
• Intelligence – The system as a whole can handle a wide range of dynamic actions
and interactions (AI)
Mobile computing systems
Pervasive systems
25
Pervasive, but emphasis is on the fact that devices are inherently mobile.
Mobile computing systems
Characters of mobile computing systems:
• Wireless communication – smart phones, remote controls, pagers, active badges,
car equipment, various GPS-enabled devices, and so on.
• Mobile – change location over time.
• MEC – supported by Mobile Edge Computing
Sensor networks
Pervasive systems
26
Pervasive, with emphasis on the actual (collaborative) sensing and actuation of the
environment.
Sensor networks
Characters of sensor networks:
• 10s - 1000s of small nodes, each equipped with one or more sensing devices.
• Wireless and often battery powered
• Limited resources, i.e., small memory/compute/communication capacity, which is
an advantage for the power consumption.
27
Sensor networks as distributed system
28
Pervasive systems
Sensor networks
29
Pitfalls
1 Introduction
Developing a distributed system is a formidable task.
False assumptions
• The network is reliable
• The network is secure
• The network is homogeneous
• The topology does not change
• Bandwidth is infinite
• There is one administrator
30
Summary
1 Introduction
• The goal of distributed system is to spread processes and resources across different
computers for sufficiency, not necessity.
• Design goals for distributed systems include sharing resources, ensuring openness,
distribution transparency, and scalability.
• Different types of distributed systems exist which can be classified as being oriented
toward supporting computations, information processing, and pervasiveness.
31