[Link] is a distributed system?
Discuss the goals that should be met to build an efficient
distributed system.
A distributed system is "a networked computer system in which processes and resources are
sufficiently spread across multiple computers"
Goals of an efficient distributed system include:
1. Resource Sharing: Distributed systems should allow various devices to share resources
effectively. Examples include cloud-based storage and peer-to-peer networks.
2. Distribution Transparency: The system should hide the fact that resources and processes
are distributed, providing a seamless experience. This includes access, location, and
replication transparency.
3. Openness: A distributed system should be open, meaning it can interact with other systems
and use components from different environments.
4. Scalability: The system should scale effectively in terms of users, geographical distances, and
administrative domains. This includes being able to handle increasing loads without
significant performance loss.
5. Fault Tolerance: The system should continue to function even in the event of partial failures,
ensuring reliability and availability.
6. Security: The system must be secure, ensuring confidentiality, integrity, and proper
authorization mechanisms
2. Explain what is meant by (distribution) transparency, and give examples of different types of
transparencies.
Distribution transparency refers to the ability of a distributed system to conceal the fact that its
processes and resources are physically distributed across multiple computers, possibly separated by
large distances. The goal is to make the system appear as a single, unified entity to the users.
Different types of transparency include:
1. Access Transparency: Hides the differences in data representation and how an object is
accessed.
Ex: s. For example, a distributed system may have computer systems that run different
operating systems, each having their own file-naming conventions. Differences in naming
conventions, as well as how files can be manipulated, should all be hidden from users and
applications.
2. Location Transparency: Hides the location of an object in the system.
Ex: d. An example of a such a name is the URL [Link] which
gives no clue about the location of Prentice Hall's main Web server
3. Relocation Transparency: Hides the fact that an object may be moved to another location
while in use.
EX: An example of relocation transparency is when mobile users can continue to use their
wireless laptops while moving from place to place without ever being (temporarily)
disconnected
4. Migration Transparency: Hides that an object may move from one location to another.
5. Replication Transparency: Hides that an object is replicated across multiple servers.
Ex: For example, resources may be replicated to increase availability or to improve
performance by placing a copy close to the place where it is accessed. Replication
transparency deals with hiding the fact that several copies of a resource exist
[Link] Transparency: Hides the fact that multiple users may share and interact with
an object simultaneously.
Ex: For example, two independent users may each have stored their files on the same file
server or may be accessing the same tables in a shared database. In such cases, it is
important that each user does not notice that the other is making use of the same resource.
This phenomenon is called concurrency transparency
[Link] Transparency: Hides the failure and recovery of an object in the system
Ex: . For example, when contacting a busy Web server, a browser will eventually time out
and report that the Web page is unavailable. .At that point, the user cannot conclude that
the server is really down.
3. Categorize and explain the types of distributed systems
Distributed systems can be categorized into several types, including:
1. Distributed Computing Systems: These attempt to provide a shared-memory abstraction
across multiple computers. However, due to performance issues, this approach is less
popular(Chapter-1 Introduction).
i] Cluster Computing Systems: Cluster computing involves a group of high-performance
computers connected through a Local Area Network (LAN), often with similar hardware and
a single managing node(Chapter-1 Introduction).
One well-known example of a cluster computer is formed by Linux-based Beowulf clusters,
of which the general configuration is shown in Fig. 1-6. Each cluster consists of a collection of
compute nodes that are controlled and accessed by means of a single master node. The
master typically handles the allocation of nodes to a particular parallel program, maintains a
batch queue of submitted jobs, and provides an interface for the users of the system. As
such, the master actually runs the middleware needed for the execution of programs and
management of the cluster, while the compute nodes often need nothing else but a
standard operating system
ii] Grid Computing Systems: Grid systems connect heterogeneous computers that are
geographically dispersed across multiple organizations. These systems often involve virtual
organizations, enabling resource sharing.
The architecture consists of four layers. The lowest fabric layer provides interfaces to local
resources at a specific site.
The connectivity layer consists of communication protocols for supporting grid transactions
that span the usage of multiple resources.
The resource layer is responsible for managing a single resource. It uses the functions
provided by the connectivity layer and calls directly the interfaces made available by the
fabric layer.
The next layer in the hierarchy is the collective layer. It deals with handling access to
multiple resources and typically consists of services for resource discovery, allocation and
scheduling of tasks onto multiple resources, data replication, and so on
Finally, the application layer consists of the applications that operate within a virtual
organization and which make use of the grid computing environment
2. Distributed Information System:
In the following, we concentrate on these two forms of distributed systems.
i] Transaction Processing Systems:
In practice, operations on a database are usually carried out in the form of transactions.
Typical examples of transaction primitives are shown in Fig.
This all-or-nothing property of transactions is one of the four characteristic properties that
transactions have. More specifically, transactions are:
1. Atomic: To the outside world, the transaction happens indivisibly.
2. Consistent: The transaction does not violate system invariants.
3. Isolated: Concurrent transactions do not interfere with each other.
4. Durable: Once a transaction commits, the changes are permanent.
ii] Enterprise Application Integration:
Middleware offers communication facilities for integration
With remote procedure calls (RPC), an application component can effectively send a request
to another application component by doing a local procedure call, which results in the
request being packaged as a message and sent to the callee. Likewise, the result will be sent
back and returned to the application as the result of the procedure call.
[ Remote Procedure Call (RPC): Requests are sent through local procedure call, packaged as
message, processed, responded through message, and result returned as return from call. ]
RPC and RMI have the disadvantage that the caller and callee both need to be up and
running at the time of communication. In addition, they need to know exactly how to refer
to each other. This tight coupling is often experienced as a serious drawback, and has led to
what is known as message-oriented middleware, or simply MOM.
[Message Oriented Middleware (MOM): Messages are sent to logical contact point
(published), and forwarded to subscribed applications.]
3] Distributed Pervasive Systems: These systems blend into the user’s environment and include:
o Ubiquitous Computing Systems: Continuously present systems that interact with
the user seamlessly.
o Mobile Computing Systems: Systems where devices are mobile and need to manage
changing locations.
o Sensor Networks: Systems where numerous, simple sensor nodes collaboratively
sense and act upon the environment.
4. Explain the challenges associated with scalability in distributed systems.
• Number of users or processes (size scalability)
• Maximum distance between nodes (geographical scalability)
• Number of administrative domains (administrative scalability)
1. Size Scalability: Observation Most systems account only, to a certain extent, for size scalability.
Often a solution: multiple powerful servers operating independently in parallel.
Today, the challenge still lies in geographical and administrative scalability.
Scalability Introduction Design goals Size scalability Root causes for scalability problems with
centralized solutions
• The computational capacity, limited by the CPUs
• The storage capacity, including the transfer rate between CPUs and disks
• The network between the user and the centralized service2.
2. Geographical Scalability: Transitioning from Local Area Networks (LAN) to Wide Area Networks
(WAN) is challenging because:
- Latency issues in client-server interactions over WAN links can be prohibitive.
- WAN links tend to be unreliable for tasks like streaming
3. Administrative Scalability: Managing distributed systems across multiple administrative domains
introduces conflicts related to usage policies, management, and security. For example:
- Computational grids sharing expensive resources between different domains can face policy
conflicts.
- Shared equipment, like large-scale sensor networks, needs complex management and control.
[Link] how scalability can be achieved by applying different scaling techniques.
Scalability in distributed systems can be achieved through the following scaling techniques:
1. Hiding Communication Latencies: By using asynchronous communication, systems can
hide the delays inherent in message-passing across distributed systems. Applications can
proceed without waiting for a response, improving scalability.
2. Moving Computations to Clients: Instead of relying solely on the server, computations can
be distributed to the clients (e.g., using Java applets or scripts). This reduces the load on the
server and helps with scalability.
3. Partitioning Data and Computations: By distributing data and computations across
multiple machines, systems can avoid bottlenecks. Examples include decentralized naming
services (like DNS) and decentralized information systems (like the World Wide Web)【
4. Replication and Caching: Making copies of data available on different machines helps
reduce bottlenecks and improve performance. Techniques such as using replicated file
servers, mirrored websites, web caches, and file caching at both server and client levels
enhance scalability.
4. Handling Inconsistencies in Replication: Replication can introduce inconsistencies, but
systems can achieve scalability by tolerating some level of inconsistency, depending on the
application. Global synchronization, which ensures consistency, is typically avoided to enable
large-scale solutions.
6. With neat diagrams, explain various architectural styles of distributed systems.
Several styles have by now been identified, of which the most important ones for distributed
systems are:
1. Layered architectures
The basic idea for the layered style is simple: components are organized in a layered fashion where a
component at layer L; is allowed to call components at the underlying layer Li: but not the other way
around, as shown in Fig.(a). This model has been widely adopted by the networking community.
A key observation is that control generally flows from layer to layer: requests go down the hierarchy
whereas the results flow upward.
2. Object-based architectures
A far looser organization is followed in object-based architectures. In essence, each object
corresponds to what we have defined as a component, and these components are connected
through a (remote) procedure call mechanism. Not surprisingly, this software architecture matches
the client-server system architecture.
3. Data-centered architectures
Data-centered architectures evolve around the idea that processes communicate through a common
(passive or active) repository. It can be argued that for distributed systems these architectures are as
important as the layered and object based architectures. Web-based distributed systems are largely
data-centric: processes communicate through the use of shared Web-based data services.
4. Event-based architectures
In event-based architectures, processes essentially communicate through the propagation of events,
which optionally also carry data. For distributed systems, event propagation has generally been
associated with what are known as publish/subscribe systems. The basic idea is that processes
publish events after which the middleware ensures that only those processes that subscribed to
those events will receive them. The main advantage of event-based systems is that processes are
loosely coupled.
7. Explain multi-tiered architectures with neat block diagrams.
The simplest organization is to have only two types of machines:
1. A client machine containing only the programs implementing (part of) the user-interface level
2. A server machine containing the rest, that is the programs implementing the processing and data
level.
In this organization everything is handled by the server while the client is essentially no more than a
dumb terminal, possibly with a pretty graphical interface. One approach for organizing the clients
and servers is to distribute the programs in the application layers of the previous section across
different machines, as shown.
As a first step, we make a distinction between only two kinds of machines: client machines and
server machines, leading to what is also referred to as a two tiered architecture.
One possible organization is to have only the terminal-dependent part of the user interface on the
client machine, as shown in Fig. 2-5(a), and give the applications remote control over the
presentation of their data. An alternative is to place the entire user-interface software on the client
side, as shown in Fig. 2-5(b). Continuing along this line of reasoning, we may also move part of the
application to the front end, as shown in Fig. 2-5(c). In many client-server environments, the
organizations shown in Fig. 2-5(d) and Fig. 2-5(e) are particularly popular. These organizations are
used where the client machine is a PC or workstation, connected through a network to a distributed
file system or database. Essentially, most of the application is running on the client machine, but all
operations on files or database entries go to the server.
In particular, when distinguishing only client and server machines as we have done so far, we miss
the point that a server may sometimes need to act as a client, as shown in Fig. 2-6, leading to a
three-tiered architecture. In this architecture, programs that form part of the processing level reside
on a separate server, but may additionally be partly distributed across the client and server
machines. A typical example of where a three-tiered architecture is used is in transaction processing.
Another, but very different example where we often see a three-tiered architecture is in the
organization of Web sites.
8. With a neat diagram, explain the principal working of BitTorrent collaborative distributed
system.
BitTorrent is a peer-to-peer file downloading system. Its principal working is shown in Fig. The basic
idea is that when an end user is looking for a file, he downloads chunks of the file from other users
until the downloaded chunks can be assembled together yielding the complete file. An important
design goal was to ensure collaboration. In most file-sharing systems, a significant fraction of
participants merely download files but otherwise contribute close to nothing. To this end, a file can
be downloaded only when the downloading client is providing content to someone else. We will
return to this "tit-for-tat" behavior.
To download, a user needs to access a global directory, which is just one of a few well-known Web
sites. Such a directory contains references to what are called .torrent files. A .torrent file contains
the information that is needed to download a specific file. In particular, it refers to what is known as
a tracker, which is a server that is keeping an accurate account of active nodes that have (chunks) of
the requested file. An active node is one that is currently downloading another file. Obviously, there
will be many different trackers, although (there will generally be only a single tracker per file (or
collection of files). Once the nodes have been identified from where chunks can be downloaded, the
downloading node effectively becomes active. At that point, it will be forced to help others.
9. Explain the role of virtualization in distributed systems.
In practice, every (distributed) computer system offers a programming interface to higher level
software, as shown in Fig. 3-5(a).
Virtualization deals with extending or replacing an existing interface so as to mimic the behavior of
another system, as shown in Fig.3-5(b).
• One of the most important reasons for introducing virtualization in the 1970s, was to allow
legacy software to run on expensive mainframe hardware.
• Virtualization allows porting legacy interfaces to new platforms.
• Diversity of platforms and machines can be reduced by letting each application run on its
own virtual machine.
• It provides high degree of portability and flexibility.
10. With neat diagrams, explain the architectures of virtual machines.
The computer systems generally offer four different types of interfaces, at four different levels:
1. An interface between the hardware and software, consisting of ma chine instructions that can be
invoked by any program.
2. An interface between the hardware and software, consisting of ma chine instructions that can be
invoked only by privileged programs, such as an operating system.
3. An interface consisting of system calls as offered by an operating system.
4. An interface consisting of library calls, generally forming what is known as an application
programming interface (API). In many cases, the aforementioned system calls are hidden by an API.
These different types are shown in Fig. 3-6.
Virtualization can take place in two different ways:
1] Build a runtime system that provides abstract instruction set that is to be used for executing
applications
• This is called as process virtual machine
• Ex: emulators, Windows running on UNIX (Wine), JRE, etc
• Not-trivial
2] Provide a system that is implemented as a layer completely shielding the original hardware, but
offering complete instruction set of the same as an interface
• This layer is called as Virtual Machine Monitor (VMM)
• Ex: VMware, Xen, VirtualBox, etc
11. What are the reasons for migrating code? Explain the various code migration models.
• Code migration can help improve performance by exploiting parallelism. Ex: implement a
search query in Web using mobile agent
• Code migration provides flexibility. Ex: multi-tiered client-server architecture.
• Code migration leads to dynamically configured distributed systems. May require proprietary
protocol
• Advantage: Clients need not have all the software pre-installed to talk to servers
• Disadvantage: Security → downloading code without trusting the interface may compromise
the system.
A process consists of:
• Code segment: contains set of instructions that make up the program that is being executed
• Resource segment: contains references to external resources needed by the process such as
files, printers, devices, other processes, etc
• Execution segment: is used to store current execution state of a process, consisting of private
data, stack and program counter.
Weak Mobility
• Transfer only code segment along with some initialization data
• Transferred program is always started from one of predefined starting positions
• Simple and easy to implement
• Ex: Applets start from executing from beginning
Strong Mobility
• Execution segment can also be transferred
• Running processes can be stopped, moved to another machine, resume execution where it let
off
• General and harder to implement
Sender-initiated
• Migration is initiated at the machine where code currently resides or is being executed
• Used when uploading programs to compute server
• Ex: sending search program across Internet to web database server to perform queries at that
server
Receiver-initiated
• Migration is initiated by the target machine
• Ex: Java Applets
• Simple compared to sender-initiated
12. What is RPC? Explain the steps involved in performing RPC.
Information can be transported from the caller to the callee in the parameters and can come
back in the procedure result. No message passing at all is visible to the programmer. This
method is known as Remote Procedure Call, or often just RPC.
A remote procedure call occurs in the following steps:
1. The client procedure calls the client stub in the normal way.
2. The client stub builds a message and calls the local operating system.
3. The client’s OS sends the message to the remote OS.
4. The remote OS gives the message to the server stub.
5. The server stub unpacks the parameters and calls the server
6. The server does the work and returns the result to the stub.
7. The server stub packs it in a message and calls its local OS.
8. The server’s OS sends the message to the client’s OS.
9. The client’s OS gives the message to the client stub.
10. The stub unpacks the result and returns to the client.
13. Discuss the issues of parameter passing while performing RPC in distributed system.
When performing Remote Procedure Calls (RPC) in a distributed system, several issues arise with
parameter passing due to differences in machine architectures. These issues include:
1. Data Representation: Different machines may use different character encodings (e.g., EBCDIC
vs. ASCII) or integer representations (e.g., one's complement vs. two's complement). For
example, a character parameter passed from an IBM PC (ASCII) to an IBM mainframe (EBCDIC)
might be misinterpreted.
2. Endianness: Different systems may use different byte ordering schemes. For instance, Intel
processors use little-endian (low-order byte first), while SPARC processors use big-endian
(high-order byte first). This can lead to incorrect data interpretation unless the system
accounts for these differences.
3. Inconsistent Parameter Unpacking: Without knowing the exact type of each parameter (e.g.,
integer vs. string), incorrect byte inversion may occur, which can distort the data further.
Thus, handling these issues requires careful attention to the format and encoding of data during the
RPC process.
14. Explain the following terms:
a) Persistent communication:
Message submitted for transmission is stored by the communication middleware till it gets delivered
to the receiver.
• Middleware stores the message at one or more storage facilities
• Sending application need not continue execution after submitting the message
• Receiving application need not be executing when the message is submitted
b) Transient communication:
Message is stored by the communication system only till sending and receiving application are
executing.
• Message will be discarded if middleware cannot deliver a message due to transmission interrupt or
recipient is currently inactive
• All transport-level communication services offer only transient communication
• Store-and-forward routers
c) Asynchronous communication:
Sender continues immediately after it has submitted its message for transmission
• Message is (temporarily) stored immediately by the middleware upon submission
d) Synchronous communication:
Sender is blocked until its request is known to be accepted.
• Three points where synchronization takes place:
• Sender may be blocked until middleware notifies that it will take over transmission of the request
• Sender may synchronize until its request has been delivered to the intended recipient
• Synchronization may take place by letting sender wait until its request has been fully processed (i.e.
up to the time that recipient returns a response)
e) Deferred synchronous RPC:
The client first calls the server to hand over a list of host names that should be looked up, and
continues when the server has acknowledged the receipt of that list. The second call is done by the
server, who calls the client to hand over the addresses it found. Combining two asynchronous RPCs is
sometimes also referred to as a deferred synchronous RPC.
15. With neat block diagram, explain the general architecture of a message-queueing system.
Messages can be put only into queues that are local to sender
• Called as Source Queue
• Messages can only be read from local queues
• Message put into a queue will contain specification of destination queue to which it should be
transferred
• Collection of queues is distributed across multiple machines
• System should maintain a database of queue names to network locations
• Analogous to DNS
Queues are managed by queue managers
• Queue manager interacts directly with application that is sending or receiving message
• Queue managers also act as routers or relays
• Forward incoming messages to other queue managers
• Relays are convenient as there is no naming service
• Relays allow for secondary processing of messages
However,
• Topology of queuing network is static
• Each queue manager needs a copy of queue-to-locating mapping
• Leads to network management problems in large scale queuing systems
• Possible solution: use routers that know the network topology.
16. With neat block diagram, explain the general organization of message broker in a message-
queueing system.
The general organization of a message broker includes several key components and functions:
1. Message Transformation: The primary role of a message broker is to convert incoming
messages into a format that can be understood by the destination application. This
transformation can range from simple reformatting to more complex conversions between
different application domains.
2. Mediation and Routing: In advanced message-queuing systems, a broker may also act as a
mediator, not just converting message formats but also routing messages between
applications based on predefined rules. This can include matching applications based on the
content or topic of the message in a publish/subscribe model. In this model, applications
publish messages on specific topics, and the broker delivers those messages to applications
that have subscribed to those topics.
3. Rule Repository: The broker uses a repository of rules and programs that define how
messages are to be transformed. These rules are typically designed by domain experts and
stored in the broker, allowing it to perform message conversions or routing as required. While
brokers come with development tools to simplify this process, the real "intelligence" lies in
the expertise of the users who define these rules.
4. Application Integration: In the context of enterprise application integration (EAI), the
message broker supports not only simple message conversions but also sophisticated
mediation of communications between different applications. It ensures that messages
exchanged between systems are compatible and can be properly interpreted by the receiving
application.