0% found this document useful (0 votes)
51 views9 pages

Topics in Distributed Computing

The document provides an overview of the topics and readings for a distributed computing course, including: 1) Presentation topics such as reliable group communication, distributed shared memory, consensus algorithms, and failure detectors. 2) Selected papers related to distributed spanning tree algorithms, fast mutual exclusion, detection of unstable predicates, vector clocks, and more. 3) A tentative course schedule covering communication mechanisms, remote method invocation, naming, clock synchronization, process synchronization, and distributed transactions. 4) Readings are suggested for each topic, such as papers on clock synchronization, mutual exclusion, distributed snapshots, and 2-phase commit protocols. Project timelines are also provided.

Uploaded by

tamilkudimagan
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views9 pages

Topics in Distributed Computing

The document provides an overview of the topics and readings for a distributed computing course, including: 1) Presentation topics such as reliable group communication, distributed shared memory, consensus algorithms, and failure detectors. 2) Selected papers related to distributed spanning tree algorithms, fast mutual exclusion, detection of unstable predicates, vector clocks, and more. 3) A tentative course schedule covering communication mechanisms, remote method invocation, naming, clock synchronization, process synchronization, and distributed transactions. 4) Readings are suggested for each topic, such as papers on clock synchronization, mutual exclusion, distributed snapshots, and 2-phase commit protocols. Project timelines are also provided.

Uploaded by

tamilkudimagan
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd

[Link]

com/anna-university-cse-6th-sem-notes-pdf-reg17-download/29/11/

Presentation Topics for Distributed Computing

1. Reliable group communication, Managing group views


2. I/O automata; Timed and untimed automata
3. Distributed shared memory - models and implementations
4. Synchronous network algorithms
5. Asynchronous shared memory and network algorithms
6. Atomic registers, Composite registers, Atomic objects
7. Distributed predicates and predicate detection; Distributed debugging
8. Check-pointing and rollback recovery
9. Counting networks
10. Agreement and consensus; randomized consensus; Survivable consensus objects
11. Failure detectors
12. Wait-free synchronization, Wait-free hierarchy
13. Network synchronizers
14. Distributed spanning tree algorithms
15. Fast mutual exclusion algorithms for shared memory
16. Detection of unstable predicates
17. Dimension-bound analysis for efficient vector clocks
18. Inhibition spectrum
19. Wait-free synchronization
20. Failure detectors (2 presentations)
21. Multiprocessor synchronization
22. Implementations of distributed shared memory
23. Remote Method Invocation: Mechanisms
24. Naming
25. Process Synchronization
26. Distributed Data: Replication
27. Processes, code migration
28. Distributed Object systems: CORBA (OMG DCOM
29. Distributed Coordination
30. Fault Tolerance
Selected papers from the literature.

1. Distributed spanning tree algorithms


1. R.G. Gallager, P.A. Humblet, and P.M. Spira. "A Distributed Algorithm
for Minimum-Weight Spanning Trees," ACM Transactions on
Programming Languages and Systems, 5(1):66-77, January 1983.
2. B. Awerbuch. "Optimal Distributed Algorithms for Minimum Weight
Spanning Tree, Counting, Leader Election, and Related Problems," Proc.
19th Symp. on Theory of Computing, pp. 230-240, May 1987.
3. M. Faloutsos, and M. Molle. "Optimal Distributed Algorithms for
Minimum Spanning Trees Revisited," 14th ACM Symp. on Principles of
Distributed Computing (PODC'95). Ottawa, Ontario. August 1995.
2. Fast mutual exclusion algorithms for shared memory
1. R. Alur, G. Taubenfeld, "Fast timing based algorithms," Distributed
Computing, 10(1): 1-10, 1996. Follow this link
2. L. Lamport, "A fast mutual exclusion algorithm," ACM Transactions on
Computer Systems, 5(1): 1-11, 1987.
3. Detection of unstable predicates
1. V. Garg, B. Waldecker, "Detection of weak unstable predicates in
distributed programs", IEEE Transactions on Parallel and Distributed
Systems, 5(3):299-307, 1994. Follow this link
2. V. K. Garg, B. Waldecker, ``Detection of Strong Unstable Predicates in
Distributed Programs,'' IEEE Transactions on Parallel and Distributed
Systems, 7(12):1323-1333, 1996. Follow this link
3. C. Chase, V. Garg, "Detection of global predicates: Techniques and their
limitations," Distributed Computing, 11(4): 191-201, 1998. Follow this
link
4. S. Stoller, L. Unnikrishnan, Y. Liu, "Efficient detection of global
properties in distributed systems using partial-order methods," Tech
Report 523, Indiana University, Sept. 1999.
4. Dimension-bound analysis for efficient vector clocks
1. P. Ward, "An online algorithm for dimension-bound analysis,"
Proceedings of Euro-Par'99, 144-153, LNCS 1685, September 1999.
2. P. Ward, "An offline algorithm for dimension-bound analysis,"
Proceedings of ICPP'99, September 1999.
5. Inhibition spectrum
1. C. Critchlow, K. Taylor, The inhibition spectrum and the achievement of
causal consistency, Distributed Computing, 10(1): 11-27, 1996. Follow
this link
6. Wait-free synchronization
1. M. Herlihy, "Wait-free synchronization," ACM Transactions on
Programming Languages and Systems, 11(1): 124-149, Jan 1991.
2. P. Jayanti, "On the robustness of Herlihy's hierarchy," 12th ACM Symp.
on Principles of Distributed Computing, 145-157, 1993.
7. Failure detectors (2 presentations)
1. E. Fromentin, M. Raynal, F. Tronel, "About Classes of Problems in
Asynchronous Distributed Systems with Process Crashes," 19th IEEE
International Conference on Distributed Computing Systems, 470-477,
May 1999. Follow this link
2. A. Schiper, Early consensus in an asynchronous system with a weak
failure detector, Distributed Computing, 10(3): 149-157, 1997. Follow this
link
8. Multiprocessor synchronization (3 presentations)
1. J. Anderson and J.-H. Yang, "Time/Contention Tradeoffs for
Multiprocessor Synchronization", Information and Computation , Vol.
124, No. 1, pp. 68-84, January 1996. Follow this link
2. James H. Anderson and Mark Moir, "Using Local-Spin k-Exclusion
Algorithms to Improve Wait-Free Object Implementations," Distributed
Computing 11(1), 1997. Follow this link
3. Mark Moir, "Efficient Object Sharing in Shared-Memory
Multiprocessors," Ph.D. Thesis, University of North Carolina at Chapel
Hill, 1996. Follow this link
4. J. Anderson and M. Moir, "Universal Constructions for Large Objects," to
appear in IEEE Transactions on Parallel and Distributed Systems. Follow
this link
9. Composite registers
1. J. Anderson, "Composite Registers", Distributed Computing, 6(3):141-
154, April 1993. Follow this link
10. Implementations of distributed shared memory
1. Check out web pages such as the following. link, link, link, link

Project start/due dates are tentative!

1. Course overview, Components of a distributed system


2. Communication Mechanisms
o Message Passing
o Stream-oriented communications

o Remote Procedure Call

o Remote Method Invocation

3. Remote Method Invocation: Mechanisms


o DCE RPC (reading)

First project starts January 23

o Java RMI (reading)
o SOAP (Reading: SOAP 1.1 spec, XML Protocol Working Group, Apache SOAP)

4. Naming
o Overview

o X.500/LDAP

o Active Directory (reading)

First project design due January 30

5. Clock Synchronization
o What is clock synchronization?
Leslie Lamport, "Time, clocks, and the ordering of events in a distributed
system", Communications of the ACM 21(7) (July 1978).
o Possibility and impossibility
Lundelius, J. and Lynch, N., "An Upper and Lower Bound for Clock
Synchronization," Information and Control, Vol. 62, Nos. 2/3, pp. 190-204, 1984.
Danny Dolev, Joe Halpern, and H. Raymond Strong, "On the possibility and
impossibility of achieving clock synchronization", Journal of Computer and
System Sciences 32(3) 230-250. April 1986.
Michael J. Fischer, Nancy A. Lynch, and Michael Merritt, "Easy impossibility
proofs for distributed consensus problems" Proceedings of the fourth annual
symposium on Principles of distributed computing 1985 , Minaki, Ontario,
Canada.
o Practical solution: NTP (Reading)

Other Reading:
Leslie Lamport and P. M. Melliar-Smith, "Synchronizing clocks in the presence of faults"
Journal of the ACM 32(1) (January 1985).
Jennifer Lundelius and Nancy Lynch, "A new fault-tolerant algorithm for clock
synchronization, Proceedings of the third annual ACM symposium on Principles of
distributed computing 1984 , Vancouver, British Columbia, Canada.
First project due February 11.

6. Process Synchronization
o Overview: Global State, Mutual Exclusion
Leslie Lamport, ``The Mutual Exclusion Problem'', Journal of the ACM 33(2)
(April 1986). Read Part II section 2 - the rest is optional.
Leslie Lamport, ``1983 Invited address: Solved problems, unsolved problems and
non-problems in concurrency, Proceedings of the third annual ACM symposium
on Principles of distributed computing, 1984, Vancouver, British Columbia,
Canada.
Optional - Global State:
K. Mani Chandy and Leslie Lamport, ``Distributed Snapshots: Determining
Global States of Distributed Sytems'', ACM Transactions on Computer
Systems 3(1) (February 1985) 63-75.
o Fault Tolerant Solutions
Michael J. Fischer, Nancy A. Lynch, James E. Burns and Allan Borodin,
``Distributed FIFO allocation of identical resources using small shared
space'' ACM Transactions on Programming Languages and Systems 11(1) (1989)
pp. 90-114.
o Multiple resources Requirements
Please don't check these out - others may want to read them.
Dijkstra, E. ``Hierarchical Ordering of Sequential Processes'', ACTA
Informatica 1 (1971), 115-138.
M. Rabin and D. Lehmann, ``On the Advantages of Free Choice: A Symmetric
and Fully Distributed Solution to the Dining Philosophers Problem'', Proceedings
of the 8th Symposium on Principles of Programming Languagues (1981) pp. 133-
138.

Second project starts February 15.

7. Distributed Transactions
o 2-Phase Commit

o Formal Models for failure and recovery

o 3-Phase Commit

Reading:
Skeen, Dale, ``A Formal Model of Crash Recovery in a Distributed System,'' IEEE
Transactions on Software Engineering 9(3), May 1983, pp.219-228. (preliminary on-line
version from SIGMOD'81)
Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman, Concurrency Control and
Recovery in Database Systems, Chapter 7: Distributed Recovery, Addison Wesley, 1987.

8. Distributed Data: Replication


o Basics
Reading:
Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman, Concurrency Control
and Recovery in Database Systems, Chapter 8: Replicated Data, Addison Wesley,
1987.
o Example: Replication in Oracle

o Advanced Techniques: Quasi-Copies
Reading:
Rafael Alonso, Daniel Barbará, and Hector Garcia-Molina, ``Data caching issues
in an information retrieval system'', ACM Transactions on Database Systems
(TODS) 15(3), September 1990.

Second project due March 1.

9. Mid-Semester Review
March 8, in class: Midterm on material from weeks 1-7.
10. Processes, code migration
Third project starts March 20.
o Threads vs. Processes, Code migration basics

o Mobile Agents

o Mobile Agents example: D'Agents


Reading: D'Agents web site, position paper.
11. Distributed Object systems: CORBA (OMG)
Reading: CORBA Overview from The Common Object Request Broker: Architecture and
Specification, OMG group, 2001.
CORBA Security Service (reading).
Third project due April 3, fourth project starts.
12. Distributed Object Systems:
o DCOM
Reading: DCOM vs. .NET
o .NET

13. Distributed Coordination: Jini. Further reading: Jan Newmarch's Guide to JINI


Technologies.
14. Fault Tolerance
o Failure models. Reading:
Dr. Flaviu Cristian, Understanding Fault-Tolerant Distributed Systems,
Communications of the ACM 34(2) February 1991.
o Fault Tolerance Reading:
Felix C. Gärtner, Fundamentals of Fault-Tolerant Distributed Computing in
Asynchronous Environments ACM Computing Surveys 31(1), March 1999.
o Reliable communication

o Recovery Optional reading:
Richard Golding and Elizabeth Borowsky, Fault-Tolerant Replication
Management in Large-Scale Distributed Storage Systems, in Proceedings of the
18th IEEE Symposium on Reliable Distributed Systems 18-21 October, 1999,
Lausanne, Switzerland.
Hector Garcia-Molina, Christos A. Polyzois and Robert B. Hagmann, Two Epoch
Algorithms for Disaster Recovery, in Proceedings of the 1990 conference on Very
Large Data Bases, Brisbane, Australia, August 13-16 1990.

Fourth project due April 19.

1. Wireless Ad-hoc Distributed File System (Download PDF)


2. Security Framework for Distributed Database System (Download PDF)
3. Distributed Files Sharing Management: A File Sharing Application Using
Distributed Computing Concepts (PDF)
Solar Power Plant with Distributed System of PV Panels (Download PDF)
The Distributed Computing Model Based on The Capabilities of The Internet
(PDF)
Analyzing the Impact of Operating System Activity of different Linux Distributions
in a Distributed Environment (Download PDF)
Fuzzy Based Grid Voltage Synchronization For Distributed Generated System
(Download PDF)
Crowdlicit: A System for Conducting Distributed End-User Elicitation and
Identification Studies (Download PDF)

4. Optimal allocation of distributed generation for a hybrid system using Particle


Swarm Optimization (Download PDF)
A High-Performance Distributed Relational Database System for Scalable OLAP
Processing (Download PDF)
RFDIDS: Radio Frequency-based Distributed Intrusion Detection System for the
Power Grid (Download PDF)
Distributed Intrusion Detection System for Cognitive Radio Networks Based on
Weighted Fair Queuing Algorithm (Download PDF)
The Evolution of the Hadoop Distributed File (Download PDF)
Data Storage Security in Cloud Computing For Ensuring Effective And Flexible
Distributed System (Download PDF)
Introduction To Query Processing And Optimization In Distributed Database
System (Download PDF)
A New Collaborative Trust Enhanced Security Model for Distributed System
(Download PDF)
Inferring and Asserting Distributed System Invariants (Download PDF)

Common questions

Powered by AI

Predicate detection in distributed systems is challenging due to the lack of global states, meaning the system can only infer a state from partial observations distributed across nodes. This makes it difficult to determine whether a given global state satisfies certain conditions or predicates because message delays and asynchronous operations can obscure the true state of the system. Techniques like vector clocks and snapshot algorithms address these challenges by reconstructing consistent views of the global state, allowing for the detection of predicates despite the concurrent, decentralized nature of these systems .

Remote Method Invocation (RMI) significantly impacts distributed object systems by facilitating seamless method calls between objects located on different network nodes, thereby enhancing system scalability and resource utilization. RMI abstracts the underlying network communication complexities, enabling developers to build applications that scale out across multiple servers effortlessly. However, RMI can also lead to increased network and CPU overhead due to serialization and deserialization processes, potentially affecting resource utilization and requiring optimization techniques such as caching, load balancing, and connection pooling to mitigate these impacts .

Managing replication in distributed systems involves techniques like quorum-based approaches, which dictate that a majority of nodes approve before a write operation can be considered successful, ensuring consistency despite node failures. Consistency protocols like Paxos and Raft are often implemented to maintain a coherent state across replicas by ensuring agreed-upon sequences of operations. Techniques like eventual consistency models may also be employed, allowing temporary inconsistencies that eventually converge as updates propagate across replicas, thus balancing consistency and availability needs .

Reliable group communication is integral to ensuring consistent message delivery among a set of processes in distributed computing. It involves reliable broadcast protocols that guarantee that messages are delivered without loss and in the correct order, handling issues like network failures and partitioning. Managing group views, on the other hand, pertains to maintaining a consistent view of the membership of a process group, especially when processes join or leave the group. This management is crucial for ensuring that all nodes in the system have a consistent understanding of the group structure, which is necessary for coordinating actions across the distributed system .

Vector clock-based dimension-bound analysis enhances the efficiency of global state monitoring by confining the dimensional complexity inherent in tracking causality among distributed processes. By reducing the vector clock size needed to capture necessary causal information, systems can utilize limited resources more effectively while maintaining accurate state tracking. This technique thus reduces overheads associated with large-scale monitoring operations and supports timely detection of global properties, which is essential for maintaining consistency and coordination across distributed systems .

Wait-free synchronization offers significant improvements over other methods by ensuring that every process in the system can complete an operation within a bounded number of steps, regardless of the execution speeds of other processes. This eliminates issues like deadlock and priority inversion that are prevalent in other synchronization methods. Wait-free algorithms guarantee that a process does not have to wait indefinitely for others to complete, which is particularly beneficial in systems where process execution times can be unpredictable, allowing for higher concurrency and system throughput .

Atomic registers are designed to provide a single-writer/single-reader CAP-consistent data service, ensuring linearizability and making them easier to understand and implement in a distributed context. Composite registers extend this concept to handle multiple writers and readers, requiring additional synchronization mechanisms to maintain consistency across these operations. The application difference lies in their ability to handle complex read/write patterns; atomic registers are simpler and more reliable but limited in concurrency management, while composite registers support greater concurrency at the cost of increased complexity .

Synchronous network algorithms operate in environments where there is a known upper bound on the message delivery time and process execution speed, allowing them to proceed in lock step. This predictability enables simplifying assumptions in algorithm design, such as handling global states and coordination. Asynchronous network algorithms, in contrast, do not assume any upper bound on message delivery time or process execution time, requiring them to handle more uncertainties, like variable network delays and process speeds. This necessitates more sophisticated techniques for ensuring coordination and reliability in the face of asynchrony and unpredictable network conditions .

Distributed spanning tree algorithms help optimize network resources by constructing a subset of the network graph that connects all nodes with the minimum total edge weight or cost. This not only reduces the overhead associated with managing redundant links but also ensures efficient data distribution paths, minimizing latency and enhancing the overall performance of the network. The ability to dynamically adapt the spanning tree as network topology changes is crucial for maintaining optimal resource use in diverse and evolving network environments .

Fault detectors play a crucial role in identifying which components in a distributed system have failed, enabling the system to take corrective actions to maintain consistency and availability. They help facilitate consensus, checkpoint recovery, and data replication processes. However, fault detectors face limitations such as imperfect accuracy; they can produce false positives or false negatives due to network partitions or delays. This necessitates designing multi-level detection mechanisms and integrating redundancy to ensure effective fault tolerance .

You might also like