Distributed Computing Course Overview
Distributed Computing Course Overview
Physical clocks attempt to synchronize with an external time standard, like NTP, and ensure the actual time of events aligns with real-world timings across distributed systems. Logical clocks, such as Lamport's logical clocks, do not use real-time but order events to determine causality, facilitating synchronization in distributed systems where exact time is less significant than the sequence of events .
Virtualization enables scalability and elasticity in cloud computing by abstracting physical hardware, allowing dynamic allocation of resources and efficient load balancing. This flexibility ensures that computing power and storage can be adjusted rapidly in response to demand spikes, offering users a seamless experience with minimized infrastructure waste .
Synchronous executions rely on coordinated actions with predictable timing between processes, offering straightforward concurrency but needing complex clock synchronization. Asynchronous executions, lacking timing guarantees, require mechanisms to handle data consistency and order independently of real-time. These differences impact design decisions regarding fault tolerance, system responsiveness, and synchronization complexity, necessitating customized approaches per system requirements .
Snapshot algorithms capture a consistent state of the distributed system at a given point in time, factoring in asynchronous communication and ensuring all component states are aligned correctly to depict a coherent global state. This capability is crucial for debugging, reliability, and recovery operations by providing a baseline for analyzing system behavior .
The characteristics of cloud computing, such as on-demand self-service, broad network access, resource pooling, rapid elasticity, and measured service, influence deployment models (public, private, hybrid, and community clouds) and service models (IaaS, PaaS, SaaS). These attributes enable diverse deployment strategies and flexible service offerings, tailor-fit for varying organizational needs and scale .
Achieving consensus in distributed systems, particularly asynchronous ones, involves challenges like network delays, message loss, and node failures. In asynchronous environments, where there is no common clock or certainty of message delivery, these factors complicate agreement algorithms, requiring complex fault tolerance techniques to ensure decision consistency among nodes despite failures or message discrepancies .
Message-passing systems facilitate communication between distributed components through messages, suitable for loosely coupled systems where nodes communicate over a network. In contrast, shared-memory systems allow multiple processes to access common memory space, often used in tightly-coupled systems where synchronization is a concern. These systems are distinguished by their communication models and the design challenges they pose for synchronization and data consistency .
Suzuki-Kasami’s Broadcast Algorithm uses a token to grant access to the critical section, providing a simpler solution with potential inefficiencies under heavy contention due to token circulation delays. Ricart-Agrawala’s Algorithm, however, is request-based and involves number marking to manage queue requests, reducing token handling but potentially increasing network messages due to request exchanges .
Monitoring is crucial in cloud computing to ensure optimal performance, security, and resource utilization. It primarily focuses on tracking compute and storage services, application performance, load balancing, and system health to detect and rectify failures, optimize costs, and ensure compliance with service level agreements .
The Chandy-Misra-Haas Algorithm uses a probe mechanism to detect cycles (indicating deadlock) in wait-for graphs for both AND and OR models of resource allocation. It sends probe messages along waiting chains to identify if a deadlock condition exists, effectively managing resource dependencies without requiring a global state view, which is essential in distributed systems .