Cloud Computing Networking Overview
Cloud Computing Networking Overview
SYLLABUS (UNIT - IV): Networking for Cloud Computing: Introduction, Overview of Data
Center Environment, Networking Issues in Data Centers, Transport Layer Issues in DCNs, Cloud
Service Providers.
1) Introduction:
What is Cloud Computing?
Cost-Effectiveness: Efficient network design can optimize data transfer costs and overall
cloud expenditure.
Future Trends:
Serverless Networking: Networks becoming even more abstracted and managed by the
cloud provider.
Network as Code (NaC): Automating network provisioning and management through code
(e.g., using Infrastructure as Code tools like Terraform, CloudFormation).
AI/ML for Network Operations (AIOps): Leveraging AI and machine learning for
predictive analytics, anomaly detection, and automated remediation in cloud networks.
Edge Computing: Extending cloud networking capabilities to the "edge" of the network for
lower latency and improved performance for certain applications.
5G Integration: The synergy between 5G and cloud computing for enhanced mobile
applications and IoT.
A cloud data center is a massive, purpose-built facility that houses the physical infrastructure
(servers, storage, networking equipment) required to provide cloud computing services on a
large scale. Unlike traditional on-premises data centers which are typically owned and operated by
a single organization for its own use, cloud data centers are operated by third-party cloud
providers (e.g., AWS, Azure, Google Cloud) and are designed to serve multiple customers
(multi-tenancy) over the internet.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
Hyper scale: Cloud data centers, especially those of major providers, are often "hyper
scale," meaning they are massive in size, containing hundreds of thousands of servers and
supporting millions of users and applications.
Geographic Distribution: Cloud providers operate data centers in multiple regions and
availability zones around the world. This geographical distribution offers:
o Low Latency: Resources are closer to users, reducing communication delays.
o Disaster Recovery & Business Continuity: Data and applications can be replicated
across different locations, ensuring resilience in case of regional outages or disasters.
o Compliance: Meeting data residency requirements for different regulations.
Virtualization: This is the core technology enabling cloud computing. Virtualization
software (hypervisors) allows physical servers to be divided into multiple virtual machines
(VMs), maximizing hardware utilization and providing flexibility for users to provision
resources on demand.
Software-Defined Everything (SDx): Cloud data centers heavily rely on software-defined
networking (SDN), software-defined storage (SDS), and software-defined data centers
(SDDC) principles. This allows for programmatic control and automation of infrastructure,
rather than manual configuration of physical devices.
Automation: Extensive automation is used for provisioning, scaling, monitoring, and
managing resources, reducing manual intervention and enabling rapid deployment.
Energy Efficiency: Given their massive scale and continuous operation, cloud data centers
are designed with advanced cooling systems (e.g., liquid cooling, hot/cold aisle
containment) and energy-efficient hardware to minimize power consumption and
environmental impact.
Robust Security: While users are responsible for security in the cloud, the cloud provider is
responsible for security of the cloud data center. This includes physical security (access
controls, surveillance), network security (firewalls, DDoS protection), and compliance with
various industry standards and regulations.
The cloud data center environment can be broadly divided into physical and logical infrastructure:
A. Physical Infrastructure:
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
Virtualization Layer (Hypervisors): Software that creates and manages virtual machines
(VMs) on physical servers (e.g., VMware ESXi, KVM, Xen).
Cloud Management Platform (CMP): The software that cloud providers use to manage
and orchestrate their vast infrastructure, enabling users to provision, monitor, and scale
resources through web interfaces and APIs.
Software-Defined Networking (SDN) & Network Virtualization: Software that abstracts
and controls network resources, allowing for dynamic network configurations, creation of
virtual networks (VPCs/VNets), and automated network services.
Orchestration and Automation Tools: Tools that automate the deployment, scaling, and
management of applications and infrastructure (e.g., Kubernetes for containers, Terraform
for Infrastructure as Code).
Monitoring and Logging Systems: Collect data on performance, health, and security
events across the entire data center, providing insights and enabling proactive management.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
Limited to internal network, possibly VPN Broad network access via the internet,
Accessibility
for remote access anywhere, anytime
Data centers are the backbone of modern digital infrastructure, housing the computing, storage, and
networking equipment necessary to operate applications, store data, and deliver services. Their
design is crucial for performance, reliability, scalability.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
1. Servers (Compute):
o Description: The workhorses of the data center, responsible for processing data and
running applications. They vary in form factor and power.
o Types:
Rack Servers: Standard servers designed to be mounted in equipment racks,
typically 1U (one rack unit) or 2U tall.
Blade Servers: Highly compact, modular servers that slide into a chassis,
sharing power, cooling, and network connections. They offer high density.
Tower Servers: Standalone servers resembling desktop PCs, often used in
smaller deployments.
Mainframes: High-performance computers capable of processing billions of
calculations, often used for mission-critical applications and large-scale
transaction processing.
o Virtualization/Containers: Modern data centers heavily rely on virtualization (e.g.,
VMware, Hyper-V) and containerization (e.g., Docker, Kubernetes) to maximize
server utilization by running multiple virtual machines or containers on a single
physical server.
2. Storage Systems:
o Description: Devices and software used to store and manage vast amounts of data.
o Types:
Direct-Attached Storage (DAS): Storage directly connected to a single
server.
Network-Attached Storage (NAS): Dedicated storage devices connected to
a network, allowing multiple servers to access file-level data.
Storage Area Network (SAN): A high-speed network dedicated to block-
level storage, allowing servers to access storage as if it were locally attached.
Object Storage: A scalable storage architecture that manages data as
objects, popular in cloud environments for unstructured data.
o Media: Hard Disk Drives (HDDs), Solid State Drives (SSDs), Tape Libraries (for
archives), Optical Discs.
3. Networking Equipment:
o Description: The communication backbone that connects all components within the
data center and links it to external networks (e.g., the Internet).
o Components:
Switches: Enable communication between devices within the same network
segment, forwarding data based on MAC addresses.
Routers: Connect different networks and direct traffic between them based
on IP addresses.
Firewalls: Network security devices that monitor and filter incoming and
outgoing network traffic based on predefined security rules.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
1. Power Systems:
o Description: Ensure a continuous and stable power supply to all IT equipment.
o Components:
Uninterruptible Power Supplies (UPS): Provide temporary power during
outages and protect against power fluctuations, allowing graceful shutdown
or transition to backup generators.
Backup Generators: Diesel or natural gas generators that provide long-term
power during extended utility outages.
Power Distribution Units (PDUs): Distribute power from the
UPS/generators to individual racks and IT equipment.
Switchgear & Electrical Panels: Manage and distribute electricity
throughout the facility.
Redundant Power Feeds: Multiple power sources to eliminate single points
of failure.
2. Cooling Systems:
o Description: Maintain optimal temperature and humidity levels to prevent
overheating of IT equipment, which generates significant heat.
o Components:
Computer Room Air Conditioners (CRACs) / Computer Room Air
Handlers (CRAHs): Units that cool and dehumidify the air in the data
center.
Chillers & Cooling Towers: Used in larger facilities for water-based
cooling systems.
Hot/Cold Aisle Containment: Physical barriers that separate hot exhaust air
from cold intake air, improving cooling efficiency.
In-row/Rack Cooling Units: Targeted cooling systems placed directly
within server rows or racks.
3. Physical Security Systems:
o Description: Protect the data center from unauthorized access, theft, and physical
damage.
o Components:
Access Control: Biometric scanners, keycard systems, and security
personnel to control entry.
Video Surveillance (CCTV): Cameras to monitor all areas of the facility.
Intrusion Detection Systems: Sensors and alarms to detect unauthorized
entry.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
Core Layer: High-speed routers and switches forming the backbone, connecting to external
networks and acting as the central aggregation point.
Distribution (Aggregation) Layer: Connects the core layer to the access layer, providing
routing, filtering, and QoS (Quality of Service) functions.
Access Layer: Connects servers and other end devices to the network via switches (often
Top-of-Rack - ToR switches).
High Oversubscription: Traffic typically flows north-south (client to server and vice-
versa). East-west (server-to-server) traffic, which is dominant in virtualized and cloud
environments, has to traverse up to the distribution or even core layer, leading to bottlenecks
and high latency.
Scalability Challenges: Adding capacity often means adding more tiers or larger, more
expensive core switches.
Complexity: Managing VLANs and Spanning Tree Protocol (STP) for redundancy can be
complex.
Single Points of Failure: Core and distribution layers can become bottlenecks.
The Spine-Leaf architecture, based on a Clos network topology, has become the de-facto
standard for modern DCNs, especially for cloud and hyperscale environments.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
By leveraging these sophisticated components and network architectures, modern data centers
provide the robust, scalable, and high-performance foundation required for cloud computing, big
data, AI, and other demanding digital workloads.
Data centers are complex ecosystems designed to house, power, and connect the IT infrastructure
that drives modern digital services. Understanding their core components, particularly storage and
compute, is fundamental to grasping how these facilities operate.
Storage systems in data centers are responsible for the persistent retention, management, and
retrieval of vast quantities of digital information. The choice of storage technology depends heavily
on factors like performance requirements (speed of access), capacity needs, cost, and the type of
data being stored (structured vs. unstructured).
1. Based on Connectivity/Architecture:
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
NAS device is essentially a specialized server with optimized storage hardware and
software.
o Pros: Centralized storage, easily shareable by multiple users/servers, relatively
simple to set up and manage, good for unstructured data.
o Cons: Performance can be affected by network congestion, generally higher latency
than SAN for block-level access.
o Use Cases: File sharing, departmental storage, backups, home directories, content
repositories (e.g., media files, documents).
Storage Area Network (SAN):
o Description: A high-speed, dedicated network (separate from the main LAN)
designed specifically for block-level data access. Servers connect to the SAN and
perceive the storage as if it were locally attached disks. SANs typically use Fibre
Channel (FC) for high performance or iSCSI (Internet Small Computer System
Interface) over Ethernet for cost-effectiveness.
o Pros: High performance and low latency (especially FC SAN), highly scalable,
centralized storage, supports advanced features like snapshots, replication, and data
deduplication, ideal for structured data.
o Cons: More complex and expensive to set up and manage than NAS/DAS, requires
specialized hardware and expertise.
o Use Cases: Databases, virtualized server environments (VMware, Hyper-V where
multiple VMs need shared block storage), high-performance applications, enterprise-
level storage.
File Storage: Data organized in a hierarchical structure of files and folders (e.g.,
documents, images). Accessed via NAS.
Block Storage: Data broken into fixed-size blocks, each with a unique address, without
metadata. Provides raw storage that operating systems can format and use as disks.
Accessed via SAN or DAS.
Object Storage: Data stored as self-contained "objects" with unique identifiers and rich
metadata (not in a hierarchy). Accessed via APIs (e.g., S3-compatible APIs). Highly
scalable and cost-effective for vast amounts of unstructured data.
o Use Cases: Cloud storage (Amazon S3, Azure Blob Storage, Google Cloud
Storage), data lakes, backups, archives, web content.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
o Pros: Significantly faster read/write speeds, lower latency, more durable, lower
power consumption.
o Cons: Higher cost per gigabyte, though prices are decreasing.
o Types: SATA SSDs, SAS SSDs, NVMe SSDs (NVMe over PCIe offers the highest
performance).
o Use Cases: Databases, virtualization, high-performance applications, caching.
Tape Libraries: Magnetic tapes stored in automated libraries.
o Pros: Extremely high capacity, lowest cost per gigabyte for cold data, very long
shelf life, air-gapped security for ransomware protection.
o Cons: Sequential access (slow for retrieval), requires dedicated hardware.
o Use Cases: Long-term archives, disaster recovery, regulatory compliance.
Computer infrastructure refers to the processing power and memory resources required to run
applications, execute code, and perform calculations within a data center. It's the "brain" of the
operation.
Here are some common networking issues in cloud computing data centers, along with their causes
and potential impacts:
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
I. Performance-Related Issues:
Latency:
o Description: The time delay for data to travel from source to destination and back.
High latency leads to slow response times and a poor user experience, especially for
real-time applications.
o Causes:
Inefficient routing (data taking unnecessarily long paths).
Physical distance between users and data centers.
Network congestion due to bandwidth constraints.
Misconfigured network devices (routers, switches).
Insufficient receive/transmit queues on NICs for high packet rates.
o Impact: Frustration for users, delays in critical business operations (e.g., financial
trading, video conferencing), degraded application performance.
Bandwidth Bottlenecks:
o Description: Occurs when the demand for network capacity exceeds the available
bandwidth.
o Causes:
Networks not designed with scalability for increased traffic.
High bandwidth consumption from applications (e.g., video streaming, large
file transfers).
Sudden spikes in network usage, particularly during peak hours.
Using lower-grade network connections (e.g., copper instead of fiber optic).
o Impact: Slower data transfer rates, increased latency, poor application performance,
congestion, reduced service quality.
Packet Loss:
o Description: Data packets fail to reach their destination.
o Causes:
Network congestion (too much traffic overloading the network).
Unstable or low-quality network connections.
Hardware failures (e.g., malfunctioning cables, network adapters).
o Impact: Incomplete data transfers, retransmissions, increased latency, degraded
application performance (e.g., choppy VoIP calls, video artifacts).
Jitter:
o Description: Variation in the delay of received packets, especially problematic for
real-time applications.
o Causes: Network congestion, varying traffic priorities.
o Impact: Audio and video distortions in real-time communication (e.g., video
conferencing), poor user experience.
Underutilization/Overutilization of Resources:
o Description: Network resources are either not fully used (wasting capacity) or are
consistently overloaded (leading to bottlenecks).
o Causes: Poor capacity planning, lack of real-time monitoring, inefficient resource
allocation.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
DDoS Attacks:
o Description: Distributed Denial of Service attacks overwhelm network resources
with a flood of traffic.
o Causes: Malicious actors.
o Impact: Network outages, service disruption, reputational damage.
Unauthorized Access/Data Breaches:
o Description: Unapproved access to the network or sensitive data.
o Causes: Weak security policies, misconfigured firewalls, malware, phishing, lack of
multi-factor authentication, unpatched vulnerabilities.
o Impact: Data loss, financial losses, reputational damage, legal consequences.
Inadequate Security Controls:
o Description: Insufficient measures to protect the network from cyber threats.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
Network Monitoring and Analytics: Implement robust monitoring tools for real-time
visibility into network traffic, performance metrics (latency, bandwidth, packet loss), and
resource utilization. This helps in proactive identification and troubleshooting.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
Example:
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
1. TCP Incast
2. TCP Outcast
Description: Even with low packet loss, DCN switches can experience significant queue
build-up due to bursty traffic and the latency of TCP's congestion control. This leads to
increased latency for all traffic passing through the congested buffer.
Cause: Traditional TCP relies on packet loss as the primary signal for congestion. In DCNs,
where link speeds are high and buffers are often deep to absorb bursts, congestion might
build up in queues for a considerable time before packet loss occurs. This "hidden"
congestion increases latency.
Example: A cloud database cluster might have many concurrent transactions, generating
bursty traffic. While no packets are being explicitly dropped, the packets might sit in switch
buffers for longer than desired, increasing the transaction latency and affecting application
performance.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
Description: In lossless Ethernet DCNs (which use mechanisms like Priority-based Flow
Control - PFC to prevent packet loss by pausing senders), a paused flow can block other
flows that share the same output port, even if those other flows are destined for uncongested
paths.
Cause: PFC operates at the link layer. If one flow experiences congestion and triggers a
pause frame, the entire link can be paused, holding back traffic for other destinations.
Example: In a DCN using PFC, if a storage array experiences a momentary slowdown and
its incoming queue fills up, it might send a pause frame to the switch. If other unrelated
traffic flows through the same switch port to different destinations, they will also be paused
until the congestion on the storage array link clears.
Description: While many new TCP congestion control algorithms (e.g., DCTCP, TIMELY,
BBR, XCP, DCN-TCP) have been proposed to address DCN-specific issues, their
deployment and interoperability can be complex.
Cause: These variants often require modifications to network devices (switches, NICs) or
operating systems, making them challenging to deploy in heterogeneous or multi-vendor
environments. Some protocols rely on explicit congestion notification (ECN) or in-band
telemetry, requiring careful configuration across the network.
Example: A large cloud provider might develop and deploy a specialized TCP variant like
DCTCP to optimize performance within its data centers. However, ensuring its
compatibility and optimal performance when interacting with older hardware or external
networks can be a significant challenge.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
7. UDP-Specific Considerations
While TCP addresses most DCN traffic, UDP is used for latency-sensitive applications that can
tolerate some loss (e.g., real-time monitoring, some gaming, DNS).
Lack of Congestion Control: UDP offers no inherent congestion control, flow control, or
reliability. In a DCN, an uncontrolled UDP flow can easily flood links and cause severe
congestion for other TCP flows.
Packet Loss Management: Applications using UDP must implement their own reliability
mechanisms if needed, or be designed to gracefully handle packet loss. In a DCN,
unexpected UDP packet loss needs to be investigated as it often points to an underlying
network bottleneck or misconfiguration.
Example: A real-time telemetry system within a data center might use UDP to send metrics
from thousands of servers to a central collector. If the collector or the network path to it
becomes congested, UDP packets will be dropped without any notification to the senders,
leading to incomplete or inaccurate data.
Specialized TCP Variants: Use DCN-optimized TCP congestion control algorithms (like
DCTCP, TIMELY, L2DCTCP, etc.) that leverage explicit congestion notification (ECN) or
RTT measurements to react faster and more precisely to congestion.
Network Buffering: Fine-tune switch buffer sizes. While deep buffers can mask
congestion, shallow buffers can lead to premature packet loss and inefficient TCP
performance.
Traffic Management: Implement QoS (Quality of Service) and traffic shaping to prioritize
critical applications and prevent elephant flows from starving mice flows.
Load Balancing: Distribute traffic evenly across multiple paths and servers to avoid single
points of congestion.
In-Network Telemetry: Utilize network monitoring tools and in-band telemetry to gain
granular visibility into network state, queue depths, and RTTs, enabling proactive
identification and mitigation of congestion.
Flow Control Mechanisms: For lossless Ethernet, carefully configure and monitor
Priority-based Flow Control (PFC) to mitigate head-of-line blocking while preserving the
lossless property.
UDP Management: For UDP traffic, employ mechanisms like rate limiting, intelligent load
balancing, and application-level congestion awareness to prevent network saturation.
The transport layer challenges are critical for designing and operating high-performance,
low-latency, and reliable Data Center Networks that can effectively support the demanding
workloads of cloud computing.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
Scalability refers to a system's ability to handle increasing workload or its potential to be enlarged
to accommodate such growth. In data centers and cloud environments, it's about expanding IT
resources (compute, storage, network) to meet growing demand without compromising
performance or incurring disproportionate costs.
1. Architectural Limitations:
o Traditional Three-Tier Networks: As discussed previously, the traditional
hierarchical (core-distribution-access) network architecture, while suitable for client-
server traffic, becomes a bottleneck for the dominant "east-west" (server-to-server)
traffic in virtualized and cloud-native environments. Oversubscription at higher tiers
limits horizontal scalability.
o Monolithic Systems: Relying on single, large, "scale-up" servers or storage arrays
eventually hits physical or cost limits. These systems are difficult to expand
incrementally and can become single points of failure.
o Legacy Systems Integration: Migrating and scaling older, on-premises applications
to the cloud can be challenging due to architectural incompatibilities, requiring
significant refactoring or complex integration solutions.
2. Increased Complexity:
o Management Overhead: As the number of virtual machines, containers, services,
and network devices grows, manual configuration and management become
unsustainable. The complexity increases exponentially, leading to higher operational
costs and increased risk of human error.
o Monitoring and Troubleshooting: Identifying performance bottlenecks, security
threats, or failures in a massively scaled, distributed environment is extremely
challenging without sophisticated monitoring, logging, and analytics tools.
3. Resource Contention:
o "Noisy Neighbor" Syndrome: In multi-tenant cloud environments or highly
virtualized data centers, diverse workloads share underlying physical resources. A
resource-intensive application from one tenant (or department) can consume
excessive CPU, memory, or network I/O, negatively impacting the performance of
other co-located workloads.
o I/O Bottlenecks: Storage I/O (Input / Output operations per second, IOPS, and
throughput) can become a bottleneck if the storage system cannot keep pace with the
demands of numerous concurrent applications.
4. Data Management and Consistency:
o Distributed Data Challenges: As applications scale horizontally across many
nodes, maintaining data consistency, managing distributed transactions, and
ensuring data integrity across multiple data stores (databases, caches, file systems)
becomes a significant architectural and operational challenge.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
o Data Locality: Ensuring that compute resources are physically close to the data they
need to access is crucial for performance at scale, especially for big data analytics.
Poor data locality can lead to high network latency.
5. Cost Management:
o Unforeseen Cloud Costs: While cloud offers elasticity, improper resource
provisioning, lack of optimization, and "cloud sprawl" (unused or over-provisioned
resources) can lead to rapidly escalating and unpredictable monthly bills.
o Hardware Refresh Cycles: For on-premises data centers, scaling often means
significant capital expenditure on new hardware, which then requires power,
cooling, and space.
6. Human Expertise and Skills Gap:
o Talent Scarcity: Scaling modern, software-defined, and cloud-native infrastructure
requires specialized skills in areas like DevOps, SRE, network automation, and
cloud security, which can be hard to find and retain.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
Platform as a Service (PaaS): This service provides an on-demand environment for developing,
testing, delivering, and managing software applications. The developer is responsible for the
application, and the PaaS vendor provides the ability to deploy and run it. Using PaaS, the
flexibility gets reduce, but the management of the environment is taken care of by the cloud
vendors.
Software as a Service (SaaS): It provides a centrally hosted and managed software services to
the end-users. It delivers software over the internet, on-demand, and typically on a subscription
basis. E.g., Microsoft One Drive, Dropbox, WordPress, Office 365, and Amazon Kindle. SaaS
is used to minimize the operational cost to the maximum extent.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4))
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
The very first and basic layer of cloud computing is Infrastructure as a service (Iaas). Infrastructure
as a Service means that you rent IT infrastructure from a cloud provider, such as Microsoft Azure
or Amazon Web Services. This happens on a pay-as-you-go term, meaning you only pay for what
you use.
Examples: Amazon Web Services (AWS) EC2, Google Compute Engine (GCE), Cisco Metapod,
GoGrid, Rackspace etc.,
It is a cloud computing offering where a vendor provides users access to resources such as storage,
data servers, and networking. This means organisations don’t need to handle that in-house.
Infrastructure as a Service consist of both hardware and network, such as servers and storage,
networking firewalls and security, and data centres. That means that organisations and businesses
can use their own applications and platforms within the infrastructure that is delivered by a service
provider.
The second layer of the cloud is the platform – the PaaS (Platform as a service). This layer is a
development and deployment environment in the cloud and provides the resources to actually build
applications.
Just like IaaS, Paas includes infrastructure, but it also includes development tools, database
management systems, middleware, business intelligence, and more. It is designed to support the
entire web application lifecycle—from building and testing to deployment, management and
updating.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
The third cloud layer is the actual Software – the SaaS (Software as a service). This is the layer that
provides a complete software solution. Organisations rent the use of an app, and the users connect
to it via the internet, usually with a web browser.
In a cloud setting, SaaS is therefore the layer where the user consumes the offering from the service
provider. It must be web-based and accessible from everywhere and preferably on any device. The
service provider manages the hardware and software.
One type of SaaS is web-based email services such as Outlook, Gmail, and Hotmail. Here, the
email software is located on the service provider’s network–together with your messages.
This is the top layer of the cloud – BPO (Business Process Outsourcing). BPO refers to the process
in which a company outsources standard business functions to a third-party provider. This is often
done to save time and money on removing that in-house administrative task.
This can be business functions such as accounting and payroll, customer service, and human
resource management. More and more companies are looking to outsource their non-core activities
to third-party service providers to save time and money using the cloud.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
Microsoft Azure:
Microsoft Azure, often simply called Azure, is a cloud computing platform and online portal
provided by Microsoft. Launched in 2010, it's one of the leading cloud providers globally,
competing directly with Amazon Web Services (AWS) and Google Cloud Platform (GCP).
In brief, Azure offers a vast collection of on-demand cloud services that allow individuals,
businesses, and governments to build, deploy, manage, and scale applications and services without
having to buy and maintain their own physical hardware and data centers.
o Compute: Virtual Machines (Windows, Linux), Azure App Service (for web apps),
Azure Functions (serverless computing).
o Storage: Blob Storage (for unstructured data), Disk Storage (for VMs), File Storage
(shared file storage).
o Databases: Azure SQL Database (managed relational database), Azure Cosmos DB
(NoSQL database), Azure Database for MySQL/PostgreSQL.
o Networking: Virtual Network (VNet), Load Balancers, VPN Gateway, Azure DNS.
o AI + Machine Learning: Azure Machine Learning, Azure AI Services (pre-built AI
capabilities for vision, speech, language).
o Analytics: Azure Synapse Analytics, Azure Stream Analytics.
o IoT (Internet of Things): Azure IoT Hub.
o Developer Tools: Azure DevOps.
o Security & Identity: Azure Active Directory, Azure Security Center.
Global Infrastructure: Azure has a vast global network of data centers, organized into "regions"
and "availability zones," providing high availability, disaster recovery capabilities, and low latency
for users worldwide.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
Hybrid Cloud Capabilities: Azure is well-known for its strong support for hybrid cloud
environments, allowing businesses to seamlessly integrate their on-premises infrastructure with
Azure cloud services. This is particularly appealing to enterprises already heavily invested in
Microsoft technologies.
Pay-as-you-go Pricing: Users only pay for the services they consume, eliminating large upfront
costs and allowing for flexible scaling of resources based on demand.
Integration with Microsoft Ecosystem: A significant advantage for businesses already using
Microsoft products (Windows Server, SQL Server, Active Directory, .NET, etc.), as Azure offers
deep integration and familiar tools.
In essence, Microsoft Azure offers a powerful, flexible, and scalable set of cloud services that
enable organizations to move their IT infrastructure and applications to the cloud, innovate faster,
reduce operational costs, and enhance their global reach.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
Security and compliance: IBM Cloud is known for its strong security features
and compliance certifications, particularly important for regulated industries.
Scalability and resilience: IBM Cloud offers scalability and resilience through its
hybrid cloud approach.
Services: These are intangible actions or activities performed for a customer, often involving
human interaction or expertise. They can be things like customer support, consulting, or
maintenance.
Features: These are specific characteristics or capabilities of a product or service that provide
value to the customer. They can be tangible, like the memory capacity of a computer, or
intangible, like the speed of a website.
Relationship: Features contribute to the overall value proposition of a service or product. A
service like a software subscription might have features like automatic updates, customer support,
and specific functionalities that enhance the user experience.
Pricing and Service-Level Agreements (SLAs):An SLA is a contract outlining the
specific services a provider will deliver and the standards they must meet. Service-Level
Agreements (SLAs) and pricing are intrinsically linked, especially in service-based contracts.
An SLA defines the level of service a provider commits to deliver, and this commitment directly
impacts the pricing structure. Higher service levels (e.g., faster response times, greater uptime)
typically come with a higher price tag.
Conversely, lower service levels may result in lower costs, but with increased risk for the customer
regarding service quality and potential downtime.
Key Components: SLAs typically include metrics like uptime, response times, resolution
times, and other performance indicators.
Impact on Pricing: The level of service defined in the SLA directly influences the
price. For example, a service with guaranteed 99.999% uptime will likely cost more than
one with 99.9% uptime.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
CLOUD COMPUTING MATERIAL (UNIT -4)
Example: If a cloud service provider offers different tiers of service with varying uptime
guarantees, the pricing will reflect those differences.
How SLAs Influence Pricing:
Response Time: Faster response times to support requests often mean more staff and
resources dedicated to support, which translates to a higher price.
Uptime Guarantees: Higher uptime percentages require robust infrastructure and
redundancy, increasing costs for the provider and, consequently, the price for the
customer.
Resolution Time: Guaranteed faster resolution times for issues can also lead to higher
pricing.
Scalability: SLAs might define how quickly the service can scale up or down, which can
affect pricing depending on the resources required.
Escalation Management: SLAs often include escalation procedures for when service
levels are not met. This ensures that issues are addressed promptly, but also adds
complexity and potential cost to the service.
Tiered Pricing: Many providers offer different pricing tiers based on the level of service
defined in the SLA. Customers choose the tier that best suits their needs and budget.
Usage-Based Pricing: Some SLAs might incorporate usage-based pricing, where the cost
is determined by how much of the service the customer uses. However, even in these
models, the SLA dictates the quality of the service provided at each usage level.
Performance-Based Pricing: In some cases, SLAs might include performance-based
pricing, where the price is adjusted based on whether the provider meets the agreed-upon
service levels. If the provider falls short, the customer might receive credits or discounts.
In essence, the SLA acts as a roadmap for both the provider and the customer, defining the
scope of the service, the expected performance, and the consequences of failing to meet those
expectations. This directly impacts the pricing structure, making it crucial to understand the
relationship between SLAs and pricing when negotiating a service contract.
PREPARED BY
[Link] KUMAR ([Link])
VEC-KHAMMAM.
Traditional data centers typically operate on a high capital expenditure (CapEx) model, requiring significant upfront investments for hardware and infrastructure setup, followed by ongoing operational costs (OpEx) for maintenance. In contrast, cloud data centers use a primarily operational expenditure (OpEx) model with a pay-as-you-go approach. This allows customers to scale resources on demand without large initial investments, as costs are tied to actual usage and operational expenses rather than owning physical infrastructure. Cloud data centers provide financial flexibility and can be more cost-effective for businesses with variable demands .
Challenges associated with implementing congestion control algorithms in DCNs include the complexity of deployment and interoperability. Many new TCP congestion control algorithms require modifications to network devices or operating systems, complicating their deployment in heterogeneous or multi-vendor environments. These algorithms often rely on explicit congestion notification or in-band telemetry, requiring coordinated configuration across the network. To address these challenges, cloud providers can employ standardized and well-supported protocols like DCTCP, TIMELY, and ensure compatibility with existing infrastructure. Network buffering parameters should be finely tuned, and comprehensive monitoring systems need to be in place to proactively manage network performance .
The main hardware components in a cloud data center's physical infrastructure include servers, storage systems, networking equipment, power infrastructure, and cooling systems. Servers are high-performance computers responsible for processing data and running applications, often equipped with powerful CPUs, RAM, and sometimes GPUs or specialized accelerators. Storage systems provide various solutions, including Direct-Attached Storage (DAS), Network-Attached Storage (NAS), Storage Area Networks (SAN), and Object Storage, each catering to different performance and cost requirements. Networking equipment, including routers, switches, and load balancers, ensure connectivity and performance across the infrastructure. Power infrastructure, such as Uninterruptible Power Supplies (UPS), generators, and Power Distribution Units (PDUs), provides reliable power. Finally, cooling systems, including HVAC and CRAC/CRAH units, maintain optimal temperatures for efficient operation .
Virtualization significantly improves resource utilization in cloud data centers by enabling the creation and management of multiple virtual machines (VMs) on a single physical server using hypervisor software. This allows hardware resources to be divided and allocated more efficiently, maximizing server capacity while offering flexibility for users to provision resources as needed. Virtualization facilitates multi-tenancy, which means multiple customers can share the same physical hardware without interference. This leads to higher utilization rates compared to traditional setups, where hardware may remain underutilized .
Spine-Leaf network architectures enhance scalability and performance in modern data centers by providing a two-tier design where every leaf switch is connected to every spine switch, minimizing the number of hops between nodes. This ensures predictable low latency and provides substantial east-west bandwidth, essential for the high communication demands of data center environments. The architecture supports Equal-Cost Multi-Path (ECMP) routing, distributing traffic evenly across multiple paths, enhancing redundancy and fault tolerance. It allows for efficient traffic distribution and scaling, improving overall network throughput and reliability .
Software-defined data centers (SDDCs) differ from traditional data centers by leveraging virtualization at every layer of the data center infrastructure, including computing, storage, and networking. In an SDDC, resources are abstracted and delivered as a service through software, enabling programmatic control and automation rather than relying on manual configuration. This provides numerous advantages, such as increased flexibility and scalability, faster deployment times, and reduced operational costs. SDDCs offer enhanced automation, allowing for dynamic provisioning and efficient resource management, which improves agility and responsiveness to changing business demands .
Geographic distribution of cloud data centers plays a crucial role in enhancing business continuity and compliance by placing data centers in multiple regions and availability zones worldwide. This distribution allows for low latency, as resources are closer to end-users, reducing communication delays. Moreover, it facilitates disaster recovery and ensures business continuity by enabling data and applications to be replicated across different locations, maintaining operations despite regional outages. Geographic distribution also supports compliance with data residency regulations by storing data within specific jurisdictions, thereby meeting various legal requirements .
The 'Noisy Neighbor' syndrome in cloud environments occurs when a resource-intensive tenant negatively affects the performance of other co-located workloads by consuming excessive shared resources such as CPU, memory, or network I/O. This leads to performance degradation for neighboring applications that share the same physical infrastructure. Mitigating this impact involves implementing strategies like resource isolation through advanced virtualization techniques, setting I/O bandwidth limits, ensuring fair resource allocation, and utilizing management tools to dynamically redistribute resources based on workload demands. Using containerization or microservices can also help by isolating application components from each other .
In cloud data centers, security responsibilities are shared between cloud providers and users. Cloud providers are responsible for securing the infrastructure of the cloud itself, which includes physical security, network security (e.g., firewalls, DDoS protection), and ensuring compliance with industry standards. Users, on the other hand, are responsible for securing the applications, data, and configurations they run in the cloud. This includes implementing access controls, encrypting data, and managing identity and access management (IAM). Both parties must collaborate to ensure robust security practices are followed .
Automation plays a central role in cloud data centers by streamlining the provisioning, scaling, monitoring, and management of resources. It reduces the need for manual intervention, allowing for rapid deployment of services and dynamic resource management through orchestration tools and APIs. This leads to increased efficiency, as resources can be automatically adjusted based on demand, ensuring optimal performance and utilization. Automation also minimizes human error, enhances consistency in operations, and allows cloud providers to handle large-scale environments with minimal staffing, thereby reducing operational costs .