CCS 335 CLOUD COMPUTING
UNIT IV
CLOUD DEPLOYMENT ENVIRONMENT
Q.1 What is Amazon Web Services ?
Ans.: Amazon Web Services (AWS) is a collection of remote computing services
(web services) that together make up a cloud computing platform, offered over the Internet
by [Link]. Amazon Web Services (AWS) is a cloud computing platform from Amazon
that provides customers with a wide array of cloud services.
Q.2 What is AWS ecosystem ?
Ans.: AWS ecosystem is made up of three subsystems:
1. AWS computing services provided by Amazon.
2. Computing services provided by third parties that operate on AWS.
3. Complete applications offered by third parties that run on AWS.
Q.3 What do you understand by third party cloud services ?
Ans.: Composing service that belongs to different vendors or integrating them into
existing software systems. The service-oriented model, which is the basis of cloud
computing, facilitates such an approach and provides the opportunity for developing a new
class of services that can be called third-party cloud services.
Q.4 What is eucalyptus ?
Ans. :
• Eucalyptus stands for Elastic Utility Computing Architecture for Linking Your
Programs to Useful Systems.
• It is an open-source software framework that provides the platform for private cloud
computing implementation on computer clusters.
• Eucalyptus implements Infrastructure as a Service (IaaS) methodology for solutions
in private and hybrid clouds.
• Eucalyptus provides a platform for a single interface so that users can calculate the
resources available in private clouds and the resources available externally in public cloud
services.
Q.5 List the features of eucalyptus.
Ans.: Features include:
1. Supports both Linux and Windows Virtual Machines (VMs).
2. Application program interface (API)compatible with Amazon EC2.
3. Compatible with Amazon Web Services (AWS) and Simple Storage Service (S3).
4. Works with multiple hypervisors including VMware, Xen and KVM.
5. Can be installed and deployed from source code or DEB and RPM.
Q.6 How Virtualization employed in azure.
Ans.:An Azure virtual machine gives you the flexibility of virtualization without
having to buy and maintain the physical hardware that runs it. However, you still need to
maintain the virtual machine by performing tasks, such as configuring, patching, and
installing the software that runs on it.
Q.7 What is AWS ecosystem.
Ans.: Amazon Web Services is a cloud computing service that makes it easy to build
scalable and reliable applications, websites, and services. It makes it easy for businesses to
develop, deploy and extend their software, as well as store data.
Q.8 What is Amazon Web Services?
Ans:Amazon Web Services (AWS) is the world's most comprehensive and broadly
adopted cloud computing platform offered by Amazon. It provides a massive range of on-
demand services over the internet, spanning:
Compute
Storage
Databases
Networking
Analytics
Security
Management Tools
Enterprise Applications
Q.9 What is AWS ecosystem?
Ans:The AWS ecosystem encompasses a vast network of elements beyond just the
individual services offered by AWS itself. It's a dynamic and interconnected space made up
of several key components:
1. AWS Services
2. AWS Marketplace
3. Independent Software Vendors (ISVs)
4. Technology Partners
5. Consulting Partners
6. Developer Community
7. Open Source Projects
Q.10 What do you understand by third party cloud services?
Ans:Third-party cloud services refer to cloud computing resources and services
offered by companies other than major hyperscalers like Amazon Web Services (AWS),
Microsoft Azure, and Google Cloud Platform (GCP). These companies provide a diverse
range of cloud offerings, often specializing in specific areas or catering to niche markets.
key characteristics
Focus on specific areas
Vertical solutions
Platform-as-a-Service (PaaS)
Software-as-a-Service (SaaS)
Flexibility and agility
Competitive pricing
Security and compliance
Limited reach
Q.11 List the issues in parallel and distributed paradigms?
Ans:Both parallel and distributed paradigms offer compelling advantages for large-
scale computing, but they are not without their challenges. Here's a breakdown of some key
issues in both paradigms:
Parallel Programming Issues:
1. Shared Memory Complexity
2. Debugging Difficulties
3. Limited Scalability
4. Algorithmic Suitability.
Distributed Programming Issues:
1. Increased Communication Overhead:
2. Fault Tolerance Complexity:
3. Network Latency and Bandwidth:
4. Security Concerns:
5. Software and Hardware Heterogeneity:
Q.12 Define SQLAzure?
Ans:SQL Azure, now known as Azure SQL Database, is a fully managed relational
database service provided by Microsoft Azure. It offers a cloud-based platform for hosting
and managing SQL Server databases, eliminating the need to set up and maintain physical
infrastructure.
Key Features:
Managed Service
Scalability
High Availability
Security
Compatibility
Multiple Deployment Options
Q.13 What is an azure queue?
Ans:In Azure, a queue refers to an Azure Queue Storage service, which provides a
reliable and scalable way to store and retrieve messages asynchronously. Think of it as a
temporary holding area for messages exchanged between different parts of your application
or even different applications.
Breakdown of what Azure queues offer:
Functionality:
Store and retrieve messages
Decoupling applications
Handling workload spikes
Reliable message delivery
Q.14 How virtualization employed in azure?
Ans:Virtualization plays a crucial role in Microsoft Azure, enabling its vast array of
cloud computing services. Here's a breakdown of how Azure employs virtualization across
different levels:
1. Server Virtualization:
2. Network Virtualization:
3. Storage Virtualization:
4. Desktop Virtualization:
Q.15 List the major feature of Google App Engine. Which kind of problems can be
solved using GAE.
Ans:
Major Features of Google App Engine (GAE):
Automatic Scaling
Pay-per-use Pricing
Global and Highly Available
Secure and Reliable
Multiple Languages and Frameworks
Simple Deployment
Built-in Services
Serverless Options
Extensive Documentation and Support
Q.17 What is cloud analytics?
Ans:Cloud analytics refers to the process of storing, analyzing, and extracting
actionable insights from data using cloud computing technologies. Essentially, it leverages
the scalability, flexibility, and powerful processing capabilities of cloud platforms to unlock
the potential of your data.
key features:
Data Storage
Data Processing
Benefits of Cloud Analytics:
Scalability and flexibility
Cost-effectiveness
Accessibility
Faster insights
Collaboration
Security
Q.18 Define MapReduce?
Ans:MapReduce is a programming model and framework for processing large
datasets in a distributed and parallel fashion. It's designed to handle massive computations
across clusters of computers, making it highly scalable and efficient.
How it works:
Map Phase:
Shuffle Phase:
Reduce Phase:
Q.19 Define iterative MapReduce?
Ans:Iterative MapReduce refers to a technique where multiple MapReduce jobs are
chained together in a sequence, with the output of one job becoming the input for the next
job. This approach allows for more complex data processing tasks that require multiple stages
of analysis or refinement.
How it works:
Initial MapReduce Job
Intermediate Results
Subsequent MapReduce Jobs
Each Iteration
Q.20 Define HDFS?
Ans:HDFS (Hadoop Distributed File System) is a distributed file system designed to
run on commodity hardware. It excels at storing and managing large datasets across clusters
of computers, making it a vital component of the Apache Hadoop ecosystem for big data
processing.
key features:
Architecture:
Scalability and Performance:
Q.21 List the characteristics of HDFS?
Ans:
[Link]: HDFS can be easily scaled horizontally by adding more nodes to the
cluster. This makes it ideal for storing and processing large datasets.
2. Fault Tolerance: HDFS replicates data across multiple nodes in the cluster. This
ensures that data is still available even if some nodes fail.
3. High Throughput: HDFS is designed for high-throughput data access. This means
that it can read and write data to disk very quickly.
4. Cost-Effectiveness: HDFS is designed to run on commodity hardware. This makes
it a cost-effective solution for storing and processing large datasets.
5. Large Files: HDFS is optimized for storing and processing large files. This makes
it a good choice for applications that deal with large datasets, such as log analysis, scientific
computing, and social media analysis.
6. Not Suitable for Small Files: HDFS is not well-suited for storing and processing
small files. This is because the overhead of storing and managing small files can be
significant.
7. Limited Support for Random Access: HDFS is designed for read-once, write-
once use cases. This means that it is not well-suited for applications that require random
access to data.
8. Evolving Landscape: HDFS is a mature technology, but it is not the only option
for storing and processing large datasets. Newer tools and frameworks, such as Spark, offer
alternative paradigms that may be better suited for some applications.
Q.22 Explain the HDFS operations?
Common HDFS operations:
1. File Creation
2. File Reading
3. File Writing
4. File Deletion
5. Replication
6. Block Management
7. Data Integrity
8. Namespace Management
9. Security
10. High Availability
11. Federation
Q.23 Define block replication?
Ans:Block replication is a fundamental concept in distributed file systems like HDFS
(Hadoop Distributed File System) that ensures data redundancy and fault tolerance. It
involves creating multiple copies of each data block and storing them on different nodes
within the cluster.
Q.24 Define heart beat in Hadoop? What are the advantages of heart beat?
Ans:The context of Hadoop, particularly Hadoop Distributed File System (HDFS), a
heartbeat refers to a signal sent by DataNodes (worker nodes) to the NameNode (master
node) at regular intervals. This signal acts as a "pulse check" to verify the liveness and health
of the DataNodes within the cluster.
Key characteristics :
Frequency:
Content.
Purpose
Monitor DataNode health:
Update block locations: .
Trigger DataNode actions: .
Q.25 Define GFS?
Ans:GFS (Google File System) is a distributed file system designed by Google to
handle massive datasets across large clusters of commodity hardware. It was developed to
address the unique challenges of storing and processing huge amounts of data within
Google's search infrastructure.
Key features of GFS:
Scalability:
Fault tolerance: .
High throughput:
Append-only writes.
Single master
Large blocks:
Q.26 Define BigTable?
Ans:BigTable is a fully managed, wide-column, and key-value NoSQL database
service offered by Google Cloud Platform. It is designed for scalability and high
performance, handling large amounts of data with low latency and high throughput.
Key features of BigTable:
Scalability
Flexibility
High Performance
Durability and Reliability
Cost-Effectiveness
Q.27 What is mean by NOSQL?
Ans:NoSQL, which stands for "not only SQL" or "non-relational", refers to a group
of database management systems that deviate from the traditional relational model used in
widespread RDBMS (relational database management systems) like MySQL or Oracle.
While RDBMS excel at structured data and complex queries, NoSQL databases offer
different data storage and retrieval mechanisms tailored for specific needs.
Key characteristics of NoSQL:
Data Models:
Flexibility
Schema-less or Schema-flexible
Scalability and Performance:
Horizontal scaling
High Availability and Fault Tolerance
Q.28 Explain the Google’s distributed lock service?
Ans:When it comes to Google's distributed lock services, the most prominent one is
Chubby, initially introduced in 2006. However, it's important to acknowledge that Google
may have other internal distributed lock services for specific purposes.
Need to know about Chubby:
Purpose: Chubby provides coarse-grained locking and reliable storage for small files
in a loosely-coupled distributed system. It's not focused on high performance or frequent
locking scenarios.
Architecture: Chubby operates as a replicated service running on a set of servers
across different data centers. This ensures high availability and fault tolerance.
Functionality:
Locking: Chubby allows clients to acquire and release locks on files, ensuring only
one client can modify a specific resource at a time.
File Storage: Chubby offers limited but reliable storage for small files, often used for
configuration information or event notifications.
Event Notification: Clients can register for notifications when data or locks
change, enabling coordination between distributed processes.
Q.29 Define simple storage service (S3)?
Ans: Amazon Simple Storage Service (S3) is a cloud storage service offered by
Amazon Web Services (AWS) that provides object storage through a web service interface. It
serves as a reliable and scalable platform for storing and managing data of any type and size,
from a few bytes to petabytes.
Key features of S3:
Scalability
Durability and Reliability
Security
Cost-effectiveness
Flexibility
Simplicity
Q.30 Define Elastic Block Store (EBS)?
Ans:Amazon Elastic Block Store (EBS) is a block storage service offered by Amazon
Web Services (AWS) that provides persistent block-level storage for use with Amazon
Elastic Compute Cloud (EC2) instances. Essentially, it acts as a virtual hard drive for your
cloud-based servers, allowing you to store data independently of the running instance itself.
Key features of EBS:
Persistent Storage
Scalability
High Performance
Durability and Reliability
Flexibility
Security