0% found this document useful (0 votes)
5 views21 pages

CC - Module-1 Question With Answers

The document discusses the evolution of computing paradigms, focusing on distributed systems, cloud computing, and high-throughput computing (HTC). It highlights the transition from centralized computing to parallel and distributed computing, emphasizing the importance of scalability, efficiency, and reliability in modern computing architectures. Additionally, it covers the architecture of multicore processors, clusters, and computational grids, illustrating how these technologies enhance computational performance and address the demands of the Internet of Things (IoT).

Uploaded by

tejashwini.genai
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views21 pages

CC - Module-1 Question With Answers

The document discusses the evolution of computing paradigms, focusing on distributed systems, cloud computing, and high-throughput computing (HTC). It highlights the transition from centralized computing to parallel and distributed computing, emphasizing the importance of scalability, efficiency, and reliability in modern computing architectures. Additionally, it covers the architecture of multicore processors, clusters, and computational grids, illustrating how these technologies enhance computational performance and address the demands of the Internet of Things (IoT).

Uploaded by

tejashwini.genai
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Cloud Computing and Security Question Bank (BIS613D)

Module 1: Distributed System Models and Enabling Technologies

1. Describe the evolution of parallel, distributed, and cloud computing over the
past 30 years with labeled diagram

The general computing trend is to leverage shared web resources and massive amounts of data
over the Internet. Figure illustrates the evolution of HPC and HTC systems. On the HPC
side,supercomputers (massively parallel processors or MPPs) are gradually replaced by clusters of
cooperative computers out of a desire to share computing resources.
The cluster is often a collection of homogeneous compute nodes that are physically connected in
close range to one another. On the HTC side, peer-to-peer (P2P) networks are formed for
distributed file sharing and content delivery applications.
A P2P system is built over many client machines. Peer machines are globally distributed in nature.
P2P, cloud computing, and web service platforms are more focused on HTC applications than on
HPC applications.
Clustering and P2P technologies lead to the development of computational grids or data grids.
High-Throughput Computing
The development of market-oriented high-end computing systems is undergoing a strategic change
from an HPC paradigm to an HTC paradigm. This HTC paradigm pays more attention to high-flux
computing. The main application for high-flux computing is in Internet searches and web services
by millions or more users simultaneously. The performance goal thus shifts to measure high
throughput or the number of tasks completed per unit of time. HTC technology needs to not only
improve in terms of batch processing speed, but also address the acute problems of cost, energy
savings, security, and reliability at many data and enterprise computing centers
Three New Computing Paradigms : with the introduction of SOA, Web 2.0 services become
available. Advances in virtualization make it possible to see the growth of Internet clouds as a new
computing paradigm.
The maturity of radio-frequency identification (RFID), Global Positioning System (GPS), and sensor
technologies has triggered the development of the Internet of Things (IoT).
• Centralized computing This is a computing paradigm by which all computer resources are
centralized in one physical system. All resources (processors, memory, and storage) are fully

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


shared and tightly coupled within one integrated OS. Many data centers and supercomputers are
centralized systems, but they are used in parallel, distributed, and cloud computing applications
• Parallel computing In parallel computing, all processors are either tightly coupled with
centralized shared memory or loosely coupled with distributed memory. Some authors refer to this
discipline as parallel processing Interprocessor communication is accomplished through shared
memory or via message passing. A computer system capable of parallel computing is commonly
known as a parallel computer. Programs running in a parallel computer are called parallel
programs. The process of writing parallel programs is often referred to as parallel programming
• Distributed computing This is a field of computer science/engineering that studies distributed
systems. A distributed system consists of multiple autonomous computers, each having its own
private memory, communicating through a computer network. Information exchange in a
distributed system is accomplished through message passing. A computer program that runs in a
distributed system is known as a distributed program. The process of writing distributed programs
is referred to as distributed programming.
• Cloud computing An Internet cloud of resources can be either a centralized or a distributed
computing system. The cloud applies parallel or distributed computing, or both. Clouds can be built
with physical or virtualized resources over large data centers that are centralized or distributed.
Some authors consider cloud computing to be a form of utility computing or service computing].
As an alternative to the preceding terms, some in the high-tech community prefer the term
concurrent computing or concurrent programming.
Distributed System Families
Since the mid-1990s, technologies for building P2P networks and networks of clusters have been
consolidated into many national projects designed to establish wide area computing
infrastructures, known as computational grids or data grids.
Future HPC and HTC systems must be able to satisfy this huge demand in computing power in
terms of throughput, efficiency, scalability, and reliability.

Meeting these goals requires to yield the following design objectives:


• Efficiency measures the utilization rate of resources in an execution model by exploiting massive
parallelism in HPC. For HTC, efficiency is more closely related to job throughput, data access,
storage, and power efficiency.
• Dependability measures the reliability and self-management from the chip to the system and
application levels. The purpose is to provide high-throughput service with Quality of Service (QoS)
assurance, even under failure conditions.
• Adaptation in the programming model measures the ability to support billions of job requests
over massive data sets and virtualized cloud resources under various workload and service models.
• Flexibility in application deployment measures the ability of distributed systems to run well in
both HPC (science and engineering) and HTC (business) applications

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


2. Explain scalable computing over the internet and how it enhances computational
performance. Discuss key technologies supporting scalability

Degrees of Parallelism
In this scenario, bit-level parallelism (BLP) converts bit-serial processing to word-level processing
gradually. Over the years, users graduated from 4-bit microprocessors to 8-,16-, 32-, and 64-bit
CPUs. This led us to the next wave of improvement, known as instruction-level parallelism (ILP), in
which the processor executes multiple instructions simultaneously rather than only one instruction
at a time. For the past 30 years, we have practiced ILP through pipelining, superscalar computing,
VLIW (very long instruction word) architectures, and multithreading. ILP requires branch
prediction, dynamic scheduling, speculation, and compiler support to work efficiently.
Data-level parallelism (DLP) was made popular through SIMD (single instruction, multiple data)
and vector machines using vector or array types of instructions. DLP requires even more hardware
support and compiler assistance to work properly. Ever since the introduction of multicore
processors and chip multiprocessors (CMPs), we have been exploring task-level parallelism (TLP).
A modern processor explores all of the aforementioned parallelism types.
Innovative Applications
Both HPC and HTC systems desire transparency in many application aspects. For example, data
access, resource allocation, process location, concurrency in execution, job replication, and failure
recovery should be made transparent to both users and system management

The Trend toward Utility Computing


Figure identifies major computing paradigms to facilitate the study of distributed systems and
their applications. These paradigms share some common characteristics.

 First, they are all ubiquitous in daily life. Reliability and scalability are two major design
objectives in these computing models.

 Second, they are aimed at autonomic operations that can be self-organized to support
dynamic discovery.

 Finally, these paradigms are composable with QoS and SLAs (service-level agreements).

 Utility computing focuses on a business model in which customers receive computing


resources from a paid service provider. All grid/cloud platforms are regarded as utility
service providers.
The Hype Cycle of New Technologies
Any new and emerging computing and information technology may go through a hype cycle, as
illustrated in Figure 1.3. This cycle shows the expectations for the technology at five different
stages.

 The expectations rise sharply from the trigger period to a high peak of inflated expectations.
Through a short period of disillusionment, the expectation may drop to a valley and then
increase steadily over a long enlightenment period to a plateau of productivity.

 The number of years for an emerging technology to reach a certain stage is marked by

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


special symbols.

 The hollow circles indicate technologies that will reach mainstream adoption in two years.
The gray circles represent technologies that will reach mainstream adoption in two to five
years.

 The solid circles represent those that require five to 10 years to reach mainstream adoption,
and the triangles denote those that require more than 10 years. T

 The crossed circles represent technologies that will become obsolete before they reach the
plateau.
The hype cycle in Figure shows the technology status as of August 2010.

The Internet of Things

 The IoT refers to the networked interconnection of everyday objects, tools, devices, or
computers. One can view the IoT as a wireless network of sensors that interconnect all
things in our daily life. These things can be large or small and they vary with respect to time
and place. The idea is to tag every object using RFID or a related sensor or electronic
technology such as GPS.

 With the introduction of the IPv6 protocol, 2128 IP addresses are available to distinguish all
the objects on Earth, including all computers and pervasive devices.

 The IoT demands universal addressability of all of the objects or things.

Cyber-Physical Systems

 A cyber-physical system (CPS) is the result of interaction between computational processes


and the physical world.

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


 A CPS integrates “cyber” (heterogeneous, asynchronous) with “physical” (concurrent and
information-dense) objects.

 A CPS merges the “3C” technologies of computation, communication ,and control into an
intelligent closed feedback system between the physical world and the information world, a
concept which is actively explored in the United States.

3. Discuss the fundamental components and working principles of modern


multicore processor with a neat diagram

Both multi-core CPU and many-core GPU processors can handle multiple instruction threads at
different magnitudes today.

Figure shows the architecture of a typical multicore processor. Each core is essentially a processor
with its own private cache (L1 cache).

 Multiple cores are housed in the same chip with an L2 cache that is shared by all cores. In
the future, multiple CMPs could be built on the same CPU chip with even the L3 cache on the
chip. Multicore and multithreaded CPUs are equipped with many high-end processors,
including the Intel i7, Xeon, AMD Opteron, Sun Niagara, IBM Power 6, and X cell processors.

 Each core could be also multithreaded.


Multicore CPU and Many-Core GPU Architectures
Multicore CPUs may increase from the tens of cores to hundreds or more in the future. But the CPU
has reached its limit in terms of exploiting massive DLP due to the aforementioned memory wall
problem.
This has triggered the development of many-core GPUs with hundreds or more thin cores. Both IA-
32 and IA-64 instruction set architectures are built into commercial CPUs.
Multithreading Technology

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


In Figure the dispatch of five independent threads of instructions to four pipelined data paths
(functional units) in each of the following five processor categories, from left to right: a x-86
processors have been extended to serve HPC and HTC systems in some high-end server processors.
Many RISC processors have been replaced with multicore x-86 processors and many-core GPUs in
the Top 500 systems.
This trend indicates that x-86 upgrades will dominate in data centers and supercomputers. The
GPU also has been applied in large clusters to build supercomputers in MPPs.
In the future, the processor industry is also keen to develop asymmetric or heterogeneous chip
multiprocessors that can house both fat CPU cores and thin GPU cores on the same chip. four-issue
superscalar processor, a fine-grain multithreaded processor, a coarse-grain multithreaded
processor, a two-core CMP, and a simultaneous multithreaded (SMT) processor.

 The superscalar processor is single-threaded with four functional units. Each of the three
multithreaded processors is four-way multithreaded over four functional data paths.

 In the dual-core processor, assume two processing cores, each a single-threaded two-way
superscalar processor.

 Instructions from different threads are distinguished by specific shading patterns for
instructions from five independent threads.

 Typical instruction scheduling patterns are shown here. Only instructions from the same
thread are executed in a superscalar processor.

 Fine-grain multithreading switches the execution of instructions from different threads per
cycle.

 Course-grain multithreading executes many instructions from the same thread for quite a
few cycles before switching to another thread.

 The multicore CMP executes instructions from different threads completely.

 The SMT allows simultaneous scheduling of instructions from different threads in the same
cycle.

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


 These execution patterns closely mimic an ordinary program. The blank squares correspond
to no available instructions for an instruction data path at a particular processor cycle. More
blank cells imply lower scheduling efficiency.

The Cloud Landscape


Traditionally, a distributed computing system tends to be owned and operated by an autonomous
administrative domain (e.g., a research laboratory or company) for on-premises computing needs.
However, these traditional systems have encountered several performance bottlenecks: constant
system maintenance, poor utilization, and increasing costs associated with hardware/software
upgrades.
Cloud computing as an on-demand computing paradigm resolves or relieves us from these
problems. Figure depicts the cloud landscape and major cloud players, based on three cloud
service models

The Internet of Things (IoT) refers to a network of physical devices embedded with sensors,
software, and connectivity, allowing them to collect and exchange data over the internet.
 Sensor-based data collection
 Real-time monitoring
 Device connectivity and communication
 Automation and intelligent decision-making

4. Explain the basic cluster architecture with neat diagram

Computing cluster consists of interconnected stand-alone computers which work cooperatively as

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


a single integrated computing resource. In the past, clustered computer systems have
demonstrated impressive results in handling heavy workloads with large data sets.
Figure shows the architecture of a typical server cluster built around a low-latency, high bandwidth
Interconnection network. This network can be as simple as a SAN (e.g., Myrinet) or a LAN (e.g.,
Ethernet).
To build a larger cluster with more nodes, the interconnection network can be built with multiple
levels of Gigabit Ethernet, Myrinet, or InfiniBand switches. Through hierarchical construction using
a SAN, LAN, or WAN, one can build scalable clusters with an increasing number of nodes.
The cluster is connected to the Internet via a virtual private network (VPN) gateway. The gateway
IP address locates the cluster.
The system image of a computer is decided by the way the OS manages the shared cluster
resources. Most clusters have loosely coupled node computers. All resources of a server node are
managed by their own OS. Thus, most clusters have multiple system images as a result of having
many autonomous nodes under different OS control.

Cluster designers desire a cluster operating system or some middleware to support SSI at various
levels, including the sharing of CPUs, memory, and I/O across all cluster nodes. An SSI is an illusion
created by software or hardware that presents a collection of resources as one integrated, powerful
resource. SSI makes the cluster appear like a single machine to the user. A cluster with multiple
system images is nothing but a collection of independent computers.
Special cluster middleware supports are needed to create SSI or high availability (HA). Both
sequential and parallel applications can run on the cluster, and special parallel environments are
needed to facilitate use of the cluster resources. For example, distributed memory has multiple
images. Users may want all distributed memory to be shared by all servers by forming distributed
shared memory (DSM).

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


5. Illustrate and explain the architecture of a computational grid. How does it
improve efficiency in large-scale computations?

Like an electric utility power grid, a computing grid offers an infrastructure that couples
computers, software/middleware, special instruments, and people and sensors together. The grid
is often constructed across LAN, WAN, or Internet backbone networks at a regional, national, or
global scale.
Enterprises or organizations present grids as integrated computing resources. They can also be
viewed as virtual platforms to support virtual organizations. The computers used in a grid are
primarily workstations, servers, clusters, and supercomputers. Personal computers, laptops, and
PDAs can be used as access devices to a grid system.
Figure shows an example computational grid built over multiple resource sites owned by
different organizations. The resource sites offer complementary computing resources, including
workstations, large servers, a mesh of processors, and Linux clusters to satisfy a chain of
computational needs. The grid is built across various IP broadband networks including LANs and
WANs already used by enterprises or organizations over the Internet. The grid is presented to
users as an integrated resource pool as shown in the upper half of the figure.

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS
6. Discuss system attacks and threats to cyberspace resulting in 4 types of losses
with neat diagram

Threats to Systems and Networks


Network viruses have threatened many users in widespread attacks. These incidents have created
a worm epidemic by pulling down many routers and servers, and are responsible for the loss of
billions of dollars in business, government, and services.

Figure summarizes various attack types and their potential damage to users. As the figure shows,
information leaks lead to a loss of confidentiality.
Loss of data integrity may be caused by user alteration, Trojan horses, and service spoofing attacks.
A denial of service (DoS) results in a loss of system operation and Internet connections.
Lack of authentication or authorization leads to attackers’ illegitimate use of computing resources.
Open resources such as data centers, P2P networks, and grid and cloud infrastructures could
become the next targets. Users need to protect clusters, grids, clouds, and P2P systems.
Otherwise, users should not use or trust them for outsourced work.
Malicious intrusions to these systems may destroy valuable hosts, as well as network and storage
resources. Internet anomalies found in routers, gateways, and distributed hosts may hinder the
acceptance of these public-resource computing services.
Security Responsibilities
Three security requirements are often considered: confidentiality, integrity, and availability for
most Internet service providers and cloud users.
In the order of SaaS, PaaS, and IaaS, the providers gradually release the responsibility of security
control to the cloud users.
Copyright Protection
Collusive piracy is the main source of intellectual property violations within the boundary of a P2P
network. Paid clients (colluders) may illegally share copyrighted content files with unpaid clients
(pirates). Online piracy has hindered the use of open P2P networks for commercial content
delivery. One can develop a proactive content poisoning scheme to stop colluders and pirates from
alleged copyright Infringements in P2P file sharing. Pirates are detected in a timely manner with
identity-based signatures and time stamped tokens. This scheme stops collusive piracy from
occurring without hurting legitimate P2P clients.
System Defense Technologies

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


Three generations of network defense technologies have appeared in the past. In the first
generation, tools were designed to prevent or avoid intrusions. These tools usually manifested
themselves as access control policies or tokens, cryptographic systems, and so forth.
However, an intruder could always penetrate a secure system because there is always a weak link
in the security provisioning process. The second generation detected intrusions in a timely manner
to exercise remedial actions.
These techniques included firewalls, intrusion detection systems (IDSes), PKI services, reputation
systems, and so on. The third generation provides more intelligent responses to intrusions.
Data Protection Infrastructure
Security infrastructure is required to safeguard web and cloud services. At the user level, one needs
to perform trust negotiation and reputation aggregation over all users.
Security responsibilities are divided between cloud providers and users differently for the three
cloud service models. The providers are totally responsible for platform availability. The IaaS users
are more responsible for the confidentiality issue. The IaaS providers are more responsible for data
integrity. In PaaS and SaaS services, providers and users are equally responsible for preserving
data integrity and confidentiality.

7. Explain Write short notes on peer-to-peer network families.

An example of a well-established distributed system is the client-server architecture. In this


scenario, client machines (PCs and workstations) are connected to a central server for compute, e-
mail, file access, and database applications. The P2P architecture offers a distributed model of
networked systems. First, a P2P network is client-oriented instead of server-oriented. In this
section, P2P systems are introduced at the physical level and overlay networks at the logical level.
P2P Systems
In a P2P system, every node acts as both a client and a server, providing part of the system
resources. Peer machines are simply client computers connected to the Internet. All client
machines act autonomously to join or leave the system freely. This implies that no master-slave
relationship exists among the peers. No central coordination or central database is needed. In other
words, no peer machine has a global view of the entire P2P system. The system is self-organizing
with distributed [Link] shows the architecture of a P2P network at two abstraction levels.
Initially, the peers are totally unrelated. Each peer machine joins or leaves the P2P network
voluntarily. Only the participating peers form the physical network at any time. Unlike the cluster
or grid, a P2P network does not use a dedicated interconnection network. The physical network is
simply an ad hoc network formed at various Internet domains randomly using the TCP/IP and NAI
protocols. Thus, the physical network varies in size and topology dynamically due to the free
membership in the P2P network.
Overlay Networks
Data items or files are distributed in the participating peers. Based on communication or file-
sharing needs, the peer IDs form an overlay network at the logical level. This overlay is a virtual
network formed by mapping each physical machine with its ID, logically, through a virtual mapping
as shown in Figure

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


When a new peer joins the system, its peer ID is added as a node in the overlay network.
When an existing peer leaves the system, its peer ID is removed from the overlay network
automatically. Therefore, it is the P2P overlay network that characterizes the logical connectivity
among the peers.
There are two types of overlay networks: unstructured and structured. An unstructured overlay
network is characterized by a random graph. There is no fixed route to send messages or files
among the nodes. Often, flooding is applied to send a query to all nodes in an unstructured overlay,
thus resulting in heavy network traffic and nondeterministic search results. Structured overlay
networks follow certain connectivity topology and rules for inserting and removing nodes (peer
IDs) from the overlay graph. Routing mechanisms are developed to take advantage of the
structured overlays.
P2P Application Families
Based on application, P2P networks are classified into four groups, as shown in Table 1.5. The first
family is for distributed file sharing of digital contents (music, videos, etc.) on the P2P network
P2P Computing Challenges
P2P computing faces three types of heterogeneity problems in hardware, software, and network
requirements.

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


8. Explore Parallel and Distributed Programming Models and Tool Sets for
distributed computing

Message-Passing Interface (MPI)

 This is the primary programming standard used to develop parallel and concurrent
programs to run on a distributed system.

 MPI is essentially a library of subprograms that can be called from C or

 FORTRAN to write parallel programs running on a distributed system. The idea is to


embody

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


clusters, grid systems, and P2P systems with upgraded web services and utility computing
applications.
MapReduce

 This is a web programming model for scalable data processing on large clusters over large
data [Link] model is applied mainly in web-scale search and cloud computing applications.

 The user specifies a Map function to generate a set of intermediate key/value pairs. Then the
user applies a Reduce function to merge all intermediate values with the same intermediate
key.

 MapReduce is highly scalable to explore high degrees of parallelism at different job levels. A
typical MapReduce computation process can handle terabytes of data on tens of thousands
or more client machines.

 Hundreds of MapReduce programs can be executed simultaneously; in fact, thousands of


MapReduce jobs are executed on Google’s clusters every day.
Hadoop Library

 Hadoop offers a software platform that was originally developed by a Yahoo! group.

 The package enables users to write and run applications over vast amounts of distributed
data.

 Users can easily scale Hadoop to store and process petabytes of data in the web space.

 Hadoop is economical in that it comes with an open source version of MapReduce that
minimizes overhead in task spawning and massive data communication. It is efficient, as it
processes data with a high degree of parallelism across a large number of commodity nodes,
and it is reliable in that it automatically keeps multiple data copies to facilitate
redeployment of computing tasks upon unexpected system failures.

9. Explain The cloudland landscape based on three cloud service models

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


Figure depicts the cloud landscape and major cloud players, based on three cloud service models.
• Infrastructure as a Service (IaaS) This model puts together infrastructures demanded by
users—namely servers, storage, networks, and the data center fabric. The user can deploy and
run on multiple VMs running guest OSes on specific applications. The user does not manage or
control the underlying cloud infrastructure, but can specify when to request and release the
needed resources.
• Platform as a Service (PaaS) This model enables the user to deploy user-built applications
onto a virtualized cloud platform. PaaS includes middleware, databases, development tools, and
some runtime support such as Web 2.0 and Java. The platform includes both hardware and
software integrated with specific programming interfaces. The provider supplies the API and
software tools (e.g., Java, Python, Web 2.0, .NET). The user is freed from managing the cloud
infrastructure.
• Software as a Service (SaaS) This refers to browser-initiated application software over
thousands of paid cloud customers. The SaaS model applies to business processes, industry
applications, consumer relationship management (CRM), enterprise resources planning (ERP),
human resources (HR), and collaborative applications. On the customer side, there is no upfront
investment in servers or software licensing. On the provider side, costs are rather low, compared
with conventional hosting of user applications.

Internet clouds offer four deployment modes: private, public, managed, and hybrid . These modes
demand different levels of security implications. The different SLAs imply that the security
responsibility is shared among all the cloud providers, the cloud resource consumers, and the third
party cloud-enabled software providers. Advantages of cloud computing have been advocated by
many IT experts, industry leaders, and computer science researchers.

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


The following list highlights eight reasons to adapt the cloud for upgraded Internet applications
and web services:
1. Desired location in areas with protected space and higher energy efficiency
2. Sharing of peak-load capacity among a large pool of users, improving overall utilization
3. Separation of infrastructure maintenance duties from domain-specific application development
4. Significant reduction in cloud computing cost, compared with traditional computing paradigms
5. Cloud computing programming and application development
6. Service and data discovery and content/service distribution
7. Privacy, security, copyright, and reliability issues
8. Service agreements, business models, and pricing policies

10. Define and analyze the impact of three modern computing paradigms: Service-
Oriented Architecture (SOA), cloud computing and Internet of Things (IOT)

 SOA applies to building grids, clouds, grids of clouds, clouds of grids, clouds of clouds (also
known as interclouds), and systems of systems in general. A large number of sensors
provide data-collection services, denoted in the figure as SS (sensor service).
 A sensor can be a ZigBee device, a Bluetooth device, a WiFi access point, a personal
computer, a GPA, or a wireless phone, among other things. Raw data is collected by sensor
services. All the SS devices interact with large or small computers, many forms of grids,
databases, the compute cloud, the storage cloud, the filter cloud, the discovery cloud, and so
on.
 Filter services (are used to eliminate unwanted raw data, in order to respond to specific
requests from the web, the grid, or web services.
 A collection of filter services forms a filter cloud. SOA aims to search for, or sort out, the
useful data from the massive amounts of raw data items. Processing this data will generate
useful information, and subsequently, the knowledge for our daily use.
 Finally, we make intelligent decisions based on both biological and machine wisdom. For
raw data collected by a large number of sensors to be transformed into useful information
or knowledge, the data stream may go through a sequence of compute, storage, filter, and
discovery clouds. Finally, the inter-service messages converge at the portal, which is
accessed by all users.

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


Grids versus Clouds
The boundary between grids and clouds are getting blurred in recent years. For web services,
workflow technologies are used to coordinate or orchestrate services with certain specifications
used to define critical business process models such as two-phase transactions. In all approaches,
one is building a collection of services which together tackle all or part of a distributed computing
problem.
In general, a grid system applies static resources, while a cloud emphasizes elastic resources. For
some researchers, the differences between grids and clouds are limited only in dynamic resource
allocation based on virtualization and autonomic computing. One can build a grid out of multiple
clouds. This type of grid can do a better job than a pure cloud, because it can explicitly support
negotiated resource allocation. Thus one may end up building with a system of systems: such as a
cloud of clouds, a grid of clouds, or a cloud of grids, or inter-clouds as a basic SOA architecture .

11. Explain how Energy efficiency in distributed systems.


Primary performance goals in conventional parallel and distributed computing systems are high
performance and high throughput, considering some form of performance reliability (e.g., fault
tolerance and security). However, these systems recently encountered new challenging issues
including energy efficiency, and workload and resource outsourcing. These emerging issues are
crucial not only on their own, but also for the sustainability of large-scale computing systems in
general. This section reviews energy consumption issues in servers and HPC systems, an area
known as distributed power management (DPM).
Protection of data centers demands integrated solutions.

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


Energy consumption in parallel and distributed computing systems raises various monetary,
environmental, and system performance issues. Energy Consumption of Unused Servers
To run a server farm (data center) a company has to spend a huge amount of money for hardware,
software, operational support, and energy every year. Therefore, companies should thoroughly
identify whether their installed server farm (more specifically, the volume of provisioned
resources)

Reducing Energy in Active Servers


In addition to identifying unused/underutilized servers for energy savings, it is also necessary to
apply appropriate techniques to decrease energy consumption in active distributed systems with
negligible influence on their performance. Power management issues in distributed computing
platforms can be categorized into four layers

Application Layer
Until now, most user applications in science, business, engineering, and financial areas tend to
increase a system’s speed or quality. By introducing energy-aware applications, the challenge is to
design sophisticated multilevel and multi-domain energy management applications without
hurting
Middleware Layer
The middleware layer acts as a bridge between the application layer and the resource layer. This
layer provides resource broker, communication service, task analyzer, task scheduler, security
access, reliability control, and information service capabilities. It is also responsible for applying
energy-efficient techniques, particularly in task scheduling.
Resource Layer
The resource layer consists of a wide range of resources including computing nodes and storage
units. This layer generally interacts with hardware devices and the operating system; therefore, it
is responsible for controlling all distributed resources in distributed computing systems. In the

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


recent past, several mechanisms have been developed for more efficient power management of
hardware and operating systems. The majority of them are hardware approaches particularly for
processors.
Dynamic power management (DPM) and dynamic voltage-frequency scaling (DVFS) are two
popular methods incorporated into recent computer hardware systems

Network Layer
Routing and transferring packets and enabling network services to the resource layer are the main
responsibility of the network layer in distributed computing systems. The major challenge to build
energy-efficient networks is, again, determining how to measure, predict, and create a balance
between energy consumption and performance. Two major challenges to designing energy-efficient
networks are:
• The models should represent the networks comprehensively as they should give a full
understanding of interactions among time, space, and energy.
• New, energy-efficient routing algorithms need to be developed. New, energy-efficient protocols
should be developed against network attacks.
DVFS Method for Energy Efficiency
The DVFS method enables the exploitation of the slack time (idle time) typically incurred by
intertask relationship. Specifically, the slack time associated with a task is utilized to execute the
task in a lower voltage frequency. The relationship between energy and voltage frequency in CMOS
circuits is related by:

where v, Ceff, K, and vt are the voltage, circuit switching capacity, a technology dependent factor,
and threshold voltage, respectively, and the parameter t is the execution time of the task under
clock frequency f. By reducing voltage and frequency, the device’s energy consumption can also be
reduced.

prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS


prepared by Dr. Rekha P M, Professor,Dept. of ISE,JSS

You might also like