What is Cloud Storage?
Cloud Storage is a mode of computer data storage in which digital data is
stored on servers in off-site locations. The servers are maintained by a third-
party provider who is responsible for hosting, managing, and securing data
stored on its infrastructure. The provider ensures that data on its servers is
always accessible via public or private internet connections.
Cloud Storage enables organizations to store, access, and maintain data so
that they do not need to own and operate their own data centers, moving
expenses from a capital expenditure model to operational. Cloud Storage is
scalable, allowing organizations to expand or reduce their data footprint
depending on need.
Google Cloud provides a variety of scalable options for organizations to store
their data in the cloud.
How does Cloud Storage work?
Cloud Storage uses remote servers to save data, such as files, business data,
videos, or images. Users upload data to servers via an internet connection,
where it is saved on a virtual machine on a physical server. To maintain
availability and provide redundancy, cloud providers will often spread data to
multiple virtual machines in data centers located across the world. If storage
needs increase, the cloud provider will spin up more virtual machines to
handle the load. Users can access data in Cloud Storage through an internet
connection and software such as web portal, browser, or mobile app via an
application programming interface (API).
Cloud Storage is available in four different models:
1. Public
Public Cloud Storage is a model where an organization stores data in a
service provider’s data centers that are also utilized by other companies. Data
in public Cloud Storage is spread across multiple regions and is often offered
on a subscription or pay-as-you-go basis. Public Cloud Storage is considered
to be “elastic” which means that the data stored can be scaled up or down
depending on the needs of the organization. Public cloud providers typically
make data available from any device such as a smartphone or web portal.
2. Private
Private Cloud Storage is a model where an organization utilizes its own
servers and data centers to store data within their own network. Alternatively,
organizations can deal with cloud service providers to provide dedicated
servers and private connections that are not shared by any other organization.
Private clouds are typically utilized by organizations that require more control
over their data and have stringent compliance and security requirements.
Hybrid
A hybrid cloud model is a mix of private and public cloud storage models. A
hybrid cloud storage model allows organizations to decide which data it wants
to store in which cloud. Sensitive data and data that must meet strict
compliance requirements may be stored in a private cloud while less sensitive
data is stored in the public cloud. A hybrid cloud storage model typically has a
layer of orchestration to integrate between the two clouds. A hybrid cloud
offers flexibility and allows organizations to still scale up with the public cloud
if need arises.
Multicloud
A multicloud storage model is when an organization sets up more than one
cloud model from more than one cloud service provider (public or private).
Organizations might choose a multicloud model if one cloud vendor offers
certain proprietary apps, an organization requires data to be stored in a
specific country, various teams are trained on different clouds, or the
organization needs to serve different requirements that are not stated in the
servicers’ Service Level Agreements. A multicloud model offers organizations
flexibility and redundancy.
Advantages of Cloud Storage
Total cost of ownership
Cloud Storage enables organizations to move from a capital expenditure to an
operational expenditure model, allowing them to adjust budgets and resources
quickly.
Elasticity
Cloud Storage is elastic and scalable, meaning that it can be scaled up (more
storage added) or down (less storage needed) depending on the
organization’s needs.
Flexibility
Cloud Storage offers organizations flexibility on how to store and access data,
deploy and budget resources, and architect their IT infrastructure.
Security
Most cloud providers offer robust security, including physical security at data
centers and cutting edge security at the software and application levels. The
best cloud providers offer zero trust architecture, identity and access management,
and encryption.
Sustainability
One of the greatest costs when operating on-premises data centers is the
overhead of energy consumption. The best cloud providers operate on
sustainable energy through renewable resources.
Redundancy
Redundancy (replicating data on multiple servers in different locations) is an
inherent trait in public clouds, allowing organizations to recover from disasters
while maintaining business continuity.
Disadvantages of Cloud Storage
Compliance
Certain industries such as finance and healthcare have stringent requirements
about how data is stored and accessed. Some public cloud providers offer tools
to maintain compliance with applicable rules and regulations.
Latency
Traffic to and from the cloud can be delayed because of network traffic
congestion or slow internet connections.
Control
Storing data in public clouds relinquishes some control over access and
management of that data, entrusting that the cloud service provider will
always be able to make that data available and maintain its systems and
security.
Outages
While public cloud providers aim to ensure continuous availability, outages
sometimes do occur, making stored data unavailable.
How to use Cloud Storage
Cloud Storage provides several use cases that can benefit individuals and
organizations. Whether a person is storing their family budget on a spreadsheet, or a
massive organization is saving years of financial data in a highly secure database,
Cloud Storage can be used for saving digital data of all kinds for as long as needed.
Backup
Data backup is one of the simplest and most prominent uses of Cloud Storage.
Production data can be separated from backup data, creating a gap between the two
that protects organizations in the case of a cyber threat such as ransomware. Data
backup through Cloud Storage can be as simple as saving files to a digital folder such
as Google Drive or using block storage to maintain gigabytes or more of important
business data.
Archiving
The ability to archive old data has become an important aspect of Cloud Storage, as
organizations move to digitize decades of old records, as well as hold on to records for
governance and compliance purposes. Google Cloud offers several tiers of storage for
archiving data, including coldline storage and archival storage, that can be accessed
whenever an organization needs them.
Disaster recovery
A disaster—natural or otherwise— that wipes out a data center or old physical records
needs not be the business-crippling event that it was in the past. Cloud Storage allows
for disaster recovery so that organizations can continue with their business, even
when times are tough.
Data processing
As Cloud Storage makes digital data immediately available, data becomes much more
useful on an ongoing basis. Data processing, such as analyzing data for business
intelligence or applying machine learning and artificial intelligence to large datasets, is
possible because of Cloud Storage.
Content delivery
With the ability to save copies of media data, such as large audio and video files, on
servers dispersed across the globe, media and entertainment companies can serve
their audience low-latency, always available content from wherever they reside.
Types of Cloud Storage
Cloud Storage comes in three different types: object, file, and block.
Object
Object storage is a data storage architecture for large stores of unstructured
data. It designates each piece of data as an object, keeps it in a separate
storehouse, and bundles it with metadata and a unique identifier for easy
access and retrieval.
File
File storage organizes data in a hierarchical format of files and folders. File
storage is common in personal computing where data is saved as files and
those files are organized in folders. File storage makes it easy to locate and
retrieve individual data items when they are needed. File storage is most often
used in directories and data repositories.
Block
Block storage breaks data into blocks, each with an unique identifier, and then
stores those blocks as separate pieces on the server. The cloud network
stores those blocks wherever it is most efficient for the system. Block storage
is best used for large volumes of data that require low latency such as
workloads that require high performance or databases.
Object Storage
Object storage is a type of data storage that designates each individual
piece of data as an object. The object can then be stored on your
computer or in the Cloud. There is no limit to the number of objects that
can be stored, as long as you have the capacity. Additionally, the object
is bundled with customizable metadata to describe the contents. This
ability to tag files with metadata allows you to index files easily.
Example:
Using object storage in the Cloud makes data incredibly scalable and
flexible, with easy access from anywhere. This makes object storage
ideal for many different applications.
Customizable metadata supports functionality including, advanced
search, management, and analytics. Because object storage allows you to
manage and tag files with metadata, it is ideal for uses where tracking
and indexing are necessary. This is why object storage is used for storing
photos, songs, and files on online platforms, such as Facebook, Spotify,
or Dropbox.
Used for:
Large Sets of Historical Data
Archived Files
Cloud Storage
Unstructured Data Storage (documents, images, video, etc.)
Big Data Storage
Backup and Recovery
File Storage
File storage (or a file system) is a type of data storage that allows for
shared access to file data. It is a hierarchical system where users can
gain access to files and create, modify, read, or organize them in a
directory. This shared file access is often offered through a network or
Network Attached Storage (NAS). NAS is a dedicated file storage
device that enables users to retrieve data that is stored within. It can be
accessed through the ethernet or the Cloud, which allows for more
flexibility. File storage used in this way will typically contain basic
metadata, such as name, file type, and date of creation.
Example:
Most businesses employees need to have access to shared files and they,
therefore, need access to a file system. File storage is ideal for large
content repositories, home directories, and development environments.
Used for:
Web Applications
Content Management Systems (CMS)
Home Directories
Big Data Analytics
Media Streaming
Database Back-up
Block Storage
Block storage is the oldest type of data storage where files are separated
into evenly sized blocks of data. It is usually managed by software that
handles retrieval, using addresses to identify blocks and assemble them
into files when necessary.
Each block has a unique address, but no metadata. This lack of metadata
can affect performance in metadata critical operations, such as search
and retrieval. However, operating systems can access block storage
directly as attached disks and allows users to edit one part of a file,
which makes it ideal for database servers.
Example:
Block storage is a high-performance, low latency alternative, making it
ideal for transactional or database applications. Block storage makes the
most recent data available quickly allowing for better performance for
transactions. It also functions well for enterprise IT environments
because it supports different workloads, such as virtual machines,
databases, and business-critical applications.
Used for:
Transactional Data
Databases
Business Applications
Enterprise IT Environments
Low-latency Storage
---------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------
1. EBS: high performance, per-instance block storage
EBS used to be accessible to a single EC2 instance only,
making it most like your physical hard drive.
That’s still generally how it’s used (as per-instance storage), but in special cases,
Amazon EBS Multi-Attach can turn EBS into multi-instance storage, like EFS.
EBS Instances can be either General Purpose SSD (for general use) or Provisioned
IOPS SSD, for mission-critical workloads.
What kind of storage is EBS?
EBS is a block storage service, which means all data within EBS is stored in equally
sized blocks. This system offers some performance advantages over traditional storage,
and generally boasts lower latency, too.
EBS’s key benefits
Within its role as a per-EC2 instance service, EBS has a range of benefits:
Low-latency performance – up to 16,000 IOPS for General Purpose SSDs and up to
256,000 IOPS for the new Provisioned IOPS SSD
Easy data backup and restoration – via snapshots that can be taken at hourly intervals,
EBS ensures all your data is well protected
Highly available – 99.8% to 99.9% for General Purpose SSDs and 99.999% for the
Provisioned IOPS SSD
EBS encryption – there’s no need to worry about key management, as EBS handles
that for you
When to use EBS?
EBS’s use case is more easily understood than the other two. It must be paired with an
EC2 instance. So when you need a high-performance storage service for a single
instance, use EBS.
2. EFS: scalable file storage for multiple EC2
instances
Unlike EBS, EFS can be mounted by
multiple EC2 instances, meaning many virtual machines may store files within an EFS
instance.
But its main feature is its scalability. EFS can grow or shrink according to demand, with
more and more files being added without disturbing your application or having to
provision new infrastructure.
What kind of storage is EFS?
EFS is a file storage system. File storage is the system you’ll likely be most familiar
with, as it’s how files are stored in the hard drive on your computer. File storage is fast
and accessible, but it doesn’t offer the increased potential for complex queries that
object storage does (more on that in the S3 section).
EFS’s key benefits
Within its role as a shared file storage service for multiple EC2 instances, EFS provides
many benefits:
Adaptive throughput – EFS’s performance can scale in-line with its storage, operating at
a higher throughput for sudden, high-volume file dumps, reaching up to 500,000 IOPS
or 10 GB per second
Totally elastic – once you’ve spun up an EFS instance, you can add add files without
worrying about provisioning or disturbing your application’s performance
Additional accessibility – EFS can be mounted from different EC2 instances, but it can
also cross the AWS region boundary via the use of VPC peering
When to use EFS?
EFS may be used whenever you need a shared file storage option for multiple EC2
instances with automatic, high-performance scaling.
This makes it a great candidate for file storage for content management systems; for lift
and shift operations, as its autoscaling potential means you do not need to re-architect;
for application development, as EFS’s shareable file storage is ideal for storing code
and media files.
3. S3: object storage for complex queries and
archived data
S3 is scalable, like EFS, and has access to multiple EC2
instances. However, it can also be accessed by other cloud services, and its object
storage system makes it ideal for handling large volumes of static data as well as
complex queries.
What kind of storage is S3?
S3 is an object storage service. Unlike file storage – in which all data is organised
hierarchically in a top-down network of folders – data in S3 is contained on the same flat
plane, with more comprehensive metadata (labels) to make it manageable.
Think of the difference between a family tree, and a family party at which each family
member is wearing a name tag. In the first scenario, people exist in hierarchal relation
to one another; in the second all are milling about on equal footing.
Having each object marked like this makes it easier to run complex queries on each
object without reference to an existing hierarchy.
S3’s key benefits
Within its role as a object storage system, S3 offers many benefits:
Running analytics – because S3 can interface with other services like AWS Lake
Formation and analytics tools, it can be used as a data lake, with other services running
complex queries on its data to draw insights
Data archiving – S3 is capable of archiving data, meaning simpler forms of your data
can be stored at a lower cost than a ‘fuller’ version would
Incredibly durable – Amazon S3 Standard, S3 Standard–IA, S3 Intelligent-Tiering, S3
One Zone-IA, S3 Glacier, and S3 Glacier Deep Archive are all designed to provide
99.999999999% (11 9’s) of data durability of objects over a given year. This durability
level corresponds to an average annual expected loss of 0.000000001% of objects. If
you store 10,000,000 objects with Amazon S3, you can on average expect to incur a
loss of a single object once every 10,000 years
Highly available – S3 boasts 99.99% + availability
Flexible – S3 can be mounted on an application to act as a shared drive, making files
shareable across multiple instances running the web application
When to use S3?
S3 is good at storing long-term data due to its archiving system. Things like reports and
records, which may go unused for years, can be stored on S3 at a lower cost than the
other two storage services discussed.
As already stated, S3 is also useful for storing data on which complex queries may be
run. This makes it useful for data related to customer purchases, behaviour or profiles,
because that data can be easily queried and fed into analytics tools.
This capacity for interfacing with other tools also makes S3 great for back-up and
restoration, as it can be paired with Amazon Glacier for even more secure backing up.
S3 also supports static websites, so if you need to host a static HTML page, S3 is a
good choice.
AWS Storage Summed Up
S3 is for object storage. Think photos, videos, files, and simple web
pages.
EBS is for EC2 block storage. Think of a computer’s hard drive.
EFS is a file system for many EC2 instances. Think multiple EC2
instances and lots of data.