0% found this document useful (0 votes)
133 views10 pages

Overview of AWS Database Services

This document provides an overview of relational and non-relational databases available on AWS, including DynamoDB, RDS, Redshift, and Aurora. It describes key features like automated backups, read replicas, multi-AZ support, and encryption. Relational databases store data in tables with rows and columns, while non-relational databases can vary in data structure.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
133 views10 pages

Overview of AWS Database Services

This document provides an overview of relational and non-relational databases available on AWS, including DynamoDB, RDS, Redshift, and Aurora. It describes key features like automated backups, read replicas, multi-AZ support, and encryption. Relational databases store data in tables with rows and columns, while non-relational databases can vary in data structure.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
  • AWS Databases Introduction
  • Relational Databases
  • Relational Database Services (RDS) Features
  • Data Warehousing
  • OLTP vs OLAP
  • RDS Backups and Availability
  • Security and Encryption
  • Redshift
  • Aurora
  • ElastiCache

AWS DATABASES

29 | P a g e
Relational Databases
Relational databases are what most of us ar all used to. They have been around since the 70’s.
Think of a traditional spreadsheet;

• Database
• Tables
• Row
• Fields

Eg;

There are different relational databases on AWS namely;

• MS SQL Server
• Oracle
• MySQL Server
• PostgreSQL
• Aurora
• MariaDB

30 | P a g e
Relational Database Services (RDS) Features;
RDS has two key features;

• Multi-AZ – For Disaster Recovery


• Read Replicas – For Performance.

Non-Relational Databases are as follows;


• Collection = Table
• Document = Row
• Key Value Pairs = Fields

Example of a non relational database;


{

“_id” : “51262c865caasdsadfbe0545435,
“firstname” : ”John”,
“surname” : “Smith”,
“Age” : “23”,
“address” : [

{“street”: “21 Jump Street”,


“suburb” : “Richmond”}
]
}

Relational Database Vs Non-Relational Database


We can have any number of rows and fields in non-relational database, but in relational
database, we need to keep some consistency in data.

DynamoDB (No SQL)


It is amazon’s No SQL solution. Amazon DynamoDB is a fast and flexible NoSQL database service
for all applications that need consistent, single-digit millisecond latency at any scale. It is a fully
managed database and supports both document and key-value data models. Its flexible data
model and reliable performance make it a great fit for mobile, web, gaming, ad-tech, IoT, and
many other applications.

31 | P a g e
The basics of DynamoDB are as follows;

• Stored on SSD Storage


• Spread across 3 geographically distinct data centers.
• Eventual consistent Reads (Default)
Consistency across all copies of data is usually reached within a second. Repeating a
read after a short time should return the updated data. (Best Read Performance)
• Strongly Consistent Reads.
A strongly consistent read returns a result that reflects all writes that received a
successful response prior to the read

Data Warehousing
Used for business intelligence. Tools like Cognos, Jaspersoft, SQL Server Reporting Services,
Oracle Hyperion, SAP NetWeaver.
Data warehousing databases use different type of architecture both from a database
perspective and infrastructure layer.
Amazon’s Data Warehouse Solution is called Redshift. (Mainly for OLAP)
Used to pull in very large and complex data sets. Usually used by management to do queries on
data (such as current performance vs targets etc)

OLTP vs OLAP
Online Transaction Processing (OLTP) differs from OLAP Online Analytics Processing (OLAP) in
terms of the types of queries you will run.
OLTP Example:
Order number 212002
Pulls up a row of data such as Name, Data, Address to Deliver to, Delivery Status etc.

OLAP Transaction Example:


Net profit for EMEA and pacific for the Digital Radio Product. Pulls in large numbers of records.
Sum of Radios Sold in EMEA
Sum of Radios Sold in Pacific
Unit Cost of Radio in each region
Sales price of each radio
Sales price – unit cost

32 | P a g e
Additional Points;
• RDS runs on virtual machines
• You cannot log in to these operating systems however.
• Patching of the RDS Operating System and DB is Amazon’s responsibility
• RDS is NOT Serverless
• Aurora Serverless IS Serverless

RDS – Back Ups, Multi-AZ & Read Replicas


There are two different types of Backups for RDS:
• Automated Backups
Automated Backups allow you to recover your database to any point in time within a
“retention period”. The retention period can be between one and 35 days. Automated
Backups will take a full daily snapshot and will also store transaction logs throughout the
day. When you do a recover, AWS will first choose the most recent daily back up, and
then apply transaction logs relevant to that day. This allows you to do a point in time
recovery down to a second, within the retention period.

Automated Backups are enabled by default. The backup data is stored in S3 and you get
free storage space equal to the size of your database. So If you have an RDS Instance of
10Gb, you will get 10Gb worth of storage.
Backups are taken within a defined window. During the backup window, storage I/O
may be suspended while your data is being backed up and you may experience elevated
latency.

• Database Snapshots
DB Snapshots are done manually (ie they are user initiated.) They are stored even after
you delete the original RDS instance, unlike automated backups.

Restoring Backups
Whenever you restore either an Automatic Backup or a manual Snapshot, the restored version
of the database will be a new RDS instance with a new DNS endpoint.

33 | P a g e
Encryption At Rest
Encryption at rest is supported for MySQL, Oracle, SQL Server, PostgreSQL, MariaDB & Aurora.
Encryption is done using the AWS Key Management Service (KMS) Service. Once your RDS
instance is encrypted, the data stored at rest in the underlying storage is encrypted, as are tis
automated backups, read replicas, and snapshots.

Multi-AZ
Multi-AZ allows you to have an exact copy of your production database in another Availability
Zone. AWS handles the replication for you, so when your production database is written to, this
write will automatically be synchronized to the stand by database. It is used for DR.

In the event of planned database maintenance, DB Instance failure, or an Availability Zone


failure, Amazon RDS will automatically failover to the standby so that database operations can
resume quickly without administrative intervention.
Multi-AZ is available for the following databases
• SQL Server
• Oracle
• MySQL Server
• PostgreSQL
• MariaDB

Read Replica
Read replicas allow you to have a read-only copy of your production database. This is achieved
by suing Asynchronous replication from the primary RDS instance to the read replica. You use
read replicas primarily for very read-heavy database workloads.

• It is used for scaling, not for DR!


• Must have automatic backups turned on in order to deploy a read replica.
• You can have up to 5 read replica copies for any database.
• You can have read replicas of read replicas (but watch out for latency.)
• Each read replica will have its own DNS end point.
• You can have read replicas that have Multi-AZ.
• You can create read replicas of Multi-AZ source databases.
• Read replicas can be promoted to be their own databases. This breaks the replication.
• You can have a read replica in a second region.

34 | P a g e
Read Replicas are available for the following databases
• MySQL Server
• PostgreSQL
• MariaDB
• Oracle
• Aurora

Redshift

Amazon Redshift is a fast and powerful, fully managed, petabyte-scale data warehouse service
in the cloud. Customers can start small for just $0.25 per hour with no commitments or upfront
costs and scale to a petabyte or more for $1,000 per terabyte per year, less than a tenth of
most other data warehousing solutions.

• It is used for Business Intelligence


• Enabled by default with 1 day retention period.
• Maximum retention period is 35 days.
• Redshift always attempts to maintain at least 3 copies of your data (the original &
replica on the compute nodes and a backup in Amazon S3).
• Redshift can also asynchronously replicate your snapshots to S3 in another region for
disaster recovery.
Advanced Compression: - Columnar data stores can be compressed much more than row-based
data stores because similar data is stored sequentially on disk. Amazon Redshift employs
multiple compression techniques and can often achieve significant compression relative to
traditional relational data stores. In addition, Amazon Redshift doesn’t require indexes or
materialized views, and so uses less space than traditional relational database systems. When
loading data into an empty table, Amazon Redshift automatically samples your data and selects
the most appropriate compression scheme.
Massively Parallel Processing (MPP): Amazon Redshift automatically distributes data and query
load across all nodes. Amazon Redshift makes it easy to add nodes to your data warehouse and
enables you to maintain fast query performance as your data warehouse grows.

35 | P a g e
Redshift is priced as follows;
• Compute Node Hours (Total number of hours you run across all your compute nodes for
the billing period. You are billed for 1 unit per node per hour, so a 3-node data
warehouse cluster running persistently for an entire month would incur 2,160 instance
hours. You will not be charged for leader node hours; only compute nodes will incur
charges.)
• Backup
• Data Transfer (Only within a VPC, not outside it)

Security Considerations (Redshift):


• Encrypted in transit using SSL
• Encrypted at rest using AES-256 encryption
• By default RedShift takes care of key management.
o Manage your own keys through HSM
o AWS Key Management Service.

Redshift Availability
• Currently only available in 1 AZ
• Can restore snapshots to new AZs in the event of an outage.

AURORA

Amazon Aurora is a MySQL-compatible, relational database engine that combines the speed and
availability of high-end commercial databases with the simplicity and cost-effectiveness of open source
databases. Amazon Aurora provides up to five times better performance than MySQL at a price point
one tenth that of a commercial database while delivering similar performance and availability.

• It start with 10GB, scales in 10GB increments to 64TB (Storage Autoscaling)


• Compute resources can scale up to 32vCPUs and 244GB of Memory

36 | P a g e
• 2 copies of your data is contained in each availability zone, with minimum of 3 availability zones.
6 copies of your data.

Scaling Aurora
• Aurora is designed to transparently handle the loss of up to 2 copies of data without affecting
database write availability and up to three copies without affecting read availability.
• Aurora storage is also self-healing. Data blocks and disks are continuously scanned for errors and
repaired automatically.

Two Types of Aurora Replicas are available;


• Aurora Replicas (currently 15)
• MySQL Read Replicas (currently 5)

Backups With Aurora


• Automated backups are always enabled on Amazon Aurora DB Instances. Backups do not impact
database performance.
• You can also take snapshots with Aurora. This also does not impact on performance.
• You can share Aurora Snapshots with other AWS accounts.

37 | P a g e
ElastiCache

ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory
cache in the cloud. The service improves the performance of web applications by allowing you
to retrieve information from fast, managed, in-memory caches, instead of relying entirely on
slower disk-based databases.

ElastiCache supports two open-source in-memory caching engines:


• Memcached
• Redis

38 | P a g e

Common questions

Powered by AI

Amazon RDS is primarily designed for running traditional relational databases like MySQL, PostgreSQL, and Oracle. Its key features include Multi-AZ for disaster recovery, automated backups, and read replicas for performance scalability . DynamoDB is Amazon's NoSQL solution focused on providing fast, flexible, and fully managed document and key-value store services with single-digit millisecond latency. It is optimal for mobile and web applications requiring consistent performance . Redshift, on the other hand, is a data warehouse service optimized for online analytical processing (OLAP), allowing for fast processing of large complex datasets typical in business intelligence and analytics tasks . Each service serves different purposes: RDS for transactional databases, DynamoDB for fast and scalable NoSQL applications, and Redshift for big data analytics.

The Multi-AZ deployment option for Amazon RDS provides high availability and disaster recovery. It involves automatically replicating data across multiple availability zones, which ensures data durability and survivability in case of failures, such as hardware malfunctions or network issues . The primary benefit is that it allows for automatic failover during scheduled maintenance or unplanned outages without requiring manual intervention, thereby minimizing downtime . In particular, Multi-AZ deployments are crucial for maintaining the integrity and availability of databases at all times, which is essential for business-critical applications .

Automated backups in RDS provide the convenience of point-in-time recovery within a set retention period and are automatically scheduled, reducing administrative overhead . However, they can cause elevated latency during the backup window as storage I/O might be suspended . In contrast, manual snapshots, while requiring user intervention, do not automatically incur these latency issues and allow for specific targeted backups that persist indefinitely, even beyond the deletion of the associated database instance. Although automated backups are convenient and provide regular recovery points, the potential for operational latency and the reliance on a defined window poses a reliability trade-off compared to the more controlled and immediate nature of manual snapshots .

ElastiCache enhances the performance of web applications by using in-memory caching to significantly reduce data retrieval times compared to traditional disk-based database systems. By storing data in memory rather than on disk, ElastiCache allows applications to access frequently-used data with much lower latency . This is particularly beneficial for read-heavy applications that need quick access to data. The supported caching engines, Memcached and Redis, provide tailored and efficient caching solutions optimized for performance, further enhancing application speed and scalability by offloading database read operations . This results in more responsive applications and can help reduce backend load, ultimately improving user experience.

Amazon Redshift incorporates multiple security measures to ensure data protection. Data in transit is encrypted using SSL, while data at rest is encrypted using AES-256 encryption . Redshift manages the encryption keys by default, but users can also manage their own keys through the AWS Key Management Service (KMS) or a hardware security module (HSM). Furthermore, Redshift maintains multiple data copies and can replicate snapshots to another region for disaster recovery, ensuring both data redundancy and security . These comprehensive security features are critical for protecting sensitive data within a data warehouse environment.

Amazon Redshift serves a critical role in a company's business intelligence and analytics framework by providing a scalable and efficient data warehousing solution. It enables businesses to perform fast, complex queries on large datasets, which is essential for generating insights and making data-driven decisions . With features like advanced compression, columnar data storage, and Massively Parallel Processing (MPP), Redshift allows for the efficient processing of OLAP queries necessary for high-level strategic analysis, such as sales performance and customer behavior insights . Furthermore, its integration with other AWS services and support for SQL-based analytics makes it a seamless component of a broader business intelligence ecosystem, facilitating the transformation of raw data into actionable business intelligence .

Read replicas are more appropriate than Multi-AZ deployments when the primary need is to scale read-heavy workloads without impacting the performance of a primary database. They provide a scalable solution by allowing read operations to be distributed across multiple replicas, which is ideal for applications that need to handle a large number of read requests . In contrast, Multi-AZ is designed primarily for high availability and disaster recovery by automatically failing over to a standby replica in the event of a failure . Thus, if the primary goal is to improve read performance and distribution rather than ensuring high availability, read replicas would be the better solution.

Amazon Aurora stands out from standard RDS offerings with its advanced scalability and performance. It can automatically scale storage from 10GB up to 64TB in 10GB increments without downtime, whereas RDS may require more manual intervention for scaling . Aurora replicates data across a minimum of three and up to six availability zones to ensure high availability and fault tolerance, which is more extensive than typical RDS configurations . Furthermore, it offers up to five times the performance of standard MySQL and is designed to handle the loss of data copies seamlessly, maintaining both read and write availability . These features provide robust failover capabilities and consistent performance under substantial workloads, distinguishing Aurora as a premier option for MySQL-compatible database needs.

Aurora Serverless provides the advantage of automatic scaling based on database workload, eliminating the need for provisioning instances, thus simplifying operations and potentially reducing costs during fluctuating demand . This flexibility proves beneficial for unpredictable workloads or non-constant database usage, as the service automatically adjusts capacity, ensuring performance efficiency while minimizing expenses . However, potential challenges can arise from the Aurora Serverless instance's startup latency, which might affect applications requiring instantaneous availability . Compared to traditional Aurora, which maintains continual performance but demands manual capacity management, Aurora Serverless introduces a trade-off between operational simplification and potential latency impacts, particularly for applications where immediate responsiveness is critical .

Amazon Redshift's architecture enhances data warehousing efficiency through several mechanisms: it uses columnar storage, which allows for high data compression rates and efficient data retrieval, especially for queries that scan many rows but only a few columns . It employs Massively Parallel Processing (MPP) to distribute data and queries across multiple nodes, ensuring query performance remains high even as data volumes grow . Redshift also features advanced compression techniques and does not require indexes or materialized views, saving on storage space. Additionally, Redshift's ability to automatically select compression schemes and maintain backups enhances its operational efficiency .

29 | P a g e  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
AWS DATABASES
30 | P a g e  
 
Relational Databases  
 
Relational databases are what most of us ar all used to. They have been around sinc
31 | P a g e  
 
Relational Database Services (RDS) Features; 
 
RDS has two key features; 
• Multi-AZ – For Disaster Recover
32 | P a g e  
 
The basics of DynamoDB are as follows; 
• Stored on SSD Storage 
• Spread across 3 geographically distinct d
33 | P a g e  
 
Additional Points; 
 
• RDS runs on virtual machines 
• You cannot log in to these operating systems however
34 | P a g e  
 
Encryption At Rest 
 
Encryption at rest is supported for MySQL, Oracle, SQL Server, PostgreSQL, MariaDB & A
35 | P a g e  
 
Read Replicas are available for the following databases 
• MySQL Server 
• PostgreSQL 
• MariaDB 
• Oracle
36 | P a g e  
 
Redshift is priced as follows; 
 
• Compute Node Hours (Total number of hours you run across all your comput
37 | P a g e  
 
• 
2 copies of your data is contained in each availability zone, with minimum of 3 availability zones. 
6 co
38 | P a g e  
 
ElastiCache 
ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory 
cac

You might also like