0% found this document useful (0 votes)
19 views6 pages

EC2 Durable Storage Solutions Explained

Chapter 5 of AWS Cloud Practitioner Essentials covers various storage and database solutions offered by AWS, including Amazon EBS for block storage, Amazon S3 for object storage, and Amazon RDS for relational databases. It highlights the differences between these services and their ideal use cases, such as EBS for active read/write operations and S3 for static file storage. Additionally, it discusses other database services like DynamoDB, Redshift, and specialized databases, along with migration services and caching solutions.

Uploaded by

lo.m.are.spi.o
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views6 pages

EC2 Durable Storage Solutions Explained

Chapter 5 of AWS Cloud Practitioner Essentials covers various storage and database solutions offered by AWS, including Amazon EBS for block storage, Amazon S3 for object storage, and Amazon RDS for relational databases. It highlights the differences between these services and their ideal use cases, such as EBS for active read/write operations and S3 for static file storage. Additionally, it discusses other database services like DynamoDB, Redshift, and specialized databases, along with migration services and caching solutions.

Uploaded by

lo.m.are.spi.o
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

AWS Cloud Practitioner Essentials

Chapter 5 : Storage and Databases


Video 1 : Introduction:
Now that your AWS environment is scalable, secure, and global, it's time to store data properly and track
user behavior, such as setting up a loyalty program.

This requires two key components:

1. Storage: For files, logs, images, backups, etc.

2. Databases: For customer data, orders, and transactional records.

Since different data types and usage scenarios exist, AWS provides many purpose-built storage and
database services to help you architect the best solution for each use case.

Video 2 : Instance Stores and Amazon Elastic Block Store (Amazon EBS)
EC2 Instance Store:

• Temporary storage physically attached to the host machine.

• Fast, but non-persistent: Data is lost if the instance is stopped or terminated.

• Use case: temporary files, caches, scratch data.

Amazon Elastic Block Store (EBS):

• Persistent block-level storage for EC2.

• EBS volumes are independent of EC2 lifecycle.

• Data remains intact after stopping/starting instances.

• Types & sizes can be chosen based on workload.

• Supports snapshots (incremental backups) for recovery.

Use EBS when you need data durability for applications, databases, or OS storage.
Video 3 : Amazon Simple Storage Service (Amazon S3)
Overview:

• Object storage for virtually unlimited data.

• Store files like images, videos, logs, documents.

• Data stored as objects in buckets.

• Max object size = 5 TB.

Features:

• Versioning: Retain older versions of files.

• Access Control: Define who can read/write objects.

• Storage Classes:

o S3 Standard: General purpose, frequent access. Offers 11 9s (99.999999999%) durability.

o S3 Standard-IA: Infrequent access, cost-optimized.

o S3 Glacier Flexible Retrieval: Archival data, multiple retrieval speeds.

o S3 Glacier Deep Archive: Cheapest, for long-term storage.

o S3 One Zone-IA: Cheaper, but stored in one AZ.

Lifecycle Policies:

• Automate data transition between storage classes.

• Example: 90 days in Standard → 30 in IA → Glacier.

Bonus:

• Host static websites directly from S3 buckets.

Comparing Amazon EBS and Amazon S3:

Amazon EBS is block storage attached to EC2. Durable and great for active read/write use cases, like editing
large files. Supports delta updates—only changed blocks are updated.

Amazon S3 is object storage with unlimited capacity. Great for infrequent changes, backup, and static
hosting. Each object is stored whole; no partial updates allowed.

Round 1 (S3 Wins): A photo analysis website with millions of viewable images. S3 offers web-enabled URLs,
cost savings, no EC2 needed, and 11 9s durability.

Round 2 (EBS Wins): Editing an 80 GB video file. EBS updates only the changed blocks. In S3, the whole file
would need reuploading each time.

Conclusion: Use S3 for static, infrequently changed files. Use EBS for complex read/write operations. Your
use case decides the winner!
Video 4 : Amazon Elastic File System (Amazon EFS)
Shared File System:

• Fully-managed, scalable file system.

• Multiple EC2 instances can read/write simultaneously.

• Ideal for shared storage and Linux-based workloads.

Differences from EBS:

• EBS: Only 1 EC2 instance in the same AZ can attach.

• EFS: Multiple EC2s across an entire region can connect.

• Scales automatically as you write more data.

• More suitable for workloads needing parallel access.

Video 5 : Amazon Relational Database Service (Amazon RDS)


Relational Databases:

• Store structured data using tables.

• Supports relationships between data (e.g., customers & orders).

• Use SQL to query data.

Amazon RDS:

• Managed service for relational databases.

• Supports: MySQL, PostgreSQL, SQL Server, Oracle, MariaDB.

• Handles patching, backups, scaling, failover automatically.

Amazon Aurora:

• Highly performant RDS-compatible database.

• Supports MySQL & PostgreSQL engines.

• Cost-effective (1/10th) of commercial databases.

• 6 copies of data across multiple AZs.

• Supports 15 read replicas for scalability.

• Continuous backups to S3 + point-in-time recovery.


Video 6 : Amazon DynamoDB
NoSQL Database:

• Fully managed, serverless, and non-relational.

• Data stored as items with attributes in a table.

• Schema-less: Each item can have different attributes.

High Performance:

• Milliseconds latency even at massive scale.

• Built-in redundancy across AZs.

• Automatically scales.

Use Cases:

• Real-time apps, gaming, mobile apps.

• Where relational DBs struggle with flexibility/performance.

• Example: Prime Day 2019 saw 7.11 trillion API calls to DynamoDB.

Comparing Amazon RDS and Amazon DynamoDB:

Amazon RDS is ideal for structured, relational data that requires SQL queries and relationships across
tables. Great for analytics and business logic.

DynamoDB is great for speed and flexibility. Handles large-scale, single-table access patterns where
relationships aren't critical.

Round 1 (RDS Wins): Sales supply chain analysis with complex joins. RDS excels here.

Round 2 (DynamoDB Wins): Employee directory with flat data. RDS features add overhead, while
DynamoDB offers speed and simplicity.

Conclusion: Choose RDS for relationship-heavy apps. Choose DynamoDB for simple, high-throughput, and
flexible workloads.
Video 7 : Amazon Redshift
Data Warehousing:

• For analyzing large volumes of historical data.

• Ideal for Business Intelligence (BI) and analytics.

Features:

• Supports petabyte-scale data.

• Use SQL queries on structured/unstructured data.

• Redshift Spectrum allows querying directly from S3 data lakes.

• 10x performance of traditional data warehouses.

• No server management required.

Use When:

• You need to analyze trends, create reports, or perform large-scale data analysis.

Video 8 : AWS Database Migration Service


Migrate Databases Easily:

• Supports migration from on-premises or cloud to AWS.

• Minimal downtime: Source stays operational during migration.

Types of Migrations:

1. Homogeneous: Same DB engines (e.g., MySQL → RDS MySQL).

2. Heterogeneous: Different engines (e.g., Oracle → Aurora), using AWS Schema Conversion Tool.

Other Use Cases:

• Test/dev migrations

• Database consolidation

• Continuous replication for DR or cross-region availability


Video 9 : Additional Database Services
DocumentDB:

• For document-oriented data (e.g., content, profiles).

• Great for content management systems.

Neptune:

• A graph database.

• Good for social networks, recommendations, fraud detection.

Amazon QLDB:

• Immutable ledger database.

• Tamper-proof, append-only record system.

• Ideal for audits, banking, compliance.

Amazon Managed Blockchain:

• Decentralized blockchain infrastructure.

• Use cases: multi-party business networks, supply chains.

Performance Boosters:

• ElastiCache: In-memory cache (Redis/Memcached) for faster reads.

• DAX (DynamoDB Accelerator): Caching for DynamoDB.

Video 10 : Summary
Key Takeaways:

• Amazon EBS: Persistent, block-level storage for EC2.

• Amazon S3: Object storage with tiers and website hosting.

• Amazon EFS: Shared, scalable file system for Linux EC2s.

• Amazon RDS/Aurora: Managed relational databases.

• Amazon DynamoDB: Serverless, NoSQL database.

• Amazon Redshift: Data warehouse for analytics.

• AWS DMS: Easy, low-downtime database migration.

• Specialty Databases: DocumentDB, Neptune, QLDB, Managed Blockchain.

• Caching: ElastiCache, DAX.

Common questions

Powered by AI

The AWS Database Migration Service minimizes downtime during migrations by allowing the source database to remain fully operational throughout the migration process. It achieves this through continuous data replication, which ensures that the destination database remains in sync with any changes in the source database until the switchover is finalized . This approach is suitable for organizations that cannot afford prolonged periods of database inactivity. The service supports both homogeneous migrations, where the source and target databases are the same engine (e.g., MySQL to MySQL on RDS), and heterogeneous migrations, which involve different database engines (e.g., Oracle to Amazon Aurora), facilitated by the AWS Schema Conversion Tool .

Amazon Aurora offers several features that make it a cost-effective database solution compared to traditional commercial databases. It is compatible with MySQL and PostgreSQL yet provides performance enhancements that are typically up to 10 times faster than standard MySQL databases. Aurora's cost-effectiveness also stems from its architecture: it automatically replicates six copies of data across multiple Availability Zones, provides continuous backups to Amazon S3, and supports point-in-time recovery, reducing the need for extensive and costly manual management . Additionally, Aurora supports up to 15 read replicas for scalability, facilitating high performance under heavy query loads, which is advantageous for cost-efficient scalability .

Amazon DynamoDB offers various benefits for applications requiring scalability and flexible data formats. It is a fully managed NoSQL database service that provides fast and predictable performance with seamless scaling, making it ideal for large-scale applications . The serverless architecture eliminates the need for infrastructure management, enhancing its scalability and ease of use. DynamoDB's schema-less design allows for flexible data formats, where each item in a table can have different attributes, catering to applications where data heterogeneity is prevalent . It ensures high performance with millisecond latency and built-in redundancy across multiple Availability Zones, which is crucial for real-time applications, such as those used during intense workload periods like Amazon Prime Day .

Amazon RDS streamlines database management by automating many tedious administrative tasks, allowing administrators to focus on higher-order database optimization and application interaction. RDS handles complexities such as database patching, backups, scaling, and failover operations automatically . This managed service supports multiple database engines like MySQL, PostgreSQL, SQL Server, Oracle, and MariaDB, providing versatility and choice based on specific needs. By managing routine tasks and ensuring continually optimized performance and reliability, RDS reduces the operational overhead often associated with maintaining relational databases .

Amazon Neptune is ideally suited for a company needing fraud detection capabilities because of its graph database model, which excels at handling complex, interrelated datasets. Fraud detection often relies on identifying unusual patterns or connections within transactional data—relationships that are naturally represented in graph structures . Neptune supports popular graph models, such as property graphs and RDF SPARQL, enabling applications to perform sophisticated queries that can efficiently uncover hidden connections and anomalies indicative of fraudulent behavior. The database's high scalability and the ability to process a large number of transactions in parallel make it particularly effective for real-time or near-real-time detection and response scenarios .

Amazon Managed Blockchain provides a strategic advantage in scenarios where multi-party business networks require a decentralized, transparent, and tamper-proof platform to exchange data and execute transactions. Proper use cases include supply chains or trade networks where multiple independent entities, such as suppliers, manufacturers, and distributors, need to collaborate with trust and transparency. Blockchain inherently supports decentralized governance and consensus mechanisms, ensuring all parties share the same immutable ledger of transactions . This reduces the need for intermediaries, allows real-time verification of transactions, and enhances the network's resilience against fraudulent activities. Managed Blockchain simplifies the process of creating and managing scalable blockchain networks using popular frameworks like Hyperledger Fabric .

Amazon EBS and Amazon S3 differ significantly in how they handle data updates and their suitable use cases. Amazon EBS is a block storage service that supports delta updates, meaning only the changed blocks are updated. This makes it ideal for applications requiring frequent read/write operations, like editing large files . In contrast, Amazon S3 is an object storage service where each object is stored whole, and partial updates are not possible; the entire object must be reuploaded if changes are made. This characterizes S3 as more suitable for static, infrequently changed files and scenarios where scalability and web-enabled URLs are beneficial, such as hosting viewable images for a website .

Organizations would benefit more from using Amazon Elastic File System (Amazon EFS) over Amazon EBS in scenarios requiring shared, scalable file storage across multiple EC2 instances. EFS is designed for use cases where parallel access to file storage is necessary, such as workloads involving collaborative data processing or distributed applications . Unlike EBS, which attaches to a single EC2 instance within the same Availability Zone, EFS allows multiple instances across an entire region to access the file system simultaneously, making it suitable for scaling workloads requiring high throughput and shared access . Its ability to automatically scale based on data written further enhances its suitability for dynamic workloads with varying demands .

Amazon Redshift Spectrum enhances data analysis capabilities by enabling users to run SQL queries directly on data stored in Amazon S3 without the need to load it into Redshift tables. This integration allows Redshift to extend beyond its traditional role as a data warehouse to become more of a data lake analytics service as well . By facilitating this, Redshift Spectrum allows for the flexible querying of both structured and unstructured data across petabyte-scale datasets. This capability is particularly advantageous for organizations needing to analyze massive amounts of historical data using Business Intelligence (BI) tools without complex data movement processes, thus retaining the speed and performance benefits of Redshift's architecture .

Amazon S3 Glacier and S3 Glacier Deep Archive are both designed for long-term data storage with distinct differences. S3 Glacier offers several retrieval speed options from minutes to hours, making it suitable for archives that need occasional access . It balances cost with retrieval speed, ideal for less frequently accessed data where relatively faster recovery times are required. On the other hand, S3 Glacier Deep Archive is the most cost-effective option for long-term data storage, designed for data that is rarely accessed and does not demand fast retrieval speeds. This service is nearly half the cost of S3 Glacier and is suitable for compliance and long-term data retention needs where retrieval times of up to 12 hours are acceptable .

You might also like