AWS Quick Notes
AWS Quick Notes
Implement Elasticity
The cloud brings a new concept of elasticity in your applications. Elasticity can be
implemented in 3 ways:
1. Proactive Cyclic Scaling: Periodic scaling that occurs at a fixed interval (daily,
weekly, monthly, quarterly)
2. Proactive Event-base Scaling: Scaling just when you are expecting a big surge
of traffic requests due to a scheduled business event (new product launch,
marketing campaigns)
3. Auto-scaling based on demand: By using monitoring service, your system can
send triggers to take appropriate actions so that it scales up or down based
on metrics (utilization of the servers or network i/o)
Secure your applications
Only permit what it's needed, nothing else.
Shut down services you don't use.
Five pillars:
Operational excellence
Security
Reliability
Performance efficiency
Cost optimization
Pillar 1 Security
Design Principles
Apply security at all layers.
Enable traceability.
Automate responses to security events.
Focus on securing your system.
Automate security best practices.
Definition
Data Protections
Before architecting any system, foundational practices that influence security should be in
place. For example, data classification provides a way to categorize organizational data based
on levels of sensitivity, and encryption protects data by way of rendering it unintelligible to
unauthorized access. These methods are important because they support objectives such as
preventing financial loss or complying with regulatory obligations.
In AWS the following practices help to protect your data:
Privilege Management
Privilege Management ensures that only authorized and authenticated users are able to access
your resources, and only in a manner that is intended. It can include:
How are you protecting access to and use the AWS root account credentials?
How are you defining roles and responsibilities of system users to control human
access to the AWS Management Console and APIs?
How are you limiting automated access (such as from applications, scripts, or 3rd
party tools or services) to AWS resources?
How are you managing keys and credentials?
Infrastructure Protection
Outside of Cloud, this is how you protect your data centre. RFID controls, security, lockable
cabinets, CCTV etc. Within AWS they handle this so infrastructure protection exists at a
VPC level.
You can use detective controls to detect or identify a security breach. AWS Services to
achieve this include
AWS Cloudtrail
AWS CloudWatch
AWS Config
AWS S3
AWS Glacier
Pillar 2 Reliability
The reliability pillar encompasses the ability of a system to recover from infrastructure or
service disruptions, dynamically acquire computing resources to meet demand, and mitigate
disruptions such as misconfigurations or transient network issues.
Design Principles
Test recovery procedures.
Automatically recover from failure (KPI).
Scale horizontally to increase aggregate system availability.
Stop guessing capacity.
Definition
Before architecting any system, you need to make sure you have the prerequisite foundations.
With AWS, they handle most of the foundations for you. The cloud is designed to be
essentially limitless meaning that AWS handle the networking and compute requirements
themselves. However, they do set the service limits to stop customers from accidentally over-
provisioning resources.
How are you managing AWS service limits for your account?
How are you planning your network topology on AWS?
Do you have an escalation path to deal with technical issues?
Change Management
You need to be aware of how change affects a system so that you can plan proactively around
it. Monitoring allows you to detect any changes to your environment and react.
With AWS, you can use CloudWatch to monitor your environment and services such as
autoscaling to automate change in response to your production environment.
Failure Mangement
With cloud, you should always architect your systems with the assumptions that failure will
occur. You should become aware of these failures, how they occurred, how to respond to
them and then plan on how to prevent these from happening again.
Design Principles
Democratize advanced technologies.
Go global in minutes.
Use server-less architectures.
Experiment more often.
Definition
Compute
When architecting your use of compute, you should take advantage of elasticity mechanisms
that can ensure that you have sufficient capacity to sustain performance as demand changes.
You should also make sure to chose the right kind of server for your needs.
How do you select the appropriate instance type for your system?
How do you ensure that you continue to have the most appropriate instance type as
new instance types and features are introduced?
How do you monitor your instances post-launch to ensure they are performing as
expected?
How do you ensure that the number of your instances match demand?
Storage
The optimal storage solution for a particular system varies based on the kind of access
method (block, file, or object) you use, patterns of access (random or sequential), throughput
required, frequency of access (online, offline, archival), frequency of update (WORM,
dynamic), and availability and durability constraints.
In AWS storage is virtualized, and there are a number of different storage types. This makes
it easier to match your storage methods more closely with your needs, and it also offers
storage options that are not easily achievable with on-premises infrastructure.
How do you select the appropriate storage solution for your system?
How do you ensure that you continue to have the most appropriate storage solution
as new storage solution features are launched?
How do you monitor your storage solution to ensure it is performing as expected?
How do you ensure that the capacity and throughput of your storage solutions
match demand?
Database
The optimal database solution for a particular system can vary based on your requirements
for availability, consistency, partition tolerance, latency, durability, scalability, and query
capability. Many systems use different database solutions for different subsystems and enable
different features to improve performance.
How do you select the appropriate database solution for your system?
How do you ensure that you continue to have the most appropriate database
solution and features as new database solution and features are launched?
How do you monitor your databases to ensure performance is as expected?
How do you ensure the capacity and throughput of your databases match demand?
Space-Time trade-off
Using AWS you can use services such as RDS to add read replicas, reducing the load on your
database and creating multiple copies of the database. This helps to lower latency.
You can use Direct Connect to provide predictable latency between your HQ and AWS
You can use the global infrastructure to have multiple copies of your environment, in regions
that are closest to our customer base. You can also use caching services such as ElastiCache
or CloudFront to reduce latency.
How do you select the appropriate proximity and caching solutions for your system?
How do you ensure that you continue to have the most appropriate proximity and
caching solutions as new solutions are launched?
How do you monitor your proximity and caching solutions to ensure performance is
as expected?
How do you ensure that the proximity and caching solutions you have matches
demand?
Design Principles
Transparently attribute expenditure.
Use managed services to reduce the cost of ownership.
Trade capital expense for operating expense.
Benefit from economies of scale.
Stop spending money on data centre operations.
Definition
Try to optimally align supply with demand. Don't overprovision or under-provision, instead
of as demand grows, so should your supply of computing resources. Think of things like
Autoscaling which scale with demand.
Similarly in a server-less context, use services such as Lambda that only execute when a
request comes in.
Services such as CloudWatch can also help you keep track as to what your demand is.
How do you make sure your capacity matches but does not substantially exceed
what you need?
How are you optimizing your usage of AWS services?
Cost-effective resources
Using the correct instance type can be key to cost savings. For example, you might have a
reporting process that is running on a t2-Micro and it takes 7 hours to complete. That same
process could be run on an m4.2xlarge in a manner of minutes. The result remains the same
but the [Link] is more expensive because it ran for longer.
A well-architected system will use the most cost-efficient resources to reach the end business
goal
Have you selected the appropriate resource types to meet your cost targets?
Have you selected the appropriate pricing model to meet your cost targets?
Are there managed services (higher-level services than Amazon EC2, Amazon EBS,
and Amazon S3) that you can use improve your ROI (return on investment)?
Expenditure Awareness
With cloud, you no longer have to go out and get quotes on physical servers, choose a
supplier, have those resources delivered, installed, made available etc. You can provision
things within seconds, however, this comes with its own issues.
Many organizations have different teams, each with their own AWS accounts. Being aware
of what each team is spending and where is crucial to any well-architected system.
You can use cost allocation tags to track this, billing alerts as well as consolidated billing.
What access control and procedures do you have in place to govern your AWS costs?
How are you monitoring usage and spending?
How do you decommission resources that you no longer need, or stop resources that
are temporarily not needed?
How do you consider data-transfer charges when designing your architecture?
AWS moves FAST. There are hundreds of new services (and potentially 1000 new services
this year). A service that you chose yesterday may not be the best service to be using today.
For example, consider MySQL RDS, Aurora was launched at re:invent 2014 and is now out
of preview. Aurora may be a better option now for your business because of its performance
and redundancy.
You should keep track of the changes made to AWS and constantly re-evaluate your existing
architecture. You can do this by subscribing to AWS blog and by using services such as
Trusted Advisor.
This includes how planned changes are executed, as well as responses to unexpected
operational events.
Change execution and responses should be automated. All processes and procedures of
operational excellence should be documented, tested and regularly reviewed.
Design Principles
Perform operations with code.
Align operations processes to business objectives.
Make regular, small, incremental changes.
Test for responses to unexpected events.
Learn from operational events and failures.
Keep operations procedures current.
Definition
There are three best practice areas of Operational Excellence in the cloud:
Preparation
Runbooks: operations guidance that operations teams can refer to so they can perform
normal daily tasks.
In AWS there are several methods, services and features that can be used to support
operational readiness and the ability to prepare for normal day-to-day operations as well as
unexpected operational events:
It is also important to use features like tagging to make sure all resources in a
workload can be easily identified when needed during operations and responses.
Document everything:
Be sure that documentation doesn't become stale or out of date as procedures change.
If documentation is not updated and tested regularly, it will not be useful when
unexpected operational events occur. If workloads are not reviewed before
production, operations will be affected when undetected issues occur.
If resources are not documented, when operational events occur, determining how to
respond will be more difficult while the correct resources are identified.
Operation
Operations should be standardized and manageable on a routine basis. The focus should be
on automation, small frequent changes, regular QA testing, and defined mechanisms to track,
audit, roll back and review changes. Changes should not be large and infrequent, they should
not require scheduled downtime, and they should not require manual execution. A wide range
of logs and metrics that are based on key operational indicators for a workload should be
collected and reviewed to ensure continuous operations.
In AWS you can set up a continuous integration / continuous deployment (CI/CD) pipeline.
Release management processes, whether manual or automated, should be tested and be based
on small incremental changes, and tracked versions. You should be able to revert changes
that introduce operational issues without causing any operational impact.
Rollbacks are more difficult in large changes, and failing to have a rollback plan or the ability
to mitigate failure impacts will prevent continuity of operations.
Align monitoring to business needs, so that the responses are effective at maintaining
business continuity. Monitoring that is ad hoc and not centralized, with responses that are
manual, will cause more impact to operations during unexpected events.
How are you evolving your workload while minimizing the impact of change?
How do you monitor your workload to ensure it is operating as expected?
Response
Responses to unexpected operational events should be automated. This is not just for alerting
but also for mitigation, remediation, rollback and recovery.
Alerts should be timely and should invoke escalations when responses are not adequate to
mitigate the impact of operational events.
Responses should follow a pre-defined playbook that involves stakeholders, the escalation
process and procedures. Escalation paths should be defined and include both functional and
hierarchical escalation capabilities. Hierarchical escalation should be automated and escalated
priority should result in stakeholder notifications.
SQS
Amazon Simple Queue Service (SQS) is a fully managed message queuing service that
enables you to decouple and scale microservices, distributed systems, and serverless
applications.
It's basically a message queue service, like Kafka for example. You add items in the
queue, and then someone will ask for these objects from the queue.
Retention can go up to 14 days.
It's a pull base system.
Type of queues:
o Default: All queue are standard by default. They don't have an order. Ordering
is best-effort, so they don't guarantee ordering on default queues.
o FIFO: (first in first out) These queues guarantee ordering.
You can have duplicates if you don't manage very well your visibility timeouts.
SWF
Amazon SWF (Simple Workflow Service) is an Amazon Web Services tool that helps
developers coordinate, track and audit multi-step, multi-machine application jobs.
SNS
Amazon SNS enables message filtering and fan out to a large number of subscribers,
including serverless functions, queues, and distributed systems. Additionally, Amazon SNS
fans out notifications to end users via mobile push messages, SMS, and email.
Elastic Transcoder
Amazon Elastic Transcoder a is media transcoding service.
API Gateway
Amazon API Gateway is a fully managed service that makes it easy for developers to create,
publish, maintain, monitor, and secure APIs at any scale.
Kinesis
Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so
you can get timely insights and react quickly to new information.
Kinesis streams: It consists of shards, where each stream is saved and stored for 24h,
up to 7 days. You can have multiple shards on each stream. A consumer then read
from the shards and process the data.
Kinesis Firehose: It doesn't have shards, so data traverses the Firehose. Data has to be
processed immediately with lambda or stored in S3 for example.
o It can stream data to Amazon S3, Amazon Redshift, Amazon Elasticsearch
Service, and Splunk.
Kinesis Analytics: It allows you to analyze the data that exists in Kinesis Firehose of
streams.
Databases 101
You won't be questioned in the exam about the next sections (up to AWS Databases). It's just
for your knowledge and very good for better understanding the course.
A relational database is a type of database. It uses a structure that allows us to identify and
access data in relation to another piece of data in the database. Often, data in a relational
database is organized into tables.
A non-relational database is any database that does not follow the relational model provided
by traditional relational database management systems. This category of databases also
referred to as NoSQL databases, has seen steady adoption growth in recent years with the rise
of Big Data applications.
What is data warehousing
Is a system used for reporting and data analysis, and is considered a core component of
business intelligence.
We can divide IT systems into transactional (OLTP) and analytical (OLAP). In general, we
can assume that OLTP systems provide source data to data warehouses, whereas OLAP
systems help to analyze it.
AWS Databases
Relations ones
SQL
MySQL
PostgreSQL
Oracle
Aurora
MariaDB
DyanmoDB - No SQL
RedShift - OLAP
Automated Backups
When you restore a backup or a snapshot, the restored version of the database will be in a
new RDS instance, with a new DNS name
Encryption You can encrypt your Amazon RDS DB instances and snapshots at rest by
enabling the encryption option on your Amazon RDS DB instances.
RDS Multi-AZ
When you provision a Multi-AZ DB Instance, Amazon RDS automatically creates a primary
DB Instance and synchronously replicates the data to a standby instance in a different
Availability Zone
This feature makes it easy to elastically scale out beyond the capacity constraints of a single
DB Instance for read-heavy database workloads The read replica operates as a DB instance
that allows only read-only connections
DynamoDB
Redshift
Amazon Redshift is a fast, scalable data warehouse that makes it simple and cost-effective to
analyze all your data across your data warehouse.
Redshift stores data by colums. By storing data in columns rather than rows, the database can
more precisely access the data it needs to answer a query rather than scanning and discarding
unwanted data in rows. Query performance is increased for certain workloads.
Columnar data stores can be compressed much more than row-based data.
Single node: You can start with a single node (Up to 160Gb of ram in one node)
Multi-Node:
o You are charged for the total number of hours you run across all your
compute nodes.
o You are charged for backups
o You are also charged for data transfer within a VPC
Security:
o Is available only in 1 AZ
o Can restore the snapshot to new AZ's in the event of an outage.
Elasticache
Types of Elasticache:
o Memcached
o Redis: In memory key-value store
RDS Aurora
Amazon Aurora is a MySQL and PostgreSQL-compatible relational database built for the
cloud, that combines the performance and availability of traditional enterprise databases with
the simplicity and cost-effectiveness of open source databases.
Scaling:
o Scales in 10GB Increments up to 64TB
o Compute resources can scale up to 32vCPUs and 244GB of Memory.
o 2 Copies of your data are contained in each availability zone, with a minimum
of 3 availability zones.
o Aurora handles the loss of up two copies of data without affecting database
write capability.
o Aurora handles the loss of up three copies of data without affecting database
read capability.
Replicas:
EC2
What's EC2
Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure,
resizable compute capacity in the cloud. It is designed to make web-scale cloud computing
easier for developers.
EC2 Options
On demand: You pay for computing capacity by per hour or per second depending on
which instances you run.
Reserved Instance (RI): Provide a significant discount (up to 75%) compared to On-
Demand pricing and provide a capacity reservation when used in a specific
Availability Zone. You have to enter a contract.
Spot: Amazon EC2 Spot instances allow you to request spare Amazon EC2
computing capacity for up to 90% off the On-Demand price.
o If you terminate an instance, you will pay for the complete hour.
o If Amazon terminates the instance, you won't pay for the complete hour.
o The following are the possible reasons that Amazon EC2 might interrupt your
Spot Instances:
Price – The Spot price is greater than your maximum price.
Capacity – If there are not enough unused EC2 instances to meet the
demand for Spot Instances, Amazon EC2 interrupts Spot Instances.
The order in which the instances are interrupted is determined by
Amazon EC2.
Constraints – If your request includes a constraint such as a launch
group or an Availability Zone group, these Spot Instances are
terminated as a group when the constraint can no longer be met.
Dedicated Hosts: Is a physical server with EC2 instance capacity fully dedicated to
your use.
What's EBS
Provides persistent block storage volumes for use with Amazon EC2 instances in the AWS
Cloud. EBS is automatically replicated in a specific AZ
SSD
General Purpose SSD (GP2): General purpose SSD volume that balances price and
performance. 3 IOPS per GB, up to 10k IOPS
Provisioned IOPS SSD (io1): Designed for IO intensive, use it if you need more than
10k IOPS
HDD (Magnetic)
Are great for sequential access (processing log files, bigdata work flows as an example).
Throughput Optimized HDD (st1): Magnetic disk, this can't be a boot volume (root
volume).
Cold HDD (sc1): Lower cost storage, like file servers, can't be a boot volume.
Magnetic (standard): Lowest cost per gigabyte of all EBS. It's bootable and it's from
the previous storage generation.
As you would do in a bare-metal server, you can also create RAID arrays within AWS EC2
boxes using EBS volumes.
o RAID 0 – splits ("stripes") data evenly across two or more disks. When I/O
performance is more important than fault tolerance; for example, as in a
heavily used database (where data replication is already set up separately).
You can use RAID 0 configurations in scenarios where you are using heavy
databases with perhaps mirroring and replication.
o RAID 1 – consists of an exact copy (or mirror) of a set of data on two or more
disks. When fault tolerance is more important than I/O performance; for
example, as in a critical application. With RAID 1 you get more data durability
in addition to the replication features of the AWS cloud.
o RAID 5 and RAID 6 are not recommended for Amazon EBS because the parity
write operations of these RAID modes consume some of the IOPS available to
your volumes. Depending on the configuration of your RAID array, these RAID
modes provide 20-30% fewer usable IOPS than a RAID 0 configuration.
Increased cost is a factor with these RAID modes as well; when using identical
volume sizes and speeds, a 2-volume RAID 0 array can outperform a 4-
volume RAID 6 array that costs twice as much.
Remember also that you can create snapshots of your RAID arrays
A security group acts as a virtual firewall for your instance to control inbound and outbound
traffic.
AMI Types
EBS: Amazon EBS provides durable, block-level storage volumes that you can attach
to a running instance
o EBS takes less time to provision.
o EBS volumes can be kept once the instance is terminated.
Instance Store / Ephemeral storage: This storage is located on disks that are physically
attached to the host computer
o Instance Store Volumes can't be stopped, if the host fails, you lose data.
o You can reboot the instance without losing data.
o You can not detach Instance Store Volumes.
o Instance store volumes cannot be kept once the instance is terminated.
Elastic Load Balancing automatically distributes incoming application traffic across multiple
targets, such as Amazon EC2 instances, and others.
Types of loadbalancers
Application Load balancers: Best for load balancing HTTP & HTTPS traffic. They
operate at layer 7.
Network Load Balancer: Loadbalancing TCP traffic where extreme performance is
needed. They operate at layer 4.
Classic Load Balancer: Legacy ELB, mostly operate at layer 4, but can go up to 7.
X-Forwarded-For
(XFF) The HTTP header field is a common method for identifying the originating IP address
of a client connecting to a web server through an HTTP proxy or load balancer.
At the time of writing, the lab instructs you to create one EC2 instance and configure the load
balancer to simply point against it. My suggestion is to create 2 instances instead and change
the [Link] in something like instance_1 and instance_2 so you can see what box the
load balancer decides to send your requests. Doing so, allows you to also confirm that if an
instance is down, the load balancer automatically forwards traffic to the remaining one still
online.
CloudWatch - Lab
Standard monitoring is 5 min and * Detailed monitoring is 1 min (you will be charged
for it)
Dashboard used to visualize what's happening with your AWS environment.
Alarms can be set to notify when a specific threshold is hit
Events can be used to perform actions when state changes happen in your AWS
resources.
Logs can be aggregated in a single place to better troubleshoot. Remember that you
need to install an agent on the EC2 instance.
By default, Matrics on EC2 instances are: CPU related, Disk related, Network related
and Status check related.
Remember that:
[Link]
This will give you the top level metadata items, something like this.
If you want to access a specific item, simply add it at the end of your request:
Autoscaling group will automatically spread evenly on the number of instances across the AZ
you selected once you configured it. So 3 AZ with 3 as group size, means 1 box in each AZ
Placement Groups
You can launch or start instances in a placement group, which determines how instances are
placed on the underlying hardware. When you create a placement group, you specify one of
the following strategies for the group:
The name you specify for a placement group must be unique within your AWS
account.
Only specific types of instances can be launched in a placement group.
You can't merge placement groups.
You can't move an existing instance into a placement group.
If the exam refers to placement groups, without mentioning which type, it's most
probably talking about the Cluster ones since those are the old ones.
EFS
Amazon EFS provides scalable file storage for use with Amazon EC2. You can create an EFS
file system and configure your instances to mount the file system
Supports NFSv4.
You only pay for the storage you use.
Scale up to petabytes.
You need to make sure that the EC2 instance that needs to connect with the EFS
volume, is associated with the same security group you have on the EFS volume.
You can assign permissions at the file level and at the folder level.
Lambda
AWS Lambda lets you run code without provisioning or managing servers. You pay only for
the compute time you consume - there is no charge when your code is not running. Lambda
basically is based on triggers.
Event-driven based: Lambda runs your code based on events, such, new file on S3 or
a new alarm on cloudwatch
Compute-service based: Lambda runs your code based on HTTP requests using an
API Gateway or API calls made using AWS SDKs.
First 1M requests per month are free. $0.20 PER 1M requests thereafter
Duration: You are charged for the amount of memory you allocate on your functions.
First 400,000 GB-seconds per month, up to 3.2M seconds of computing time, are
free.
Your functions can't go over 5 minutes in run-time.
Big Data
Kinesis
If in the exam you got a question which is talking about consuming social media
feeds, or a way to consume big data into the cloud, chances are they are talking
about Kinesis.
Redshift
If in the question they use languages like business intelligence, or applying business
intelligence to big data think about Redshift.
Elastic MapReduce
If in the questions they refer to big data processing, think about Elastic Mapreduce.
EC2
EBS
EBS backed volumes are persistent
Instance Store backed volumes are not persistent (ephemeral).
EBS Volumes can be detached and reattached to other EC2 instances.
Instance store volume cannot be detached and reattached to other instances - they
exist only for the life of that instance.
EBS volumes can be stopped; data will persists.
Instance store volumes cannot be stopped - if you stop them, data will be lost.
EBS Backed = Store Data for Long term.
Instance Store = You should not use it for long-term data storage.
OpsWorks
If you got a question that is talking about chef, or recipes or cookbooks, think about
OpsWorks
SWF Actors
Workflow Started: An application that can initiate a workflow.
Decider: Control the flow of activity tasks in a workflow execution.
Activity Workers: Carry out activity tasks.
AWS Organizations
Using AWS Organizations, you can manage multiple AWS accounts at once. With
organizations, you can create groups of accounts a then apply policies to those groups.
Cross-Account Access
Many AWS customers use separate AWS accounts for their development and production
resources. This separation allows them to cleanly separate different types of resources and
can also provide some security benefits.
Cross-account access makes it easier for you to work productively within a multi-account (or
multi-role) AWS environment by making it easy for you to switch roles within the AWS
Management Console. You can now sign in to the console using your IAM user name then
switch the console to manage another account without having to enter or remember another
user name and password
Resource groups make it easy to group your resources using the tags that are assigned to
them. You can group resources that share one or more tags.
Region
Name
Health Checks
VPC Peering
VPC Peering is simply a connection between two VPCs that enables you to route traffic
between them using private IP addresses. Instances in either VPC can communicate with each
other as if they are within the same network, You can create a VPC peering connection
between your own VPCs, or with a VPC in another AWS account within a single reagion.
AWS uses the existing infrastructure of a VPC to create a VPC peering connection; it's
neither a gateway nor a VPN connection and does not rely on separate piece of physical
hardware. There is no single point of failure for communication or a bandwidth bottleneck.
VPC Limitations:
You cannot create a VPC peering connections between VPCs that have matching or
overlapping CIDR block.
You cannot create a VPC peering connection between VPCs in different regions.
VPC peering does not support transitive peering relationships.
Direct Connect
AWS Direct Connect is a cloud service solution that makes it easy to establish a dedicated
network connection from your premises to AWS. Using AWS Direct Connect, you can
establish private connectivity between AWS and your datacenter, office, or colocation
environment.
VPNs connections are not direct connect! Direct connections need a physical cable
(Dedicated Line) between your facility up to your AWS Facility.
STS
Grants users limited and temporary access to AWS resources. Users can come from three
sources:
Key Terms
Federation: Combining or join a list of users in one domain (such as IAM) with a list
of users in another domain (such as Active Directory, Facebook etc)
Identity Broker: A service that allows you to take identity from point A and join it
(federate it) with point B
Identity Store: Services like Active Directory, Facebook, Google etc
Identities: A user of a service like Facebook etc.
Scenario
4. The identity Broker calls the new GetFederationToken function using IAM
Credentials. The call must include an IAM policy and a duration (1 to 36 hours),
along with a policy that specifies the permissions to be granted to the temporary
security credentials.
5. The Security Token Service confirms that the policy of the IAM user making the call
to GetFederationToken gives permission to create new tokens and then returns four
values to the application:
o Access key
o secret access key
o token
o duration (the token's lifetime)
6. The Identity Broker returns the temporary security credentials to the reporting
application.
7. the data storage application uses the temporary security credentials (including the
token) to make a request to Amazon S3.
8. Amazon S3 uses IAM to verify that the credentials allowed the requested operation on
the given S3 bucket and key.
In the Eaxm
Scenario 1
Scenario 2
Quick Facts:
Docker is used to run software packages called containers. Containers are isolated from each
other and bundle their own application, tools, libraries and configuration files; they can
communicate with each other through well-defined channels. All containers are run by a
single operating system kernel and are thus more lightweight than virtual machines.
Containers are created from images that specify their precise contents.
About ECS
ECS is a regional service that you can use in one or more AZs across a new or
existing, VPC to schedule the placement of containers across your cluster based on
your resource needs, isolation policies, and availability requirements.
ECS eliminates the need for you to operate your own cluster management and config
management systems, or to worry about scaling your management infrastructure.
ECS can also be used to create a consistent deployment and build experience,
manage and scale batch and ETL workloads, and build sophisticated application
architectures on a microservice level.
About Containers
Containers are a method of operating system virtualization that allows you to run
the application and its dependencies in resource-isolated process.
Containers have everything the software needs to run (libraries, system tools, code,
and runtime).
Containers are created from a read-only template called an image.
Task Definitions are text files in JSON format that describe one or more containers
that form your application.
Some of the parameters you can specify in a task definition include:
o Which Docker images to use with the containers in your task
o How much CPU and memory to use with each container
o Whether containers are linked together in a task
o The docker networking mode to use for the containers in your task
o What (if any) ports from the container are mapped to the host container
service
o Whether the task should continue to run if the container finishes or fails
o The command the container should run when it is started
o What (if any) env variables should be passed to the container when it starts.
o Any data volumes that should be used with containers in the task
o What (if any) IAM role your tasks should use for permissions
ECS Services
An ECS service allows you to run and maintain a specified number (or, the "desired
count") of instances of a task definition simultaneously in and ECS cluster
Think of services like Auto-Scaling groups for ECS
If a task should fail or stop, the ECS service scheduler launches another instance of
your task definition to replace it and maintain the desired count of tasks in the
service.
ECS Clusters
An ECS cluster is a logical grouping of container instances that you can place tasks
on. When you first use the Amazon ECS service, a default cluster is created for you,
but you can create multiple clusters in an account to keep your resources separate.
Concepts:
o Clusters can contain multiple different container instance types
o Clusters are region-specific
o Container instances can only be part of one cluster at a time.
o You can create IAM policies for your clusters to allow or restrict users' access
to specific clusters
ECS Scheduling
Service Scheduler:
o Ensures that the specific number of tasks are constantly running and
reschedules tasks when a task fails (for example, if the underlying container
instance fails for some reason).
o Can ensure tasks are registered against and ELB.
Custom Scheduler:
o You can create your own schedulers that meet your business needs.
o Leverage third-party schedulers such as Blox.
The Amazon ECS schedulers leverage the same cluster state information provided by
the ECS API to make appropriate placement decisions.
ECS Container Agent allows container instances to connect to your cluster. The ECS
container agent is included in the ECS optimized AMI, but you can also install it on any EC2
instance that supports ECS specifications. The Amazon ECS container agent is only
supported on EC2 instances.
ECS Security
IAM Roles:
o EC2 instances use an IAM role to access ECS.
o ECS tasks use an IAM role to access services and resources.
Security Groups attach at the instance-level (i.e. the host...not the task or container)
You can access and configure the OS of the EC2 instances in your ECS cluster
ECS Limits
Soft Limits:
o Clusters per Region (default = 1000)
o Instances per Cluster (default = 1000)
o Services per Cluster (default = 500)
Hard Limits:
o One Load Balancer per Service
o 1000 Tasks per Service (the "desired count")
o Max 10 Containers per Task Defintion
o Max 10 Tasks per Instance (host)
Recap
ECS - Amazon's managed EC2 container service. Allows you to manage Docker
containers on a cluster or EC2 instances.
Containers are a method of operating system virtualization that allows you to run an
application and its dependencies in resource-isolate processes.
Containers are created from a read-only template called Image
An image is a read-only template with instructions for creating a Docker container.
Images are stored in a Registry.
Amazon EC2 Container Registry is a managed AWS docker registry service (Amazon
ECR)
A task definition is required to run Docker containers in Amazon ECS.
Task Definitions are text files in JSON format that describe one or more containers
that form your application.
Think of a task definition as a cloud formation template but for docker.
An Amazon ECS service allows you to run and maintain a specified number of
instances of a task definition simultaneously in an ECS cluster.
Think of Services like Auto-Scaling groups for ECS.
An Amazon ECS cluster is a logical grouping of containers instances that you can
place tasks on.
Clusters can contain multiple different container instances types.
Clusters are region-specific
Container instances can only be part of one cluster at the time.
You can create IAM policies for your clusters to allow or restrict users' access to
specific clusters.
You can schedule ECS in two ways:
o Service scheduler
o Custom scheduler
ECS agent to connect EC2 instances to your ECS cluster. Linux Only
IAM with ECS to restrict access
Security groups operate at the instance level, not at the task or container level.
What's IAM
Allows you to manage users and their level of access to the AWS console.
Features
Centralised control of your AWS account.
Shared Access to your AWS account.
Gives you granular Permission.
Does Identity Federation (Active directory, Facebook, etc)
MultiFactor Authentication.
Provides temporary access for users/devices etc.
Allows you to set up your own password rotation policy.
Supports PCI DSS compliance.
integrated with a lot of AWS services.
From Console
IAM users sign-in link: You can customize the URL used by users to sign-in
Billing Alarm
You can set a billing alarm to get notified when the amount of spendings for the current
month reaches your threshold. Before doing it you probably have to go on:
[Link] and enable: Receive
Billing Alerts
After that, you can go and use cloudwatch services to set a new Billing Alarm
CHAPTER 5 | Route53
DNS
You won't be questioned in the exam about the next sections (up to ALIAS record). It's just for
your knowledge and very good for better understanding the course.
What's DNS
The Domain Name Systems (DNS) is the phonebook of the Internet. Humans access
information online through domain names, like [Link] or [Link]. Web browsers
interact through Internet Protocol (IP) addresses. DNS translates domain names to IP
addresses so browsers can load Internet resources.
IPv4 vs IPv6
IPv4 is a 32-Bit IP Address.
IPv6 is 128 Bit IP Address and was created to fulfil the need for more Internet
addresses.
A top-level domain is one of the domains at the highest level in the hierarchical Domain
Name System of the Internet (.com .net .org as an example).
In the case of .[Link], .co is the second level domain and .uk is the top level domain
SOA Record
A Start of Authority record (abbreviated as SOA record) is a type of resource record in the
Domain Name System (DNS) containing administrative information about the zone,
especially regarding zone transfers.
[...]
;; ANSWER SECTION:
[Link]. 55 IN SOA [Link]. [Link]. (
238640061 ; serial
900 ; refresh (15 minutes)
900 ; retry (15 minutes)
1800 ; expire (30 minutes)
60 ; minimum (1 minute)
)
[...]
NS Record
NS stands for 'name server' and this record indicates which DNS server is authoritative for
that domain (which server contains the actual DNS records)
$ dig NS [Link]
[...]
;; ANSWER SECTION:
[Link]. 2073 IN NS [Link].
[Link]. 2073 IN NS [Link].
[Link]. 2073 IN NS [Link].
[Link]. 2073 IN NS [Link].
[...]
A Record
The ‘A’ stands for ‘address’ and this is the most fundamental type of DNS record, it indicates
the IP address of a given domain
$ dig A [Link]
[...]
;; ANSWER SECTION:
[Link]. 62 IN A [Link]
[...]
When a caching (recursive) nameserver queries the authoritative nameserver for a resource
record, it will cache that record for the time (in seconds) specified by the TTL.
CNAME Record
CNAME records can be used to alias one name to another. CNAME stands for Canonical
Name.
A common example is when you have both [Link] and [Link] pointing to
the same application and hosted by the same server.
$ dig [Link]
[...]
;; ANSWER SECTION:
[Link]. 3595 IN CNAME [Link].
[Link]. 55 IN A [Link]
[...]
CNAME can't be used on the root domain. (This is a contractual limitation imposed by the
RFC 1912 and RFC 2181, not a technical one.)
ALIAS record
In AWS you have to use ALIAS records to point your root domain to other DNS records such
as your ELB.
DNS IN AWS
Routing policies available in AWS
Simple routing policy: Use for a single resource that performs a given function for
your domain. You can have 1 record with multiple addresses.
Weighted routing policy: Use to route traffic to multiple resources in proportions
that you specify. You can send 40% of the traffic on one IP and 60% to another IP.
Latency routing policy: Use when you have resources in multiple AWS Regions and
you want to route traffic to the region that provides the best latency.
Failover routing policy: Use when you want to configure active-passive failover. You
need to create a health check before.
Geolocation routing policy: Use when you want to route traffic based on the location
of your users.
Multivalue answer routing policy: Use when you want Route 53 to respond to DNS
queries with up to eight healthy records selected at random.
Geoproximity routing policy: Use when you want to route traffic based on the
location of your resources and, optionally, shift traffic from resources in one location
to resources in another.
S3
What's S3
Object-based storage: you can save only object, you can't, for example, install an OS
(In this case you need block-based storage).
Files can save from 0 Bytes to 5 TB.
No storage Limits.
Files are stored in Buckets (a folder in a cloud).
S3 is a universal namespace, the name must be unique globally. So you cannot have
the same name as someone else.
Sample of an S3 URL: [Link]
When you upload an object in S3 you get an HTTP 200 OK code back.
Components
S3 is an object. Objects consist of:
o Key (name of the object)
o Value (data)
o Version ID (Used on versioning)
o Metadata (a set of data that describes and gives information about the object
data.)
o Subresources:
Access Control List (Decide who can access files)
Torrent (Not an exam topic)
Basics
S3 SLA: 99.9% availability
S3 is built for 99.99%
S3 guarantees 11x9s (99.999999999) durability for S3 information.
Tiered Storage (classes) available
You can have lifecycle management
Versioning
Supports multi-part upload
Encryption
Access control (permissions on single files) and bucket policies (permissions on
buckets)
S3 Storage Tiers
S3 standard: 99.99% availability 11x9s durability (it sustains the loss of 2 facilities
concurrently)
S3 IA: (Infrequently Accessed): For data that is accessed less frequently, but needs
rapid access. You are charged a retrieval fee per GB retrieved
S3 One Zone IA: Like S3 IA but data is stored only in one AZ
Glacier: Most cheap, used for archival only.
o Expedited: few minutes for retrieval
o Standard: 3-5 hours for retrieval
o Bulk: 5-12 hours for retrieval
o It encrypts data by default
o Regionally availability
o Designed with 11x9s durability, like S3
Charges
S3 is charged for:
Storage
Requests
Storage management pricing
Data Transfer Pricing
Transfer acceleration (it's using CloudFront the AWS CDN) using edge locations
S3 Version Control
Once you enable versioning, you can't disable it, you can only suspend it. A way of
disabling it is to delete the bucket and re-create it
Every time you update an object, it will become private by default.
Delete an object:
Once you delete a file inside a versioned bucket, you don't delete the file, you simply
add a Delete Marker (this basically creates a new version of the object) If you delete
the version with the Detele Marker you will basically restore the object.
If you want to permanently delete the object, you have to delete all the Versions of the
object.
You can optionally add another layer of security by configuring a bucket to enable
MFA Delete
In order to replicate the existing objects, you need to do a cp using the aws cli:
If you delete an object in the primary bucket, the delete action and markers won't be
done or replicated in your remote bucket, this is a security function. Only creations
and modifications are replicated to the bucket in the other regions NOT the delete
You can't replicate over multiple buckets, the maps are always 1-to-1
CloudFront
What's a CDN
o Edge Location: Is the location where the content is cached (separate from
AWS AZ's or regions) Be aware that you can also write on edge locations, is
not ready only.
o Invalidating (erasing) the cache costs money.
o Origin: Is the source of the files the CDN will distribute. An origin can be an
EC2 instance, an S3 bucket, an Elastic Load Balancer or Route53, you can also
have your own origin, it not mandatory that is within AWS.
o Distribution: Is the name AWS calls CDN's.
You can Have two types: Web that is for generic web contents and
RTMP that is for video streaming
TTL: time to live of the cached object.
You can configure S3 to create access logs for requests made to the S3 bucket
Access control for buckets:
Amazon Storage
Amazon Storage Gateway
File Gateway: For flat files, stored directly in S3. You can NFS Mount points
VOlume gateway (iSCSI): Block-based storage
o Store volume (you keep all your data on prem)
o Cached Volumes (you keep only the most recent data on prem) Tape
Gateway (VTL): Virtual tapes
Snowball
Import Export is still available and was the first version of snowball, you used to ship your
drives to AWS
Snowball is (an appliance) a petabyte-scale data transport solution that uses devices designed
to be secure to transfer large amounts of data into and out of the AWS Cloud
Snowball edge: is a 100TB data transfer device with onboard storage-computer capabilities.
It's like an AWS DC in a box
S3 Transfer Acceleration
Instead of uploading files directly to your S3 bucket, you can use the AWS edge network.
Using a specific URL, you upload the file to your local edge and then the file will be
uploaded to S3 an example or URL: [Link]
{
"Version":"2012-10-17",
"Statement":[{
"Sid":"PublicReadGetObject",
"Effect":"Allow",
"Principal": "*",
"Action":["s3:GetObject"],
"Resource":["arn:aws:s3:::example-bucket/*"
]
}
]
}
Amazon Virtual Private Cloud (Amazon VPC) lets you provision a logically isolated section
of the AWS Cloud where you can launch AWS resources in a virtual network that you define.
You have complete control over your virtual networking environment, including selection of
your own IP address range, creation of subnets, and configuration of route tables and network
gateways.
Default VPC
Amazon provides a default VPC to immediately deploy instances.
All Subnets in default VPC have a route out to the internet.
VPC Peering
You can peer one VPC to another VPC using private IP subnets.
You can peer VPC's with others AWS accounts as well as with other VPC's in the
same account.
Overlapping CIDR Blocks is not supported: You can't connect two VPC's that have
the same CIDR.
Transitive Peering is not supported:
You have a VPC peering connection between VPC A and VPC B (pcx-aaaabbbb), and
between VPC A and VPC C (pcx-aaaacccc). There is no VPC peering connection
between VPC B and VPC C. You cannot route packets directly from VPC B to VPC
C through VPC A.
VPC and Subnet Sizing The first four IP addresses and the last IP address in each
subnet CIDR block are not available for you to use, and cannot be assigned to an
instance.
For Nat Instances you have to disable the Source/Destination Checks.
VPC Flow Logs is a feature that enables you to capture information about the IP traffic going
to and from network interfaces in your VPC. Flow log data can be published to Amazon
CloudWatch Logs and Amazon S3. After you've created a flow log, you can retrieve and
view its data in the chosen destination.
VPC Enpoints
A VPC endpoint enables you to privately connect your VPC to supported AWS services and
VPC endpoint services powered by PrivateLink without requiring an internet gateway, NAT
device, VPN connection, or AWS Direct Connect connection.