IAM
IAM Security Tools
IAM Credentials Report (account-level) – a report that lists all your account’s users and the
status of their various credentials.
IAM Access Advisor (user-level) - access advisor shows the service permissions granted to a
user and when those services were last accessed. You can use this information to revise your
policies..
EBS
EBS Multi Attach – ability to attach the same EBS volume to multiple EC2 instances in the same
AZ
Each instance has full read & write permissions to the high-performance volume
Use case:
When you are trying to achieve higher application availability in clustered Linux applications (ex:
Teradata)
When applications must manage concurrent write operations
Up to 16 EC2 Instances at a time
Must use a file system that’s cluster-aware (not XFS, EXT4, etc…)
Elastic Load Balancing (ELB)
EC2 instance allows security group from Load Balancer while Load Balancer has the usual set up
for Security Group of its own.
Application Load Balancer (ALB)
Provides fixed hostname.
The application servers don’t see the IP of the client directly. Instead, true ip of the client is
inserted in the header X-Forwarded-For. We also get Port (X-Forwarded-Port) and proto (X-
Forwarded-Proto).
Support by protocol
Application Load Balancer - HTTP and HTTPS
Network Load Balancer - TCP, UDP, TLS, when you need millions of requests handled - high
performance.
Gateway Load Balancer - analyzing network traffic-for security,
Network Load Balancer (NLB)
NLB - one static IP per AZ. When exam asks you if your application should only be accessed from
1,2 or 3 IPs, then NLB is solution (question from exam)
Target groups can be:
EC2
Private IP
ALB
Important for exam - Health Checks support TCP, HTTP, HTTPS
Gateway Load Balancer
Used with 3rd party network virtual appliances. If you want to have your traffic inspected before
forwarded to your applicances, use Gateway Load Balancer.
if we see GENEVE protocol on port 6081 on exam - it's gateway load balancer,
Combines functions of Transparent Network Gateway - single entry/exit for all traffic and Load
Balancer - distributes traffic to virtual appliances.
Example: Firewalls, Intrusion Detection and Prevention Systems, Deep Packet Inspection
Systems, payload manipulation.
Target groups can be either EC2 instances or Ip Addresses
Elastic Load Balancer
Sticky sessions - Used for user not loosing session data.
Important to know there are 2 types of cookies - application based and duration based.
Application-based Cookies
- Custom cookie: generated by the target, can include any custom attributes required by the
application, cookie name must be specified individually for each target group.
Reserved names: AWSALB, AWSALBAPP, AWSALBTG
- Application cookie: generated by the load balancer. Reserved name: AWSALBAPP
Duration-based cookies
Generated by load balancer. Reserved names: AWSALB for ALB, AWSELB for CLB.
Application load Balancer - Cross-Zone is enabled by default and there is no addtl charges.
Network Load Balancer and Gateway Load Balancer are disabled by default.
Server Name Indication - multiple SSL certificates onto one web server to serve multiple
websites
Connection Draining – for CLB or Deregistration Delay for ALBand NLB
Concept is that it gives time to complete “in-flight” requests while the instance is de-
regeistering or is unhealthy. While the instance is being drained, it will stop sending new
requests to it.
Process: Users who are already connected to the instance that is being drained are going to be
given enough time (which is draining period) to complete their transactions. While for the new
requests, ELB will be smart enough to forward these requests just to other instances.
1-3600 seconds (default is 300). Can be disabled (set value to 0). Set a low value if your request
is short, because it will allow for quick ec2 instance draining.
Auto Scaling groups
Scale out - adding new instances in the case of increased demand.
Scale in - removing instances (lowering number of instances)
Good metrics to scale on
CPUUtilization - Average CPU utilization across your instances
RequestCountPerTarget - to make sure the number of requests per EC2 instances is stable
Average Network In / Out (if you’re application is network bound)
Any custom metric (that you push using CloudWatch)
3 types of dynamic scaling
This is to tell Auto Scalling when to Auto Scale:
Target Tracking - Example: I want the average ASG CPU to stay at around 40%
Simple/Step – Example: When a CloudWatch alarm is triggered (example CPU > 70%), then add
2 units
Scheduled Actions - Anticipate a scaling based on known usage patterns
Example: increase the min capacity to 10 at 5 pm on Fridays
Aster scaling activity happens, cooldown period proceeds.Default cooldown period - 300
seconds.
RDS
RDS is a managed service:
• Automated provisioning, OS patching
• Continuous backups and restore to specific timestamp (Point in Time Restore)!
• Monitoring dashboards
• Read replicas for improved read performance
• Multi AZ setup for DR (Disaster Recovery)
• Maintenance windows for upgrades
• Scaling capability (vertical and horizontal)
• Storage backed by EBS (gp2 or io1)
• BUT you can’t SSH into your instances
Storage auto scaling
Maximum Storage Threshold
Will automatically scale when:
Free storage is less than 10% of allocated storage
Low-storage lasts at least 5 minutes
6 hours have passed since last modification
RDS Read Replicas
Up to 15 Read replicas
Within AZ, Cross AZ or Cross Region
Replication is Async
Replicas can be promoted to their DB
Applications must update connection strings
RDS Read Replicas within same region – no extra fee
You can only select – can’t write
RDS Multi AZ
Sync Replication
One DNS Name – no need to update connection strings.
Increase availability
Failover in case of loss of AZ, loss of network.
Good for DR
From single AZ to Multi AZ
Zero downtime operation – no need to stop the DB
Process:
Snapshot is taken
A new DB is restored from the snapshot in a new AZ
Synchronization is established between two databases.
Aurora
• 6 copies of your data across 3 AZ:
• 4 copies out of 6 needed for writes
• 3 copies out of 6 need for reads
• Self healing with peer-to-peer replication
• Storage is striped across 100s of volumes
• One Aurora Instance takes writes (master)
• Automated failover for master in less than 30 seconds
• Master + up to 15 Aurora Read Replicas serve reads
• Support for Cross Region Replication
1 master, multiple replicas, self healing
Features
Automatic fail-over
• Backup and Recovery
• Isolation and security
• Industry compliance
• Push-button scaling
• Automated Patching with Zero Downtime
• Advanced Monitoring
• Routine Maintenance
• Backtrack: restore data at any point of time without using backups
RDS Proxy
Use RDS Proxy to improve database efficiency by reducing the stress on database resources and
minimize open connections (important for exam)
RDS proxy handles failover.
Allows apps to pool and share DB connections established with the database.
Supports both RDS and Aurora.
No code changes needed.
Never publicly accessible
Enforce IAM Authentication for DB, and securely store credentials in AWS Secrets Manager
ElastiCache
REDIS
Multi AZ with Auto-Failover
Read Replicas to scale reads and have high availability
Backup and restore features
Supports Sets and Sorted Sets – keywords for the exam
Memcached
Multi node for partitioning of data (sharding)
No high availability
Non peristant
No back up and restore
Multi threaded architecture
REDIS – high availability , MEMCACHED – pure cache.
Lazy Loading / Cache-Aside / Lazy population
Application lookds first in ElasticCache. If Cache Hit – OK. If cache miss, look for data in RDS.
Once retrieved from RDS, write that data to cache
PROS:
Only the requested data will be cached (very efficiant)
Node failures are not fatal
CONS:
In case of cache miss, read penalty is that 3 calls have to be made
It's possible to have stale data
Write through – cache is updated every time RDS is updated
PROS
Data is never stale
Write penalty vs read penalty (only 2 calls are required)
CONS
Data missing until data is written.
Cache churn – a lot of data will never be read.
So combine with lazy loading – try first write through, if data is not found, do lazy loading.
Cache evictions and time to live
3 ways to evict cache:
Delete item explicitly in the cache
Item is evicted because the memory is full and it’s not recently use
Set TTL
TTL good for leaderboards, comments, activity streams.
Amazon MemoryDB for Redis
Route 53
A – maps hostname to IPv4
AAAA – maps hostname to IPv6
CNAME – maps a hostname to another hostname. Can’t create CNAME for toplevel - domains
NS – Name Servers for the Hosted Zone
Public Hosted Zones
Whenever you buy a public domain – you can store it in public hosted zone
Private Hosted Zones
Internal, not visibile publicaly, for traffic within one or more VPCs.
TTL – High TTL vs Low TTL
Client is saying please cache this request for a certain amount of time, defined by TTL
If High TTL – less traffic con Route 53, but it’s possible outdated records.
if Low TTL – more traffic on Route 53, but records are more up to date and it’s easy to do
changes.
TTL is mandatory for each DNS record except for Alias
Cname vs Alias
CNAME – points hostname to another hostname. Only for non root domains.
ALIAS – points hostname to aws service such as ALB, S3 etc. Works for ROOT. Can’t set TTL on it.
Alias is always type of A/AAAA. Automatically recognizes changes in the resource’s IP address
(for example if there is a change in IP for ALB, it will be automatically picked up)
What can Alias point to? Possible target groups:
Elastic Load Balancers
• CloudFront Distributions
• API Gateway
• Elastic Beanstalk environments
• S3 Websites
• VPC Interface Endpoints
• Global Accelerator accelerator
• Route 53 record in the same hosted zone
You cannot set an ALIAS record for an EC2 DNS name
Important to know for exam – ALIAS record can be set at both root and non root tomains. When
creating alias records you can have health check be performed automatically. It’s always
A/AAAA DNS.
Routing policies
Simple
Route traffic to a single resource.
It’s possible to specify multiple values for the same record. For example, you can add multiple A
records for the same domain. In that case, when queried, random one will be chosen by the
client.
When Alias is enabled, you can only have one aws resource
Can’t be associated with health checks.
Weighted
Control the percentage of the requests that go to each specific resource
Can be associated with health checks.
Use case: Load balancing between multiple regions, testing new application versions.
If you want to stop sending traffic to resource, assign it weight of 0. If all resources have weight
of 0, traffic will be returned equally.
Latency Based
Redirect to the resource that has the least latency close to us.
Super helpful when latency is priority.
Latency is based on traffic between users and regions.
Can be associated with Health Checks
Failover
Failover can be primary or secondary. Primary is the one where traffic goes to when healthy.
Secondary is for the traffic failover when primary becomes unhealthy.
Geolocation
Routing based on user location.
You should have default location in case there is no match on location
Use cases: website localization, restrict content distribution, load balancing, …
Can be associated with Health Checks
Geoproximity
Bias is used to manipulate geoproximity.
Multivalue
Not substitution for ELB.
ELB is client side routing.
To change the size of the geographic region, specify bias values:
• To expand (1 to 99) – more traffic to the resource
• To shrink (-1 to -99) – less traffic to the resource
Geoproximity is really helpful when you need to shift traffic from one region to another, by
increasing the bias - IMPORTANT FOR THE EXAM
IP-based Routing
Routing is based on clients’ IP addresses
You provide a list of CIDRs for your clients and the corresponding endpoints/locations (user-IP-
to-endpoint mappings)
Use cases: Optimize performance, reduce network costs…
Example: route end users from a particular ISP to a specific endpoint
Health Checks
Monitor an endpoint: 15 global healthcheckers, automated, supported protocols HTTP, HTTPS,
TCP, if > 18% health checkers report healthy, Route 53 considers it healthy. Health Check only
passes when response is with 2xx and 3xx status codes.
Health checks that monitor other health checks (calculated health checks): combines results of
multiple health checks into a single one.
Health checks that monitor CW Alarms (full control): You can create a CloudWatch Metric and
associate a CloudWatch Alarm, then create a Health Check that checks the alarm itself
To visually represent traffic flow and maintain complex decision trees – use traffic flow diagram.
VPC
VPC: private network to deploy your resources (regional resource)
Subnets allow you to partition your network inside your VPC (Availability Zone resource)
A public subnet is a subnet that is accessible from the internet
A private subnet is a subnet that is not accessible from the internet
To define access to the internet and between subnets, we use Route Tables.
Internet gateway connects VPC to internet. Public subnets have route to the internet gateway.
NAT Gateways and instances allow instances in your private subnets to access the internet while
remaining private.
NACL and Security Groups
NACL: firewall that controls traffic from and to subnet.
Can have ALLOW and DENY rules.
Attached at subnet level
Rules Only include IP addresses
Security Groups
Firewall that controls traffic to and from an ENI/EC2 instance
Can have Only ALLOW Rules
VPC Flow Logs data can go to S3, CloudWatch Logs and Kinsesis Data Firehose.
VPC Endpoints
Any time time exam is asking you to privately connect to an AWS Service, VPC Endpoint is the
way.
S3
Not a global service – buckets are created in a region
Security
User-Based
IAM Policies – which API calls should be allowed for a specific user from IAM
Resource-Based
Bucket Policies – bucket wide rules from the S3 console - allows cross account
Object Access Control List (ACL) – finer grain (can be disabled)
Bucket Access Control List (ACL) – less common (can be disabled)
Encryption: encrypt objects in Amazon S3 using encryption keys
If on s3 webstie hosting you get 403 Forbidden error, make sure the bucket policy allows public
read.
Versioning
Enabled on bucket level.
Replication
Cross-Region Replication and Same-Region Replication
Must enable Versioning for it to work
Copying is asynchronous
Use cases:
• CRR – compliance, lower latency access, replication across
accounts
• SRR – log aggregation, live replication between production and test
Accounts
S3 Batch Replication – replicates existing objects and objects that failed replication.
Delete Marker replication important for exam!
Delete markers created by S3 deleted operations will be replicated. Deleted markers created by
lifecycle rules are not replicated!
If I permanently delete object in the source bucket it will not be deleted in the destination
(replicated) bucket
S3 Storage Classes
Amazon S3 Standard - General Purpose
Used for frequently accessed data. Low latency and high throughput.
Use Cases: Big data analytics, mobile and gaming applications, content distribution…
Amazon S3 Standard-Infrequent Access (IA)
For data that is less frequently accessed, but requires rapid access when needed.
Lower cost than S3 Standard
S3 Standard-IA: Use Case: Disaster recovery, backups
Amazon S3 One Zone-Infrequent Access
high availability in one AZ. Data lost when AZ destroyed. Use case: storing secondary backup
copies or data that you can recreate.
Amazon S3 Glacier Instant Retrieval
Millisecond retrieval, great for data accessed once a quarter
Minimum storage duration of 90 days
Amazon S3 Glacier Flexible Retrieval
Expedited (1 to 5 minutes), Standard (3 to 5 hours), Bulk (5 to 12 hours) – bulk is free
Minimum storage duration of 90 days
Amazon S3 Glacier Deep Archive – long term storage
Standard (12 hours), Bulk (48 hours)
Minimum storage duration of 180 days
Amazon S3 Intelligent Tiering
Small monthly monitoring and auto-tiering fee
Moves objects automatically between Access Tiers based on usage
There are no retrieval charges in S3 Intelligent-Tiering
Transitioning
You can transition between storage classes. For infrequently accessed object, move them to
Standard IA. For archive, move to Glacier, or Glacier Deep Archive.
Transition Action – configure object to transition from one storage class to another
Expiration action – configure object to expire (be deleted) after certain time. Can be used to
delete old version of file or to delete incomplete multi-part uploads.
S3 notifications can be sent to SQS, SNS and Lambda and Event Bridge Notification for more
services.
S3 Performance
Multi-Part upload
Recommended for files greater than 100 MB, must be used for files greater than 5 GB. Can help
parallelize uploads
S3 Transfer Acceleration
Increase transfer speed by transferring file to aws edge location which will forward data to s3
bucket. Compatible with multi-part.
S3 Byte-Range Fetches
Parallelizing gets to speed up downloads by requesting specific byte ranges.
Important for the exam is to know these performance options for speeding up download and
upload of the files
S3 Select & Glacier Select
Retrieve less data using SQL by performing server-side filtering
S3 Object Tags and metadata
If you want to upload your own metadata when uploading file, metadata names must begin
with "x-amz-meta-"
Tags are used usually for fine-grained permissions or analytics purposes.
Important to remember is that you cannot search object metadata or object tags.
Common exam question - how would you search - answer you must use external DB as a search
index such as DynamoDB and then retrieve a file name from that DB and into s3.In that external
DB you would save object metadata and object tags.
S3 Object Encryption
Server-Side Encryption (SSE)
Server-Side Encryption with Amazon S3-Managed Keys (SSE-S3) – Enabled by
Default
Encrypts S3 objects using keys handled, managed, and owned by AWS
Must set header "x-amz-server-side-encryption": "AES256"
Encryption is type AWS-256
Object encrypted server side
Server-Side Encryption with KMS Keys stored in AWS KMS (SSE-KMS)
Leverage AWS Key Management Service (AWS KMS) to manage encryption keys
object encrypted server side
Must set header "x-amz-server-side-encryption": "aws:kms"
IMPORTANT LIMITATION
If you use SSE-KMS, you may be impacted by the KMS limits
When you upload, it calls the GenerateDataKey KMS API
When you download, it calls the Decrypt KMS API
Count towards the KMS quota per second (5500, 10000, 30000 req/s based on region)
You can request a quota increase
Server-Side Encryption with Customer-Provided Keys (SSE-C)
When you want to manage your own encryption keys
HTTPS MUST be used.
Client-Side Encryption
Client must encrypt data themselves before sending to s3. Also decrypt when receiving file from
s3.
Encryption in transit (SSL/TLS)
S3 exposes 2 endpoints:
HTTP Endpoint – not encrypted
HTTPS Endpoint – encrypted. This one is recommended and mandatory for SSE-C
How to force Encryption in transit
EC2 Instance Metadata
AWS EC2 instance can learn about themselves without using IAM Role for that purpose.
The URL is http://169.254.169.254/latest/meta-data
You can retrieve the IAM Role name from the metadata, but you CANNOT retrieve the IAM
Policy.
Metadata = Info about the EC2 instance
Userdata = launch script of the EC2 instance
IMDSv1 vs IMDSv2
V1 is accessing link directly
V2 needs to get the session token of limited validity and use it to make a call
Question at the exam: How to use MFA with CLI or SDK?
With CLI: STS GetSessionToken! API Call.
AWS SDK
Official SDKs are…
• Java
• .NET
• Node.js
• PHP
• Python (named boto3 / botocore)
• Go
• Ruby
• C++
If you don’t specify region, it’s us-east-1 by default.
API Rate Limits
DescribeInstances API for EC2 has a limit of 100 calls per seconds
GetObject on S3 has a limit of 5500 GET per second per prefix
For Intermittent Errors: implement Exponential Backoff
For Consistent Errors: request an API throttling limit increase
Service Quotas (Service Limits)
Running On-Demand Standard Instances: 1152 vCPU
You can request a service limit increase by opening a ticket
You can request a service quota increase by using the Service Quotas API
Exponential backoff – If you get ThrottlingException intermittently.
Must only implement retries on 5xx server errors and throttling.
DO NOT IMPLEMENT on 4xx client erros.
EXAM QUESTION
Any time you get ThrottlingException because we did too many API calls - use exponential
backoff.
Which kind of errors you should retry on an Exponential Backoff?
When you receive server error that has 5xx server errors.
You SHOULD NOT IMPLEMENT RETRY ON 4XX CLIENT ERRORS.
Exam question!
Look for order of credentials chain priority at the beginning of video.
1. Command line options – --region, --output, and --profile
2. Environment variables – AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY, and
AWS_SESSION_TOKEN
3. CLI credentials file –aws configure
~/.aws/credentials on Linux / Mac & C:\Users\user\.aws\credentials on Windows
4. CLI configuration file – aws configure
~/.aws/config on Linux / macOS & C:\Users\USERNAME\.aws\config on Windows
5. Container credentials – for ECS tasks
6. Instance profile credentials – for EC2 Instance Profiles
When you have API request, you need to sign it. and you sign it with SigV4.
There are 2 ways to transmit signatures.
http header in authorization
and query string (X-Amz-Signature)