DVA-C01 studying - SQS, SNS, SES, Kinesis, Elastic Beanstalk

Jul 13, 2022

DVA-C01 software-dev

SQS - Simple Queue Service

What it is

https://aws.amazon.com/sqs/
distributed message queue service (temp buffer to store messages waiting to be processed)
- https://www.youtube.com/watch?v=oUJbuFMyBDk <- good explanation of message queues
- helps decouple components of an application
- messages can contain up to 256KB of text in any format (plaintext, XML,JSON, etc)
pull based, not push based - application sevices are pulling the messages from the queue rather than them being pushed out
guarantees messages will be processed at least once
messages can be kept in queue from 1minute up to 14 days - default retention period is 4 days

Types of Queues

standard queues (default)
- best-effort ordering - messages are generally delivered in the order they were sent but will occasionally be delivered out of order or more than one copy of the message will be delivered
- nearly unlimited number of transactions per second
- guarantees that a message is delivered at least once
FIFO queues (first in first out)
- ordering is strictly preserved - messages delivered in the order they were sent
- no duplicate messages will ever be sent
- exactly-once processing - message is delivered once and remains available until a consumer processes and deletes it
- 300 transactions per second
- ^minus these, same capabilities as a standard queue
- good usecase - banking

Visibility Timeout (configuration)

ChangeMessageVisibility - API call to change the visibility timeout of a specified message in a queue
the amount of time a message is invisible in the queue after a reader reads the message for processing
- if the job is not processed within that time, the message will become visible again for another reader to read
- default is 30sec - need to increase if your task will take longer
- maximum is 12hrs

Polling

Short Polling
- returns a response immediately even if the message queue being polled is empty
- can result in a lot of empty responses
- pay per response (even empty ones)
Long Polling
- periodically polls the queue
- doesn’t return a response until a message arrives in the queue or the long poll times out
- generally preferable to short polling - also saves money

Delay Queues (configuration)

postpone the delivery of new messages (setting change)
- does not affect the delay of messages already in the queue, only new ones - standard queues
- does affect the delay of messages already in the queue - FIFO queues
messages in the Delay Queue remain invisible for the delay period (0-900sec)
When to use:
- large, distributed applications that need a delay in processing
- you need to apply a delay to an entire message queue

managing large messages

for messages 256kb up to 2GB in size
use S3 to store the messages
use Amazon SQS Extended Client Library to manage and the AWS SDK for S3 and object operations
- AWS docs default to Java for the library and SDK
- cannot use AWS CLI, AWS Management Console, SQS Console, SQS API

Exam Tips

separate queues can be used to prioritize work

SNS - Simple Notification Service

What it is

https://aws.amazon.com/sns/
web service to setup up, operate, and send push notifications from the cloud
- messages sent can immediately be delivered to subscribers or other applications
can send push notifications to a Apple, Google, Android, Windows, and other devices
can send SMS or email or send to SQS queues or HTTP endpoints
can trigger Lambda function
uses a pub-sub model
- https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern
- applications publish/push messages to a topic; subscribers recieve message from the topic
topic - a logical access point that acts as a communication channel for recipients to subscribe to and recieve identical copies o the same notification
includes durable storage - prevents messages from being lost
- all messages published to SNS are stored redundantly across multiple AZ

Pros

pay-as-you-go pricing model; no up-front costs
easy to configure from AWS management console
scalable and highly available
managed service with durable storage
supports multiple transport protocols - SMS, HTTP, HTTPS, SQS, email, etc
can trigger lambda functions
instantaneous, push notifications (no polling)
can fannout messages to a large number of recipients - can send a message to multiple SQS queues

SES - Simple Email Service

What it is

https://aws.amazon.com/ses/
scalable, highly available email service
pay-as-you-go model
can be used to send emails or recieve them to an S3 bucket
incoming emails can trigger Lambda functions and SNS notifications
useful for automated emails - changes in a post/group/forumn, purchase confirmations, order status and shipping updates, marketing/promotions
differences from SNS
- only email
- can trigger Lamdba functions or SNS notifications (as opposed to just Lamdba functions)
- supports incoming and outgoing email (SNS is push only)
- only requires knowing the email address to get started - not subscription based

Kinesis

What it is

https://aws.amazon.com/kinesis/
family of services that let you collect, process, and analyze streaming data in real-time and make decisions off them
- streaming data - small sized (in the kilobytes) data generated continuously and simulatanously by many data sources
Kinesis Streams

Kinesis Data Streams
- data comes from producers; data is sent to Kinesis Streams; data is stored in shards (sequence of data records); data in shards consumed by consumers
  - shards retain data for 24hrs (default) up to 7 days
- Kinesis Shards
  - only apply to Kinesis Streams
  - each shard is a sequence of one or more data records
  - provides a fixed unit of capacity - Stream’s capacity is determined by number of shards it has
    - 5 reads/s; max total read rate is 2MB/s
    - 1000 writes/s; max total write rate is 1MB/s
  - can increase a Stream’s capacity by increasing the number of shards (called resharding)
Kinesis Video Streams
- securely stream video (used for analytics or ML) from connected devices to AWS
Kinesis Data Firehose
- capture, transform, and load data streams into AWS data stores
- allows for near-real-time analytics
- no shards (no data retention) - all capacity and sizing is automated for you
- no need for consumers to consume data
- no BI tools
- data comes from producers; data is sent to Kinesis Firehose; Firehose stores transformed data in AWS data stores
Kinesis Data Analytics
- analyze, query, and transform streamed data (in real-time); storing results in an AWS data store
- uses standard SQL
- data comes from producers; data is sent to Kinesis Firehose or Kinesis Data Streams; Kinesis Data Analytics runs SQL on data; data stored in AWS data stores

Consumers

Kinesis Client Library runs on the consumer instance
- tracks the number of shards in your system
- discovers new shards after resharding
- ensures that for every shard, there is a record processor
- manages the number of record processors relative to the number of shards and consumers
  - will load balance if you have multiple consumers
generally number of consumer instances should not be higher than the number of shards
- exceptions being failure or standby purposes
you never need multiple consumers to handle the processing load one shard; one consumer can process multiple shards
- CPU utilization should be the metric for increasing/decreasing your consumer quantity - best practice use an Auto Scaling group based on CPU load

ElasticBeanstalk -

What it is

https://aws.amazon.com/elasticbeanstalk/
service to deploy and scale applications - “quickest and easiest way”
- supports go, ruby, node, java, .net, and php
- supports tomcat and docker
- allows developers to deploy code without worrying about the underlying infrastructure
handles
- infrastructure - provisioning AWS resources; load balancing; scaling; monitoring metrics and health (+ dashboarding)
- application platform - installing and managing the stack; patches/updates on the OS and applicaiton platform
gives you complete administrative control over the AWS resources it creates/manages
no additional charges for using ElasticBeanstalk - only pay for the resources you create/deploy

Updating Applications

All at once deployment - deploys to all hosts concurrently
- will result in a total outage (service interruption)
- not ideal for mission-critical production systems
- failure results in a rollback (another outage)
- useful for dev/testing
rolling update - deploys the new version in batches
- reduces your capacity during the update process
- not ideal for performance sensitive systems
rolling with additional batch - launches an additional batch of instances; then deploys the new version in batches
- maintains full capacity during the update
- failure results in doing another rolling update to undo the changes
immutable - deploys new version to a fresh group of instances; then deletes the old instances
- old instances deleted when new instances pass their healthchecks
- if deployment fails, delete the new instances - no need to rollback changes
- preferred for mission-critical systems
traffic splitting - deploys new version to a fresh group of instances; then forwards a percentage of incoming traffic to the new version for a specified evaluation period
- similar to immut
- enables canary testing during deployments

Customizing Elastic Beanstalk environment

Amazon Linux 1 - configuration files (old way but still supported)
- define packages to install; Linux users and groups to create; shell commands to run; services to enable; load balancer configurations
- YAML or JSON format
- file must be have a .config file extension
- must be inside a folder named .ebextensions at the root level of the application’s repo
Amazon Linux 2
- use Buildfile, Procfile, and platform hooks to configure
- Buildfile - commands that run for a short time, then exit on completion
  - Buildfile must be at root level in the application’s repo
  - process_name: command format
  - e.g. ‘make: ./build.sh’ (build.sh must be in the directory)
- Procfile - long running application processes (continuously running)
  - Procfile must be at root level in the application’s repo
  - process_name: command format
  - e.g. app: bin/app (bin must be in the directory, app must be in bin)
  - Elastic Beanstalk monitors and restarted terminated processes
- platform hooks - custom scripts or executable files that run at specified stages in the EC2 provisioning process
  - stored in dedicated folders in the application’s repo
  - .platform/hooks/prebuild - files that run before building, setting up, and configuring
  - .platform/hooks/predeploy - files that run after building, setting up, and configuring but before deploying
  - .platform/hooks/postdeploy - files that run after deploying

RDS and Elastic Beanstalk

2 options
launch RDS within Elastic Beanstalk
- launch RDS instance from Elastic Beanstalk console
- RDS is created within the Elastic Beanstalk environment
- terminating the environment terminates the database
- useful for dev/testing - inflexible
- no need to configure security groups
launch RDS outside Elastic Beanstalk
- launch RDS from RDS console or AWS CLI
- allows for tearing down the application environment without affecting the database
- preferred for prod - more flexible
- need to add a security group to your environment’s Auto Scaling group
- need to provide connection string information to your application servers (using Elastic Beanstalk environment properties)

Windows Web Application Migration Assistant

formerly known as .NET Migration Assistant
interactive powershell utility - enables migrating a .NET application or entire website from Windows servers to Elastic Beanstalk

Deploying Docker Containers using Elastic Beanstalk

can deploy a Docker container to a single EC2 instance
can deploy multiple Docker instances to an ECS cluster
to deploy a Docker application, upload your code bundle to Elastic Beanstalk
- code can be uploaded from local or an S3 bucket
upgrading - use console to upload your code and deploy a new version; bundle your Dockerfile and application code into a zip file and upload
- Elastic Beanstalk will also manage old versions
if using CodeCommit to store your code, you must use the Elastic Beanstalk CLI (out of scope for exam)

DVA-C01 studying - SQS, SNS, SES, Kinesis, Elastic Beanstalk

SQS - Simple Queue Service

What it is

Types of Queues

Visibility Timeout (configuration)

Polling

Delay Queues (configuration)

managing large messages

Exam Tips

SNS - Simple Notification Service

What it is

Pros

SES - Simple Email Service

What it is

Kinesis

What it is

Kinesis Data Streams

Kinesis Video Streams

Consumers

ElasticBeanstalk -

What it is

Updating Applications

Customizing Elastic Beanstalk environment

RDS and Elastic Beanstalk

Windows Web Application Migration Assistant

Deploying Docker Containers using Elastic Beanstalk