All-at-once deployments instantly shift traffic from the original (old) Lambda function to the updated (new) Lambda function, all at one time. All-at-once deployments can be beneficial when the speed of your deployments matters. In this strategy, the new version of your code is released quickly, and all your users get to access it immediately.
A linear deployment is similar to canary deployment. In this strategy, you direct a small amount of traffic to
In a canary deployment, you deploy your new version of your application code and shift a small percentage of production traffic to point to that new version. After you have validated that this version is safe and not causing errors, you direct all traffic to the new version of your code.
A linear deployment is similar to canary deployment. In this strategy, you direct a small amount of traffic to your new version of code at first. After a specified period of time, you automatically increment the amount of traffic that you send to the new version until you’re sending 100% of production traffic.
Comparing deployment strategies
To help you decide which deployment strategy to use for your application, you’ll need to consider each option’s consumer impact, rollback, event model factors, and deployment speed. The comparison table below illustrates these points.
Event Model Factors
All at once
Redeploy older version
Any event model at low concurrency rate
1-10% typical initial traffic shift, then phased
Revert 100% of traffic to previous deployment
Better for high-concurrency workloads
Minutes to hours
Deployment preferences with AWS SAM
Traffic shifting with aliases is directly integrated into AWS SAM. If you’d like to use all-at-once, canary, or linear deployments with your Lambda functions, you can embed that directly into your AWS SAM templates. You can do this in the deployment preferences section of the template. AWS CodeDeploy uses the deployment preferences section to manage the function rollout as part of the AWS CloudFormation stack update. SAM has several pre-built deployment preferences you can use to deploy your code. See the table below for examples.
Deployment Preferences Type
Shifts 10 percent of traffic in the first increment. The remaining 90 percent is deployed 30 minutes later.
Shifts 10 percent of traffic in the first increment. The remaining 90 percent is deployed 5 minutes later.
Shifts 10 percent of traffic in the first increment. The remaining 90 percent is deployed 10 minutes later.
Shifts 10 percent of traffic in the first increment. The remaining 90 percent is deployed 15 minutes later.
Shifts 10 percent of traffic every 10 minutes until all traffic is shifted.
Shifts 10 percent of traffic every minute until all traffic is shifted.
Shifts 10 percent of traffic every 2 minutes until all traffic is shifted.
Shifts 10 percent of traffic every 3 minutes until all traffic is shifted.
Shifts all traffic to the updated Lambda functions at once.
Creating a deployment pipeline
When you check a piece of code into source control, you don’t want to wait for a human to manually approve it or have each piece of code run through different quality checks. Using a CI/CD pipeline can help automate the steps required to release your software deployment and standardize on a core set of quality checks.
Review the built-in Amazon CloudWatch metrics and their dimensions for each of the services you plan to use so that you can decide how to best leverage them vs. adding custom metrics. There are also many third-party tools that provide monitoring and metrics reporting from CloudWatch data.
Business Key Performance Indicators (KPIs) measure your application performance against business goals. It is extremely important to know when something is critically affecting your overall business (revenue wise or not).
Customer experience data dictates not only the overall effectiveness of the UI/UX but also whether changes or anomalies are affecting the customer experience in a particular section of your application. These metrics are often measured in percentiles to prevent outliers when trying to understand the impact over time and how widespread it is across your customer base.
Examples: Perceived latency, time it takes to add an item to a basket/to checkout, page load times
Vendor and application metrics are important to underpin root causes. System metrics also tell you if your systems are healthy, at risk, or already impacting your customers.
Examples: Percentage of HTTP errors/success, memory utilization, function duration/error/throttling, queue length, stream records length, integration latency
Ops metrics are important to understand sustainability and maintenance of a given system and crucial to pinpoint how stability has progressed/degraded over time.
Examples: Number of tickets([un]successful resolutions, etc.), number of times people on-call were paged, availability, CI/CD pipeline stats (successful/failed deployments, feedback time, cycle and lead time)
Logs let you dig into specific issues, but you can also use log data to create business-level metrics via CloudWatch Logs metric filters. You can interact with logs via CloudWatch Logs to drill into any specific log entry or filter them based on a pattern to create your own metrics. See how the services listed below interact with CloudWatch Logs.
Lambda automatically logs all requests handled by your function and stores them in CloudWatch Logs. This gives you access to information about each invocation of your Lambda function.
You can log almost anything to CloudWatch Logs by using print or standard out statements in your functions. When you create custom logs, use a structured format like a JSON event to make it easier to report from them.
API Gateway execution and access logs
API Gateway execution logs include information on errors as well as execution traces. Info like parameter values, payload, Lambda authorizers used, and API keys appear in the execution logs. You can log just errors or errors and info. Logging is set up per API stage. These logs are detailed, so you want to be thoughtful about what you need. Also, log groups don’t expire by default, so make sure to set retention values suitable to your workload.
You can also create custom access logs and send them to your preferred CloudWatch group to track who is accessing your APIs and how. You can specify the access details by selecting context variables and choosing the format you want to use.
CloudWatch Logs Insights
CloudWatch Log Insights lets you use prebuilt or custom queries on your logs to provide aggregated views and reporting. If you’ve created structured custom logs, CloudWatch Logs Insights can automatically discover the fields in your logs to make it easy to query and group your log data.
When a transaction fails, or completes slower than expected, how do you figure out where in the flow of services it failed? X-Ray gives you a visual representation of your services—a service map—that illustrates each integration point, and gives you quick insight into successes and failures. Then, you can drill down into the details of each individual trace.
You can enable X-Ray with one click for Lambda, API Gateway, and Amazon SNS. You can also turn it on for SQS queues that are not Lambda event sources, and you can add custom instrumentation to your function using the X-Ray SDK to write your own code. X-Ray integrations support both active and passive instrumentation.
You can add custom instrumentation to your function using the X-Ray SDK to write your own code. X-Ray integrations support both active and passive instrumentation:
Samples and instruments incoming requests
Instruments requests that have been sampled by another service
Writes traces to X-Ray
Can add information to traces
Amazon API Gateway
CloudWatch metrics – To view how resources are performing, CloudWatch metrics is the best solution. If a developer needs to check how many times a Lambda function has been invoked,
CloudWatch Logs Insights – CloudWatch Logs Insights enables you to interactively query your log data in CloudWatch Logs. If a team wants to search and query their logs for their API, CloudWatch Logs Insights would be the best option.
CloudWatch Logs – You can insert logging statements into your code to help you validate that your code is working as expected. Lambda automatically integrates with CloudWatch Logs and pushes all logs from your code to CloudWatch. If an engineer wants to see what parameters are being passed into a function, they can insert logging statements in the code and check the response in CloudWatch Logs.
X-Ray – X-Ray provides a visual map of successes and failures and lets you drill into individual traces for an execution and drill down into the details of how long each leg of the execution took.
Records IAM user, IAM role, and AWS service API activity in your account.
Is enabled when you create an account.
Provides full details about the API action, like identity of the requestor, time of the API call, request parameters, and response elements returned by the service.
When activity occurs in your AWS account, that activity is recorded in a CloudTrail event, and you can see recent events in the event history.
The CloudTrail event history provides a viewable, searchable, and downloadable record of the past 90 days of CloudTrail events. Use this history to gain visibility into actions taken in your AWS account in the AWS Management Console, AWS SDKs, command line tools, and other AWS services.
A trail is a configuration that enables delivery of CloudTrail events to an Amazon S3 bucket, CloudWatch Logs, and CloudWatch Events. If you need to maintain a longer history of events, you can create your own trail. When you create a trail, it tracks events performed on or within resources in your AWS account and writes them to an S3 bucket you specify.
For example, a trail could capture modifications to your API Gateway APIs. You can optionally add data events to track S3 object-level API activity (like when someone uploads something to the bucket) or Lambda invoke API operations on one or all future Lambda functions in the account.
You can configure CloudTrail Insights on your trails to help you identify and respond to unusual activity associated with write API calls. CloudTrail Insights is a feature that tracks your normal patterns of API call volume and generates Insights events when the volume is outside normal patterns.
we’ll look at considerations for migrating existing applications to serverless and common ways for extending the serverless
At a high level, there are three migration patterns that you might follow to migrate your legacy your applications to a serverless model.
As the name suggests, you bypass interim steps and go straight from an on-premises legacy architecture to a serverless cloud architecture
You move on-premises applications to the cloud in more of a “lift and shift” model. In this model, existing applications are kept intact, either running on Amazon Elastic Compute Cloud (Amazon EC2) instances or with some limited rewrites to container services like Amazon Elastic Kubernetes Service (Amazon EKS)/Amazon Elastic Container Service (Amazon ECS) or AWS Fargate.
Developers experiment with Lambda in low-risk internal scenarios like log processing or cron jobs. As you gain more experience, you might use serverless components for tasks like data transformations and parallelization of processes.
At some point in the adoption curve, you take a more strategic look at how serverless and microservices might address business goals like market agility, developer innovation, and total cost of ownership.
You get buy-in for a more long-term commitment to invest in modernizing your applications and select a production workload as a pilot. With initial success and lessons learned, adoption accelerates, and more applications are migrated to microservices and serverless.
With the strangler pattern, an organization incrementally and systematically decomposes monolithic applications by creating APIs and building event-driven components that gradually replace components of the legacy application.
Distinct API endpoints can point to old vs. new components, and safe deployment options (like canary deployments) let you point back to the legacy version with very little risk.
New feature branches can be “serverless first,” and legacy components can be decommissioned as they are replaced. This pattern represents a more systematic approach to adopting serverless, allowing you to move to critical improvements where you see benefit quickly but with less risk and upheaval than the leapfrog pattern.
Migration questions to answer:
What does this application do, and how are its components organized?
How can you break your data needs up based on the command query responsibility (CQRS) pattern?
How does the application scale, and what components drive the capacity you need?
Do you have schedule-based tasks?
Do you have workers listening to a queue?
Where can you refactor or enhance functionality without impacting the current implementation?
Application Load Balancer vs. API Gateway for directing traffic to serverless targets
Application Load Balancer
Amazon API Gateway
Easier to transition existing compute stack where you are already using an Application Load Balancer
Good for building REST APIs and integrating with other services and Lambda functions
Supports authorization via OIDC-capable providers, including Amazon Cognito user pools
Supports authorization via AWS Identity and Access Management (IAM), Amazon Cognito, and Lambda authorizers
Charged by the hour, based on Load Balancer Capacity Units
Charged based on requests served
May be more cost-effective for a steady stream of traffic
May be more cost-effective for spiky patterns
Additional features for API management: Export SDK for clients Use throttling and usage plans to control access Maintain multiple versions of an APICanary deployments
Consider three factors when comparing costs of ownership:
The infrastructure cost to run your workload (for example, the costs for your provisioned EC2 capacity vs. the per-invocation cost of your Lambda functions)
The development effort to plan, architect, and provision resources on which the application will run
The costs of your team’s time to maintain the application once it is in production
AWS Transit Gateway is a highly available and scalable service that provides interconnectivity between VPCs and your on-premises network. Within a Region, AWS Transit Gateway provides a method for consolidating and centrally managing routing between VPCs with a hub-and-spoke network architecture.
Between Regions, AWS Transit Gateway supports inter-regional peering with other transit gateways. It does this to facilitate routing network traffic between VPCs of different Regions over the AWS global backbone. This removes the need to route traffic over the internet. AWS Transit Gateway also integrates with hybrid network configurations when a Direct Connect or AWS Site-to-Site VPN connection is connected to the transit gateway.
AWS Transit Gateway concepts
AWS Transit Gateway supports the following connections:
One or more VPCs
A compatible Software-Defined Wide Area Network (SD-WAN) appliance
A Direct Connect gateway
A peering connection with another transit gateway
A VPN connection to a transit gateway
AWS Transit Gateway MTU
AWS Transit Gateway supports an MTU of 8,500 bytes for:
Direct Connect connections
Connections to other transit gateways
AWS Transit Gateway supports an MTU of 1,500 bytes for VPN connections.
AWS Transit Gateway route table
A transit gateway has a default route table and can optionally have additional route tables. A route table includes dynamic and static routes that decide the next hop based on the destination IP address of the packet. The target of these routes can be any transit gateway attachment.
Each attachment is associated with exactly one route table. Each route table can be associated with zero to many attachments.
A VPC, VPN connection, or Direct Connect gateway can dynamically propagate routes to a transit gateway route table. With a Direct Connect attachment, the routes are propagated to a transit gateway route table by default.
With a VPC, you must create static routes to send traffic to the transit gateway.
With a VPN connection or a Direct Connect gateway, routes are propagated from the transit gateway to your on-premises router using BGP.
With a peering attachment, you must create a static route in the transit gateway route table to point to the peering attachment.
AWS Transit Gateway inter-regional peering
AWS offers two types of peering connections for routing traffic between VPCs in different Regions: VPC peering and transit gateway peering. Both peering types are one-to-one, but transit gateway peering connections have a simpler network design and more consolidated management.
Suppose a customer has multiple VPCs in three different Regions. As the following diagram illustrates, to permit network traffic to route between each VPC requires creating 72 VPC peering connections. Each VPC needs 8 different routing configurations and security policies.
With AWS Transit Gateway, the same environment only needs three peering connections. The transit gateway in each Region facilitates routing network traffic to all the VPCs in its Region. Because all routing can be managed by the transit gateway, the customer only needs to maintain three routing configurations, simplifying management.
AWS Site-to-Site VPN enables you to securely connect your on-premises network to Amazon VPC, for example your branch office site.
AWS Client VPN enables you to securely connect users to AWS or on-premises networks, for example remote employees.
AWS Site-to-Site VPN
ased on IPsec technology, AWS Site-to-Site VPN uses a VPN tunnel to pass data from the customer network to or from AWS.
One AWS Site-to-Site VPN connection consists of two tunnels. Each tunnel terminates in a different Availability Zone on the AWS side, but it must terminate on the same customer gateway on the customer side.
AWS Site-to-Site VPN components
A resource you create and configure in AWS that represents your on-premise gateway device. The resource contains information about the type of routing used by the Site-to-Site VPN, BGP, ASN and other optional configuration information.
Customer gateway device
A customer gateway device is a physical device or software application on your side of the AWS Site-to-Site VPN connection.
Virtual private gateway
A virtual private gateway is the VPN concentrator on the Amazon side of the AWS Site-to-Site VPN connection. You use a virtual private gateway or a transit gateway as the gateway for the Amazon side of the AWS Site-to-Site VPN connection.
A transit gateway is a transit hub that can be used to interconnect your VPCs and on-premises networks. You use a transit gateway or virtual private gateway as the gateway for the Amazon side of the AWS Site-to-Site VPN connection.
AWS Site-to-Site VPN limitations
IPv6 traffic is partially supported. AWS Site-to-Site VPN supports IPv4/IPv6-Dualstack through separate tunnels for inner traffic. IPv6 for outer tunnel connection not supported.
AWS Site-to-Site VPN does not support Path MTU Discovery. The greatest Maximum Transmission Unit (MTU) available on the inside tunnel interface is 1,399 bytes.
Throughput of AWS Site-to-Site VPN connections is limited. When terminating on a virtual private gateway, only one tunnel out of the pair can be active and carry a maximum of 1.25 Gbps. However, real-life throughput will be about 1 Gbps. When terminating on AWS Transit Gateway, both tunnels in the pair can be active and carry an aggregate maximum of 2.5 Gbps. However, real-life throughput will be 2 Gbps. Each flow (for example, TCP stream) will still be limited to a maximum of 1.25 Gbps, with a real-life value of about 1 Gbps.
Maximum packets per second (PPS) per VPN tunnel is 140,000.
AWS Site-to-Site VPN terminating on AWS Transit Gateway supports equal-cost multi-path routing (ECMP) and multi-exit discriminator (MED) across tunnels in the same and different connection. ECMP is only supported for Site-to-Site VPN connections activated on an AWS Transit Gateway. MED is used to identify the primary tunnel for Site-to-Site VPN conncetions that use BGP. Note, BFD is not yet supported on AWS Site-to-Site VPN, though it is supported on Direct Connect.
AWS Site-to-Site VPN endpoints use public IPv4 addresses and therefore require a public virtual interface to transport traffic over Direct Connect. Support for AWS Site-to-Site VPN over private Direct Connect is not yet available.
For globally distributed applications, the accelerated Site-to-Site VPN option provides a connection to the global AWS backbone through AWS Global Accelerator. Because the Global Accelerator IP space is not announced over a Direct Connect public virtual interface, you cannot use accelerated Site-to-Site VPN with a Direct Connect public virtual interface.
In addition, when you connect your VPCs to a common on-premises network, it’s recommend that you use nonoverlapping CIDR blocks for your networks.
Based on OpenVPN technology, Client VPN is a managed client-based VPN service that lets you securely access your AWS resources and resources in your on-premises network. With Client VPN, you can access your resources from any location using an OpenVPN-based VPN client.
Client VPN components
Client VPN endpoint
Your Client VPN administrator creates and configures a Client VPN endpoint in AWS. Your administrator controls which networks and resources you can access when you establish a VPN connection.
VPN client application
This is the software application that you use to connect to the Client VPN endpoint and establish a secure VPN connection.
Client VPN endpoint configuration file
This is a configuration file that is provided to you by your Client VPN administrator. The file includes information about the Client VPN endpoint and the certificates required to establish a VPN connection. You load this file into your chosen VPN client application.
Client VPN limitations
Client VPN supports IPv4 traffic only. IPv6 is not supported.
Security Assertion Markup Language (SAML) 2.0-based federated authentication only works with an AWS provided client v1.2.0 or later.
SAML integration with AWS Single Sign-On requires a workaround. Better integration is being worked on.
Client CIDR ranges must have a block size of at least /22 and must not be greater than /12.
A Client VPN endpoint does not support subnet associations in a dedicated tenancy VPC.
Client VPN is not compliant with Federal Information Processing Standards (FIPS).
Client CIDR ranges cannot overlap with the local CIDR of the VPC in which the associated subnet is located. It also cannot overlap any routes manually added to the Client VPN endpoint’s route table.
A portion of the addresses in the client CIDR range is used to support the availability model of the Client VPN endpoint and cannot be assigned to clients. Therefore, we recommend that you assign a CIDR block that contains twice the number of required IP addresses. This will ensure the maximum number of concurrent connections that you plan to support on the Client VPN endpoint.
The client CIDR range cannot be changed after you create the Client VPN endpoint.
The subnets associated with a Client VPN endpoint must be in the same VPC.
You cannot associate multiple subnets from the same Availability Zone with a Client VPN endpoint.
AWS Certificate Manager (ACM) certificates are not supported with mutual authentication because you cannot extract the private key. You can use an ACM server as the server-side certificate. But, to add a client certificate to your customer configuration, you cannot use a general ACM certificate because you can’t extract the required private key details. So you must access the keys in one of two ways. Either generate your own certificate where you have the key or use AWS Certificate Manager Private Certificate Authority (ACM PCA), which gives the private keys. If the customer is authenticating based on Active Directory or SAML, they can use a general ACM-generated certificate because only the server certificate is required.
Direct Connect provides a private, reliable connection to AWS from your physical facility, such as a data center or office. It is a fully integrated and redundant AWS service that provides complete control over the data exchanged between your AWS environment and the physical location of your choice.
Direct Connect offers consistent performance with reduced bandwidth cost, backed by a service-level agreement that guarantees 99.99 percent availability.
When choosing to implement a Direct Connect connection, you should first consider bandwidth, connection type, protocol configurations, and other network configuration specifications.
Direct Connect offers physical connections of 1, 10, and 100 Gbps to support your private connectivity needs to the cloud. Direct Connect supports the Link Aggregation Control Protocol (LACP), facilitating multiple dedicated physical connections to be grouped into link aggregation groups (LAGs). When you group connections into LAGs, you can stream the multiple connections as a single, managed connection.
Available only in select locations, the 100-Gbps connection is particularly beneficial for applications that transfer large-scale datasets. Such applications include broadcast media distribution, advanced driver assistance systems for autonomous vehicles, and financial services trading and market information systems.
Consider the following Direct Connect specifications:
All connections must be dedicated connections and have a port speed of 1 Gbps, 10 Gbps, or 100 Gbps.
All connections in the LAG must use the same bandwidth.
You can have a maximum of two 100-Gbps connections in a LAG, or four connections with a port speed less than 100 Gbps. Each connection in the LAG counts toward your overall connection limit for the Region.
All connections in the LAG must terminate at the same Direct Connect endpoint.
When you create a LAG, you can download the Letter of Authorization and Connecting Facility Assignment (LOA-CFA) for each new physical connection individually from the Direct Connect console.
To use Direct Connect in a Direct Connect location, your network must meet one of the following conditions:
Your network is co-located with an existing Direct Connect location.
You are working with a Direct Connect Partner.
You are working with an independent service provider to connect to Direct Connect.
The two most common solutions are co-locating at a Direct Connect location or contracting with a Direct Connect Partner.
You deploy a router and supporting network equipment to a location with a physical uplink to AWS. Your router at the Direct Connect location is connected to the AWS router using a cross connect. This establishes the physical link used by the Direct Connect service to connect your physical location with AWS.
contracting with a Direct Connect Partner.
The Direct Connect Partner provides you with the physical equipment necessary to connect to an AWS router at the Partner’s physical location. You use this physical link to configure the Direct Connect service to link your physical location with AWS.
Additionally, your network must meet the following conditions:
Your network must use single-mode fiber with one of the following:
1000BASE-LX (1,310 nm) transceiver for 1-gigabit Ethernet
10GBASE-LR (1,310 nm) transceiver for 10-gigabit Ethernet
100GBASE-LR4 for 100-gigabit Ethernet
Auto-negotiation for the port must be deactivated. Port speed and full-duplex mode must be configured manually.
802.1Q VLAN encapsulation must be supported across the entire connection, including intermediate devices.
Your device must support Border Gateway Protocol (BGP) and BGP MD5 authentication.
(Optional) You can configure Bidirectional Forwarding Detection (BFD) on your network. Asynchronous BFD is automatically activated for Direct Connect virtual interfaces, but does not take effect until you configure it on your router or customer gateway device.
When all the physical components are in place to create the Direct Connect connection, AWS will provide you with an LOA-CFA. The LOA-CFA lets you show the operator of the facility hosting the AWS router that AWS approves your request to connect to the AWS router. This connection will complete the last physical step in setting up the Direct Connect connection.
When this is done, you can complete the setup using the AWS Management Console. Here you can choose the virtual interface type your connection will use and configure the Direct Connect gateway.
Virtual interface types
Direct Connect supports three different virtual interfaces:
A private virtual interface permits traffic to be routed to any VPC resource in the same private IP space as the virtual interface.
A public virtual interface permits traffic to be routed to any VPC or AWS regional resource with a public IP address in the same Region.
A transit virtual interface permits traffic to be routed to any VPC or AWS regional resource routable through an AWS Transit Gateway in the same Region.
A VPC peering connection is a networking connection between two VPCs that lets you route traffic between them privately.
Benefits of VPC peering
A VPC peering connection is highly available. This is because it is neither a gateway nor a VPN connection and does not rely on a separate piece of physical hardware. There is no bandwidth bottleneck or single point of failure for communication. A VPC peering connection helps to facilitate the transfer of data.
You can establish peering relationships between VPCs across different AWS Regions. This is called inter-Region VPC peering. It permits VPC resources that run in different AWS Regions to communicate securely with each other. Examples of these resources include EC2 instances, Amazon Relational Database Service (Amazon RDS) databases, and AWS Lambda functions. This communication is accomplished using private IP addresses, without requiring gateways, VPN connections, or separate network appliances. All inter-Region traffic is encrypted with no single point of failure or bandwidth bottleneck. Traffic always stays on the global AWS backbone and never traverses the public internet, which reduces threats such as common exploits and distributed denial of service (DDoS) attacks. Inter-Region VPC peering provides an uncomplicated and cost-effective way to share resources between Regions or replicate data for geographic redundancy.
You can also create a VPC connection between VPCs in different AWS accounts.
why you would set up a VPC peering connection
Full sharing of resources between all VPCs
Your organization has company services distributed across four VPCs and a single VPC dedicated to centralized IT services and logging. To facilitate data sharing, the IT department constructed a fully mesh network design using VPC peering to connect each VPC to every other VPC in the organization.
Each VPC must have a one-to-one connection with each VPC it is approved to communicate with. This is because each VPC peering connection is nontransitive in nature and does not allow network traffic to pass from one peering connection to another.
For example, VPC 1 is peered with VPC 2, and VPC 2 is peered with VPC 4. You cannot route packets from VPC 1 to VPC 4 through VPC 2. To route packets directly between VPC 1 and VPC 4, you can create a separate VPC peering connection between them.
Partial sharing of centralized resources
Your organization’s IT department maintains a central VPC for file sharing. Multiple VPCs require access to this resource but do not need to send traffic to each other. A peering connection is established to connect the VPCs solely to this resource.
Non-valid peering configurations
Overlapping CIDR blocks
You cannot create a VPC peering connection between VPCs with matching or overlapping IPv4 Classless Inter-Domain Routing (CIDR) blocks. This limitation also applies to VPCs that have nonoverlapping IPv6 CIDR blocks. You cannot create a VPC peering connection if the VPCs have matching or overlapping IPv4 CIDR blocks. This applies even if you intend to use the VPC peering connection for IPv6 communication only.
You have a VPC peering connection between VPC A and VPC B, and between VPC A and VPC C. There is no VPC peering connection between VPC B and VPC C. You cannot route packets directly from VPC B to VPC C through VPC A.
Edge-to-edge routing through a gateway or private connection
If either VPC in a peering relationship has one of the following connections, you cannot extend the peering relationship to that connection:
A VPN connection or a Direct Connect connection to a corporate network
An internet connection through an internet gateway
An internet connection in a private subnet through a NAT device
A gateway VPC endpoint to an AWS service, for example, an endpoint to Amazon S3
A VPC endpoint lets you privately connect your VPC to supported AWS services and VPC endpoint services. With VPC endpoints, resources inside a VPC do not require public IP addresses to communicate with resources outside the VPC. Traffic between Amazon Virtual Private Cloud (Amazon VPC) and a service does not leave the Amazon network.
VPC endpoints are a security product first and a connectivity product second. VPC endpoints do not allow traffic between your VPC and the other services to leave the Amazon network.
You might have stringent compliance requirements that prevent connectivity between a VPC and a public-facing service endpoint. In this case, VPC endpoints offer a way to use AWS services from your VPCs that would otherwise not be available.
A VPC endpoint does not require an internet gateway, virtual private gateway, network address translation (NAT) device, virtual private network (VPN) connection, or Direct Connect connection. Instances in your VPC do not require a public IP address to connect to services presented through a VPC endpoint.
The following are the different types of VPC endpoints. You create the type of VPC endpoint that is required by the supported service.
Gateway VPC endpoints
A gateway VPC endpoint targets specific IP routes in a VPC route table in the form of a prefix list. This is used for traffic destined to Amazon DynamoDB or Amazon Simple Storage Service (Amazon S3).
Instances in a VPC do not require public IP addresses to communicate with VPC endpoints. This is because interface endpoints use local IP addresses within the consumer VPC. Gateway endpoints are destinations that are reachable from within a VPC through prefix-lists within the VPC’s route table.
In the following diagram, instances in subnet 1 can send and receive traffic to and from the internet and the S3 bucket. Instances in subnet 2 only have access to the S3 bucket.
Powered by AWS PrivateLink, an interface endpoint is an elastic network interface with a private IP address from the IP address range of your subnet. It serves as an entry point for traffic destined to a supported AWS service or a VPC endpoint service.
Gateway Load Balancer endpoint
A Gateway Load Balancer endpoint is an elastic network interface with a private IP address from the IP address range of your subnet. This type of endpoint serves as an entry point to intercept traffic and route it to a service that you’ve configured using Gateway Load Balancers, for example, for security inspection. You specify a Gateway Load Balancer endpoint as a target for a route in a route table. Gateway Load Balancer endpoints are supported for endpoint services that are configured for Gateway Load Balancers only.
Like interface endpoints, Gateway Load Balancer endpoints are also powered by AWS PrivateLink.
What is AWS PrivateLink?
AWS PrivateLink provides a private connection between your VPCs and supported AWS services. This AWS service provides secure usage within the AWS network and avoids exposing traffic to the public internet.
Before AWS PrivateLink, services within a single VPC were connected to multiple VPCs in two ways:
Public IP addresses using the internet gateway of the VPC
Private IP addresses using VPC peering
With AWS PrivateLink, services establish a Transmission Control Protocol (TCP) connection between the service provider’s VPC and the service consumer’s VPC. This provides a secure and scalable solution.
In the following diagram, traffic from Amazon Elastic Compute Cloud (Amazon EC2) instances in private subnets is routed to a Network Load Balancer. The Network Load Balancer is connected to instances in public subnets that communicate with the internet. This architecture permits backend EC2 instances to communicate with the front-end instances through the AWS PrivateLink endpoint. And it avoids the security and cost implications of data traveling through the public internet.
Benefits of AWS PrivateLink
AWS PrivateLink considerations
AWS PrivateLink does not support IPv6.
Traffic will be sourced from the Network Load Balancer inside the service provider VPC. From the perspective of the service provider application, all IP traffic will originate from the Network Load Balancer. All IP addresses logged by the application will be the private IP addresses of the Network Load Balancer. The service provider application will never see the IP addresses of the customer or service consumer.
You can activate Proxy Protocol v2 to gain insight into the network traffic. Network Load Balancers use Proxy Protocol v2 to send additional connection information such as the source and destination. This might require changes to the application.
Endpoint services cannot be tagged.
The private Domain Name System (DNS) of the endpoint does not resolve outside of the VPC. Private DNS hostnames can be configured to point directly to endpoint network interface IP addresses. Endpoint services are available in the AWS Region in which they are created and can be accessed in remote AWS Regions using inter-Region VPC peering.
Availability Zone names in a customer account might not map to the same locations as Availability Zone names in another account. For example, the Availability Zone US-East-1A might not be the same Availability Zone as US-East-1A for another account. An endpoint service is configured in Availability Zones according to their mapping in a customer’s account.
When building and testing a function, you must specify three primary configuration settings: memory, timeout, and concurrency. These settings are important in defining how each function performs. Deciding how to configure memory, timeout, and concurrency comes down to testing your function in real-world scenarios and against peak volume. As you monitor your functions, you must adjust the settings to optimize costs and ensure the desired customer experience with your application.
You can allocate up to 10 GB of memory to a Lambda function. Lambda allocates CPU and other resources linearly in proportion to the amount of memory configured. Any increase in memory size triggers an equivalent increase in CPU available to your function. To find the right memory configuration for your functions, use the AWS Lambda Power Tuning tool.
The AWS Lambda timeout value dictates how long a function can run before Lambda terminates the Lambda function. At the time of this publication, the maximum timeout for a Lambda function is 900 seconds. This limit means that a single invocation of a Lambda function cannot run longer than 900 seconds (which is 15 minutes).
It is important to analyze how long your function runs. When you analyze the duration, you can better determine any problems that might increase the invocation of the function beyond your expected length. Load testing your Lambda function is the best way to determine the optimum timeout value.
Lambda billing costs
With AWS Lambda, you pay only for what you use. You are charged based on the number of requests for your functions and the duration, the time it takes for your code to run. Lambda counts a request each time it starts running in response to an event notification or an invoke call, including test invokes from the console.
Duration is calculated from the time your code begins running until it returns or otherwise terminates, rounded up to the nearest 1 ms. Price depends on the amount of memory you allocateto your function, not the amount of memory your function uses. If you allocate 10 GB to a function and the function only uses 2 GB, you are charged for the 10 GB. This is another reason to test your functions using different memory allocations to determine which is the most beneficial for the function and your budget.
In the AWS Lambda resource model, you can choose the amount of memory you want for your function and are allocated proportional CPU power and other resources. An increase in memory triggers an equivalent increase in CPU available to your function. The AWS Lambda Free Tier includes 1 million free requests per month and 400,000 GB-seconds of compute time per month.
The balance between power and duration
Depending on the function, you might find that the higher memory level might actually cost less because the function can complete much more quickly than at a lower memory configuration.
You can use an open-source tool called Lambda Power Tuning to find the best configuration for a function. The tool helps you to visualize and fine-tune the memory and power configurations of Lambda functions. The tool runs in your own AWS account—powered by AWS Step Functions—and supports three optimization strategies: cost, speed, and balanced. It’s language-agnostic so that you can optimize any Lambda functions in any of your languages.
Concurrency and scaling
Concurrency is the third major configuration that affects your function’s performance and its ability to scale on demand. Concurrency is the number of invocations your function runs at any given moment. When your function is invoked, Lambda launches an instance of the function to process the event. When the function code finishes running, it can handle another request. If the function is invoked again while the first request is still being processed, another instance is allocated. Having more than one invocation running at the same time is the function’s concurrency.
As an analogy, you can think of concurrency as the total capacity of a restaurant for serving a certain number of diners at one time. If you have seats in the restaurant for 100 diners, only 100 people can sit at the same time. Anyone who comes while the restaurant is full must wait for a current diner to leave before a seat is available. If you use a reservation system, and a dinner party has called to reserve 20 seats, only 80 of those 100 seats are available for people without a reservation. Lambda functions also have a concurrency limit and a reservation system that can be used to set aside runtime for specific instances.
The amount of concurrency that is not allocated to any specific set of functions. The minimum is 100 unreserved concurrency. This allows functions that do not have any provisioned concurrency to still be able to run. If you provision all your concurrency to one or two functions, no concurrency is left for any other function. Having at least 100 available allows all your functions to run when they are invoked.
Guarantees the maximum number of concurrent instances for the function. When a function has reserved concurrency, no other function can use that concurrency. No charge is incurred for configuring reserved concurrency for a function.
Initializes a requested number of runtime environments so that they are prepared to respond immediately to your function’s invocations. This option is used when you need high performance and low latency.
You pay for the amount of provisioned concurrency that you configure and for the period of time that you have it configured.
For example, you might want to increase provisioned concurrency when you are expecting a significant increase in traffic. To avoid paying for unnecessary warm environments, you scale back down when the event is over.
Reasons for setting concurrency limits
Limit a function’s concurrency to achieve the following:
Regulate how long it takes you to process a batch of events
Match it with a downstream resource that cannot scale as quickly as Lambda
Reserve function concurrency to achieve the following:
Ensure that you can handle peak expected volume for a critical function
Address invocation errors
CloudWatch metrics for concurrency
When your function finishes processing an event, Lambda sends metrics about the invocation to Amazon CloudWatch. You can build graphs and dashboards with these metrics in the CloudWatch console. You can also set alarms to respond to changes in use, performance, or error rates.
CloudWatch includes two built-in metrics that help determine concurrency: ConcurrentExecutions and UnreservedConcurrentExecutions.
Shows the sum of concurrent invocations for a given function at a given point in time. Provides historical data on how functions are performing.
You can view all functions in the account or only the functions that have a custom concurrency limit specified.
Shows the sum of the concurrency for the functions that do not have a custom concurrency limit specified.