Osama mustafa blog – Technology Geek

DELETE All the VCNs in THE OCI Using BASH SCRIPT

Posted on May 27, 2025May 27, 2025 by Osama Mustafa in Cloud, OCI

The script below will allow you to list all VCNs in OCI and delete all attached resources to the COMPARTMENT_OCID.

Note: I wrote the scripts to perform the tasks mentioned below, which can be updated and expanded based on the needs. Feel free to do that and say the source

Complete Resource Deletion Chain: The script now handles the proper order of deletion:

Compute instances first
Clean route tables and security lists
Load balancers
Gateways (NAT, Internet, Service, DRG attachments)
Subnets
Custom security lists, route tables, and DHCP options
Finally, the VCN itself

#!/bin/bash

# ✅ Set this to the target compartment OCID
COMPARTMENT_OCID="Set Your OCID Here"

# (Optional) Force region
export OCI_CLI_REGION=me-jeddah-1

echo "📍 Region: $OCI_CLI_REGION"
echo "📦 Compartment: $COMPARTMENT_OCID"
echo "⚠️  WARNING: This will delete ALL VCNs and related resources in the compartment!"
echo "Press Ctrl+C within 10 seconds to cancel..."
sleep 10

# Function to wait for resource deletion
wait_for_deletion() {
    local resource_id=$1
    local resource_type=$2
    local max_attempts=30
    local attempt=1
    
    echo "    ⏳ Waiting for $resource_type deletion..."
    while [ $attempt -le $max_attempts ]; do
        if ! oci network $resource_type get --${resource_type//-/}-id "$resource_id" &>/dev/null; then
            echo "    ✅ $resource_type deleted successfully"
            return 0
        fi
        sleep 10
        ((attempt++))
    done
    echo "    ⚠️  Timeout waiting for $resource_type deletion"
    return 1
}

# Function to check if resource is default
is_default_resource() {
    local resource_id=$1
    local resource_type=$2
    
    case $resource_type in
        "security-list")
            result=$(oci network security-list get --security-list-id "$resource_id" --query "data.\"display-name\"" --raw-output 2>/dev/null)
            [[ "$result" == "Default Security List"* ]]
            ;;
        "route-table")
            result=$(oci network route-table get --rt-id "$resource_id" --query "data.\"display-name\"" --raw-output 2>/dev/null)
            [[ "$result" == "Default Route Table"* ]]
            ;;
        "dhcp-options")
            result=$(oci network dhcp-options get --dhcp-id "$resource_id" --query "data.\"display-name\"" --raw-output 2>/dev/null)
            [[ "$result" == "Default DHCP Options"* ]]
            ;;
        *)
            false
            ;;
    esac
}

# Function to clean all route tables in a VCN
clean_all_route_tables() {
    local VCN_ID=$1
    echo "  🧹 Cleaning all route tables..."
    
    local RT_IDS=$(oci network route-table list \
        --compartment-id "$COMPARTMENT_OCID" \
        --vcn-id "$VCN_ID" \
        --query "data[?\"lifecycle-state\" == 'AVAILABLE'].id" \
        --raw-output 2>/dev/null | jq -r '.[]' 2>/dev/null)
    
    for RT_ID in $RT_IDS; do
        if [ -n "$RT_ID" ]; then
            echo "    🔧 Clearing routes in route table: $RT_ID"
            oci network route-table update --rt-id "$RT_ID" --route-rules '[]' --force &>/dev/null || true
        fi
    done
    
    # Wait a bit for route updates to propagate
    sleep 5
}

# Function to clean all security lists in a VCN
clean_all_security_lists() {
    local VCN_ID=$1
    echo "  🧹 Cleaning all security lists..."
    
    local SL_IDS=$(oci network security-list list \
        --compartment-id "$COMPARTMENT_OCID" \
        --vcn-id "$VCN_ID" \
        --query "data[?\"lifecycle-state\" == 'AVAILABLE'].id" \
        --raw-output 2>/dev/null | jq -r '.[]' 2>/dev/null)
    
    for SL_ID in $SL_IDS; do
        if [ -n "$SL_ID" ]; then
            echo "    🔧 Clearing rules in security list: $SL_ID"
            oci network security-list update \
                --security-list-id "$SL_ID" \
                --egress-security-rules '[]' \
                --ingress-security-rules '[]' \
                --force &>/dev/null || true
        fi
    done
    
    # Wait a bit for security list updates to propagate
    sleep 5
}

# Function to delete compute instances in subnets
delete_compute_instances() {
    local VCN_ID=$1
    echo "  🖥️  Checking for compute instances..."
    
    local INSTANCES=$(oci compute instance list \
        --compartment-id "$COMPARTMENT_OCID" \
        --query "data[?\"lifecycle-state\" != 'TERMINATED'].id" \
        --raw-output 2>/dev/null | jq -r '.[]' 2>/dev/null)
    
    for INSTANCE_ID in $INSTANCES; do
        if [ -n "$INSTANCE_ID" ]; then
            # Check if instance is in this VCN
            local INSTANCE_VCN=$(oci compute instance list-vnics \
                --instance-id "$INSTANCE_ID" \
                --query "data[0].\"vcn-id\"" \
                --raw-output 2>/dev/null)
            
            if [[ "$INSTANCE_VCN" == "$VCN_ID" ]]; then
                echo "    🔻 Terminating compute instance: $INSTANCE_ID"
                oci compute instance terminate --instance-id "$INSTANCE_ID" --force &>/dev/null || true
            fi
        fi
    done
}

# Main cleanup function for a single VCN
cleanup_vcn() {
    local VCN_ID=$1
    echo -e "\n🧹 Cleaning resources for VCN: $VCN_ID"
    
    # Step 1: Delete compute instances first
    delete_compute_instances "$VCN_ID"
    
    # Step 2: Clean route tables and security lists
    clean_all_route_tables "$VCN_ID"
    clean_all_security_lists "$VCN_ID"
    
    # Step 3: Delete Load Balancers
    echo "  🔻 Deleting load balancers..."
    local LBS=$(oci lb load-balancer list \
        --compartment-id "$COMPARTMENT_OCID" \
        --query "data[?\"lifecycle-state\" == 'ACTIVE'].id" \
        --raw-output 2>/dev/null | jq -r '.[]' 2>/dev/null)
    
    for LB_ID in $LBS; do
        if [ -n "$LB_ID" ]; then
            echo "    🔻 Deleting Load Balancer: $LB_ID"
            oci lb load-balancer delete --load-balancer-id "$LB_ID" --force &>/dev/null || true
        fi
    done
    
    # Step 4: Delete NAT Gateways
    echo "  🔻 Deleting NAT gateways..."
    local NAT_GWS=$(oci network nat-gateway list \
        --compartment-id "$COMPARTMENT_OCID" \
        --vcn-id "$VCN_ID" \
        --query "data[?\"lifecycle-state\" == 'AVAILABLE'].id" \
        --raw-output 2>/dev/null | jq -r '.[]' 2>/dev/null)
    
    for NAT_ID in $NAT_GWS; do
        if [ -n "$NAT_ID" ]; then
            echo "    🔻 Deleting NAT Gateway: $NAT_ID"
            oci network nat-gateway delete --nat-gateway-id "$NAT_ID" --force &>/dev/null || true
        fi
    done
    
    # Step 5: Delete DRG Attachments
    echo "  🔻 Deleting DRG attachments..."
    local DRG_ATTACHMENTS=$(oci network drg-attachment list \
        --compartment-id "$COMPARTMENT_OCID" \
        --query "data[?\"vcn-id\" == '$VCN_ID' && \"lifecycle-state\" == 'ATTACHED'].id" \
        --raw-output 2>/dev/null | jq -r '.[]' 2>/dev/null)
    
    for DRG_ATTACHMENT_ID in $DRG_ATTACHMENTS; do
        if [ -n "$DRG_ATTACHMENT_ID" ]; then
            echo "    🔻 Deleting DRG Attachment: $DRG_ATTACHMENT_ID"
            oci network drg-attachment delete --drg-attachment-id "$DRG_ATTACHMENT_ID" --force &>/dev/null || true
        fi
    done
    
    # Step 6: Delete Internet Gateways
    echo "  🔻 Deleting internet gateways..."
    local IGWS=$(oci network internet-gateway list \
        --compartment-id "$COMPARTMENT_OCID" \
        --vcn-id "$VCN_ID" \
        --query "data[?\"lifecycle-state\" == 'AVAILABLE'].id" \
        --raw-output 2>/dev/null | jq -r '.[]' 2>/dev/null)
    
    for IGW_ID in $IGWS; do
        if [ -n "$IGW_ID" ]; then
            echo "    🔻 Deleting Internet Gateway: $IGW_ID"
            oci network internet-gateway delete --ig-id "$IGW_ID" --force &>/dev/null || true
        fi
    done
    
    # Step 7: Delete Service Gateways
    echo "  🔻 Deleting service gateways..."
    local SGWS=$(oci network service-gateway list \
        --compartment-id "$COMPARTMENT_OCID" \
        --vcn-id "$VCN_ID" \
        --query "data[?\"lifecycle-state\" == 'AVAILABLE'].id" \
        --raw-output 2>/dev/null | jq -r '.[]' 2>/dev/null)
    
    for SGW_ID in $SGWS; do
        if [ -n "$SGW_ID" ]; then
            echo "    🔻 Deleting Service Gateway: $SGW_ID"
            oci network service-gateway delete --service-gateway-id "$SGW_ID" --force &>/dev/null || true
        fi
    done
    
    # Step 8: Wait for gateways to be deleted
    echo "  ⏳ Waiting for gateways to be deleted..."
    sleep 30
    
    # Step 9: Delete Subnets
    echo "  🔻 Deleting subnets..."
    local SUBNETS=$(oci network subnet list \
        --compartment-id "$COMPARTMENT_OCID" \
        --vcn-id "$VCN_ID" \
        --query "data[?\"lifecycle-state\" == 'AVAILABLE'].id" \
        --raw-output 2>/dev/null | jq -r '.[]' 2>/dev/null)
    
    for SUBNET_ID in $SUBNETS; do
        if [ -n "$SUBNET_ID" ]; then
            echo "    🔻 Deleting Subnet: $SUBNET_ID"
            oci network subnet delete --subnet-id "$SUBNET_ID" --force &>/dev/null || true
        fi
    done
    
    # Step 10: Wait for subnets to be deleted
    echo "  ⏳ Waiting for subnets to be deleted..."
    sleep 30
    
    # Step 11: Delete non-default Security Lists
    echo "  🔻 Deleting custom security lists..."
    local SL_IDS=$(oci network security-list list \
        --compartment-id "$COMPARTMENT_OCID" \
        --vcn-id "$VCN_ID" \
        --query "data[?\"lifecycle-state\" == 'AVAILABLE'].id" \
        --raw-output 2>/dev/null | jq -r '.[]' 2>/dev/null)
    
    for SL_ID in $SL_IDS; do
        if [ -n "$SL_ID" ] && ! is_default_resource "$SL_ID" "security-list"; then
            echo "    🔻 Deleting Security List: $SL_ID"
            oci network security-list delete --security-list-id "$SL_ID" --force &>/dev/null || true
        fi
    done
    
    # Step 12: Delete non-default Route Tables
    echo "  🔻 Deleting custom route tables..."
    local RT_IDS=$(oci network route-table list \
        --compartment-id "$COMPARTMENT_OCID" \
        --vcn-id "$VCN_ID" \
        --query "data[?\"lifecycle-state\" == 'AVAILABLE'].id" \
        --raw-output 2>/dev/null | jq -r '.[]' 2>/dev/null)
    
    for RT_ID in $RT_IDS; do
        if [ -n "$RT_ID" ] && ! is_default_resource "$RT_ID" "route-table"; then
            echo "    🔻 Deleting Route Table: $RT_ID"
            oci network route-table delete --rt-id "$RT_ID" --force &>/dev/null || true
        fi
    done
    
    # Step 13: Delete non-default DHCP Options
    echo "  🔻 Deleting custom DHCP options..."
    local DHCP_IDS=$(oci network dhcp-options list \
        --compartment-id "$COMPARTMENT_OCID" \
        --vcn-id "$VCN_ID" \
        --query "data[?\"lifecycle-state\" == 'AVAILABLE'].id" \
        --raw-output 2>/dev/null | jq -r '.[]' 2>/dev/null)
    
    for DHCP_ID in $DHCP_IDS; do
        if [ -n "$DHCP_ID" ] && ! is_default_resource "$DHCP_ID" "dhcp-options"; then
            echo "    🔻 Deleting DHCP Options: $DHCP_ID"
            oci network dhcp-options delete --dhcp-id "$DHCP_ID" --force &>/dev/null || true
        fi
    done
    
    # Step 14: Wait before attempting VCN deletion
    echo "  ⏳ Waiting for all resources to be cleaned up..."
    sleep 60
    
    # Step 15: Finally, delete the VCN
    echo "  🔻 Deleting VCN: $VCN_ID"
    local max_attempts=5
    local attempt=1
    
    while [ $attempt -le $max_attempts ]; do
        if oci network vcn delete --vcn-id "$VCN_ID" --force &>/dev/null; then
            echo "    ✅ VCN deletion initiated successfully"
            break
        else
            echo "    ⚠️  VCN deletion attempt $attempt failed, retrying in 30 seconds..."
            sleep 30
            ((attempt++))
        fi
    done
    
    if [ $attempt -gt $max_attempts ]; then
        echo "    ❌ Failed to delete VCN after $max_attempts attempts"
        echo "    💡 You may need to manually check for remaining dependencies"
    fi
}

# Main execution
echo -e "\n🚀 Starting VCN cleanup process..."

# Fetch all VCNs in the compartment
VCN_IDS=$(oci network vcn list \
    --compartment-id "$COMPARTMENT_OCID" \
    --query "data[?\"lifecycle-state\" == 'AVAILABLE'].id" \
    --raw-output 2>/dev/null | jq -r '.[]' 2>/dev/null)

if [ -z "$VCN_IDS" ]; then
    echo "📭 No VCNs found in compartment $COMPARTMENT_OCID"
    exit 0
fi

echo "📋 Found VCNs to delete:"
for VCN_ID in $VCN_IDS; do
    VCN_NAME=$(oci network vcn get --vcn-id "$VCN_ID" --query "data.\"display-name\"" --raw-output 2>/dev/null)
    echo "  - $VCN_NAME ($VCN_ID)"
done

# Process each VCN
for VCN_ID in $VCN_IDS; do
    if [ -n "$VCN_ID" ]; then
        cleanup_vcn "$VCN_ID"
    fi
done

echo -e "\n✅ Cleanup complete for compartment: $COMPARTMENT_OCID"
echo "🔍 You may want to verify in the OCI Console that all resources have been deleted."

Output example

Regards

Osama Mustafa

Automating Block Volume Backups in Oracle Cloud Infrastructure (OCI) using CLI and Terraform

Posted on October 9, 2024 by Osama Mustafa in Cloud, OCI

Briefly introduce the importance of block volumes in OCI and why automated backups are essential.Mention that this blog will cover two methods: using the OCI CLI and Terraform for automation.

Automating Block Volume Backups using OCI CLI

Prerequisites:

Set up OCI CLI on your machine (brief steps with links).
Ensure that you have the right permissions to manage block volumes.

Step-by-step guide:

Command to create a block volume

oci bv volume create --compartment-id <your_compartment_ocid> --availability-domain <your_ad> --display-name "MyVolume" --size-in-gbs 50

Command to take a backup of the block volume:

oci bv backup create --volume-id <your_volume_ocid> --display-name "MyVolumeBackup"

Scheduling backups using cron jobs for automation.

Example cron job configuration

0 2 * * * /usr/local/bin/oci bv backup create --volume-id <your_volume_ocid> --display-name "ScheduledBackup" >> /var/log/oci_backup.log 2>&1

Automating Block Volume Backups using Terraform

Prerequisites

OCI Credentials: Make sure you have the proper API keys and permissions configured in your OCI tenancy.
Terraform Setup: Terraform should be installed and configured to interact with OCI, including the OCI provider setup in your environment.

Step 1: Define the OCI Block Volume Resource

First, define the block volume that you want to automate backups for. Here’s an example of a simple block volume resource in Terraform:

resource "oci_core_volume" "my_block_volume" {
  availability_domain = "your-availability-domain"
  compartment_id      = "ocid1.compartment.oc1..your-compartment-id"
  display_name        = "my_block_volume"
  size_in_gbs         = 50
}

Step 2: Define a Backup Policy

OCI provides predefined backup policies such as gold, silver, and bronze, which define how frequently backups are taken. You can create a custom backup policy as well, but for simplicity, we’ll use one of the predefined policies in this example. The Terraform resource oci_core_volume_backup_policy_assignment will assign a backup policy to the block volume.

Here’s an example to assign the gold backup policy to the block volume:

resource "oci_core_volume_backup_policy_assignment" "backup_assignment" {
  volume_id       = oci_core_volume.my_block_volume.id
  policy_id       = data.oci_core_volume_backup_policy.gold.id
}

data "oci_core_volume_backup_policy" "gold" {
  name = "gold"
}

Step 3: Custom Backup Policy (Optional)

If you need a custom backup policy rather than using the predefined gold, silver, or bronze policies, you can define a custom backup policy using OCI’s native scheduling.

You can create a custom schedule by combining these elements in your oci_core_volume_backup_policy resource.

resource "oci_core_volume_backup_policy" "custom_backup_policy" {
  compartment_id = "ocid1.compartment.oc1..your-compartment-id"
  display_name   = "CustomBackupPolicy"

  schedules {
    backup_type = "INCREMENTAL"
    period      = "ONE_DAY"
    retention_duration = "THIRTY_DAYS"
  }

  schedules {
    backup_type = "FULL"
    period      = "ONE_WEEK"
    retention_duration = "NINETY_DAYS"
  }
}

You can then assign this policy to the block volume using the same method as earlier.

Step 4: Apply the Terraform Configuration

Once your Terraform configuration is ready, apply it using the standard Terraform workflow:

Initialize Terraform:

terraform init

Plan the Terraform deployment:

terraform plan

Apply the Terraform plan:

terraform apply

This process will automatically provision your block volumes and assign the specified backup policy.

Regards
Osama

AWS Data migration tools

Posted on July 13, 2024 by Osama Mustafa in AWS, Cloud

AWS offers a wide variety of services and Partner tools to help you migrate your data sets, whether they are files, databases, machine images, block volumes, or even tape backups.

AWS Storage Gateway

AWS Storage Gateway is a service that gives your applications seamless and secure integration between on-premises environments and AWS storage.

It provides you low-latency access to cloud data with a Storage Gateway appliance.

Storage Gateway types

Choose a Storage Gateway type that is the best fit for your workload.

Amazon s3 file Gateway
Amazon FSx file Gateway
Tape Gateway
Volume Gateway

The Storage Gateway Appliance supports the following protocols to connect to your local data:

NFS or SMB for files
iSCSI for volumes
iSCSI VTL for tapes

Your storage gateway appliance runs in one of four modes: Amazon S3 File Gateway, Amazon FSx File Gateway, Tape Gateway, or Volume Gateway.

Data moved to AWS using Storage Gateway can be sent to the following destinations through the Storage Gateway managed service:

Amazon S3 (Amazon S3 File Gateway, Tape Gateway)
Amazon S3 Glacier (Amazon S3 File Gateway, Tape Gateway)
Amazon FSx for Windows File Server (Amazon FSx File Gateway)
Amazon EBS (Volume Gateway)

AWS Datasync

Manual tasks related to data transfers can slow down migrations and burden IT operations. DataSync facilitates moving large amounts of data between on-premises storage and Amazon S3 and Amazon EFS, or FSx for Windows File Server. By default, data is encrypted in transit using Transport Layer Security (TLS) 1.2. DataSync automatically handles scripting copy jobs, scheduling and monitoring transfers, validating data, and optimizing network usage.

Reduce on-premises storage infrastructure by shifting SMB-based data stores and content repositories from file servers and NAS arrays to Amazon S3 and Amazon EFS for analytics.

DataSync deploys as a single software agent that can connect to multiple shared file systems and run multiple tasks. The software agent is typically deployed on premises through a virtual machine to handle the transfer of data over the wide area network (WAN) to AWS. On the AWS side, the agent connects to the DataSync service infrastructure. Because DataSync is a service, there is no infrastructure for customers to set up or maintain in the cloud. DataSync configuration is managed directly from the console.

AWS Snow Family service models

The AWS Snow Family helps customers that need to run operations in austere, non-data center environments and in locations where there’s lack of consistent network connectivity. The AWS Snow Family, comprised of AWS Snowcone, AWS Snowball, and AWS Snowmobile, offers several physical devices and capacity points.

You can check my blog post here about the model https://osamaoracle.com/2023/01/28/aws-snow-family-members/

Regards

Osama

AWS database services part 2

Posted on November 1, 2023 by Osama Mustafa in AWS, Cloud

Part one https://osamaoracle.com/2023/01/03/aws-database-services/

Amazon RDS

Amazon RDS is a web service that makes it easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity while managing time-consuming database administration tasks. This allows you to focus on your applications and business. Amazon RDS gives you access to the full capabilities of a MySQL, Oracle, SQL Server, or Aurora database engines. This means that the code, applications, and tools you already use today with your existing databases can be used with Amazon RDS.

Amazon RDS automatically patches the database software and backs up your database. It stores the backups for a user-defined retention period and provides point-in-time recovery. You benefit from the flexibility of scaling the compute resources or storage capacity associated with your relational DB instance with a single API call.

Amazon RDS is available on six database engines, which optimize for memory, performance, or I/O. The database engines include:

Amazon Aurora
PostgreSQL
MySQL
MariaDB
Oracle Database
SQL Server

Amazon RDS Multi-AZ deployments

Amazon RDS Multi-AZ deployments provide enhanced availability and durability for DB instances, making them a natural fit for production database workloads. When you provision a Multi-AZ DB instance, Amazon RDS synchronously replicates the data to a standby instance in a different Availability Zone.

You can modify your environment from Single-AZ to Multi-AZ at any time. Each Availability Zone runs on its own physically distinct, independent infrastructure and is engineered to be highly reliable. Upon failure, the secondary instance picks up the load. Note that this is not used for read-only scenarios.

Read replicas

With Amazon RDS, you can create read replicas of your database. Amazon automatically keeps them in sync with the primary DB instance. Read replicas are available in Amazon RDS for Aurora, MySQL, MariaDB, PostgreSQL, Oracle, and Microsoft SQL Server. Read replicas can help you:

Relieve pressure on your primary node with additional read capacity.
Bring data close to your applications in different AWS Regions.
Promote a read replica to a standalone instance as a disaster recovery (DR) solution if the primary DB instance fails.

You can add read replicas to handle read workloads so your primary database doesn’t become overloaded with read requests. Depending on the database engine, you can also place your read replica in a different Region from your primary database. This gives you the ability to have a read replica closer to a particular locality.

You can configure a source database as Multi-AZ for high availability and create a read replica (in Single-AZ) for read scalability. With RDS for MySQL and MariaDB, you can also set the read replica as Multi-AZ, and as a DR target. When you promote the read replica to be a standalone database, it will be replicated to multiple Availability Zones.

Amazon DynamoDB tables

DynamoDB is a fully managed NoSQL database service. DynamoDB uses primary keys to uniquely identify each item in a table and secondary indexes to provide more querying flexibility. When creating a table, you must specify a table name and a primary key. These are the only two required entities.

There are two types of primary keys supported:

Simple primary key: A simple primary key is composed of just one attribute designated as the partition key. If you use only the partition, no two items can have the same value.
Composite primary key: A composite primary key is composed of both a partition key and a sort key. In this case the partition key value for multiple items can be the same, but their sort key values must be different.

You work with the core components: tables, items, and attributes. A table is a collection of items, and each item is a collection of attributes. In the example above, the table includes two items, with primary keys Nikki Wolf and John Stiles. The item with the primary key Nikki Wolf includes three attributes: Role, Year, and Genre. The primary key for John Stiles includes a Height attribute, and it does not include the Genre attribute.

Amazon DynamoDB consistency options

When your application writes data to a DynamoDB table and receives an HTTP 200 response (OK), the write has occurred and is durable. The data is eventually consistent across all storage locations, usually within one second or less. DynamoDB supports eventually consistent and strongly consistent reads.

DynamoDB uses eventually consistent reads, unless you specify otherwise. Read operations (such as GetItem, Query, and Scan) provide a ConsistentRead parameter. If you set this parameter to true, DynamoDB uses strongly consistent reads during the operation.

EVENTUALLY CONSISTENT READS

When you read data from a DynamoDB table, the response might not reflect the results of a recently completed write operation. The response might include some stale data. If you repeat your read request after a short time, the response should return the latest data.

STRONGLY CONSISTENT READS

When you request a strongly consistent read, DynamoDB returns a response with the most up-to-date data, reflecting the updates from all prior write operations that were successful. A strongly consistent read might not be available if there is a network delay or outage.

Amazon DynamoDB global tables

A global table is a collection of one or more DynamoDB tables, all owned by a single AWS account, identified as replica tables. A replica table (or replica, for short) is a single DynamoDB table that functions as part of a global table. Each replica stores the same set of data items. Any given global table can only have one replica table per Region, and every replica has the same table name and the same primary key schema.

DynamoDB global tables provide a fully managed solution for deploying a multi-Region, multi-active database, without having to build and maintain your own replication solution. When you create a global table, you specify the AWS Regions where you want the table to be available. DynamoDB performs all the necessary tasks to create identical tables in these Regions and propagate ongoing data changes to all of them.

Database Caching

Without caching, EC2 instances read and write directly to the database. With a caching, instances first attempt to read from a cache which uses high performance memory. They use a cache cluster that contains a set of cache nodes distributed between subnets. Resources within those subnets have high-speed access to those nodes.

Common caching strategies

There are multiple strategies for keeping information in the cache in sync with the database. Two common caching strategies include lazy loading and write-through.

Lazy loading

In lazy loading, updates are made to the database without updating the cache. In the case of a cache miss, the information retrieved from the database can be subsequently written to the cache. Lazy loading ensures that the data loaded in the cache is data needed by the application but can result in high cache-miss-to-cache-hit ratios in some use cases.

Write-through

An alternative strategy is to write through to the cache every time the database is accessed. This approach results in fewer cache misses. This improves performance but requires additional storage for data, which may not be needed by the applications.

Managing your cache

As your application writes to the cache, you need to consider cache validity and make sure that the data written to the cache is accurate. You also need to develop a strategy for managing cache memory. When your cache is full, you determine which items should be deleted by setting an eviction policy.

CACHE VALIDITY

Lazy loading allows for stale data but doesn’t fail with empty nodes. Write-through ensures that data is always fresh but can fail with empty nodes and can populate the cache with superfluous data. By adding a time to live (TTL) value to each write to the cache, you can ensure fresh data without cluttering up the cache with extra data.

TTL is an integer value that specifies the number of seconds or milliseconds, until the key expires. When an application attempts to read an expired key, it is treated as though the data is not found in cache, meaning that the database is queried and the cache is updated. This keeps data from getting too stale and requires that values in the cache are occasionally refreshed from the database.

MANAGING MEMORY

When cache memory is full, the cache engine removes data from memory to make space for new data. It chooses this data based on the eviction policy you set. An eviction policy evaluates the following characteristics of your data:

Which were accessed least recently?
Which have been accessed least frequently?
Which have a TTL set and the TTL value?

Amazon Elasticache

Amazon ElastiCache is a web service that makes it easy to set up, manage, and scale a distributed in-memory data store or cache environment in the cloud. When you’re using a cache for a backend data store, a side-cache is perhaps the most commonly known approach. Redis and Memcached are general-purpose caches that are decoupled from the underlying data store.

Use ElastiCache for Memcached for data-intensive apps. The service works as an in-memory data store and cache to support the most demanding applications requiring sub-millisecond response times. It is fully managed, scalable, and secure—making it an ideal candidate for cases where frequently accessed data must be in memory. The service is a popular choice for web, mobile apps, gaming, ad tech, and e-commerce.

ElastiCache for Redis is an in-memory data store that provides sub-millisecond latency at internet scale. It can power the most demanding real-time applications in gaming, ad tech, e-commerce, healthcare, financial services, and IoT.

ElastiCache engines

	ElastiCache for Memcached	ElastiCache for Redis
Simple cache to offload database burden	Yes	Yes
Ability to scale horizontally for writes and storage	Yes	Yes (if cluster mode is enabled)
Multi-threaded performance	Yes	–
Advanced data types	–	Yes
Sorting and ranking data sets	–	Yes
Pub and sub capability	–	Yes
Multi-AZ with Auto Failover	–	Yes
Backup and restore	–	Yes

Amazon DynamoDB Accelerator

DynamoDB is designed for scale and performance. In most cases, the DynamoDB response times can be measured in single-digit milliseconds. However, there are certain use cases that require response times in microseconds. For those use cases, DynamoDB Accelerator (DAX) delivers fast response times for accessing eventually consistent data.

DAX is an Amazon DynamoDB compatible caching service that provides fast in-memory performance for demanding applications.

AWS Database Migration Service

AWS Database Migration Service (AWS DMS) supports migration between the most widely used databases like Oracle, PostgreSQL, SQL Server, Amazon Redshift, Aurora, MariaDB, and MySQL. AWS DMS supports both homogeneous (same engine) and heterogeneous (different engines) migrations.

The service can be used to migrate between databases on Amazon EC2, Amazon RDS, and on-premises. Either the target or the source database must be located in Amazon EC2. It cannot be used to migrate between two on-premises databases.
AWS DMS automatically handles formatting of the source data for consumption by the target database. It does not perform schema or code conversion.
For homogenous migrations, you can use native tools to perform these conversions. For heterogeneous migrations, you can use the AWS Schema Conversion Tool (AWS SCT).

AWS Schema Conversion Tool

The AWS Schema Conversion Tool (AWS SCT) automatically converts the source database schema and a majority of the database code objects. The conversion includes views, stored procedures, and functions. They are converted to a format that is compatible with the target database. Any objects that cannot be automatically converted are marked so that they can be manually converted to complete the migration.

Source databases		Target databases on AWS
Oracle database Oracle data warehouse Azure SQL SQL server Teradata IBM Netezza Greenplum HPE Vertica MySQL and MariaDB PostgreSQL Aurora IBM DB2 LUW Apache Cassandra SAP ASE	AWS SCT	MySQL PostgreSQL Oracle AmazonDB RDS for MySQL Aurora for MySQL RDS for PostgreSQL Aurora PostgreSQL

The AWS SCT can also scan your application source code for embedded SQL statements and convert them as part of a database schema conversion project. During this process, the AWS SCT performs cloud native code optimization by converting legacy Oracle and SQL Server functions to their equivalent AWS service, modernizing the applications at the same time of migration.

Regards

Osama

AWS Step Functions

Posted on October 17, 2023 by Osama Mustafa in AWS, Cloud

It’s common for modern cloud applications to be composed of many services and components. As applications grow, an increasing amount of code needs to be written to coordinate the interaction of all components. With AWS Step Functions, you can focus on defining the component interactions, rather than writing all the software to make the interactions work.

AWS Step Functions integrates with the AWS services listed below. You can directly call API actions from the Amazon States Language in AWS Step Functions and pass parameters to the APIs of these services:

Compute services (AWS Lambda, Amazon ECS, Amazon EKS, and AWS Fargate)
Database services (Amazon DynamoDB)
Messaging services (Amazon SNS and Amazon SQS)
Data processing and analytics services (Amazon Athena, AWS Batch, AWS Glue, Amazon EMR, and AWS Glue DataBrew)
Machine learning services (Amazon SageMaker)
APIs created by API Gateway

You can configure your AWS Step Functions workflow to call other AWS services using AWS Step Functions service tasks.

Step Functions: State machine

A state machine is an object that has a set number of operating conditions that depend on its previous condition to determine output.

A common example of a state machine is the soda vending machine. The machine starts in the operating state (waiting for a transaction), and then moves to soda selection when money is added. After that, it enters a vending state, where the soda is deployed to the customer. After completion, the state returns back to operating.

Build workflows using state types

States are elements in your state machine. A state is referred to by its name, which can be any string, but must be unique within the scope of the entire state machine.

States can perform a variety of functions in your state machine:

Do some work in your state machine (a Task state)
Make a choice between different branches to run (a Choice state)
Stop with a failure or success (a Fail or Succeed state)
Pass its input to its output or inject some fixed data (a Pass state)
Provide a delay for a certain amount of time or until a specified time or date (a Wait state)
Begin parallel branches (a Parallel state)
Dynamically iterate steps (a Map state)

Orchestration of complex distributed workflows

Express Workflows are ideal for high-volume, event-processing workloads such as IoT data ingestion, streaming data processing and transformation, and mobile application backends. They can run for up to 5 minutes. Express Workflows employ an at-least-once model, where there is a possibility that a code might be run more than once. This makes them ideal for orchestrating idempotent actions such as transforming input data and storing using PUT in DynamoDB. Express Workflow executions are billed by the number of executions, the duration of execution, and the memory consumed.

Regards

Osama

Amazon Kinesis

Posted on October 5, 2023 by Osama Mustafa in AWS, Cloud

Amazon Kinesis for data collection and analysis

With Amazon Kinesis, you:

Collect, process, and analyze data streams in real time. Kinesis has the capacity to process streaming data at any scale. It provides you the flexibility to choose the tools that best suit the requirements of your application in a cost-effective way.
Ingest real-time data such as video, audio, application logs, website clickstreams, and Internet of Things (IoT) telemetry data. The ingested data can be used for machine learning, analytics, and other applications.
Can process and analyze data as it arrives, and respond instantly. You don’t have to wait until all data is collected before the processing begins.

Amazon Kinesis Data Streams

To get started using Amazon Kinesis Data Streams, create a stream and specify the number of shards. Each shard is a unit of read and write capacity. Each shard can read up to 1 MB of data per second and write at a rate of 2 MB per second. The total capacity of a stream is the sum of the capacities of its shards. Increase or decrease the number of shards in a stream as needed. Data being written is in the form of a record, which can be up to 1 MB in size.

Producers write data into the stream. A producer might be an Amazon EC2 instance, a mobile client, an on-premises server, or an IoT device.
Consumers receive the streaming data that the producers generate. A consumer might be an application running on an EC2 instance or AWS Lambda. If it’s on an Amazon EC2 instance, the application will need to scale as the amount of streaming data increases. If this is the case, run it in an Auto Scaling group.
Each consumer reads from a particular shard. There might be more than one application processing the same data.
Another way to write a consumer application is to use AWS Lambda, which lets you run code without having to provision or manage servers.
The results of the consumer applications can be stored by AWS services such as Amazon S3, Amazon DynamoDB, and Amazon RedShift.

Amazon Kinesis Data Firehose

Amazon Kinesis Data Firehose starts to process data in near-real time. Kinesis Data Firehose can send records to Amazon S3, Amazon Redshift, Amazon Elasticsearch Service (ES), and any HTTP endpoint owned by you. It can also send records to any of your third-party service providers, including Datadog, New Relic, and Splunk.

Regards

Osama

SQS vs. SNS

Posted on September 29, 2023 by Osama Mustafa in AWS, Cloud

Loose coupling with Amazon Simple Queue Service

Amazon Simple Queue Service (SQS) is a fully managed message queuing service that use use to you to decouple and scale microservices, distributed systems, and serverless applications The service works on a massive scale, processing billions of messages per day. It stores all message queues and messages within a single, highly available AWS Region with multiple redundant Availability Zones. This ensures that no single computer, network, or Availability Zone failure can make messages inaccessible. Messages can be sent and read simultaneously.

A loosely coupled workload involves processing a large number of smaller jobs. The loss of one node or job in a loosely coupled workload usually doesn’t delay the entire calculation. The lost work can be picked up later or omitted altogether.

With Amazon SQS, you can decouple pre-processing steps from compute steps and post-processing steps. Building applications from individual components that perform discrete functions improves scalability and reliability. Decoupling components is a best practice for designing modern applications. Amazon SQS frequently lies at the heart of cloud-native loosely coupled solutions.

SQS queue types

Amazon SQS offers two types of message queues

STANDARD QUEUES

Standard queues support at-least-once message delivery and provide best-effort ordering. Messages are generally delivered in the same order in which they are sent. However, because of the highly distributed architecture, more than one copy of a message might be delivered out of order. Standard queues can handle a nearly unlimited number of API calls per second. You can use standard message queues if your application can process messages that arrive repetitively and out of order.

FIFO QUEUES

FIFO (First-In-First-Out) queues are designed to enhance messaging between applications when the order of operations and events is critical or where duplicates can’t be tolerated. FIFO queues also provide exactly-once processing but have a limited number of API calls per second. FIFO queues are designed to enhance messaging between applications when the order of operations and events is critical.

Optimizing your Amazon SQS queue configurations

When creating an Amazon SQS queue, you need to consider how your application interacts with the queue. This information will help you optimize the configuration of your queue to control costs and increase performance.

TUNE YOUR VISIBILITY TIMEOUT

When a consumer receives an SQS message, that message remains in the queue until the consumer deletes it. You can configure the SQS queue’s visibility timeout setting to make that message invisible to other consumers for a period of time. This helps to prevent another consumer from processing the same message. The default visibility timeout is 30 seconds. The consumer deletes the message once it completes processing the message. If the consumer fails to delete the message before the visibility timeout expires, it becomes visible to other consumers and can be processed again.

Typically, you should set the visibility timeout to the maximum time that it takes your application to process and delete a message from the queue. Setting too short of a timeout increases the possibility of your application processing a message twice. Too long of a visibility timeout delays subsequent attempts at processing a message.

CHOOSE THE RIGHT POLLING TYPE

You can configure an Amazon SQS queue to use either short polling or long polling. Queues with short polling:

Send a response to the consumer immediately after receiving a request providing a faster response
Increases the number of responses and therefore costs.

SQS queues with long polling:

Do not return a response until at least one message arrives or the poll times out.
Less frequent responses but decreases costs.

Depending on the frequency of messages arriving in your queue, many of the responses from a queue using short polling could just be reporting an empty queue. Unless your application requires an immediate response to its poll requests, long polling is the preferable option.

Amazon SNS

Amazon SNS is a web service that makes it easy to set up, operate, and send notifications from the cloud. The service follows the publish-subscribe (pub-sub) messaging paradigm, with notifications being delivered to clients using a push mechanism.

Amazon SNS publisher to multiple SQS queues

Using highly available services, such as Amazon SNS, to perform basic message routing is an effective way of distributing messages to microservices. The two main forms of communications between microservices are request-response, and observer. In the example, an observer type is used to fan out orders to two different SQS queues based on the order type.

To deliver Amazon SNS notifications to an SQS queue, you subscribe to a topic specifying Amazon SQS as the transport and a valid SQS queue as the endpoint. To permit the SQS queue to receive notifications from Amazon SNS, the SQS queue owner must subscribe the SQS queue to the topic for Amazon SNS. If the user owns the Amazon SNS topic being subscribed to and the SQS queue receiving the notifications, nothing else is required. Any message published to the topic will automatically be delivered to the specified SQS queue. If the owner of the SQS queue is not the owner of the topic, Amazon SNS requires an explicit confirmation to the subscription request.

Amazon SNS and Amazon SQS

Features	Amazon SNS	Amazon SQS
Message persistence	No	Yes
Delivery mechanism	Push (passive)	Poll (active)
Producer and consumer	Publisher and subscriber	Send or receive
Distribution model	One to many	One to one

Regards

Osama

AWS API GATEWAY

Posted on September 1, 2023 by Osama Mustafa in AWS, Cloud

With API Gateway, you can create, publish, maintain, monitor, and secure APIs.

With API Gateway, you can connect your applications to AWS services and other public or private websites. It provides consistent RESTful and HTTP APIs for mobile and web applications to access AWS services and other resources hosted outside of AWS.

As a gateway, it handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls. These include traffic management, authorization and access control, monitoring, and API version management.

API Gateway sample architecture

API Gateway integrates with Amazon CloudWatch by sending log messages and detailed metrics to it. You can activate logging for each stage in your API or for each method. You can set the verbosity of the logging (Error or Info), and if full request and response data should be logged.

The detailed metrics that API Gateway can send to Amazon CloudWatch are:

Number of API calls
Latency
Integration latency
HTTP 400 and 500 errors

API Gateway features

Creates a unified API front end for multiple microservices.

Provides DDoS protection and throttling for your backend.

Authenticates and authorizes requests to a backend.

Throttles, meters, and monetizes API usage by third-party developers.

Regards

Osama

AWS Community Builder

Posted on May 2, 2023May 2, 2023 by Osama Mustafa in News

I woke up today with fantastic news: AWS Community Builder has been renewed for the second time.

The AWS Community Builders program offers technical resources, education, and networking opportunities to AWS technical enthusiasts and emerging thought leaders passionate about sharing knowledge and connecting with the technical community.

Interested AWS builders should apply to the program to build relationships with AWS product teams, AWS Heroes, and the AWS community.

You can check the program here.

Regards

Osama

VPC endpoints

Posted on April 29, 2023 by Osama Mustafa in AWS, Cloud

A VPC endpoint enables private connections between your VPC and supported AWS services without requiring an internet gateway, NAT device, VPN connection, or Direct Connect connection. Instances in your VPC do not require public IP addresses to communicate with resources in the service. Traffic between your VPC and the other service does not leave the AWS network.

Endpoints are virtual devices. They are horizontally scaled, redundant, and highly available VPC components. They permit communication between instances in your VPC and services without imposing availability risks or bandwidth constraints on your network traffic.

Types of VPC endpoints

GATEWAY ENDPOINT

Specify a gateway endpoint as a route target in your route table. A gateway endpoint is meant for traffic destined to Amazon S3, or Amazon DynamoDB and remains inside the AWS network.

instance A in the public subnet communicates with Amazon S3 via an internet gateway. Instance A has a route to local destinations in the VPC. Instance B communicates with an Amazon S3 bucket and an Amazon DynamoDB table using unique gateway endpoints. The diagram shows an example of a private route table. The private route table directs your Amazon S3 and DynamoDB requests through each gateway endpoint using routes. The route table uses a prefix list to target the specific Region for each service.

INTERFACE ENDPOINT

With an interface VPC endpoint (interface endpoint), you can privately connect your VPC to services as if they were in your VPC. When the interface endpoint is created, traffic is directed to the new endpoint without changes to any route tables in your VPC.

For example, a Region is shown with Systems Manager outside of the example VPC. The example VPC has a public and private subnet with an Amazon Elastic Compute Cloud (Amazon EC2) instance in each. Systems Manager traffic sent to ssm.region.amazonaws.com is sent to an elastic network interface in the private subnet.

Gateway VPC endpoints and interface VPC endpoints help you access services over the AWS backbone.

A gateway VPC endpoint (gateway endpoint) is a gateway that you specify as a target for a route in your route table for traffic destined for a supported AWS service. The following AWS services are supported: Amazon S3 and Amazon DynamoDB.

An interface VPC endpoint (interface endpoint) is an elastic network interface with a private IP address from the IP address range of your subnet. The network interface serves as an entry point for traffic destined to a supported service. AWS PrivateLink powers interface endpoints and it avoids exposing traffic to the public internet.

Regards

Osama