Ensuring high availability (HA) for your applications is critical in today’s cloud-first environment. Oracle Cloud Infrastructure (OCI) provides robust tools such as Load Balancers and Compute Instances to help you create a resilient, highly available architecture for your applications. In this post, we’ll walk through the steps to set up an HA architecture using OCI Load Balancer with multiple compute instances across availability domains for fault tolerance.
Prerequisites
- OCI Account: A working Oracle Cloud Infrastructure account.
- OCI CLI: Installed and configured with necessary permissions.
- Terraform: Installed and set up for provisioning infrastructure.
- Basic knowledge of Load Balancers and Compute Instances in OCI.
Step 1: Set Up a Virtual Cloud Network (VCN)
A VCN is required to house your compute instances and load balancers. To begin, create a new VCN with subnets in different availability domains (ADs) for high availability.
Terraform Configuration (vcn.tf):
resource "oci_core_virtual_network" "vcn" {
compartment_id = "<compartment_ocid>"
cidr_block = "10.0.0.0/16"
display_name = "HA-Virtual-Network"
}
resource "oci_core_subnet" "subnet1" {
compartment_id = "<compartment_ocid>"
vcn_id = oci_core_virtual_network.vcn.id
cidr_block = "10.0.1.0/24"
availability_domain = "AD-1"
display_name = "HA-Subnet-AD1"
}
resource "oci_core_subnet" "subnet2" {
compartment_id = "<compartment_ocid>"
vcn_id = oci_core_virtual_network.vcn.id
cidr_block = "10.0.2.0/24"
availability_domain = "AD-2"
display_name = "HA-Subnet-AD2"
}
Step 2: Provision Compute Instances
Create two compute instances (one in each subnet) to ensure redundancy.
Terraform Configuration (compute.tf):
resource "oci_core_instance" "instance1" {
compartment_id = "<compartment_ocid>"
availability_domain = "AD-1"
shape = "VM.Standard2.1"
display_name = "HA-Instance-1"
create_vnic_details {
subnet_id = oci_core_subnet.subnet1.id
assign_public_ip = true
}
source_details {
source_type = "image"
source_id = "<image_ocid>"
}
}
resource "oci_core_instance" "instance2" {
compartment_id = "<compartment_ocid>"
availability_domain = "AD-2"
shape = "VM.Standard2.1"
display_name = "HA-Instance-2"
create_vnic_details {
subnet_id = oci_core_subnet.subnet2.id
assign_public_ip = true
}
source_details {
source_type = "image"
source_id = "<image_ocid>"
}
}
Step 3: Set Up the OCI Load Balancer
Now, configure the OCI Load Balancer to distribute traffic between the compute instances in both availability domains.
Terraform Configuration (load_balancer.tf):
resource "oci_load_balancer_load_balancer" "ha_lb" {
compartment_id = "<compartment_ocid>"
display_name = "HA-Load-Balancer"
shape = "100Mbps"
subnet_ids = [
oci_core_subnet.subnet1.id,
oci_core_subnet.subnet2.id
]
backend_sets {
name = "backend-set-1"
backends {
ip_address = oci_core_instance.instance1.private_ip
port = 80
}
backends {
ip_address = oci_core_instance.instance2.private_ip
port = 80
}
policy = "ROUND_ROBIN"
health_checker {
port = 80
protocol = "HTTP"
url_path = "/health"
retries = 3
timeout_in_seconds = 10
interval_in_seconds = 5
}
}
}
resource "oci_load_balancer_listener" "ha_listener" {
load_balancer_id = oci_load_balancer_load_balancer.ha_lb.id
name = "http-listener"
default_backend_set_name = "backend-set-1"
port = 80
protocol = "HTTP"
}
Step 4: Set Up Health Checks for High Availability
Health checks are critical to ensure that the load balancer sends traffic only to healthy instances. The health check configuration is included in the backend set definition above, but you can customize it as needed.
Step 5: Testing and Validation
Once all resources are provisioned, test the HA architecture:
Verify Load Balancer Health: Ensure that the backend instances are marked as healthy by checking the load balancer’s health checks.
oci load-balancer backend-set get --load-balancer-id <load_balancer_id> --name backend-set-1
- Access the Application: Test accessing your application through the Load Balancer’s public IP. The Load Balancer should evenly distribute traffic across the two compute instances.
- Failover Testing: Manually shut down one of the instances to verify that the Load Balancer reroutes traffic to the other instance.