Enabling TLS Encryption on a PubSub+ Broker – Technical Guide

Secure communication between clients and your messaging broker is critical in modern distributed systems. Transport Layer Security (TLS) protects data in transit from eavesdropping and tampering by encrypting the connection between clients and the broker. In this guide, you’ll learn how to generate certificates, configure TLS on a Solace PubSub+ broker, and validate secure connections.

1. Overview

PubSub+ supports TLS encryption (e.g., TLSv1.1 and TLSv1.2) for secure client connections. This guide focuses on server-side authentication only (the broker authenticating to clients).

2. Certificate and Key Generation

Before enabling TLS, you must create the cryptographic materials:

2.1 Generate a Private Key (RSA 2048 bit)

Use OpenSSL to create a password-protected RSA private key in PEM format:

openssl genpkey -algorithm RSA \
  -aes-256-cbc \
  -out private_key.pem \
  -pkeyopt rsa_keygen_bits:2048

You will be prompted for a passphrase — make sure to record it.

2.2 Extract Public Key

From the private key, export the public key. You will need this later:

ssh-keygen -e -f private_key.pem > public_key.pem

Again you will enter the passphrase you set earlier.

2.3 Create a Certificate Signing Request (CSR)

Generate a CSR to issue a certificate:

openssl req -new -key private_key.pem -out certificate.csr

You will be asked to complete the Distinguished Name (DN) attributes (e.g., Common Name, Organization). Use your broker’s real hostname in Common Name (CN) — this ensures hostname verification works during TLS handshakes.

2.4 Generate the TLS Certificate

You can use the CSR to create a self-signed certificate (for testing), or send the CSR to a CA (recommended for production).

For a self-signed certificate:

openssl x509 -req -in certificate.csr \
  -signkey private_key.pem \
  -days 365 \
  -out server_certificate.pem

This results in a PEM-encoded TLS certificate valid for one year.

3. Prepare the PubSub+ Broker

TLS on PubSub+ requires the certificate file and key to be available in the broker’s certificate directory (/usr/sw/jail/certs)

4. Configure TLS on Solace PubSub+

4.1 Load the Certificate File

Transfer the certificate file to the broker’s /certs directory, for example using SFTP:

solace# copy sftp://admin@<host-ip>/server_certificate.pem /certs/server_certificate.pem

Replace <host-ip> and credentials as appropriate.

4.2 Set the Server Certificate

In the broker CLI:

solace(configure)# ssl
solace(configure/ssl)# server-certificate server_certificate.pem

This tells the broker to use that certificate for all TLS connections. Solace

⚠️ Only one TLS certificate can be active at a time.

4.3 Cipher Suite (Optional, Recommended)

Solace supports selecting specific cipher suites. For example:

solace(configure/ssl)# cipher-suite msg-backbone name AES256-SHA

This forces a secure symmetric cipher for session encryption.

5. Client-Side Requirements

5.1 Trust Store

Clients must trust the CA that signed the server’s certificate. For self-signed certificates, distribute the root certificate to all clients’ trust stores. If using a public CA, clients will automatically trust the certificate.

5.2 Secure Connection URI

Instead of using plaintext connections like:

tcp://broker.example.com:55555

Clients must connect over TLS, e.g.:

tcps://broker.example.com:55443

Where tcps:// indicates TLS transport.

6. Verify the Setup

Once TLS is enabled, attempt a secure connection from a client using TLS-enabled APIs (e.g., Solace Messaging APIs or MQTT with TLS support):

  • Confirm that the TLS handshake completes
  • Ensure the client validates the server certificate and hostname
  • Observe that plaintext connections are rejected

Tools like openssl s_client can also be used for validation:

openssl s_client -connect broker.example.com:55443 \
  -CAfile rootCA.pem

If the certificate is trusted and connection succeeds, you should see handshake details and certificate information.

Regards
Osama

Basic Guide to Build a Production-Architecture on OCI

1. Why OCI for Modern Architecture?

Many architects underestimate how much OCI has matured. Today, OCI offers:

  • Low-latency networking with deterministic performance.
  • Flexible compute shapes (standard, dense I/O, high memory).
  • A Kubernetes service (OKE) with enterprise-level resilience.
  • Cloud-native storage (Block, Object, File).
  • A full security stack (Vault, Cloud Guard, WAF, IAM policies).
  • A pricing model that is often 30–50% cheaper than equivalent hyperscaler deployments.

Reference: OCI Overview
https://docs.oracle.com/en-us/iaas/Content/home.htm

2. Multi-Tier Production Architecture Overview

A typical production workload on OCI includes:

  • Network Layer: VCN, subnets, NAT, DRG, Load Balancers
  • Compute Layer: OKE, VMs, Functions
  • Data Layer: Autonomous DB, PostgreSQL, MySQL, Object Storage
  • Security Layer: OCI Vault, WAF, IAM policies
  • Observability Layer: Logging, Monitoring, Alarms, Prometheus/Grafana
  • Automation Layer: Terraform, OCI CLI, GitHub Actions/Azure DevOps

3. Networking Foundation

You start with a Virtual Cloud Network (VCN), structured in a way that isolates traffic properly:

VCN Example Layout

  • 10.10.0.0/16 — VCN Root
    • 10.10.1.0/24 — Public Subnet (Load Balancers)
    • 10.10.2.0/24 — Private Subnet (Applications / OKE Nodes)
    • 10.10.3.0/24 — DB Subnet
    • 10.10.4.0/24 — Bastion Subnet

Terraform Example

resource "oci_core_vcn" "main" {
  cidr_block = "10.10.0.0/16"
  compartment_id = var.compartment_ocid
  display_name = "prod-vcn"
}

resource "oci_core_subnet" "private_app" {
  vcn_id = oci_core_vcn.main.id
  cidr_block = "10.10.2.0/24"
  prohibit_public_ip_on_vnic = true
  display_name = "app-private-subnet"
}

Reference: OCI Networking Concepts
https://docs.oracle.com/en-us/iaas/Content/Network/Concepts/overview.htm


4. Deploying Workloads on OKE (Oracle Kubernetes Engine)

OKE is one of OCI’s strongest services due to:

  • Native integration with VCN
  • Worker nodes running inside your own subnets
  • The ability to use OCI Load Balancers or NGINX ingress
  • Strong security by default

Cluster Creation Example (CLI)

oci ce cluster create \
  --name prod-oke \
  --vcn-id ocid1.vcn.oc1... \
  --kubernetes-version "1.30.1" \
  --compartment-id <compartment_ocid>

Node Pool Example

oci ce node-pool create \
  --name prod-nodepool \
  --cluster-id <cluster_ocid> \
  --node-shape VM.Standard3.Flex \
  --node-shape-config '{"ocpus":4,"memoryInGBs":32}' \
  --subnet-ids '["<subnet_ocid>"]'

5. Adding Ingress Traffic: OCI LB + NGINX

In multi-cloud architectures (Azure, GCP, OCI), it’s common to use Cloudflare or F5 for global routing, but within OCI you typically rely on:

  • OCI Load Balancer (Layer 4/7)
  • NGINX Ingress Controller on OKE

Example: Basic Ingress for Microservices

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: payments-ingress
spec:
  ingressClassName: nginx
  rules:
  - host: payments.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: payments-svc
            port:
              number: 8080

6. Secure Secrets With OCI Vault

Never store secrets in ConfigMaps or Docker images.
OCI Vault integrates tightly with:

  • Kubernetes Secrets via CSI Driver
  • Database credential rotation
  • Key management (KMS)

Example: Using OCI Vault with Kubernetes

apiVersion: v1
kind: Secret
metadata:
  name: db-secret
type: Opaque
stringData:
  username: appuser
  password: ${OCI_VAULT_SECRET_DB_PASSWORD}

7. Observability: Logging + Monitoring + Prometheus

OCI Monitoring handles metrics out of the box (CPU, memory, LB metrics, OKE metrics).
But for application-level observability, you deploy Prometheus/Grafana.

Prometheus Helm Install

helm install prometheus prometheus-community/kube-prometheus-stack \
  --namespace monitoring

Add ServiceMonitor for your applications:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: payments-monitor
spec:
  selector:
    matchLabels:
      app: payments
  endpoints:
  - port: http

8. Disaster Recovery and Multi-Region Strategy

OCI provides:

  • Block Volume replication
  • Object Storage Cross-Region Replication
  • Multi-AD (Availability Domain) deployment
  • Cross-region DR using Remote Peering

Example: Autonomous DB Cross-Region DR

oci db autonomous-database create-adb-cross-region-disaster-recovery \
  --autonomous-database-id <db_ocid> \
  --disaster-recovery-region "eu-frankfurt-1"

9. CI/CD on OCI Using GitHub Actions

Example pipeline to build a Docker image and deploy to OKE:

name: Deploy to OKE

on:
  push:
    branches: [ "main" ]

jobs:
  build-deploy:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3

    - name: Build Docker Image
      run: docker build -t myapp:${{ github.sha }} .

    - name: OCI CLI Login
      run: |
        oci session authenticate

    - name: Push Image to OCIR
      run: |
        docker tag myapp:${{ github.sha }} \
        iad.ocir.io/tenancy/myapp:${{ github.sha }}
        docker push iad.ocir.io/tenancy/myapp:${{ github.sha }}

    - name: Deploy to OKE
      run: |
        kubectl set image deployment/myapp myapp=iad.ocir.io/tenancy/myapp:${{ github.sha }}

The Final Architecture will look like this

Building a Fully Private, Zero-Trust API Platform on OCI Using API Gateway, Private Endpoints, and VCN Integration

1. Why a Private API Gateway Matters

A typical API Gateway sits at the edge and exposes public REST endpoints.
But some environments require:

  • APIs callable only from internal systems
  • Backend microservices running in private subnets
  • Zero inbound public access
  • Authentication and authorization enforced at gateway level
  • Isolation between dev, test, pprd, prod

These requirements push you toward a private deployment using Private Endpoint Mode.

This means:

  • The API Gateway receives traffic only from inside your VCN
  • Clients must be inside the private network (on-prem, FastConnect, VPN, or private OCI services)
  • The entire flow stays within the private topology

2. Architecture Overview

A private API Gateway requires several OCI components working together:

  • API Gateway (Private Endpoint Mode)
  • VCN with private subnets
  • Service Gateway for private object storage access
  • Private Load Balancer for backend microservices
  • IAM policies controlling which groups can deploy APIs
  • VCN routing configuration to direct requests correctly
  • Optional WAF (private) for east-west inspection inside the VCN

The call flow:

  1. A client inside your VCN sends a request to the Gateway’s private IP.
  2. The Gateway handles authentication, request validation, and OCI IAM signature checks.
  3. The Gateway forwards traffic to a backend private LB or private OKE services.
  4. Logs go privately to Logging service via the service gateway.

All traffic stays private. No NAT, no public egress.

3. Deploying the Gateway in Private Endpoint Mode

When creating the API Gateway:

  • Choose Private Gateway Type
  • Select the VCN and Private Subnet
  • Ensure the subnet has no internet gateway
  • Disable public routing

You will receive a private IP instead of a public endpoint.

Example shape:

Private Gateway IP: 10.0.4.15
Subnet: app-private-subnet-1
VCN CIDR: 10.0.0.0/16

Only systems inside the 10.x.x.x network (or connected networks) can call it.

4. Routing APIs to Private Microservices

Your backend might be:

  • A microservice running in OKE
  • A VM instance
  • A container on Container Instances
  • A private load balancer
  • A function in a private subnet
  • An internal Oracle DB REST endpoint

For reliable routing:

a. Attach a Private Load Balancer

It’s best practice to put microservices behind an internal load balancer.

Example LB private IP: 10.0.20.10

b. Add Route Table Entries

Ensure the subnet hosting the API Gateway can route to the backend:

Destination: 10.0.20.0/24
Target: local

If OKE is involved, ensure proper security list or NSG rules:

  • Allow port 80 or 443 from Gateway subnet to LB subnet
  • Allow health checks

5. Creating an API Deployment (Technical Example)

Here is a minimal private deployment using a backend running at internal LB:

Deployment specification

{
  "routes": [
    {
      "path": "/v1/customer",
      "methods": ["GET"],
      "backend": {
        "type": "HTTP_BACKEND",
        "url": "http://10.0.20.10:8080/api/customer"
      }
    }
  ]
}

Upload this JSON file and create a new deployment under your private API Gateway.

The Gateway privately calls 10.0.20.10 using internal routing.

6. Adding Authentication and Authorization

OCI API Gateway supports:

  • OCI IAM Authorization (for IAM-authenticated clients)
  • JWT validation (OIDC tokens)
  • Custom authorizers using Functions

Example: validate a token from an internal identity provider.

"authentication": {
  "type": "JWT_AUTHENTICATION",
  "tokenHeader": "Authorization",
  "jwksUri": "https://id.internal.example.com/.well-known/jwks.json"
}

This ensures zero-trust by requiring token validation even inside the private network.

7. Logging, Metrics, and Troubleshooting 100 Percent Privately

Because we are running in private-only mode, logs and metrics must also stay private.

Use:

  • Service Gateway for Logging service
  • VCN Flow Logs for traffic inspection
  • WAF (private deployment) if deeper L7 filtering is needed

Enable Access Logs:

Enable access logs: Yes
Retention: 90 days

You will see logs in the Logging service with no public egress.

8. Common Mistakes and How to Avoid Them

Route table missing entries

Most issues come from mismatched route tables between:

  • Gateway subnet
  • Backend subnet
  • OKE node pools

Security Lists or NSGs blocking traffic

Ensure the backend allows inbound traffic from the Gateway subnet.

Incorrect backend URL

Use private IP or private LB hostname.

Backend certificate errors

If using HTTPS internally, ensure trusted CA is loaded on Gateway.

Regards

Osama

Building a Real-Time Data Enrichment & Inference Pipeline on AWS Using Kinesis, Lambda, DynamoDB, and SageMaker

Modern cloud applications increasingly depend on real-time processing, especially when dealing with fraud detection, personalization, IoT telemetry, or operational monitoring.
In this post, we’ll build a fully functional AWS pipeline that:

  • Streams events using Amazon Kinesis
  • Enriches and transforms them via AWS Lambda
  • Stores real-time feature data in Amazon DynamoDB
  • Performs machine-learning inference using a SageMaker Endpoint

1. Architecture Overview

2. Step-By-Step Pipeline Build


2.1. Create a Kinesis Data Stream

aws kinesis create-stream \
  --stream-name RealtimeEvents \
  --shard-count 2 \
  --region us-east-1

This stream will accept incoming events from your apps, IoT devices, or microservices.


2.2. DynamoDB Table for Real-Time Features

aws dynamodb create-table \
  --table-name UserFeatureStore \
  --attribute-definitions AttributeName=userId,AttributeType=S \
  --key-schema AttributeName=userId,KeyType=HASH \
  --billing-mode PAY_PER_REQUEST \
  --region us-east-1

This table holds live user features, updated every time an event arrives.


2.3. Lambda Function (Real-Time Data Enrichment)

This Lambda:

  • Reads events from Kinesis
  • Computes simple features (e.g., last event time, rolling count)
  • Saves enriched data to DynamoDB
import json
import boto3
from datetime import datetime, timedelta

ddb = boto3.resource("dynamodb")
table = ddb.Table("UserFeatureStore")

def lambda_handler(event, context):

    for record in event["Records"]:
        payload = json.loads(record["kinesis"]["data"])

        user = payload["userId"]
        metric = payload["metric"]
        ts = datetime.fromisoformat(payload["timestamp"])

        # Fetch old features
        old = table.get_item(Key={"userId": user}).get("Item", {})

        last_ts = old.get("lastTimestamp")
        count = old.get("count", 0)

        # Update rolling 5-minute count
        if last_ts:
            prev_ts = datetime.fromisoformat(last_ts)
            if ts - prev_ts < timedelta(minutes=5):
                count += 1
            else:
                count = 1
        else:
            count = 1

        # Save new enriched features
        table.put_item(Item={
            "userId": user,
            "lastTimestamp": ts.isoformat(),
            "count": count,
            "lastMetric": metric
        })

    return {"status": "ok"}

Attach the Lambda to the Kinesis stream.


2.4. Creating a SageMaker Endpoint for Inference

Train your model offline, then deploy it:

aws sagemaker create-endpoint-config \
  --endpoint-config-name RealtimeInferenceConfig \
  --production-variants VariantName=AllInOne,ModelName=MyInferenceModel,InitialInstanceCount=1,InstanceType=ml.m5.large

aws sagemaker create-endpoint \
  --endpoint-name RealtimeInference \
  --endpoint-config-name RealtimeInferenceConfig


2.5. API Layer Performing Live Inference

Your application now requests predictions like this:

import boto3
import json

runtime = boto3.client("sagemaker-runtime")
ddb = boto3.resource("dynamodb").Table("UserFeatureStore")

def predict(user_id, extra_input):

    user_features = ddb.get_item(Key={"userId": user_id}).get("Item")

    payload = {
        "userId": user_id,
        "features": user_features,
        "input": extra_input
    }

    response = runtime.invoke_endpoint(
        EndpointName="RealtimeInference",
        ContentType="application/json",
        Body=json.dumps(payload)
    )

    return json.loads(response["Body"].read())

This combines live enriched features + model inference for maximum accuracy.


3. Production Considerations

Performance

  • Enable Lambda concurrency
  • Use DynamoDB DAX caching
  • Use Kinesis Enhanced Fan-Out for high throughput

Security

  • Use IAM roles with least privilege
  • Encrypt Kinesis, Lambda, DynamoDB, and SageMaker with KMS

Monitoring

  • CloudWatch Metrics
  • CloudWatch Logs Insights queries
  • DynamoDB capacity alarms
  • SageMaker Model error monitoring

Cost Optimization

  • Use PAY_PER_REQUEST DynamoDB
  • Use Lambda Power Tuning
  • Scale SageMaker endpoints with autoscaling

Implementing a Real-Time Anomaly Detection Pipeline on OCI Using Streaming Data, Oracle Autonomous Database & ML

Detecting unusual patterns in real time is critical to preventing outages, catching fraud, ensuring SLA compliance, and maintaining high-quality user experiences.
In this post, we build a real working pipeline on OCI that:

  • Ingests streaming data
  • Computes features in near-real time
  • Stores results in Autonomous Database
  • Runs anomaly detection logic
  • Sends alerts and exposes dashboards

This guide contains every technical step, including:
Streaming → Function → Autonomous DB → Anomaly Logic → Notifications → Dashboards

1. Architecture Overview

Components Used

  • OCI Streaming
  • OCI Functions
  • Oracle Autonomous Database
  • DBMS_SCHEDULER for anomaly detection job
  • OCI Notifications
  • Oracle Analytics Cloud / Grafana

2. Step-by-Step Implementation


2.1 Create OCI Streaming Stream

oci streaming stream create \
  --compartment-id $COMPARTMENT_OCID \
  --display-name "anomaly-events-stream" \
  --partitions 3

2.2 Autonomous Database Table

CREATE TABLE raw_events (
  event_id       VARCHAR2(50),
  event_time     TIMESTAMP,
  metric_value   NUMBER,
  feature1       NUMBER,
  feature2       NUMBER,
  processed_flag CHAR(1) DEFAULT 'N',
  anomaly_flag   CHAR(1) DEFAULT 'N',
  CONSTRAINT pk_raw_events PRIMARY KEY(event_id)
);

2.3 OCI Function – Feature Extraction

func.py:

import oci
import cx_Oracle
import json
from datetime import datetime

def handler(ctx, data: bytes=None):
    event = json.loads(data.decode('utf-8'))

    evt_id = event['id']
    evt_time = datetime.fromisoformat(event['time'])
    value = event['metric']

    # DB Connection
    conn = cx_Oracle.connect(user='USER', password='PWD', dsn='dsn')
    cur = conn.cursor()

    # Fetch previous value if exists
    cur.execute("SELECT metric_value FROM raw_events WHERE event_id=:1", (evt_id,))
    prev = cur.fetchone()
    prev_val = prev[0] if prev else 1.0

    # Compute features
    feature1 = value - prev_val
    feature2 = value / prev_val

    # Insert new event
    cur.execute("""
        INSERT INTO raw_events(event_id, event_time, metric_value, feature1, feature2)
        VALUES(:1, :2, :3, :4, :5)
    """, (evt_id, evt_time, value, feature1, feature2))

    conn.commit()
    cur.close()
    conn.close()

    return "ok"

Deploy the function and attach the streaming trigger.


2.4 Anomaly Detection Job (DBMS_SCHEDULER)

BEGIN
  FOR rec IN (
    SELECT event_id, feature1
    FROM raw_events
    WHERE processed_flag = 'N'
  ) LOOP
    DECLARE
      meanv NUMBER;
      stdv  NUMBER;
      zscore NUMBER;
    BEGIN
      SELECT AVG(feature1), STDDEV(feature1) INTO meanv, stdv FROM raw_events;

      zscore := (rec.feature1 - meanv) / NULLIF(stdv, 0);

      IF ABS(zscore) > 3 THEN
        UPDATE raw_events SET anomaly_flag='Y' WHERE event_id=rec.event_id;
      END IF;

      UPDATE raw_events SET processed_flag='Y' WHERE event_id=rec.event_id;
    END;
  END LOOP;
END;

Schedule this to run every 2 minutes:

BEGIN
  DBMS_SCHEDULER.CREATE_JOB (
    job_name        => 'ANOMALY_JOB',
    job_type        => 'PLSQL_BLOCK',
    job_action      => 'BEGIN anomaly_detection_proc; END;',
    repeat_interval => 'FREQ=MINUTELY;INTERVAL=2;',
    enabled         => TRUE
  );
END;


2.5 Notifications

oci ons topic create \
  --compartment-id $COMPARTMENT_OCID \
  --name "AnomalyAlerts"

In the DB, add a trigger:

CREATE OR REPLACE TRIGGER notify_anomaly
AFTER UPDATE ON raw_events
FOR EACH ROW
WHEN (NEW.anomaly_flag='Y' AND OLD.anomaly_flag='N')
BEGIN
  DBMS_OUTPUT.PUT_LINE('Anomaly detected for event ' || :NEW.event_id);
END;
/


2.6 Dashboarding

You may use:

  • Oracle Analytics Cloud (OAC)
  • Grafana + ADW Integration
  • Any BI tool with SQL

Example Query:

SELECT event_time, metric_value, anomaly_flag 
FROM raw_events
ORDER BY event_time;

2. Terraform + OCI CLI Script Bundle

Terraform – Streaming + Function + Policies

resource "oci_streaming_stream" "anomaly" {
  name           = "anomaly-events-stream"
  partitions     = 3
  compartment_id = var.compartment_id
}

resource "oci_functions_application" "anomaly_app" {
  compartment_id = var.compartment_id
  display_name   = "anomaly-function-app"
  subnet_ids     = var.subnets
}

Terraform Notification Topic

resource "oci_ons_notification_topic" "anomaly" {
  compartment_id = var.compartment_id
  name           = "AnomalyAlerts"
}

CLI Insert Test Events

oci streaming stream message put \
  --stream-id $STREAM_OCID \
  --messages '[{"key":"1","value":"{\"id\":\"1\",\"time\":\"2025-01-01T10:00:00\",\"metric\":58}"}]'

Deploying Real-Time Feature Store on Amazon SageMaker Feature Store with Amazon Kinesis Data Streams & Amazon DynamoDB for Low-Latency ML Inference

Modern ML inference often depends on up-to-date features (customer behaviour, session counts, recent events) that need to be available in low-latency operations. In this article you’ll learn how to build a real-time feature store on AWS using:

  • Amazon Kinesis Data Streams for streaming events
  • AWS Lambda for processing and feature computation
  • Amazon DynamoDB (or SageMaker Feature Store) for storage of feature vectors
  • Amazon SageMaker Endpoint for low-latency inference
    You’ll see end-to-end code snippets and architecture guidance so you can implement this in your environment.

1. Architecture Overview

The pipeline works like this:

  1. Front-end/app produces events (e.g., user click, transaction) → published to Kinesis.
  2. A Lambda function consumes from Kinesis, computes derived features (for example: rolling window counts, recency, session features).
  3. The Lambda writes/updates these features into a DynamoDB table (or directly into SageMaker Feature Store).
  4. When a request arrives for inference, the application fetches the current feature set from DynamoDB (or Feature Store) and calls a SageMaker endpoint.
  5. Optionally, after inference you can stream feedback events for model refinement.

This architecture provides real-time feature freshness and low-latencyinference.

2. Setup & Implementation

2.1 Create the Kinesis data stream

aws kinesis create-stream \
  --stream-name UserEventsStream \
  --shard-count 2 \
  --region us-east-1

2.2 Create DynamoDB table for features

aws dynamodb create-table \
  --table-name RealTimeFeatures \
  --attribute-definitions AttributeName=userId,AttributeType=S \
  --key-schema AttributeName=userId,KeyType=HASH \
  --billing-mode PAY_PER_REQUEST \
  --region us-east-1

2.3 Lambda function to compute features

Here is a Python snippet (using boto3) which will be triggered by Kinesis:

import json
import boto3
from datetime import datetime, timedelta

dynamo = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamo.Table('RealTimeFeatures')

def lambda_handler(event, context):
    for record in event['Records']:
        payload = json.loads(record['kinesis']['data'])
        user_id = payload['userId']
        event_type = payload['eventType']
        ts = datetime.fromisoformat(payload['timestamp'])

        # Fetch current features
        resp = table.get_item(Key={'userId': user_id})
        item = resp.get('Item', {})
        
        # Derive features: e.g., event_count_last_5min, last_event_type
        last_update = item.get('lastUpdate', ts.isoformat())
        count_5min = item.get('count5min', 0)
        then = datetime.fromisoformat(last_update)
        if ts - then < timedelta(minutes=5):
            count_5min += 1
        else:
            count_5min = 1
        
        # Update feature item
        new_item = {
            'userId': user_id,
            'lastEventType': event_type,
            'count5min': count_5min,
            'lastUpdate': ts.isoformat()
        }
        table.put_item(Item=new_item)
    return {'statusCode': 200}

2.4 Deploy and connect Lambda to Kinesis

  • Create Lambda function in AWS console or via CLI.
  • Add Kinesis stream UserEventsStream as event source with batch size and start position = TRIM_HORIZON.
  • Assign IAM role allowing kinesis:DescribeStream, kinesis:GetRecords, dynamodb:PutItem, etc.

2.5 Prepare SageMaker endpoint for inference

  • Train model offline (outside scope here) with features stored in training dataset matching real-time features.
  • Deploy model as endpoint, e.g., arn:aws:sagemaker:us-east-1:123456789012:endpoint/RealtimeModel.
  • In your application code call endpoint by fetching features from DynamoDB then invoking endpoint:
import boto3
sagemaker = boto3.client('sagemaker-runtime', region_name='us-east-1')
dynamo = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamo.Table('RealTimeFeatures')

def get_prediction(user_id, input_payload):
    resp = table.get_item(Key={'userId': user_id})
    features = resp.get('Item')
    payload = {
        'features': features,
        'input': input_payload
    }
    response = sagemaker.invoke_endpoint(
        EndpointName='RealtimeModel',
        ContentType='application/json',
        Body=json.dumps(payload)
    )
    result = json.loads(response['Body'].read().decode())
    return result

Conclusion

In this blog post you learned how to build a real-time feature store on AWS: streaming event ingestion with Kinesis, real-time feature computation with Lambda, storage in DynamoDB, and serving via SageMaker. You got specific code examples and operational considerations for production readiness. With this setup, you’re well-positioned to deliver low-latency, ML-powered applications.

Enjoy the cloud
Osama

Building an Embedding-Driven Similarity API Using a Vector Database on Oracle Database 23 ai

Introduction

In modern AI workflows, one common requirement is: given some piece of content (a document, image caption, query text), find “similar” items in your data store — not by exact keyword match, but by meaning. This is where vector embeddings + vector search come in. In this post we build a real API that:

  • Takes input text,
  • Generates an embedding,
  • Stores embeddings in Oracle’s vector-enabled database,
  • Builds a vector index,
  • Exposes an API endpoint that returns the top K similar items.

2. Setup & Embedding Generation

2.1 Provisioning

Ensure you have an Oracle Database that supports:

2.2 Embedding generation code (Python)

from sentence_transformers import SentenceTransformer
import oracledb

# Load embedding model
model = SentenceTransformer('all-MiniLM-L12-v2')

# Sample dataset
docs = [
    {"id":1, "title":"Cloud cost management", "category":"Finance", "text":"How to optimize cloud costs …"},
    {"id":2, "title":"Vendor contract termination", "category":"Legal", "text":"Steps and risks around vendor termination …"},
    # more documents...
]

# Connect to Oracle
conn = oracledb.connect(user="vec_user", password="pwd", dsn="your_dsn")
cursor = conn.cursor()

# Create table
cursor.execute("""
  CREATE TABLE doc_store (
    doc_id     NUMBER PRIMARY KEY,
    title      VARCHAR2(500),
    category   VARCHAR2(100),
    doc_text   CLOB,
    embed_vec  VECTOR
  )
""")
conn.commit()

# Insert embeddings
for d in docs:
    vec = model.encode(d["text"]).tolist()
    cursor.execute("""
      INSERT INTO doc_store(doc_id, title, category, doc_text, embed_vec)
      VALUES(:1, :2, :3, :4, :5)
    """, (d["id"], d["title"], d["category"], d["text"], vec))
conn.commit()

At this point you have your texts stored with their embedding vectors.

3. Vector Indexing & Querying

3.1 Create index

CREATE INDEX idx_doc_embed 
  ON doc_store(embed_vec)
  INDEXTYPE IS vector_ann 
  PARAMETERS('distance_metric=cosine, dimension=384');

(Modify dimension per your embedding size.)

3.2 API Query: embedding + vector similarity

from flask import Flask, request, jsonify
import oracledb

app = Flask(__name__)
model = SentenceTransformer('all-MiniLM-L12-v2')
conn = oracledb.connect(user="vec_user", password="pwd", dsn="your_dsn")
cursor = conn.cursor()

@app.route('/similar', methods=['POST'])
def similar():
    query = request.json["text"]
    q_vec = model.encode([query]).tolist()[0]
    cursor.execute("""
      SELECT doc_id, title, category, vector_distance(embed_vec, :qv) AS dist
      FROM doc_store
      ORDER BY vector_distance(embed_vec, :qv)
      FETCH FIRST 5 ROWS ONLY
    """, {"qv": q_vec})
    results = [{"doc_id": r[0], "title": r[1], "category": r[2], "distance": r[3]} for r in cursor]
    return jsonify(results)

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=8080)

When you call this API with input text, it returns the top 5 similar documents by semantic meaning.

3.3 Hybrid filtering example

Suppose you want only results in category = “Legal”. Modify the SQL:

SELECT doc_id, title, vector_distance(embed_vec, :qv) AS dist
FROM doc_store
WHERE category = 'Legal'
ORDER BY vector_distance(embed_vec, :qv)
FETCH FIRST 5 ROWS ONLY;

This combines business metadata and semantic similarity.

Conclusion

This tutorial walked you through building a vector-based similarity API: embedding generation, vector storage, indexing, query API, hybrid filtering and production readiness. While the example uses text and embeddings, the same pattern works for images, audio, logs — any data converted into vectors. For your next step, you might add: embedding refresh jobs, user feedback logging, multi-modal embeddings (text+image), or integrate into a larger Microservices architecture.

Regards

Osama

Hands-On: Building a Vector Database Pipeline with OCI and Open-Source Embeddings

Introduction

Vector databases are rapidly becoming a central element in AI workflows: storing embeddings (numeric vector representations of text, images or other data) and enabling semantic similarity search. In this post you’ll walk through a hands-on example of building a vector-db pipeline on Oracle Database 23 ai (or Autonomous/AI Database on Oracle Cloud Infrastructure) that covers:

  1. Generating embeddings with an open-source model.
  2. Loading embeddings into the vector-enabled database.
  3. Constructing vector indexes and performing similarity queries.
  4. Integrating with metadata to produce hybrid search.
  5. Discussing performance, scalability, maintenance and best practices.

I’ve reviewed the articles on Osama’s blog—while he covers vector search in theory (data type, index, RAG) you’ll find this one emphasises step-by-step code, pipeline creation and hybrid-search use-case, so it should not overlap.

1. Pipeline Overview

Here’s the architecture of the pipeline we’ll build:

  • Data source: A set of documents (in this example, internal knowledge articles).
  • Embedding generation: Use an open-source sentence-transformer (e.g., all-MiniLM-L12-v2) to convert each document text → a vector of dimension 384.
  • Storage: Use Oracle’s VECTOR data type in a table that also holds metadata (title, date, department).
  • Indexing: Create a vector index (approximate nearest-neighbour) for fast similarity search.
  • Querying: Accept a search query (text), embed it, and run a similarity search among documents. Combine vector similarity with metadata filters (e.g., department = “Legal”).
  • Serving: Return top K results ranked by semantic similarity and metadata weight.

Here is a conceptual diagram:

Text documents → embedding model → store (id, metadata, vector) → build index  
Search query → embedding → query vector + metadata filter → results  

2. Setup & Embedding Generation

Prerequisites

  • Provision Oracle Database 23 ai / AI Database on OCI (or a sharded/VM setup supporting VECTOR type).
  • Ensure the database supports the VECTOR column type and vector indexing.
  • Python environment with sentence-transformers and cx_Oracle or oracledb driver.

Embedding generation (Python)

from sentence_transformers import SentenceTransformer
import oracledb

# Load model
model = SentenceTransformer('all-MiniLM-L12-v2')

# Sample documents
docs = [
    {"id": 1, "title": "Employee onboarding policy", "dept": "HR", "text": "..."},
    {"id": 2, "title": "Vendor contract guidelines", "dept": "Legal", "text": "..."},
    # … more rows
]

# Generate embeddings
for doc in docs:
    vec = model.encode(doc['text']).tolist()
    doc['embed'] = vec

# Connect to Oracle DB
conn = oracledb.connect(user="vector_usr", password="pwd", dsn="your_dsn")
cursor = conn.cursor()

# Create table
cursor.execute("""
  CREATE TABLE kb_documents (
    doc_id     NUMBER PRIMARY KEY,
    title      VARCHAR2(500),
    dept       VARCHAR2(100),
    content    CLOB,
    doc_vector VECTOR
  )
""")
conn.commit()

# Insert rows
for doc in docs:
    cursor.execute("""
      INSERT INTO kb_documents(doc_id, title, dept, content, doc_vector)
      VALUES(:1, :2, :3, :4, :5)
    """, (doc['id'], doc['title'], doc['dept'], doc['text'], doc['embed']))
conn.commit()

Why this matters

  • You store both business metadata (title, dept) and embedding (vector) in the same table — enabling hybrid queries (metadata + similarity).
  • Using a stable, open-source embedding model ensures reproducible vectors; you can later upgrade model version and re-embed to evolve.

3. Vector Indexing & Similarity Querying

Create vector index

Once vectors are stored, you create a vector index for fast search.

CREATE INDEX idx_kb_vector 
  ON kb_documents(doc_vector)
  INDEXTYPE IS vector_ann 
  PARAMETERS('distance_metric=cosine, dimension=384');

Running a query: semantic search + metadata filter

Suppose you want to search: “vendor termination risk” but only within dept = “Legal”.

query = "vendor termination risk"
query_vec = model.encode([query]).tolist()[0]

cursor.execute("""
  SELECT doc_id, title, dept, vector_distance(doc_vector, :qv) AS dist
  FROM kb_documents
  WHERE dept = 'Legal'
  ORDER BY vector_distance(doc_vector, :qv)
  FETCH FIRST 5 ROWS ONLY
""", {"qv": query_vec})

for row in cursor:
    print(row)

Explanation

  • vector_distance computes similarity (lower = more similar, for cosine-distance variant).
  • We combine a standard filter WHERE dept = 'Legal' with the vector search.
  • The result returns the closest (by meaning) documents among the “Legal” department.

4. Enhancements & Production Considerations

Chunking & embedding size

  • For large documents (e.g., whitepapers), chunk them into ~512 token segments before embedding; store each segment as a separate row with parent document id.
  • Maintain model_version column so you can know which embedding model was used.

Hybrid ranking

You may want to combine semantic similarity + recency or popularity. For example:

SELECT doc_id, title,
       vector_distance(doc_vector, :qv) * 0.7 + (extract(day from (sysdate - created_date))/365)*0.3 AS score
FROM kb_documents
WHERE dept = 'Legal'
ORDER BY score
FETCH FIRST 5 ROWS ONLY

Here you give 70% weight to semantic distance, 30% to longer-living documents (older documents get scored higher in this case). Adjust weights based on business logic.

Scaling

  • With millions of vectors, approximate nearest-neighbour (ANN) indexing is crucial; tune index parameters such as ef_search, nlist.
  • Monitor latency of vector_distance queries, and monitor index size/maintenance cost.
  • Consider sharding or partitioning the embedding table (by dept, date) if usage grows.

Maintenance

  • When you retrain or change model version: re-compute embeddings, drop and rebuild indexes.
  • Monitor performance drift: track metrics like top-K retrieval relevance, query latency, user feedback.
  • Maintain metadata hygiene: e.g., ensure each row has a valid dept, tag, creation date.

Regards
Osama

Unlocking Semantic Search and Generative-AI with Vector Databases on OCI: A Deep Dive into Oracle’s AI Vector Search

In the age of generative AI and LLM-driven applications, one of the biggest challenges enterprises face is how to connect their business-critical data (structured and unstructured) to AI models in a performant, scalable and governed way. Enter vector databases and vector search: these allow you to represent unstructured data (documents, images, embeddings) as high-dimensional “vectors”, index them for speedy similarity or semantic search, and combine them with relational business data.

With the Oracle stack — particularly the release of Oracle Database 23 ai / AI Database 26 ai — this capability is built into the database, giving you a unified platform for relational, JSON, spatial, graph and vector data.

In this article you’ll learn:

  • What vector databases and vector search are, and why they matter for AI use-cases.
  • How Oracle’s AI Vector Search works: data types, indexes, distance functions.
  • A step-by-step example: ingest text embeddings into Oracle, query them via SQL using the VECTOR data type, combine with business metadata.
  • Architectural and operational considerations: when to use, how to scale, best practices.
  • Real-world use cases and governance implications.


Vector Databases & Why They Matter

What is a vector?

A vector is simply a list of numbers that represent features of an object: could be a sentence, document, image or audio snippet. By converting raw content into vectors (embeddings) via a model, you can perform similarity or semantic search in a high-dimensional space. Oracle+1

What is a vector database / vector search?

A vector database supports the storage, indexing and efficient querying of vectors — typically enabling nearest-neighbour or similarity search. According to Oracle:

“A vector database is any database that can natively store and manage vector embeddings and handle the unstructured data they describe.”

Importantly, in Oracle’s case, they’ve integrated vector search into their flagship database platform so you don’t need a separate vector store — you can keep relational data + vector embeddings in one system.

Why does this matter for AI and enterprise apps?

  • Search not just by keywords, but by meaning. For example: “find all documents about contracts with high risk” might match content without the word “risk” explicitly.
  • Enables Retrieval-Augmented Generation (RAG): your LLM can query your private business data (via vector search) and feed it into the prompt to generate more accurate responses.
  • Combines unstructured data (embeddings) with structured business data (metadata, JSON, graph) in one platform — leading to simpler architecture, fewer data silos

How Oracle’s AI Vector Search Works

New data type: VECTOR

With Oracle Database 23 ai / AI Database 26 ai, the VECTOR data type is introduced: you can define table columns as VECTOR, store high-dimensional embeddings, and perform vector-specific operations.

Example:

CREATE TABLE docs (
  doc_id   INT,
  doc_text CLOB,
  doc_vector VECTOR  -- storing embedding
);

Vector Indexes & Distance Metrics

To deliver performant searches, Oracle supports vector indexes and distance functions (cosine, Euclidean, etc.). You can build indexes on the VECTOR column. oracle-base.com+1

SQL Example – similarity query:

SELECT doc_id, doc_text
FROM docs
WHERE vector_distance(doc_vector, :query_vector) < 0.3
ORDER BY vector_distance(doc_vector, :query_vector)
FETCH FIRST 10 ROWS ONLY;

Embedding generation & model support

You have two broad options:

  • Generate embeddings externally (for example using an open-source transformer model) and load them into the VECTOR column.
  • Use built-in or integrated embedding models (Oracle offers embedding generation or ONNX support) so that vector creation and storage is closer to the database.

Hybrid queries: relational + vector

Because everything is in the same database, you can combine structured filters (e.g., WHERE region = 'EMEA') with vector similarity queries. This enables richer semantics. Example: “Find contract documents similar to this one and related to Europe market” in one query.

Retrieval-Augmented Generation (RAG) support

By using vector search to fetch relevant documents and feeding them into your LLM prompt, you create a pipeline where your AI model is grounded in your private enterprise data. Oracle emphasises this with the AI Vector Search feature.

3. Example Walk-through: Text Embeddings + Similarity Search on OCI

Let’s walk through a practical example of how you might use Oracle AI Vector Search on OCI.

Step 1: Set up the environment

  • Provision the Oracle AI Database 26 ai service in your OCI tenancy (or use Exadata/Autonomous with vector support).
  • Ensure compatible version (VECTOR data type support requires version 23.7+ or similar). Oracle Documentation
  • Create a user/table space for embeddings.

Step 2: Create tables for content and embeddings

CREATE TABLE knowledge_base (
  kb_id       NUMBER GENERATED BY DEFAULT AS IDENTITY,
  title       VARCHAR2(500),
  content     CLOB,
  embed_vector VECTOR
);

Step 3: Generate embeddings and load them

Example with Python using sentence-transformers to generate embeddings, and oracledb python driver to insert:

import oracledb
from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L12-v2')
texts = ["Contract for vendor A", "Service Level Agreement for cloud services", ...]
embeds = model.encode(texts).tolist()

conn = oracledb.connect(user="vector_usr", password="pwd", dsn="your_dsn")
cursor = conn.cursor()

for text, embed in zip(texts, embeds):
    cursor.execute("""
        INSERT INTO knowledge_base(title, content, embed_vector)
        VALUES(:1, :2, :3)
    """, (text, text, embed))
conn.commit()

Step 4: Build a vector index (optional but recommended)

CREATE INDEX idx_kb_embed ON knowledge_base(embed_vector)
INDEXTYPE IS vector_ann INDEX_PARAMETERS('distance_metric=cosine, dimension=384');

Step 5: Run a similarity search query

Suppose you want documents similar to a query “cloud SLA compliance vendor”:

query_text = "cloud SLA compliance vendor"
query_embed = model.encode([query_text]).tolist()[0]

cursor.execute("""
  SELECT kb_id, title, vector_distance(embed_vector, :qb) AS dist
  FROM knowledge_base
  ORDER BY vector_distance(embed_vector, :qb)
  FETCH FIRST 5 ROWS ONLY
""", {"qb": query_embed})
for row in cursor:
    print(row)

Step 6: Combine with relational filters

For example: only search documents where region = 'EMEA' and then do vector search on their embeddings.

SELECT kb_id, title
FROM knowledge_base
WHERE region = 'EMEA'
ORDER BY vector_distance(embed_vector, :qb)
FETCH FIRST 5 ROWS ONLY;

Step 7: Build RAG pipeline

  • Use vector search to fetch top K relevant documents for a given input.
  • Pass those documents plus user input to an LLM in your application layer (OCI Functions, Data Science notebook, etc).
  • Return generated answer citing which documents were used.
  • Store feedback/metrics to refine embeddings over time.

4. Architecture & Operational Considerations

When to use vector databases

Use cases:

  • Semantic document search across large unstructured corpora
  • Recommendation engines (product similarity, content suggestions)
  • Anomaly/outlier detection (embeddings of transactions or sessions)
  • RAG workflows, chatbots backed by enterprise data

Architecture variations

  • Fully integrated: Use Oracle AI Database / Exadata with vector support. One system for relational + vector.
  • Hybrid: Vector store + separate LLM + service layer (if you already have a vector DB elsewhere). But the integrated approach simplifies data movement and governance.
    Oracle emphasises eliminating data silos by embedding vector search within the database.

Performance & scaling

  • Choose appropriate vector index type (ANN, HNSW, IVF) according to scale.
  • Ensure correct dimension of embeddings (e.g., 384, 768) and index parameters (e.g., nlist,nprobe).
  • Use horizontal scalability: Oracle supports sharding, parallel SQL, and Exadata acceleration for vector workloads.
  • Keep control of memory and storage: high-dimensional embeddings and large volumes need planning (embedding store size, index maintenance).

Data governance, security & maintainability

  • Embeddings often represent sensitive data: apply encryption / access controls as you would relational data.
  • Versioning of embeddings: if you regenerate embeddings (new model version), you need to update vectors & indexes.
  • Monitoring & freshness: track metrics like query latency, drift in embeddings, relevance degradation.
  • Explainability: embeddings are opaque. When building enterprise apps, you may need audit trails showing “why” a result was returned.

Best practices

  • Define embedding generation strategy: consistent model, dimension size, pipeline for updating.
  • Build hybrid search queries to mix semantic + business filters.
  • Keep embedding tables small and well-partitioned (e.g., by date or region) if you expect high volumes.
  • Automate index rebuilds/maintenance during low traffic periods.
  • Cache top results where appropriate if you have frequent similar queries.
  • Perform A/B testing: compare semantic search vs keyword search to measure lift.
  • Document and govern vector fields: vector type, model version, embedding timestamp.

5. Use-Cases and Business Value

Use-case: Contract Search & Compliance

Imagine a legal department with thousands of contracts. Traditional keyword search misses meaning (“vendor terminated for cause”) if wording varies. With vector search you embed all contracts, allow semantic queries (“supplier termination risk Europe”), retrieve relevant ones quickly, and then feed into an LLM to summarise risk across contracts.

Use-case: Product Recommendation & RAG-enabled chatbot

Retailer: store product embeddings + user behaviour embeddings in vector table. When a user asks “What new hiking boots would you recommend given my past purchases?”, the system vector-searches similar items + user profile, then uses RAG+LLM to explain recommendations (“Based on your past purchase of Trailblazer 200 and preference for Gore-Tex, here are these three options…”).

Business value

  • Faster time-to-insight from unstructured data.
  • More relevant search & recommendations → higher engagement or productivity.
  • Better AI confidence: feeding enterprise data through vector search into LLM reduces hallucinations by anchoring responses.
  • Unified cost & architecture: no separate vector store means less operational overhead and fewer data-movement risks.

Automating Cost-Governance Workflows in Oracle Cloud Infrastructure (OCI) with APIs & Infrastructure as Code

Introduction

Cloud cost management isn’t just about checking invoices once a month — it’s about embedding automation, governance, and insights into your infrastructure so that your engineering teams make cost-aware decisions in real time. With OCI, you have native tools (Cost Analysis, Usage APIs, Budgets, etc.) and infrastructure-as-code (IaC) tooling that can help turn cost governance from an after-thought into a proactive part of your DevOps workflow.

In this article you’ll learn how to:

  1. Extract usage and cost data via the OCI Usage API / Cost Reports.
  2. Define IaC workflows (e.g., with Terraform) that enforce budget/usage guardrails.
  3. Build a simple example where you automatically tag resources, monitor spend by tag, and alert/correct when thresholds are exceeded.
  4. Discuss best practices, pitfalls, and governance recommendations for embedding FinOps into OCI operations.

1. Understanding OCI Cost & Usage Data

What data is available?

OCI provides several cost/usage-data mechanisms:

  • The Cost Analysis tool in the console allows you to view trends by service, compartment, tag, etc. Oracle Docs+1
  • The Usage/Cost Reports (CSV format) which you can download or programmatically access via the Usage API. Oracle Docs+1
  • The Usage API (CLI/SDK) to query usage-and-cost programmatically. Oracle Docs+1

Why this matters

By surfacing cost data at a resource, compartment, or tag level, teams can answer questions like:

  • “Which tag values are consuming cost disproportionately?”
  • “Which compartments have heavy spend growth month-over-month?”
  • “Which services (Compute, Storage, Database, etc.) are the highest spenders and require optimization?”

Example: Downloading a cost report via CLI

Here’s a Python/CLI snippet that shows how to download a cost-report CSV from your tenancy:

oci os object get \
  --namespace-name bling \
  --bucket-name <your-tenancy-OCID> \
  --name reports/usage-csv/<report_name>.csv.gz \
  --file local_report.csv.gz
import oci
config = oci.config.from_file("~/.oci/config", "DEFAULT")
os_client = oci.object_storage.ObjectStorageClient(config)
namespace = "bling"
bucket = "<your-tenancy-OCID>"
object_name = "reports/usage-csv/2025-10-19-report-00001.csv.gz"

resp = os_client.get_object(namespace, bucket, object_name)
with open("report-2025-10-19.csv.gz", "wb") as f:
    for chunk in resp.data.raw.stream(1024*1024, decode_content=False):
        f.write(chunk)

2. Defining Cost-Governance Workflows with IaC

Once you have data flowing in, you can enforce guardrails and automate actions. Here’s one example pattern.

a) Enforce tagging rules

Ensure that every resource created in a compartment has a cost_center tag (for example). You can do this via policy + IaC.

# Example Terraform policy for tagging requirement
resource "oci_identity_tag_namespace" "governance" {
  compartment_id = var.compartment_id
  display_name   = "governance_tags"
  is_retired     = false
}

resource "oci_identity_tag_definition" "cost_center" {
  compartment_id = var.compartment_id
  tag_namespace_id = oci_identity_tag_namespace.governance.id
  name            = "cost_center"
  description     = "Cost Center code for FinOps tracking"
  is_retired      = false
}

You can then add an IAM policy that prevents creation of resources if the tag isn’t applied (or fails to meet allowed values). For example:

Allow group ComputeAdmins to manage instance-family in compartment Prod
  where request.operation = “CreateInstance”
  and request.resource.tag.cost_center is not null

b) Monitor vs budget

Use the Usage API or Cost Reports to pull monthly spend per tag, then compare against defined budgets. If thresholds are exceeded, trigger an alert or remediation.

Here’s an example Python pseudo-code:

from datetime import datetime, timedelta
import oci

config = oci.config.from_file()
usage_client = oci.usage_api.UsageapiClient(config)

today = datetime.utcnow()
start = today.replace(day=1)
end = today

req = oci.usage_api.models.RequestSummarizedUsagesDetails(
    tenant_id = config["tenancy"],
    time_usage_started = start,
    time_usage_ended   = end,
    granularity        = "DAILY",
    group_by           = ["tag.cost_center"]
)

resp = usage_client.request_summarized_usages(req)
for item in resp.data.items:
    tag_value = item.tag_map.get("cost_center", "untagged")
    cost     = float(item.computed_amount or 0)
    print(f"Cost for cost_center={tag_value}: {cost}")

    if cost > budget_for(tag_value):
        send_alert(tag_value, cost)
        take_remediation(tag_value)

c) Automated remediation

Remediation could mean:

  • Auto-shut down non-production instances in compartments after hours.
  • Resize or terminate idle resources.
  • Notify owners of over-spend via email/Slack.

Terraform, OCI Functions and Event-Service can help orchestrate that. For example, set up an Event when “cost by compartment exceeds X” → invoke Function → tag resources with “cost_alerted” → optional shutdown.

3. Putting It All Together

Here is a step-by-step scenario:

  1. Define budget categories – e.g., cost_center codes: CC-101, CC-202, CC-303.
  2. Tag resources on creation – via policy/IaC ensure all resources include cost_center tag with one of those codes.
  3. Collect cost data – using Usage API daily, group by tag.cost_center.
  4. Evaluate current spend vs budget – for each code, compare cumulative cost for current month against budget.
  5. If over budget – then:
    • send an alert to the team (via SNS, email, Slack)
    • optionally trigger remediation: e.g., stop non-critical compute in that cost center’s compartments.
  6. Dashboard & visibility – load cost data into a BI tool (could be OCI Analytics Cloud or Oracle Analytics) with trends, forecasts, anomaly detection. Use the “Show cost” in OCI Ops Insights to view usage & forecast cost. Oracle Docs
  7. Continuous improvement – right-size instances, pause dev/test at night, switch to cheaper shapes or reserved/commit models (depending on your discount model). See OCI best practice guide for optimizing cost. Oracle Docs

Example snippet – alerting logic in CLI

# example command to get summarized usage for last 7 days
oci usage-api request-summarized-usages \
  --tenant-id $TENANCY_OCID \
  --time-usage-started $(date -u -d '-7 days' +%Y-%m-%dT00:00:00Z) \
  --time-usage-ended   $(date -u +%Y-%m-%dT00:00:00Z) \
  --granularity DAILY \
  --group-by "tag.cost_center" \
  --query "data.items[?tagMap.cost_center=='CC-101'].computedAmount" \
  --raw-output

Enjoy the OCI
Osama