Oracle Autonomous Database (ADB) on Oracle Cloud Infrastructure (OCI) is a cloud service that leverages machine learning to automate routine database tasks, offering users a self-driving, self-securing, and self-repairing database solution. This blog post will delve into setting up and interacting with an Autonomous Transaction Processing (ATP) instance, showcasing how to deploy a sample application to demonstrate its capabilities.
Overview of Oracle Autonomous Database
Self-Driving: Automates performance tuning and scaling.
Self-Repairing: Offers built-in high availability and backup solutions.
Step 1: Creating an Autonomous Database
Log into OCI Console: Go to console.oracle.com and log in to your account.
Create Autonomous Database:
Navigate to the Database section and click on Autonomous Database.
Click on Create Autonomous Database.
Fill in the required details:
Display Name: MyATPDB
Database Name: MYATPDB
Database Type: Autonomous Transaction Processing
CPU Count: 1 (can be adjusted later)
Storage: 1 TB (adjust as necessary)
Configure the Admin Password and ensure you store it securely.
Click Create Autonomous Database.
Step 2: Setting Up the Network
2.1: Create a Virtual Cloud Network (VCN)
Navigate to the Networking Section.
Click on Create VCN and fill in the necessary details:
VCN Name: MyVCN
CIDR Block: 10.0.0.0/16
Subnets: Create a public subnet with a CIDR block of 10.0.0.0/24.
2.2: Configure Security Lists
In the VCN settings, add a security rule to allow traffic to your database:
Source CIDR: Your public IP address (for SQL Developer access).
IP Protocol: TCP
Source Port Range: All
Destination Port Range: 1522 (default for ADB)
Step 3: Connecting to the Autonomous Database
3.1: Download Wallet
In the ADB console, navigate to your database and click on DB Connection.
Download the Client Credentials (Wallet). This will be a zip file containing the wallet and connection files.
3.2: Set Up SQL Developer
Open Oracle SQL Developer.
Go to Tools > Preferences > Database > Advanced and set the Use Wallet option to true.
In the Connections pane, click on the green + icon to create a new connection.
Set the connection type to Cloud Wallet, then specify:
Connection Name: MyATPConnection
Username: ADMIN
Password: Your admin password
Wallet Location: Path to the unzipped wallet directory
Click Test to verify the connection, then click Save.
Step 4: Creating a Sample Schema and Table
Once connected to your database, execute the following SQL commands to create a sample schema and a table:
-- Create a new user/schema CREATE USER sample_user IDENTIFIED BY SamplePassword; GRANT ALL PRIVILEGES TO sample_user;
-- Connect as the new user ALTER SESSION SET CURRENT_SCHEMA = sample_user;
-- Create a sample table CREATE TABLE employees ( employee_id NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY, first_name VARCHAR2(50) NOT NULL, last_name VARCHAR2(50) NOT NULL, email VARCHAR2(100) NOT NULL UNIQUE, hire_date DATE DEFAULT CURRENT_DATE );
-- Insert sample data INSERT INTO employees (first_name, last_name, email) VALUES ('John', 'Doe', 'john.doe@example.com');
INSERT INTO employees (first_name, last_name, email) VALUES ('Jane', 'Smith', 'jane.smith@example.com');
COMMIT;
Querying the Data
To verify the data insertion, run:
SELECT * FROM employees;
Step 5: Using Autonomous Database Features
5.1: Auto-Scaling
ADB allows you to scale compute and storage resources automatically. To enable auto-scaling:
Navigate to your Autonomous Database instance in the OCI console.
Click on Edit.
Enable Auto Scaling for both CPU and storage.
Specify the minimum and maximum resources.
5.2: Monitoring Performance
Utilize the Performance Hub feature to monitor real-time database performance. You can view metrics like:
In this blog, we will explore how to build a scalable data pipeline on Oracle Cloud Infrastructure (OCI) using OCI Data Flow. We’ll cover the end-to-end process, from setting up OCI Data Flow to processing large datasets, and integrating with other OCI services.
Introduction to OCI Data Flow
Overview of OCI Data Flow and its key features.
Benefits of using a serverless, scalable data processing service.
Common use cases for OCI Data Flow, including ETL, real-time analytics, and machine learning.
Setting Up OCI Data Flow
Prerequisites
An active Oracle Cloud account.
Necessary permissions and quotas for creating OCI resources.
Configuration Steps
Create a Data Flow Application:
Navigate to the OCI Console and open the Data Flow service.
Click on “Create Application” and provide the necessary details.
Define your application’s parameters and Spark version.
Configure Networking:
Set up Virtual Cloud Network (VCN) and subnets.
Ensure proper security lists and network security groups (NSGs) for secure communication.
3. Creating a Scalable Data Pipeline
Designing the Data Pipeline
Outline the flow of data from source to target.
Example pipeline: Ingest data from OCI Object Storage, process it using Data Flow, and store results in an Autonomous Database.
Developing Data Flow Jobs
Write Spark jobs in Scala, Python, or Java.
Example Spark job to process data:
val df = spark.read.json("oci://<bucket_name>@<namespace>/data/")
df.filter("age > 30").write.csv("oci://<bucket_name>@<namespace>/output/")
Deploying and Running Jobs
Deploy the Spark job to OCI Data Flow.
Schedule and manage job runs using OCI Console or CLI.
Processing Large Datasets
Handling Big Data
Techniques for optimizing Spark jobs for large datasets.
Using partitions and caching to improve performance.
Example: Processing a 1TB Dataset
Step-by-step guide to ingest, process, and analyze a 1TB dataset using OCI Data Flow.
5. Integrating with Other OCI Services
OCI Object Storage
Use Object Storage for data ingestion and storing intermediate results.
Configure Data Flow to directly access Object Storage buckets.
OCI Autonomous Database
Store processed data in an Autonomous Database.
Example of loading data from Data Flow to Autonomous Database.
OCI Streaming
Integrate with OCI Streaming for real-time data processing.
Example: Stream processing pipeline using OCI Streaming and Data Flow.
Optimizing Data Flow Jobs
Performance Tuning
Tips for optimizing resource usage and job execution times.
Adjusting executor memory, cores, and dynamic allocation settings.
Cost Management
Strategies for minimizing costs while running Data Flow jobs.
Monitor job execution and cost metrics using the OCI Console.
putting in place a safe data pipeline that uses OCI Data Integration to import log data into an OCI Autonomous Database, OCI Data Flow to process the log data, and OCI Object Storage bucket to modify it. To protect the security and integrity of the data, the pipeline has access controls, encryption, and monitoring.
Setting up the OCI CLI (Command Line Interface) involves several steps to authenticate, configure, and start using it effectively. Here’s a detailed guide to help you set up OCI CLI.
Step 1: Prerequisites
OCI Account: Ensure you have an Oracle Cloud Infrastructure account.
Access: Make sure you have appropriate permissions to create and manage resources.
Operating System: OCI CLI supports Windows, macOS, and Linux distributions.
Step 2: Install OCI CLI
Install Python: OCI CLI requires Python 3.5 or later. Install Python if it’s not already installed:
On Linux:
sudo apt update sudo apt install python3
On macOS: Install via Homebrew:
brew install python3
On Windows: Download and install Python from python.org.
Install OCI CLI: Use pip, Python’s package installer, to install OCI CLI:
pip3 install oci-cli
Step 3: Configure OCI CLI
Generate API Signing Keys: OCI CLI uses API signing keys for authentication. If you haven’t created keys yet, generate them through the OCI Console:
Go to Identity → Users.
Select your user.
Under Resources, click on API Keys.
Generate a new key pair if none exists.
Configure OCI CLI: After installing OCI CLI, configure it with your tenancy, user details, and API key:
Open a terminal or command prompt.
Run the following command:
oci setup config
Enter a location for your config file: Choose a path where OCI CLI configuration will be stored (default is ~/.oci/config).
Enter a user OCID: Enter your user OCID (Oracle Cloud Identifier).
Enter a tenancy OCID: Enter your tenancy OCID.
Enter a region name: Choose the OCI region where your resources are located (e.g., us-ashburn-1).
Do you want to generate a new API Signing RSA key pair?: If you haven’t generated API keys, choose yes and follow the prompts.
Once configured, OCI CLI will create a configuration file (config) and a key file (oci_api_key.pem) in the specified location.
In today’s rapidly evolving digital landscape, choosing the right cloud infrastructure is crucial for organizations aiming to scale, secure, and innovate efficiently. Oracle Cloud Infrastructure (OCI) stands out as a robust platform offering a comprehensive suite of cloud services tailored for enterprise-grade performance and reliability.
1. Overview of OCI: Oracle Cloud Infrastructure (OCI) provides a highly scalable and secure cloud computing platform designed to meet the needs of both traditional enterprise workloads and modern cloud-native applications. Key components include:
Compute Services: OCI offers Virtual Machines (VMs) for general-purpose and high-performance computing, Bare Metal instances for demanding workloads, and Container Engine for Kubernetes clusters.
Storage Solutions: Includes Block Volumes for persistent storage, Object Storage for scalable and durable data storage, and File Storage for file-based workloads.
Networking Capabilities: Virtual Cloud Network (VCN) enables customizable network topologies with VPN and FastConnect for secure and high-bandwidth connectivity. Load Balancer distributes incoming traffic across multiple instances.
Database Options: Features Autonomous Database for self-driving, self-securing, and self-repairing databases, MySQL Database Service for fully managed MySQL databases, and Exadata Cloud Service for high-performance databases.
Example 2: Implementing Autonomous Database
Autonomous Database handles routine tasks like patching, backups, and updates automatically, allowing the IT team to focus on enhancing customer experiences.
Security and Compliance: OCI provides robust security features such as Identity and Access Management (IAM) for centralized control over access policies, Security Zones for isolating critical workloads, and Web Application Firewall (WAF) for protecting web applications from threats.
Management and Monitoring: OCI’s Management Tools offer comprehensive monitoring, logging, and resource management capabilities. With tools like Oracle Cloud Infrastructure Monitoring and Logging, organizations gain insights into performance metrics and operational logs, ensuring proactive management and troubleshooting.
Integration and Developer Tools: For seamless integration, OCI offers Oracle Integration Cloud and API Gateway, enabling organizations to connect applications and services securely across different environments. Developer Tools like Oracle Cloud Developer Tools and SDKs support agile development and deployment practices.
Oracle Cloud Infrastructure (OCI) emerges as a robust solution for enterprises seeking a secure, scalable, and high-performance cloud platform. Whether it’s deploying mission-critical applications, managing large-scale databases, or ensuring compliance and security, OCI offers the tools and capabilities to drive innovation and business growth.
Amazon RDS is a web service that makes it easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity while managing time-consuming database administration tasks. This allows you to focus on your applications and business. Amazon RDS gives you access to the full capabilities of a MySQL, Oracle, SQL Server, or Aurora database engines. This means that the code, applications, and tools you already use today with your existing databases can be used with Amazon RDS.
Amazon RDS automatically patches the database software and backs up your database. It stores the backups for a user-defined retention period and provides point-in-time recovery. You benefit from the flexibility of scaling the compute resources or storage capacity associated with your relational DB instance with a single API call.
Amazon RDS is available on six database engines, which optimize for memory, performance, or I/O. The database engines include:
Amazon Aurora
PostgreSQL
MySQL
MariaDB
Oracle Database
SQL Server
Amazon RDS Multi-AZ deployments
Amazon RDS Multi-AZ deployments provide enhanced availability and durability for DB instances, making them a natural fit for production database workloads. When you provision a Multi-AZ DB instance, Amazon RDS synchronously replicates the data to a standby instance in a different Availability Zone.
You can modify your environment from Single-AZ to Multi-AZ at any time. Each Availability Zone runs on its own physically distinct, independent infrastructure and is engineered to be highly reliable. Upon failure, the secondary instance picks up the load. Note that this is not used for read-only scenarios.
Read replicas
With Amazon RDS, you can create read replicas of your database. Amazon automatically keeps them in sync with the primary DB instance. Read replicas are available in Amazon RDS for Aurora, MySQL, MariaDB, PostgreSQL, Oracle, and Microsoft SQL Server. Read replicas can help you:
Relieve pressure on your primary node with additional read capacity.
Bring data close to your applications in different AWS Regions.
Promote a read replica to a standalone instance as a disaster recovery (DR) solution if the primary DB instance fails.
You can add read replicas to handle read workloads so your primary database doesn’t become overloaded with read requests. Depending on the database engine, you can also place your read replica in a different Region from your primary database. This gives you the ability to have a read replica closer to a particular locality.
You can configure a source database as Multi-AZ for high availability and create a read replica (in Single-AZ) for read scalability. With RDS for MySQL and MariaDB, you can also set the read replica as Multi-AZ, and as a DR target. When you promote the read replica to be a standalone database, it will be replicated to multiple Availability Zones.
Amazon DynamoDB tables
DynamoDB is a fully managed NoSQL database service. DynamoDB uses primary keys to uniquely identify each item in a table and secondary indexes to provide more querying flexibility. When creating a table, you must specify a table name and a primary key. These are the only two required entities.
There are two types of primary keys supported:
Simple primary key: A simple primary key is composed of just one attribute designated as the partition key. If you use only the partition, no two items can have the same value.
Composite primary key: A composite primary key is composed of both a partition key and a sort key. In this case the partition key value for multiple items can be the same, but their sort key values must be different.
You work with the core components: tables, items, and attributes. A table is a collection of items, and each item is a collection of attributes. In the example above, the table includes two items, with primary keys Nikki Wolf and John Stiles. The item with the primary key Nikki Wolf includes three attributes: Role, Year, and Genre. The primary key for John Stiles includes a Height attribute, and it does not include the Genre attribute.
Amazon DynamoDB consistency options
When your application writes data to a DynamoDB table and receives an HTTP 200 response (OK), the write has occurred and is durable. The data is eventually consistent across all storage locations, usually within one second or less. DynamoDB supports eventually consistent and strongly consistent reads.
DynamoDB uses eventually consistent reads, unless you specify otherwise. Read operations (such as GetItem, Query, and Scan) provide a ConsistentRead parameter. If you set this parameter to true, DynamoDB uses strongly consistent reads during the operation.
EVENTUALLY CONSISTENT READS
When you read data from a DynamoDB table, the response might not reflect the results of a recently completed write operation. The response might include some stale data. If you repeat your read request after a short time, the response should return the latest data.
STRONGLY CONSISTENT READS
When you request a strongly consistent read, DynamoDB returns a response with the most up-to-date data, reflecting the updates from all prior write operations that were successful. A strongly consistent read might not be available if there is a network delay or outage.
Amazon DynamoDB global tables
A global table is a collection of one or more DynamoDB tables, all owned by a single AWS account, identified as replica tables. A replica table (or replica, for short) is a single DynamoDB table that functions as part of a global table. Each replica stores the same set of data items. Any given global table can only have one replica table per Region, and every replica has the same table name and the same primary key schema.
DynamoDB global tables provide a fully managed solution for deploying a multi-Region, multi-active database, without having to build and maintain your own replication solution. When you create a global table, you specify the AWS Regions where you want the table to be available. DynamoDB performs all the necessary tasks to create identical tables in these Regions and propagate ongoing data changes to all of them.
Database Caching
Without caching, EC2 instances read and write directly to the database. With a caching, instances first attempt to read from a cache which uses high performance memory. They use a cache cluster that contains a set of cache nodes distributed between subnets. Resources within those subnets have high-speed access to those nodes.
Common caching strategies
There are multiple strategies for keeping information in the cache in sync with the database. Two common caching strategies include lazy loading and write-through.
Lazy loading
In lazy loading, updates are made to the database without updating the cache. In the case of a cache miss, the information retrieved from the database can be subsequently written to the cache. Lazy loading ensures that the data loaded in the cache is data needed by the application but can result in high cache-miss-to-cache-hit ratios in some use cases.
Write-through
An alternative strategy is to write through to the cache every time the database is accessed. This approach results in fewer cache misses. This improves performance but requires additional storage for data, which may not be needed by the applications.
Managing your cache
As your application writes to the cache, you need to consider cache validity and make sure that the data written to the cache is accurate. You also need to develop a strategy for managing cache memory. When your cache is full, you determine which items should be deleted by setting an eviction policy.
CACHE VALIDITY
Lazy loading allows for stale data but doesn’t fail with empty nodes. Write-through ensures that data is always fresh but can fail with empty nodes and can populate the cache with superfluous data. By adding a time to live (TTL) value to each write to the cache, you can ensure fresh data without cluttering up the cache with extra data.
TTL is an integer value that specifies the number of seconds or milliseconds, until the key expires. When an application attempts to read an expired key, it is treated as though the data is not found in cache, meaning that the database is queried and the cache is updated. This keeps data from getting too stale and requires that values in the cache are occasionally refreshed from the database.
MANAGING MEMORY
When cache memory is full, the cache engine removes data from memory to make space for new data. It chooses this data based on the eviction policy you set. An eviction policy evaluates the following characteristics of your data:
Which were accessed least recently?
Which have been accessed least frequently?
Which have a TTL set and the TTL value?
Amazon Elasticache
Amazon ElastiCache is a web service that makes it easy to set up, manage, and scale a distributed in-memory data store or cache environment in the cloud. When you’re using a cache for a backend data store, a side-cache is perhaps the most commonly known approach. Redis and Memcached are general-purpose caches that are decoupled from the underlying data store.
Use ElastiCache for Memcached for data-intensive apps. The service works as an in-memory data store and cache to support the most demanding applications requiring sub-millisecond response times. It is fully managed, scalable, and secure—making it an ideal candidate for cases where frequently accessed data must be in memory. The service is a popular choice for web, mobile apps, gaming, ad tech, and e-commerce.
ElastiCache for Redis is an in-memory data store that provides sub-millisecond latency at internet scale. It can power the most demanding real-time applications in gaming, ad tech, e-commerce, healthcare, financial services, and IoT.
ElastiCache engines
ElastiCache for Memcached
ElastiCache for Redis
Simple cache to offload database burden
Yes
Yes
Ability to scale horizontally for writes and storage
Yes
Yes (if cluster mode is enabled)
Multi-threaded performance
Yes
–
Advanced data types
–
Yes
Sorting and ranking data sets
–
Yes
Pub and sub capability
–
Yes
Multi-AZ with Auto Failover
–
Yes
Backup and restore
–
Yes
Amazon DynamoDB Accelerator
DynamoDB is designed for scale and performance. In most cases, the DynamoDB response times can be measured in single-digit milliseconds. However, there are certain use cases that require response times in microseconds. For those use cases, DynamoDB Accelerator (DAX) delivers fast response times for accessing eventually consistent data.
DAX is an Amazon DynamoDB compatible caching service that provides fast in-memory performance for demanding applications.
AWS Database Migration Service
AWS Database Migration Service (AWS DMS) supports migration between the most widely used databases like Oracle, PostgreSQL, SQL Server, Amazon Redshift, Aurora, MariaDB, and MySQL. AWS DMS supports both homogeneous (same engine) and heterogeneous (different engines) migrations.
The service can be used to migrate between databases on Amazon EC2, Amazon RDS, and on-premises. Either the target or the source database must be located in Amazon EC2. It cannot be used to migrate between two on-premises databases.
AWS DMS automatically handles formatting of the source data for consumption by the target database. It does not perform schema or code conversion.
For homogenous migrations, you can use native tools to perform these conversions. For heterogeneous migrations, you can use the AWS Schema Conversion Tool (AWS SCT).
AWS Schema Conversion Tool
The AWS Schema Conversion Tool (AWS SCT) automatically converts the source database schema and a majority of the database code objects. The conversion includes views, stored procedures, and functions. They are converted to a format that is compatible with the target database. Any objects that cannot be automatically converted are marked so that they can be manually converted to complete the migration.
Source databases
Target databases on AWS
Oracle database Oracle data warehouse Azure SQL SQL server Teradata IBM Netezza Greenplum HPE Vertica MySQL and MariaDB PostgreSQL Aurora IBM DB2 LUW Apache Cassandra SAP ASE
AWS SCT
MySQL PostgreSQL Oracle AmazonDB RDS for MySQL Aurora for MySQL RDS for PostgreSQL Aurora PostgreSQL
The AWS SCT can also scan your application source code for embedded SQL statements and convert them as part of a database schema conversion project. During this process, the AWS SCT performs cloud native code optimization by converting legacy Oracle and SQL Server functions to their equivalent AWS service, modernizing the applications at the same time of migration.
Oracle Database 23c Free Version Now Available to Developers.
The new Oracle Database 23c Free – Developer Release is a free version of the trusted Oracle Database used by businesses of all sizes around the globe. Obtaining the only converged database that works with any data model and any task type is as easy as downloading it from the internet with no oracle.com user account or license click-through requirements.
If you’re looking for a free database to use for developing data-driven applications, look no further than Oracle Database 23c Free – Developer Release. Users can upgrade to other Oracle Database products at any moment because of its backwards compatibility with Oracle Database Enterprise Edition and Oracle Database cloud services.
Infrastructure as code is one of the most common uses to set up a cloud environment, either Cloudformation, Oracle resource stack, or 3rd party such as Pulumi or terraform.
For this, I would like to share the tools I use for the perfect IaC tools that could be useful for someone
Quest Oracle Community is home to 25,000+ users of JD Edwards, PeopleSoft, Oracle Cloud apps and Oracle Database products. We connect Oracle users to technology leaders and Oracle experts from companies who are driving innovation and leading through their use of Oracle products.
The Quest Oracle Community is dedicated to helping Oracle users develop skills and expand knowledge by connecting with other Oracle users and experts for education and networking.