AWS database services part 2

Part one https://osamaoracle.com/2023/01/03/aws-database-services/

Amazon RDS

Amazon RDS is a web service that makes it easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity while managing time-consuming database administration tasks. This allows you to focus on your applications and business. Amazon RDS gives you access to the full capabilities of a MySQL, Oracle, SQL Server, or Aurora database engines. This means that the code, applications, and tools you already use today with your existing databases can be used with Amazon RDS. 

Amazon RDS automatically patches the database software and backs up your database. It stores the backups for a user-defined retention period and provides point-in-time recovery. You benefit from the flexibility of scaling the compute resources or storage capacity associated with your relational DB instance with a single API call.

Amazon RDS is available on six database engines, which optimize for memory, performance, or I/O. The database engines include:

  • Amazon Aurora
  • PostgreSQL
  • MySQL
  • MariaDB
  • Oracle Database
  • SQL Server

Amazon RDS Multi-AZ deployments

Amazon RDS Multi-AZ deployments provide enhanced availability and durability for DB instances, making them a natural fit for production database workloads. When you provision a Multi-AZ DB instance, Amazon RDS synchronously replicates the data to a standby instance in a different Availability Zone. 

You can modify your environment from Single-AZ to Multi-AZ at any time. Each Availability Zone runs on its own physically distinct, independent infrastructure and is engineered to be highly reliable. Upon failure, the secondary instance picks up the load. Note that this is not used for read-only scenarios.

Read replicas

With Amazon RDS, you can create read replicas of your database. Amazon automatically keeps them in sync with the primary DB instance. Read replicas are available in Amazon RDS for Aurora, MySQL, MariaDB, PostgreSQL, Oracle, and Microsoft SQL Server. Read replicas can help you:

  • Relieve pressure on your primary node with additional read capacity.
  • Bring data close to your applications in different AWS Regions.
  • Promote a read replica to a standalone instance as a disaster recovery (DR) solution if the primary DB instance fails.

You can add read replicas to handle read workloads so your primary database doesn’t become overloaded with read requests. Depending on the database engine, you can also place your read replica in a different Region from your primary database. This gives you the ability to have a read replica closer to a particular locality.

You can configure a source database as Multi-AZ for high availability and create a read replica (in Single-AZ) for read scalability. With RDS for MySQL and MariaDB, you can also set the read replica as Multi-AZ, and as a DR target. When you promote the read replica to be a standalone database, it will be replicated to multiple Availability Zones.

Amazon DynamoDB tables

DynamoDB is a fully managed NoSQL database service. DynamoDB uses primary keys to uniquely identify each item in a table and secondary indexes to provide more querying flexibility. When creating a table, you must specify a table name and a primary key. These are the only two required entities.

There are two types of primary keys supported:

  • Simple primary key: A simple primary key is composed of just one attribute designated as the partition key. If you use only the partition, no two items can have the same value.
  • Composite primary key: A composite primary key is composed of both a partition key and a sort key. In this case the partition key value for multiple items can be the same, but their sort key values must be different.

You work with the core components: tables, items, and attributes. A table is a collection of items, and each item is a collection of attributes. In the example above, the table includes two items, with primary keys Nikki Wolf and John Stiles. The item with the primary key Nikki Wolf includes three attributes: Role, Year, and Genre. The primary key for John Stiles includes a Height attribute, and it does not include the Genre attribute.

Amazon DynamoDB consistency options

When your application writes data to a DynamoDB table and receives an HTTP 200 response (OK), the write has occurred and is durable. The data is eventually consistent across all storage locations, usually within one second or less. DynamoDB supports eventually consistent and strongly consistent reads.

DynamoDB uses eventually consistent reads, unless you specify otherwise. Read operations (such as GetItem, Query, and Scan) provide a ConsistentRead parameter. If you set this parameter to true, DynamoDB uses strongly consistent reads during the operation.

EVENTUALLY CONSISTENT READS

When you read data from a DynamoDB table, the response might not reflect the results of a recently completed write operation. The response might include some stale data. If you repeat your read request after a short time, the response should return the latest data.

STRONGLY CONSISTENT READS

When you request a strongly consistent read, DynamoDB returns a response with the most up-to-date data, reflecting the updates from all prior write operations that were successful. A strongly consistent read might not be available if there is a network delay or outage.

Amazon DynamoDB global tables

global table is a collection of one or more DynamoDB tables, all owned by a single AWS account, identified as replica tables. A replica table (or replica, for short) is a single DynamoDB table that functions as part of a global table. Each replica stores the same set of data items. Any given global table can only have one replica table per Region, and every replica has the same table name and the same primary key schema.

DynamoDB global tables provide a fully managed solution for deploying a multi-Region, multi-active database, without having to build and maintain your own replication solution. When you create a global table, you specify the AWS Regions where you want the table to be available. DynamoDB performs all the necessary tasks to create identical tables in these Regions and propagate ongoing data changes to all of them.

Database Caching

Without caching, EC2 instances read and write directly to the database. With a caching, instances first attempt to read from a cache which uses high performance memory. They use a cache cluster that contains a set of cache nodes distributed between subnets. Resources within those subnets have high-speed access to those nodes.

Common caching strategies

There are multiple strategies for keeping information in the cache in sync with the database. Two common caching strategies include lazy loading and write-through.

Lazy loading

In lazy loading, updates are made to the database without updating the cache. In the case of a cache miss, the information retrieved from the database can be subsequently written to the cache. Lazy loading ensures that the data loaded in the cache is data needed by the application but can result in high cache-miss-to-cache-hit ratios in some use cases.

Write-through

An alternative strategy is to write through to the cache every time the database is accessed. This approach results in fewer cache misses. This improves performance but requires additional storage for data, which may not be needed by the applications.

Managing your cache

As your application writes to the cache, you need to consider cache validity and make sure that the data written to the cache is accurate. You also need to develop a strategy for managing cache memory. When your cache is full, you determine which items should be deleted by setting an eviction policy.

CACHE VALIDITY

Lazy loading allows for stale data but doesn’t fail with empty nodes. Write-through ensures that data is always fresh but can fail with empty nodes and can populate the cache with superfluous data. By adding a time to live (TTL) value to each write to the cache, you can ensure fresh data without cluttering up the cache with extra data. 

TTL is an integer value that specifies the number of seconds or milliseconds, until the key expires. When an application attempts to read an expired key, it is treated as though the data is not found in cache, meaning that the database is queried and the cache is updated. This keeps data from getting too stale and requires that values in the cache are occasionally refreshed from the database.

MANAGING MEMORY

When cache memory is full, the cache engine removes data from memory to make space for new data. It chooses this data based on the eviction policy you set. An eviction policy evaluates the following characteristics of your data:

  • Which were accessed least recently?
  • Which have been accessed least frequently?
  • Which have a TTL set and the TTL value?

Amazon Elasticache

Amazon ElastiCache is a web service that makes it easy to set up, manage, and scale a distributed in-memory data store or cache environment in the cloud. When you’re using a cache for a backend data store, a side-cache is perhaps the most commonly known approach. Redis and Memcached are general-purpose caches that are decoupled from the underlying data store.

Use ElastiCache for Memcached for data-intensive apps. The service works as an in-memory data store and cache to support the most demanding applications requiring sub-millisecond response times. It is fully managed, scalable, and secure—making it an ideal candidate for cases where frequently accessed data must be in memory. The service is a popular choice for web, mobile apps, gaming, ad tech, and e-commerce. 

ElastiCache for Redis is an in-memory data store that provides sub-millisecond latency at internet scale. It can power the most demanding real-time applications in gaming, ad tech, e-commerce, healthcare, financial services, and IoT. 

ElastiCache engines

ElastiCache for MemcachedElastiCache for Redis
Simple cache to offload database burdenYesYes
Ability to scale horizontally for writes and storageYesYes
(if cluster mode is enabled)
Multi-threaded performanceYes
Advanced data typesYes
Sorting and ranking data setsYes
Pub and sub capabilityYes
Multi-AZ with Auto FailoverYes
Backup and restoreYes

Amazon DynamoDB Accelerator

DynamoDB is designed for scale and performance. In most cases, the DynamoDB response times can be measured in single-digit milliseconds. However, there are certain use cases that require response times in microseconds. For those use cases, DynamoDB Accelerator (DAX) delivers fast response times for accessing eventually consistent data.

DAX is an Amazon DynamoDB compatible caching service that provides fast in-memory performance for demanding applications.

AWS Database Migration Service

AWS Database Migration Service (AWS DMS) supports migration between the most widely used databases like Oracle, PostgreSQL, SQL Server, Amazon Redshift, Aurora, MariaDB, and MySQL. AWS DMS supports both homogeneous (same engine) and heterogeneous (different engines) migrations.

  • The service can be used to migrate between databases on Amazon EC2, Amazon RDS, and on-premises. Either the target or the source database must be located in Amazon EC2. It cannot be used to migrate between two on-premises databases.
  • AWS DMS automatically handles formatting of the source data for consumption by the target database. It does not perform schema or code conversion.
  • For homogenous migrations, you can use native tools to perform these conversions. For heterogeneous migrations, you can use the AWS Schema Conversion Tool (AWS SCT).

AWS Schema Conversion Tool

The AWS Schema Conversion Tool (AWS SCT) automatically converts the source database schema and a majority of the database code objects. The conversion includes views, stored procedures, and functions. They are converted to a format that is compatible with the target database. Any objects that cannot be automatically converted are marked so that they can be manually converted to complete the migration.

Source databasesTarget databases on AWS
Oracle database
Oracle data warehouse
Azure SQL
SQL server
Teradata
IBM Netezza
Greenplum
HPE Vertica
MySQL and MariaDB
PostgreSQL
Aurora
IBM DB2 LUW
Apache Cassandra
SAP ASE
AWS SCTMySQL
PostgreSQL
Oracle
AmazonDB
RDS for MySQL
Aurora for MySQL
RDS for PostgreSQL
Aurora PostgreSQL

The AWS SCT can also scan your application source code for embedded SQL statements and convert them as part of a database schema conversion project. During this process, the AWS SCT performs cloud native code optimization by converting legacy Oracle and SQL Server functions to their equivalent AWS service, modernizing the applications at the same time of migration. 

Regards

Osama

AWS Infrastructure

The AWS Global Cloud Infrastructure is the most secure, extensive, and reliable cloud platform, offering over 200 fully featured services from data centers globally.

AWS Data Center

AWS pioneered cloud computing in 2006 to provide rapid and secure infrastructure. AWS continuously innovates on the design and systems of data centers to protect them from man-made and natural risks. Today, AWS provides data centers at a large, global scale. AWS implements controls, builds automated systems, and conducts third-party audits to confirm security and compliance. As a result, the most highly-regulated organizations in the world trust AWS every day.

Availability Zone – AZ

An Availability Zone (AZ) is one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region. Availability Zones are multiple, isolated areas within a particular geographic location. When you launch an instance, you can select an Availability Zone or let AWS choose one for you. If you distribute your instances across multiple Availability Zones and one instance fails, you can design your application so that an instance in another Availability Zone can handle requests.

Region

Each AWS Region consists of multiple, isolated, and physically separate Availability Zones within a geographic area. This achieves the greatest possible fault tolerance and stability. In your account, you determine which Regions you need. You can run applications and workloads from a Region to reduce latency to end users. You can do this while avoiding the upfront expenses, long-term commitments, and scaling challenges associated with maintaining and operating a global infrastructure.

AWS Local Zone

AWS Local Zones can be used for highly demanding applications that require single-digit millisecond latency to end users. Media and entertainment content creation, real-time multiplayer gaming, and Machine learning hosting and training are some use cases for AWS Local Zones.

CloudFront – Edge Location

An edge location is the nearest point to a requester of an AWS service. Edge locations are located in major cities around the world. They receive requests and cache copies of your content for faster delivery.

Regards

Osama

AWS Snow Family memberS

The AWS Snow Family is a collection of physical devices that help to physically transport up to exabytes of data into and out of AWS. 

AWS Snow Family is composed of AWS SnowconeAWS Snowball, and AWS Snowmobile.

These devices offer different capacity points, and most include built-in computing capabilities. AWS owns and manages the Snow Family devices and integrates with AWS security, monitoring, storage management, and computing capabilities.  

AWS Snowcone

AWS Snowcone is a small, rugged, and secure edge computing and data transfer device. 

It features 2 CPUs, 4 GB of memory, and 8 TB of usable storage.

AWS Snowball

AWS Snowball offers two types of devices:

  • Snowball Edge Storage Optimized devices are well suited for large-scale data migrations and recurring transfer workflows, in addition to local computing with higher capacity needs.
    • Storage: 80 TB of hard disk drive (HDD) capacity for block volumes and Amazon S3 compatible object storage, and 1 TB of SATA solid state drive (SSD) for block volumes. 
    • Compute: 40 vCPUs, and 80 GiB of memory to support Amazon EC2 sbe1 instances (equivalent to C5).
  • Snowball Edge Compute Optimized provides powerful computing resources for use cases such as machine learning, full motion video analysis, analytics, and local computing stacks.
    • Storage: 42-TB usable HDD capacity for Amazon S3 compatible object storage or Amazon EBS compatible block volumes and 7.68 TB of usable NVMe SSD capacity for Amazon EBS compatible block volumes. 
    • Compute: 52 vCPUs, 208 GiB of memory, and an optional NVIDIA Tesla V100 GPU. Devices run Amazon EC2 sbe-c and sbe-g instances, which are equivalent to C5, M5a, G3, and P3 instances.

AWS Snowmobile

AWS Snowmobile is an exabyte-scale data transfer service used to move large amounts of data to AWS. 

You can transfer up to 100 petabytes of data per Snowmobile, a 45-foot long ruggedized shipping container, pulled by a semi trailer truck.

Cheers

Osama

AWS Support

AWS offers four different Support plans to help you troubleshoot issues, lower costs, and efficiently use AWS services. 

You can choose from the following Support plans to meet your company’s needs: 

  • Basic
  • Developer
  • Business
  • Enterprise

Basic Support

Basic Support is free for all AWS customers. It includes access to whitepapers, documentation, and support communities. With Basic Support, you can also contact AWS for billing questions and service limit increases.

With Basic Support, you have access to a limited selection of AWS Trusted Advisor checks. Additionally, you can use the AWS Personal Health Dashboard, a tool that provides alerts and remediation guidance when AWS is experiencing events that may affect you. 

If your company needs support beyond the Basic level, you could consider purchasing Developer, Business, or Enterprise Support.

Developer, Business, and Enterprise Support

The Developer, Business, and Enterprise Support plans include all the benefits of Basic Support, in addition to the ability to open an unrestricted number of technical support cases. These three Support plans have pay-by-the-month pricing and require no long-term contracts.

The information in this course highlights only a selection of details for each Support plan. A complete overview of what is included in each Support plan, including pricing for each plan, is available on the AWS Support site.

In general, for pricing, the Developer plan has the lowest cost, the Business plan is in the middle, and the Enterprise plan has the highest cost. 

Developer Support

Customers in the Developer Support plan have access to features such as:

  • Best practice guidance
  • Client-side diagnostic tools
  • Building-block architecture support, which consists of guidance for how to use AWS offerings, features, and services together

For example, suppose that your company is exploring AWS services. You’ve heard about a few different AWS services. However, you’re unsure of how to potentially use them together to build applications that can address your company’s needs. In this scenario, the building-block architecture support that is included with the Developer Support plan could help you to identify opportunities for combining specific services and features.

Business Support

Customers with a Business Support plan have access to additional features, including: 

  • Use-case guidance to identify AWS offerings, features, and services that can best support your specific needs
  • All AWS Trusted Advisor checks
  • Limited support for third-party software, such as common operating systems and application stack components

Suppose that your company has the Business Support plan and wants to install a common third-party operating system onto your Amazon EC2 instances. You could contact AWS Support for assistance with installing, configuring, and troubleshooting the operating system. For advanced topics such as optimizing performance, using custom scripts, or resolving security issues, you may need to contact the third-party software provider directly.

Enterprise Support

In addition to all the features included in the Basic, Developer, and Business Support plans, customers with an Enterprise Support plan have access to features such as:

  • Application architecture guidance, which is a consultative relationship to support your company’s specific use cases and applications
  • Infrastructure event management: A short-term engagement with AWS Support that helps your company gain a better understanding of your use cases. This also provides your company with architectural and scaling guidance.
  • A Technical Account Manager

Amazon Simple Storage Service (Amazon S3)

Amazon Simple Storage Service (Amazon S3) is a service that provides object-level storage. Amazon S3 stores data as objects in buckets.

You can upload any type of file to Amazon S3, such as images, videos, text files, and so on. For example, you might use Amazon S3 to store backup files, media files for a website, or archived documents. Amazon S3 offers unlimited storage space. The maximum file size for an object in Amazon S3 is 5 TB.

Amazon S3 storage classes

With Amazon S3, you pay only for what you use. You can choose from a range of storage classes to select a fit for your business and cost needs. When selecting an Amazon S3 storage class, consider these two factors:

  • How often you plan to retrieve your data
  • How available you need your data to be

S3 Standard

  • Designed for frequently accessed data
  • Stores data in a minimum of three Availability Zones

S3 Standard provides high availability for objects. This makes it a good choice for a wide range of use cases, such as websites, content distribution, and data analytics. S3 Standard has a higher cost than other storage classes intended for infrequently accessed data and archival storage.

S3 Standard-Infrequent Access (S3 Standard-IA)

  • Ideal for infrequently accessed data
  • Similar to S3 Standard but has a lower storage price and higher retrieval price

S3 Standard-IA is ideal for data infrequently accessed but requires high availability when needed. Both S3 Standard and S3 Standard-IA store data in a minimum of three Availability Zones. S3 Standard-IA provides the same level of availability as S3 Standard but with a lower storage price and a higher retrieval price.

S3 One Zone-Infrequent Access (S3 One Zone-IA)

  • Stores data in a single Availability Zone
  • Has a lower storage price than S3 Standard-IA

Compared to S3 Standard and S3 Standard-IA, which store data in a minimum of three Availability Zones, S3 One Zone-IA stores data in a single Availability Zone. This makes it a good storage class to consider if the following conditions apply:

  • You want to save costs on storage.
  • You can easily reproduce your data in the event of an Availability Zone failure.

S3 Intelligent-Tiering

  • Ideal for data with unknown or changing access patterns
  • Requires a small monthly monitoring and automation fee per object

In the S3 Intelligent-Tiering storage class, Amazon S3 monitors objects’ access patterns. If you haven’t accessed an object for 30 consecutive days, Amazon S3 automatically moves it to the infrequent access tier, S3 Standard-IA. If you access an object in the infrequent access tier, Amazon S3 automatically moves it to the frequent access tier, S3 Standard.

S3 Glacier

  • Low-cost storage designed for data archiving
  • Able to retrieve objects within a few minutes to hours

S3 Glacier is a low-cost storage class that is ideal for data archiving. For example, you might use this storage class to store archived customer records or older photos and video files.

S3 Glacier

  • Low-cost storage designed for data archiving
  • Able to retrieve objects within a few minutes to hours

S3 Glacier is a low-cost storage class that is ideal for data archiving. For example, you might use this storage class to store archived customer records or older photos and video files.

S3 Glacier Deep Archive

  • Lowest-cost object storage class ideal for archiving
  • Able to retrieve objects within 12 hours

When deciding between Amazon S3 Glacier and Amazon S3 Glacier Deep Archive, consider how quickly you need to retrieve archived objects. You can retrieve objects stored in the S3 Glacier storage class within a few minutes to a few hours. By comparison, you can retrieve objects stored in the S3 Glacier Deep Archive storage class within 12 hours.

Cheers

Osama

Amazon EC2 Options

With Amazon EC2, you pay only for the compute time that you use. Amazon EC2 offers a variety of pricing options for different use cases. For example, if your use case can withstand interruptions, you can save with Spot Instances. You can also save by committing early and locking in a minimum level of use with Reserved Instances.

On-Demand

are ideal for short-term, irregular workloads that cannot be interrupted. No upfront costs or minimum contracts apply. The instances run continuously until you stop them, and you pay for only the compute time you use.

Sample use cases for On-Demand Instances include developing and testing applications and running applications that have unpredictable usage patterns. On-Demand Instances are not recommended for workloads that last a year or longer because these workloads can experience greater cost savings using Reserved Instances.

Amazon EC2 Savings Plans

AWS offers Savings Plans for several compute services, including Amazon EC2. Amazon EC2 Savings Plans enable you to reduce your compute costs by committing to a consistent amount of compute usage for a 1-year or 3-year term. This term commitment results in savings of up to 66% over On-Demand costs.

Any usage up to the commitment is charged at the discounted plan rate (for example, $10 an hour). Any usage beyond the commitment is charged at regular On-Demand rates.

Later in this course, you will review AWS Cost Explorer, a tool that enables you to visualize, understand, and manage your AWS costs and usage over time. If you are considering your options for Savings Plans, AWS Cost Explorer can analyze your Amazon EC2 usage over the past 7, 30, or 60 days. AWS Cost Explorer also provides customized recommendations for Savings Plans. These recommendations estimate how much you could save on your monthly Amazon EC2 costs, based on previous Amazon EC2 usage and the hourly commitment amount in a 1-year or 3-year plan.

Reserved Instances

are a billing discount applied to the use of On-Demand Instances in your account. You can purchase Standard Reserved and Convertible Reserved Instances for a 1-year or 3-year term, and Scheduled Reserved Instances for a 1-year term. You realize greater cost savings with the 3-year option.

At the end of a Reserved Instance term, you can continue using the Amazon EC2 instance without interruption. However, you are charged On-Demand rates until you do one of the following:

  • Terminate the instance.
  • Purchase a new Reserved Instance that matches the instance attributes (instance type, Region, tenancy, and platform).

Spot Instances

 are ideal for workloads with flexible start and end times, or that can withstand interruptions. Spot Instances use unused Amazon EC2 computing capacity and offer you cost savings at up to 90% off of On-Demand prices.

Suppose that you have a background processing job that can start and stop as needed (such as the data processing job for a customer survey). You want to start and stop the processing job without affecting the overall operations of your business. If you make a Spot request and Amazon EC2 capacity is available, your Spot Instance launches. However, if you make a Spot request and Amazon EC2 capacity is unavailable, the request is not successful until capacity becomes available. The unavailable capacity might delay the launch of your background processing job.

After you have launched a Spot Instance, if capacity is no longer available or demand for Spot Instances increases, your instance may be interrupted. This might not pose any issues for your background processing job. However, in the earlier example of developing and testing applications, you would most likely want to avoid unexpected interruptions. Therefore, choose a different EC2 instance type that is ideal for those tasks.

Dedicated Hosts

are physical servers with Amazon EC2 instance capacity that is fully dedicated to your use. 

You can use your existing per-socket, per-core, or per-VM software licenses to help maintain license compliance. You can purchase On-Demand Dedicated Hosts and Dedicated Hosts Reservations. Of all the Amazon EC2 options that were covered, Dedicated Hosts are the most expensive.

Cheers

Osama

Amazon EC2 instance types

Amazon EC2 instance types are optimized for different tasks. When selecting an instance type, consider the specific needs of your workloads and applications. This might include requirements for compute, memory, or storage capabilities.

General purpose instances

provide a balance of compute, memory, and networking resources. You can use them for a variety of workloads, such as:

  • application servers
  • gaming servers
  • backend servers for enterprise applications
  • small and medium databases

Suppose that you have an application in which the resource needs for compute, memory, and networking are roughly equivalent. You might consider running it on a general purpose instance because the application does not require optimization in any single resource area.

Compute optimized instances

are ideal for compute-bound applications that benefit from high-performance processors. Like general purpose instances, you can use compute optimized instances for workloads such as web, application, and gaming servers.

However, the difference is compute optimized applications are ideal for high-performance web servers, compute-intensive applications servers, and dedicated gaming servers. You can also use compute optimized instances for batch processing workloads that require processing many transactions in a single group.

Memory optimized instances

are designed to deliver fast performance for workloads that process large datasets in memory. In computing, memory is a temporary storage area. It holds all the data and instructions that a central processing unit (CPU) needs to be able to complete actions. Before a computer program or application is able to run, it is loaded from storage into memory. This preloading process gives the CPU direct access to the computer program.

Suppose that you have a workload that requires large amounts of data to be preloaded before running an application. This scenario might be a high-performance database or a workload that involves performing real-time processing of a large amount of unstructured data. In these types of use cases, consider using a memory optimized instance. Memory optimized instances enable you to run workloads with high memory needs and receive great performance.

Accelerated computing instances

use hardware accelerators, or coprocessors, to perform some functions more efficiently than is possible in software running on CPUs. Examples of these functions include floating-point number calculations, graphics processing, and data pattern matching.

In computing, a hardware accelerator is a component that can expedite data processing. Accelerated computing instances are ideal for workloads such as graphics applications, game streaming, and application streaming.

Storage optimized instances

are designed for workloads that require high, sequential read and write access to large datasets on local storage. Examples of workloads suitable for storage optimized instances include distributed file systems, data warehousing applications, and high-frequency online transaction processing (OLTP) systems.

In computing, the term input/output operations per second (IOPS) is a metric that measures the performance of a storage device. It indicates how many different input or output operations a device can perform in one second. Storage optimized instances are designed to deliver tens of thousands of low-latency, random IOPS to applications. 

You can think of input operations as data put into a system, such as records entered into a database. An output operation is data generated by a server. An example of output might be the analytics performed on the records in a database. If you have an application that has a high IOPS requirement, a storage optimized instance can provide better performance over other instance types not optimized for this kind of use case.

Cheers

Osama

Using PersistentVolumes in Kubernetes

PersistentVolumes provide a way to treat storage as a dynamic resource in Kubernetes. This lab will allow you to demonstrate your knowledge of PersistentVolumes. You will mount some persistent storage to a container using a PersistentVolume and a PersistentVolumeClaim.

Create a custom Storage Class by using “`vi localdisk.yml`.

apiVersion: storage.k8s.io/v1 
kind: StorageClass 
metadata: 
  name: localdisk 
provisioner: kubernetes.io/no-provisioner
allowVolumeExpansion: true

Finish creating the Storage Class by using kubectl create -f localdisk.yml.
Create the PersistentVolume by using vi host-pv.yml.

kind: PersistentVolume 
apiVersion: v1 
metadata: 
   name: host-pv 
spec: 
   storageClassName: localdisk
   persistentVolumeReclaimPolicy: Recycle 
   capacity: 
      storage: 1Gi 
   accessModes: 
      - ReadWriteOnce 
   hostPath: 
      path: /var/output

Finish creating the PersistentVolume by using kubectl create -f host-pv.yml.

Check the status of the PersistenVolume by using kubectl get pv

Create a PersistentVolumeClaim

Start creating a PersistentVolumeClaim for the PersistentVolume to bind to by using vi host-pvc.yml.

apiVersion: v1 
kind: PersistentVolumeClaim 
metadata: 
   name: host-pvc 
spec: 
   storageClassName: localdisk 
   accessModes: 
      - ReadWriteOnce 
   resources: 
      requests: 
         storage: 100Mi

Finish creating the PersistentVolumeClaim by using kubectl create -f host-pvc.yml.

Check the status of the PersistentVolume and PersistentVolumeClaim to verify that they have been bound:

kubectl get pv
kubectl get pvc

Create a Pod That Uses a PersistentVolume for Storage

Create a Pod that uses the PersistentVolumeClaim by using vi pv-pod.yml.

apiVersion: v1 
kind: Pod 
metadata: 
   name: pv-pod 
spec: 
   containers: 
      - name: busybox 
        image: busybox 
        command: ['sh', '-c', 'while true; do echo Success! > /output/success.txt; sleep 5; done'] 

Mount the PersistentVolume to the /output location by adding the following, which should be level with the containers spec in terms of indentation:

volumes: 
 - name: pv-storage 
   persistentVolumeClaim: 
      claimName: host-pvc

In the containers spec, below the command, set the list of volume mounts by using:

volumeMounts: 
- name: pv-storage 
  mountPath: /output 

Finish creating the Pod by using kubectl create -f pv-pod.yml.

Check that the Pod is up and running by using kubectl get pods.

If you wish, you can log in to the worker node and verify the output data by using cat /var/output/success.txt.

Managing Container Storage with Kubernetes Volumes

Kubernetes volumes offer a simple way to mount external storage to containers. This lab will test your knowledge of volumes as you provide storage to some containers according to a provided specification. This will allow you to practice what you know about using Kubernetes volumes.

Create a Pod That Outputs Data to the Host Using a Volume

  • Create a Pod that will interact with the host file system by using vi maintenance-pod.yml.
apiVersion: v1
kind: Pod
metadata:
    name: maintenance-pod
spec:
    containers:
    - name: busybox
      image: busybox
      command: ['sh', '-c', 'while true; do echo Success! >> /output/output.txt; sleep 5; done']
  • Under the basic YAML, begin creating volumes, which should be level with the containers spec:
volumes:
- name: output-vol
  hostPath:
      path: /var/data
  • In the containers spec of the basic YAML, add a line for volume mounts:
volumeMounts:
- name: output-vol
  mountPath: /output

The complete YAML will be

apiVersion: v1
kind: Pod
metadata:
    name: maintenance-pod
spec:
  containers:
    - name: busybox
      image: busybox
      command: ['sh', '-c', 'while true; do echo Success! >> /output/output.txt; sleep 5; done']
      volumeMounts:
      - name: output-vol
        mountPath: /output
  volumes:
   - name: output-vol
     hostPath:
      path: /var/data

Create a Multi-Container Pod That Shares Data Between Containers Using a Volume

  1. Create another YAML file for a shared-data multi-container Pod by using vi shared-data-pod.yml
  2. Start with the basic Pod definition and add multiple containers, where the first container will write the output.txt file and the second container will read the output.txt file:
apiVersion: v1
kind: Pod
metadata:
    name: shared-data-pod
spec:
    containers:
    - name: busybox1
      image: busybox
      command: ['sh', '-c', 'while true; do echo Success! >> /output/output.txt; sleep 5; done']
    - name: busybox2
      image: busybox
      command: ['sh', '-c', 'while true; do cat /input/output.txt; sleep 5; done']

Set up the volumes, again at the same level as containers with an emptyDir volume that only exists to share data between two containers in a simple way:

volumes:
- name: shared-vol
  emptyDir: {}

Mount that volume between the two containers by adding the following lines under command for the busybox1 container:

volumeMounts:
- name: shared-vol
  mountPath: /output

For the busybox2 container, add the following lines to mount the same volume under command to complete creating the shared file:

volumeMounts:
- name: shared-vol
  mountPath: /input

The complete file

Finish creating the multi-container Pod using kubectl create -f shared-data-pod.yml.

apiVersion: v1
kind: Pod
metadata:
    name: shared-data-pod
spec:
    containers:
    - name: busybox1
      image: busybox
      command: ['sh', '-c', 'while true; do echo Success! >> /output/output.txt; sleep 5; done']
      volumeMounts:
        - name: shared-vol
          mountPath: /output
    - name: busybox2
      image: busybox
      command: ['sh', '-c', 'while true; do cat /input/output.txt; sleep 5; done']
      volumeMounts:
        - name: shared-vol
          mountPath: /input
    volumes:
    - name: shared-vol
    emptyDir: {}

And you can now apply the YAML file.

Cheers

Osama