Managing OKE Cluster Add-ons

On Private Cloud Appliance, cluster add-ons are components that you can choose to deploy on a Kubernetes cluster. Cluster add-ons extend core Kubernetes functionality and improve cluster manageability and performance. This section describes how to install and manage the following supported add-ons.

The WebLogic Kubernetes Operator add-on, which supports running WebLogic Server and Fusion Middleware appliance domains on Kubernetes. For more information about the WebLogic Kubernetes Operator, see the public documentation at https://github.com/oracle/weblogic-kubernetes-operator.
The Database Operator for Kubernetes add-on (OraOperator) helps developers, database administrators, DevOps, and GitOps teams reduce the time and complexity of deploying and managing Oracle databases. For more information, see the public documentation at https://github.com/oracle/oracle-database-operator/tree/main.
The NVIDIA GPU Plugin add-on is a convenient way to manage the NVIDIA Device Plugin for Kubernetes. The NVIDIA Device Plugin for Kubernetes is the NVIDIA implementation of the Kubernetes device plug-in framework for exposing the number of NVIDIA GPUs on each worker node, and tracking the health of those GPUs. For more information about the NVIDIA Device Plugin for Kubernetes, see https://github.com/NVIDIA/k8s-device-plugin.
The optional Certificate Manager add-on, also known as cert-manager, adds certificates and certificate issuers to Kubernetes clusters as resource types. The Certificate Manager also simplifies the process of obtaining, using, and renewing those certificates. For more information, see https://github.com/cert-manager/cert-manager.

Add-on Prerequisites

Review the following requirements before you install the add-ons.

Oracle Database Operator add-on

Database Operator can only be enabled on an existing cluster.
The Certificate Manager add-on must be installed, enabled, and in the ACTIVE state before you can use the Database Operator add-on. See Install an Add-on for an Existing Cluster.

Nvidia GPU Plugin add-on

For both the cluster and node pool, Kubernetes version 1.29.14 or higher is required
Enable the Nvidia GPU plugin before creating a GPU node pool.
Once a node pool is created as either GPU or non-GPU, its type cannot be switched.

Use the appropriate required base image for your use case.


Base Image	Use Case
`nvcr.io/nvidia/cuda:12.9.0-runtime-ubi9` or equal as a runtime environment	Deploying of a prebuilt AI application
`nvcr.io/nvidia/cuda:12.9.0-devel-ubi9` or equal as a development environment	Developing, compiling, or training AI with native CUDA/C++ code.

All worker nodes come with default block volume size of 50 GB however, GPU applications when used with the required base image might run out of disk space. Ensure to use Persistent Volume Claim in the GPU applications.
To avoid unintended scheduling, the GPU nodes are tainted by default. Without this taint any pod could be scheduled onto a GPU node, even if it doesn’t need a GPU. This means only pods that explicitly request GPU resources and tolerate the taint are be scheduled on GPU nodes.
To ensure that the Nvidia device-plugin pods are scheduled and run only on nodes that possess NVIDIA GPU hardware, GPU nodes are labeled by default. The OKE controller labels GPU nodes with the following node label "nvidia.com/gpu": "true".

WebLogic Kubernetes Operator add-on

See Deploying the WebLogic Kubernetes Operator Add-on.

Oracle Cloud Infrastructure Documentation

Managing OKE Cluster Add-ons

Add-on Prerequisites

Oracle Database Operator add-on

Nvidia GPU Plugin add-on

WebLogic Kubernetes Operator add-on