Managing OKE Cluster Add-ons
On Private Cloud Appliance, cluster add-ons are components that you can choose to deploy on a Kubernetes cluster. Cluster add-ons extend core Kubernetes functionality and improve cluster manageability and performance. This section describes how to install and manage the following supported add-ons.
The WebLogic Kubernetes Operator add-on, which supports running WebLogic Server and Fusion Middleware appliance domains on Kubernetes. For more information about the WebLogic Kubernetes Operator, see the public documentation at https://github.com/oracle/weblogic-kubernetes-operator.
The Database Operator for Kubernetes add-on (OraOperator) helps developers, database administrators, DevOps, and GitOps teams reduce the time and complexity of deploying and managing Oracle databases. For more information, see the public documentation at https://github.com/oracle/oracle-database-operator/tree/main.
- The NVIDIA GPU Plugin add-on is a convenient way to manage the NVIDIA Device Plugin for Kubernetes. The NVIDIA Device Plugin for Kubernetes is the NVIDIA implementation of the Kubernetes device plug-in framework for exposing the number of NVIDIA GPUs on each worker node, and tracking the health of those GPUs. For more information about the NVIDIA Device Plugin for Kubernetes, see https://github.com/NVIDIA/k8s-device-plugin.
The optional Certificate Manager add-on, also known as cert-manager, adds certificates and certificate issuers to Kubernetes clusters as resource types. The Certificate Manager also simplifies the process of obtaining, using, and renewing those certificates. For more information, see https://github.com/cert-manager/cert-manager.
Add-on Prerequisites
Review the following requirements before you install the add-ons.
Oracle Database Operator add-on
Database Operator can only be enabled on an existing cluster.
The Certificate Manager add-on must be installed, enabled, and in the ACTIVE state before you can use the Database Operator add-on. See Install an Add-on for an Existing Cluster.
Nvidia GPU Plugin add-on
- For both the cluster and node pool, Kubernetes version 1.29.14 or higher is required
- Enable the Nvidia GPU plugin before creating a GPU node pool.
Once a node pool is created as either GPU or non-GPU, its type cannot be switched.
Use the appropriate required base image for your use case.
Base Image
Use Case
nvcr.io/nvidia/cuda:12.9.0-runtime-ubi9or equal as a runtime environmentDeploying of a prebuilt AI application
nvcr.io/nvidia/cuda:12.9.0-devel-ubi9or equal as a development environmentDeveloping, compiling, or training AI with native CUDA/C++ code.
All worker nodes come with default block volume size of 50 GB however, GPU applications when used with the required base image might run out of disk space. Ensure to use Persistent Volume Claim in the GPU applications.
To avoid unintended scheduling, the GPU nodes are tainted by default. Without this taint any pod could be scheduled onto a GPU node, even if it doesn’t need a GPU. This means only pods that explicitly request GPU resources and tolerate the taint are be scheduled on GPU nodes.
To ensure that the Nvidia device-plugin pods are scheduled and run only on nodes that possess NVIDIA GPU hardware, GPU nodes are labeled by default. The OKE controller labels GPU nodes with the following node label
"nvidia.com/gpu": "true".