Manage VM Clusters
Learn how to manage your VM clusters on Oracle Exadata Database Service on Cloud@Customer.
- About Managing VM Clusters on Oracle Exadata Database Service on Cloud@Customer
The VM cluster provides a link between your Oracle Exadata Database Service on Cloud@Customer infrastructure and Oracle Databases you deploy. - Overview of VM Cluster Node Subsetting
VM Cluster Node Subsetting enables you to allocate a subset of database servers to new and existing VM clusters to enable maximum flexibility in the allocation of compute (CPU, memory, local storage) resources. - Overview of Automatic Diagnostic Collection
By enabling diagnostics collection and notifications, Oracle Cloud Operations and you will be able to identify, investigate, track, and resolve guest VM issues quickly and effectively. Subscribe to Events to get notified about resource state changes. - Incident Logs and Trace Files
This section lists all of the files that can be collected by Oracle Support if you opt-in for incident logs and trace collection. - Health Metrics
Review the list of database and non-database health metrics collected by Oracle Trace File Analyzer. - Introduction to Scale Up or Scale Down Operations
With the Multiple VMs per Exadata system (MultiVM) feature release, you can scale up or scale down your VM cluster resources. - Using the Console to Manage VM Clusters on Oracle Exadata Database Service on Cloud@Customer
Learn how to use the console to create, edit, and manage your VM Clusters on Oracle Exadata Database Service on Cloud@Customer. - Using the API to Manage Oracle Exadata Database Service on Cloud@Customer VM Clusters
Review the list of API calls to manage your Oracle Exadata Database Service on Cloud@Customer VM cluster networks and VM clusters. - Troubleshooting Virtual Machines Using Console Connections
You can troubleshoot malfunctioning virtual machines using console connections. For example, a previously working Guest VM stops responding.
Parent topic: How-to Guides
About Managing VM Clusters on Oracle Exadata Database Service on Cloud@Customer
The VM cluster provides a link between your Oracle Exadata Database Service on Cloud@Customer infrastructure and Oracle Databases you deploy.
The VM cluster contains an installation of Oracle Clusterware, which supports databases in the cluster. In the VM cluster definition, you also specify the number of enabled CPU cores, which determines the amount of CPU resources that are available to your databases
Before you can create any databases on your Exadata Cloud@Customer infrastructure, you must create a VM cluster network, and you must associate it with a VM cluster.
Avoid entering confidential information when assigning descriptions, tags, or friendly names to your cloud resources through the Oracle Cloud Infrastructure Console, API, or CLI.
Parent topic: Manage VM Clusters
Overview of VM Cluster Node Subsetting
VM Cluster Node Subsetting enables you to allocate a subset of database servers to new and existing VM clusters to enable maximum flexibility in the allocation of compute (CPU, memory, local storage) resources.
- Create a smaller VM cluster to host databases that have low resource and scalability requirements or to host a smaller number of databases that require isolation from the rest of the workload.
- Expand or shrink an existing VM cluster by adding and removing nodes to ensure optimal utilization of available resources.
- VM Cluster Node Subsetting capability is available for new and existing VM clusters in Gen2 Exadata Cloud@Customer service.
- All VMs across a VM cluster will have the same resource allocation per VM irrespective of whether the VM was created during cluster provisioning or added later by extending an existing VM cluster.
- VM Clusters only need a minimum of 1 VM with node subsetting. However, Oracle recommends a minimum of 2 VMs per VM Cluster to provide high availability.
- Each VM cluster network is pre-provisioned with IP addresses for every DB Server in the infrastructure. One cluster network can only be used by a single VM cluster and is validated to ensure the IP addresses do not overlap with other cluster networks. Adding or removing VMs to the cluster does not impact the pre-provisioned IP addresses assigned to each DB server in the associated cluster network.
For the Maximum number of VMs per DB server and Maximum number of VM Clusters per System, see the System Shape and Configuration Tables. The Maximum number of VM Clusters per System depends on the resources available per DB server and is subject to the per DB Server maximum VM limit.
When a cluster contains a node-subsetted database, the attributed usage and cost feature for pluggable databases will not work because the process of creating node-subsetted databases happens on the backend, and the metadata for node-subsetted databases doesn't get synchronized with the Control Plane Server.
However, if the database was originally created without using node-subsetting and later converted to a node-subsetted database, this issue will not arise since the metadata is already available in the Control Plane.
Overview of Automatic Diagnostic Collection
By enabling diagnostics collection and notifications, Oracle Cloud Operations and you will be able to identify, investigate, track, and resolve guest VM issues quickly and effectively. Subscribe to Events to get notified about resource state changes.
-
Enable Diagnostic Events
Allow Oracle to collect and publish critical, warning, error, and information events to you. For more information, see Database Service Events.
-
Enable Health Monitoring
Allow Oracle to collect health metrics/events such as Oracle Database up/down, disk space usage, and so on, and share them with Oracle Cloud operations. You will also receive notification of some events. For more information, see Health Metrics.
-
Enable Incident Logs and Trace Collection
Allow Oracle to collect incident logs and traces to enable fault diagnosis and issue resolution. For more information, see Incident Logs and Trace Files.
Diagnostics Collection is:
- Enabled: When you choose to collect diagnostics, health metrics, incident logs, and trace files (all three options).
- Disabled: When you choose not to collect diagnostics, health metrics, incident logs, and trace files (all three options).
- Partially Enabled: When you choose to collect diagnostics, health metrics, incident logs, and trace files (one or two options).
Disabling diagnostic events and health monitoring will only stop the collection and notification of data/events from the time you uncheck the checkboxes tied to the options. However, historical data will not be purged from Oracle Cloud Operations data repositories.
Incident Logs and Trace Files
This section lists all of the files that can be collected by Oracle Support if you opt-in for incident logs and trace collection.
- Oracle will create a service request (SR) against the infrastructure Customer Support Identifier (CSI) when an issue is detected and needs customer interaction to resolve.
- The customer's Oracle Cloud Infrastructure tenancy admin email will be used as the CSI contact to create SR and attach logs to it. Ensure tenancy admin is added as a CSI contact in My Oracle Support (MOS).
Oracle Trace File Analyze (TFA) Component Driven Logs Collections
The directories are generally assigned to a component and that component can then be used to guide TFA to the files it needs to collect, for example, requesting the CRS component would tell TFA to look at directories mapped to the CRS component and find files that match the required collection time frame.
If have previously opted in for incident log and trace file collection and decide to opt out when Oracle Cloud operations run a log collection job, then the job will run its course and will not cancel. Future log collections won't happen until you opt-in again to the incident logs and trace file collection option.
TFA is shipped with scripts that run when a particular component
is requested, for example, for CRS component,
crscollect.pl
will run a number
of crsctl
commands and gather the input. By
default, TFA does not redact collected logs.
Table 5-12 Oracle Trace File Analyze (TFA) Component Driven Logs Collections
Component | Script | Files/Directories |
---|---|---|
|
|
|
|
|
|
|
No DB Specific Script - runs |
|
Cloud Tool Logs
- Creg files:
/var/opt/oracle/creg/*.ini
files with masked sensitive info - Cstate file:
/var/opt/oracle/cstate.xml
-
Database related tooling logs:
If
dbName
specified,/var/opt/oracle/log/<dbName>
, else collect logs for all databases/var/opt/oracle/log/
If
dbName
specified,/var/opt/oracle/dbaas_acfs/log/<dbName>
, else collect logs for all databases/var/opt/oracle/log/<dbName>
- Database env files: If
dbName
specified,/home/oracle/<dbName>.env
, else collect logs for all databases/home/oracle/*.env
- Pilot logs:
/home/opc/.pilotBase/logs
- List of log directories:
/var/opt/oracle/log
/var/opt/oracle/dbaas_acfs/log
/var/opt/oracle/dbaas_acfs/dbsystem_details
/var/opt/oracle/dbaas_acfs/job_manager
/opt/oracle/dcs/log
DCS Agent Logs
/opt/oracle/dcs/log/
Tooling-Related Grid Infrastructure/Database Logs
- Grid Infrastructure:
GI_HOME/cfgtoollogs
- Database alertlog:
/u02/app/oracle/diag/rdbms/*/*/alert*.log
Health Metrics
Review the list of database and non-database health metrics collected by Oracle Trace File Analyzer.
Oracle may add more metrics in the future, but if you have already chosen to collect metrics, you need not update your opt-in value. It will remain enabled/disabled based on your current preference.
Guest VM Health Metrics List - Database Metrics
Table 5-13 Guest VM Health Metrics List - Database Metrics
Metric Name | Metric Display Name | Unit | Aggregation | Interval | Collection Frequency | Description |
---|---|---|---|---|---|---|
|
CPU Utilization |
Percentage |
Mean |
One minute |
Five minutes |
The CPU utilization is expressed as a percentage, which is aggregated across all consumer groups. The utilization percentage is reported with respect to the number of CPUs the database is allowed to use, which is two times the number of OCPUs. |
|
Storage Utilization |
Percentage |
Mean |
One hour |
One hout |
The percentage of provisioned storage capacity currently in use. Represents the total allocated space for all tablespaces. |
|
DB Block Changes |
Changes per second |
Mean |
One minute |
Five minutes |
The Average number of blocks changed per second. |
|
Execute Count |
Count |
Sum |
One minute |
Five minutes |
The number of user and recursive calls that executed SQL statements during the selected interval. |
|
Current Logons |
Count |
Sum |
One minute |
Five minutes |
The number of successful logons during the selected interval. |
|
Transaction Count |
Count |
Sum |
One minute |
Five minutes |
The combined number of user commits and user rollbacks during the selected interval. |
|
User Calls |
Count |
Sum |
One minute |
Five minutes |
The combined number of logons, parses, and execute calls during the selected interval. |
|
Parse Count |
Count |
Sum |
One minute |
Five minutes |
The number of hard and soft parses during the selected interval. |
|
Storage Space Used |
GB |
Max |
One hour |
One hour |
Total amount of storage space used by the database at the collection time. |
|
Storage Space Allocated |
GB |
Max |
One hour |
One hour |
Total amount of storage space allocated to the database at the collection time. |
|
Storage Space Used By Tablespace |
GB |
Max |
One hour |
One hour |
Total amount of storage space used by tablespace at the collection time. In the case of container databases, this metric provides root container tablespaces. |
|
Allocated Storage Space By Tablespace |
GB |
Max |
One hour |
One hour |
Total amount of storage space allocated to the tablespace at the collection time. In the case of container databases, this metric provides root container tablespaces. |
|
Storage Space Utilization By Tablespace |
Percentage |
Mean |
One hour |
One hour |
This indicates the percentage of storage space utilized by the tablespace at the collection time. In the case of container databases, this metric provides root container tablespaces. |
Guest VM Health Metrics List - Non-Database Metrics
Table 5-14 Guest VM Health Metrics List - Non-Database Metrics
Metric Name | Metric Display Name | Unit | Aggregation | Collection Frequency | Description |
---|---|---|---|---|---|
|
ASM Diskgroup Utilization |
Percentage |
Max |
10 minutes |
Percentage of usable space used in a Disk Group. Usable space is the space available for growth. DATA disk group stores our Oracle database files. RECO disk group contains database files for recovery such as archives and flashback logs. |
|
Filesystem Utilization |
Percentage |
Max |
One minute |
Percent utilization of provisioned filesystem. |
|
CPU Utilization |
Percentage |
Mean |
One minute |
Percent CPU utilization. |
|
Memory Utilization |
Percentage |
Mean |
One minute |
Percentage of memory available for starting new applications, without swapping.
The available memory can be obtained via the following command: |
|
Swap Utilization |
Percentage |
Mean |
One minute |
Percent utilization of total swap space. |
|
Load Average |
Number |
Mean |
One minute |
System load average over 5 minutes. |
|
Node Status |
Integer |
Mean |
One minute |
Indicates whether the host is reachable. |
|
OCPU Allocated |
Integer |
Max |
One minute |
The number of OCPUs allocated. |
Introduction to Scale Up or Scale Down Operations
With the Multiple VMs per Exadata system (MultiVM) feature release, you can scale up or scale down your VM cluster resources.
- Scaling Up or Scaling Down the VM Cluster Resources
You can scale up or scale down the memory, local disk size (/u02
), ASM Storage, and CPUs. - Resizing Memory and Large Pages
- Calculating the ASM Storage
- Estimating How Much Local Storage You Can Provision to Your VMs
- Scaling Local Storage
Parent topic: Manage VM Clusters
Scaling Up or Scaling Down the VM Cluster Resources
You can scale up or scale down the memory, local disk size
(/u02
), ASM Storage, and CPUs.
Oracle doesn't stop billing when a VM or VM Cluster is stopped. To stop billing for a VM Cluster, lower the OCPU count to zero.
Scaling up or down of these resources requires thorough auditing of existing usage and capacity management by the customer DB administrator. Review the existing usage to avoid failures during or after a scale down operation. While scaling up, consider how much of these resources are left for the next VM cluster you are planning to create. Exadata Cloud@Customer Cloud tooling calculates the current usage of memory, local disk, and ASM storage in the VM cluster, adds headroom to it, and arrives at a "minimum" value below which you cannot scale down, and expects that you specify the value below this minimum value.
- When creating or scaling a VM Cluster, setting the number of OCPUs to zero will shut down the VM Cluster and eliminate any billing for that VM Cluster, but the hypervisor will still reserve the minimum 2 OCPUs for each VM. These reserved OCPUs cannot be allocated to any other VMs, even though the VM to which they are allocated is shut down. The Control Plane does not account for reserved OCPUs when showing maximum available OCPU, so you should account for these reserved OCPU when performing any subsequent scaling operations to ensure the operation can acquire enough OCPUs to successfully complete the operation.
- For memory and
/u02
scale up or scale down operations, if the difference between the current value and the new value is less than 2%, then no change will be made to that VM. This is because memory change involves rebooting the VM, and/u02
change involves bringing down the Oracle Grid Infrastructure stack and un-mounting/u02
. Productions customers will not resize for such a small increase or decrease, and hence such requests are a no-op. - You can scale the VM Cluster resources even if any of the DB servers in the VM
Cluster are down:
- If a DB server is down and scaling is performed, the VMs on that server will not be automatically scaled to the new OCPUs when the DB server and the VMs come back online. It's your responsibility to ensure that all the VMs in the cluster have the same OCPU values.
- Even if the DB server is down, billing does not stop for the VM Cluster that has the VMs on that DB server.
Parent topic: Introduction to Scale Up or Scale Down Operations
Resizing Memory and Large Pages
You can scale the database server memory up and down in a VM Cluster. Scaling memory requires a rolling restart of the database servers to take effect. For memory scaling to succeed, the databases must autostart in the Open state.
Changing the memory in a VM Cluster will affect the large pages (HugePages) settings for the VMs in that cluster. When a VM is initially created, each VM's operating system is configured with 50% of the memory allocated to the VM for large pages, and databases are configured to use that memory for their SGA. Oracle recommends you not modify the large pages configuration unless you understand the implication of any changes you make. Improper configurations can prevent all databases from starting, and even prevent the VM from booting.
Although modifying the large pages configuration is not recommended, it is permitted. However, any changes made may be overridden by cloud automation if the VM's memory is resized later. During a memory resize operation, cloud automation will attempt to maintain the large pages memory as a percentage of total memory, with a maximum limit of 60%. If large pages are configured to use more than 60% of the total memory, cloud automation will automatically resize it to this 60% limit.
- Condition 1: The current HugePages usage, multiplied by 1.15 (15% more than currently used), must be less than the new large pages allocation.
- Condition 2: The current HugePages usage, multiplied by 1.15, must also be less than 60% of the new total memory size.
The current HugePages usage is determined by subtracting the free HugePages from the total current HugePages.
EXACLOUD: Requested memory is insufficient. The new hugepage count is <<>>, which is less than the minimum required for the VM. Not proceeding with the change.
This process ensures there is enough conventional memory for the VM to boot. Before proceeding with the resize, automation performs a precheck to determine the current large pages usage by running database instances. If the precheck indicates that there will not be enough large pages memory after the resize to support the existing databases, the resize will fail, and the process will not continue.
Parent topic: Introduction to Scale Up or Scale Down Operations
Calculating the ASM Storage
Use the following formula to calculate the minimum required ASM storage:
- For each disk group, for example,
DATA
,RECO
, note the total size and free size by running theasmcmd lsdg
command on any Guest VM of the VM cluster. - Calculate the used size as (Total size - Free size) / 3 for each disk group. The /3 is used because the disk groups are triple mirrored.
-
DATA:RECO ratio is:
80:20 if Local Backups option was NOT selected in the user interface.
40:60 if Local Backups option was selected in the user interface.
- Ensure that the new total size as given in the user interface passes
the following conditions:
Used size for DATA * 1.15 <= (New Total size * DATA % )
Used size for RECO * 1.15 <= (New Total size * RECO % )
Example 5-3 Calculating the ASM Storage
- Run the
asmcmd lsdg
command in the Guest VM:- Without
SPARSE:
/u01/app/19.0.0.0/grid/bin/asmcmd lsdg ASMCMD> State Type Rebal Sector Logical_Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED HIGH N 512 512 4096 4194304 12591936 10426224 1399104 3009040 0 Y DATAC5/ MOUNTED HIGH N 512 512 4096 4194304 3135456 3036336 348384 895984 0 N RECOC5/ ASMCMD>
- With
SPARSE:
/u01/app/19.0.0.0/grid/bin/asmcmd lsdg ASMCMD> State Type Rebal Sector Logical_Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED HIGH N 512 512 4096 4194304 12591936 10426224 1399104 3009040 0 Y DATAC5/ MOUNTED HIGH N 512 512 4096 4194304 3135456 3036336 348384 895984 0 N RECOC5/ MOUNTED HIGH N 512 512 4096 4194304 31354560 31354500 3483840 8959840 0 N SPRC5/ ASMCMD>
Note
The listed values of all attributes for SPARSE diskgroup (SPRC5) present the virtual size. In Exadata DB Systems and Exadata Cloud@Customer, we use the ratio of 1:10 for
physicalSize
:virtualSize
. Hence, for all purposes of our calculation we must use 1/10th of the values displayed above in case of SPARSE for those attributes. - Without
SPARSE:
- Used size for a disk group = (Total_MB - Free_MB) /3
- Without SPARSE:
Used size for DATAC5 = (12591936 - 10426224 ) / 3 = 704.98 GB
Used size for RECO5 = (3135456 - 3036336 ) / 3 = 32.26 GB
- With SPARSE:
Used size for DATAC5 = (12591936 - 10426224 ) / 3 ~= 704.98 GB
Used size for RECO5 = (3135456 - 3036336 ) /3 ~= 32.26 GB
Used size for SPC5 = (1/10 * (31354560 - 31354500)) / 3 ~= 0 GB
- Without SPARSE:
- Storage distribution among diskgroups
- Without SPARSE:
DATA:RECO ratio is 80:20 in this example.
- With SPARSE:
DATA RECO: SPARSE ratio is 60:20:20 in this example.
- Without SPARSE:
- New requested size should pass the following conditions:
- Without SPARSE: (For example, 5 TB in user interface.)
5 TB = 5120 GB ; 5120 *.8 = 4096 GB; 5120 *.2 = 1024 GB
For DATA: (704.98 * 1.15 ) <= 4096 GB
For RECO: (32.36 * 1.15) <= 1024 GB
- With SPARSE: (For example, 8 TB in the user interface.)
8 TB = 8192 GB; 8192 *.6 = 4915 GB; 8192 *.2 = 1638 GB; 8192 *.2 = 1638 GB
For DATA: (704.98 * 1.15 ) <= 4915 GB
For RECO: (32.36 * 1.15) <= 1638 GB
For SPR: (0 * 1.15) <= 1638 GB
- Without SPARSE: (For example, 5 TB in user interface.)
Above resize will go through. If above conditions are not met by the new size, then resize will fail the precheck.
Parent topic: Introduction to Scale Up or Scale Down Operations
Estimating How Much Local Storage You Can Provision to Your VMs
VM Images include the files necessary to boot and run the VM and its operating system, as well as space for Oracle Homes stored in /u02
. To estimate how much additional local storage space beyond the minimum can be allocated to any file system associated with a VM, subtract the size of the VM images for all VMs on a server from the total available space. If you have not modified the default VM Image size by expanding any file systems, use the VM Image size (default and minimum) below. If you have or plan to modify your VM Image size, you must use the OCI console and "Scale VM Cluster" action to check the allocated and available for an existing VM Cluster as expanding some non-/u02 file systems will consume more incremental storage than was added to the file system. This information is also available in the "Configure VM Cluster" action while creating a new VM Cluster.
X8-2 and X7-2 Systems
- Total space available for VM images (X7 All Systems): 1237 GB
- Total space available for VM images (X8 All Systems): 1037 GB
- VM Image size (default and minimum) including
/u02
: 244 GB - Default (minimum)
/u02
: 60 GB
X8M-2 Systems
- Total space available for VM images (X8M Base System): 1237 GB
- Total space available for VM images (X8M Elastic): 2500 GB
- VM Image size (default and minimum) including
/u02
: 244 GB - Default (minimum)
/u02
: 60 GB
X10M and X9M-2 Systems
- Total Available for VM Images (Base System X9M): 1077 GB
- Total Available for VM Images (Elastic): 2243 GB
- VM Image size (default and minimum) including
/u02
: 244 GB - Default (minimum)
/u02
: 60 GB
Parent topic: Introduction to Scale Up or Scale Down Operations
Scaling Local Storage
Scale Local Space Operation Guidelines
You can scale local storage by modifying the size of many of the individual file systems in a VM. By default, the file systems are created at their minimum size. You can increase the size of the file systems as required. However, note that you can only shrink /u02
. Other file systems can only be increased in size. The maximum supported size of any file system is 900 GB.
The storage consumed by all file systems is greater than the sum of the file system sizes. Refer to the calculations displayed in the OCI console to see the effects on free local storage when resizing a file system.
Using the OCI Console or API, you can increase or decrease the size of the following local file systems:
/u02
Using the OCI Console or API, you can increase the size of following local filesystems:
/
/u01
/tmp
/var
/var/log
/var/log/audit
/home
However, you cannot resize the following local file systems:
/crashfiles
/boot
/acfs01
/u01/app/19.0.0.0/grid
- With the exception of
/u02
, you can only expand the file systems and cannot reduce their size once they have been expanded. - A rolling restart of each VM is required for the resizing to take effect.
- Each file system can only be expanded to a maximum of 900 GB
- Ability to increase the size of additional local file systems is only supported on X8M and later systems.
For more information about resizing these file systems, see Estimating How Much Local Storage You Can Provision to Your VMs.
Resource Limit Based On Current Utilization
- Any scale-down operation must leave 15% buffer on top of highest local space utilization across all nodes in the cluster.
- The lowest local space per node allowed is higher of the above two limits.
- Run the
df –kh
command on each node to find out the node with the highest local storage. - You can also use the utility like
cssh
to issue the same command from all hosts in a cluster by typing it just once. - Lowest value of local storage each node can be scaled down to would be = 1.15x (highest value of local space used among all nodes).
ACFS File Systems
If requested by support, you can also resize the /acfs01
file system. This file system is used by the system to stage software. It uses Exadata storage and is not subject to the limits described above for /u02
. It is a shared file system visible from all nodes in the cluster, and can be online resized from the command line of any VM.
- Default size: The default size of
/acfs01
is 100 GB. - Scaling /acfs01: You can scale
acfs01
as user grid from any VM via the/sbin/acfsutil
command. No reboot is required. The resize operation will not affect the availability of the database service running in the VM Cluster. The following command issued by thegrid
user will increase the size of/acfs01
by 100 GB:/sbin/acfsutil size +100 GB /acfs01
. - You can create additional ACFS file systems if required. These will also consume storage from the Exadata Storage diskgroups and may be shared across all VMs in the cluster. Refer to the ACFS documentation for more information.
Parent topic: Introduction to Scale Up or Scale Down Operations
Using the Console to Manage VM Clusters on Oracle Exadata Database Service on Cloud@Customer
Learn how to use the console to create, edit, and manage your VM Clusters on Oracle Exadata Database Service on Cloud@Customer.
- Using the Console to Create a VM Cluster
To create your VM cluster, be prepared to provide values for the fields required for configuring the infrastructure. - Using the Console to Enable, Partially Enable, or Disable Diagnostics Collection
You can enable, partially enable, or disable diagnostics collection for your Guest VMs after provisioning the VM cluster. Enabling diagnostics collection at the VM cluster level applies the configuration to all the resources such as DB home, Database, and so on under the VM cluster. - Using the Console to Add VMs to a Provisioned Cluster
To add virtual machines to a provisioned cluster, use this procedure. - Using the Console to View a List of DB Servers on an Exadata Infrastructure
To view a list of database server hosts on an Oracle Exadata Cloud@Customer system, use this procedure. - Using the Console to Remove a VM from a VM Cluster
To remove a virtual machine from a provisioned cluster, use this procedure. - Using the Console to Update the License Type on a VM Cluster
To modify licensing, be prepared to provide values for the fields required for modifying the licensing information. - Using the Console to Add SSH Keys After Creating a VM Cluster
- Using the Console to Scale the Resources on a VM Cluster
Starting in Oracle Exadata Database Service on Cloud@Customer, you can scale up or down multiple resources at the same time. You can also scale up or down resources one at a time. - Using the Console to Stop, Start, or Reboot a VM Cluster Virtual Machine
Use the console to stop, start, or reboot a virtual machine. - Using the Console to Check the Status of a VM Cluster Virtual Machine
Review the health status of a VM cluster virtual machine. - Using the Console to Move a VM Cluster to Another Compartment
To change the compartment that contains your VM cluster on Oracle Exadata Database Service on Cloud@Customer, use this procedure. - Using the Console to Terminate a VM Cluster
Before you can terminate a VM cluster, you must first terminate the databases that it contains.
Parent topic: Manage VM Clusters
Using the Console to Create a VM Cluster
To create your VM cluster, be prepared to provide values for the fields required for configuring the infrastructure.
To create a VM cluster, ensure that you have:
- Active Exadata infrastructure is available to host the VM cluster.
- A validated VM cluster network is available for the VM cluster to use.
Related Topics
- Oracle Exadata Database Service on Cloud@Customer Service Description
- Using the Console to Scale the Resources on a VM Cluster
- Introduction to Scale Up or Scale Down Operations
- Estimating How Much Local Storage You Can Provision to Your VMs
- Resource Tags
- Oracle PaaS/IaaS Cloud Service Description documents
- Oracle Platform as a Service and Infrastructure as a Service – Public Cloud Service DescriptionsMetered & Non-Metered
- Getting Started with Events
- Overview of Database Service Events
- Overview of Automatic Diagnostic Collection
- Incident Logs and Trace Files
- Health Metrics
- Using the Console to Enable, Partially Enable, or Disable Diagnostics Collection
- Resource Manager and Terraform
Using the Console to Enable, Partially Enable, or Disable Diagnostics Collection
You can enable, partially enable, or disable diagnostics collection for your Guest VMs after provisioning the VM cluster. Enabling diagnostics collection at the VM cluster level applies the configuration to all the resources such as DB home, Database, and so on under the VM cluster.
- You are opting in with the understanding that the list of events, metrics, and log files collected can change in the future. You can opt-out of this feature at any time.
- Oracle may add more metrics in the future, but if you have already chosen to collect metrics, you need not update your opt-in value. It will remain enabled/disabled based on your current preference.
- If have previously opted in for incident log and trace file collection and decide to opt out when Oracle Cloud operations run a log collection job, then the job will run its course and will not cancel. Future log collections won't happen until you opt-in again to the incident logs and trace file collection option.
Using the Console to Add VMs to a Provisioned Cluster
To add virtual machines to a provisioned cluster, use this procedure.
Once the VM cluster is upgraded to Exadata Database Service Guest VM OS 23.1, you will be able to add a new VM or a new database server to this VM cluster if Exadata Cloud@Customer Infrastructure is running an Exadata System Software version 22.1.16 and later.
Upgrade to Exadata System Software 23.1 for Exadata Cloud@Customer Infrastructure will be available with February 2023 update cycle.
- The same Guest OS Image version running on the existing provisioned VMs in the cluster is used to provision new VMs added to extend the VM cluster. However, any customizations made to the Guest OS Image on the existing VMs must be manually applied to the newly added VM.
- For VM clusters running a Guest OS Image version older than a year, you must update the Guest OS Image version before adding a VM to extend the cluster.
- Adding a VM to a cluster will not automatically extend any database which is part of a Data Guard configuration (either primary or standby) to the newly provisioned VM.
- For databases not part of a Data Guard configuration, only databases that are running on all VMs in the existing cluster will be added to the newly provisioned VM. Any database running on a subset of VMs will not extend automatically to run on the newly added VM.
To extend the database instance for Data Guard-enabled databases for the newly added VMs, see Nodelist is not Updated for Data Guard-Enabled Databases.
Using the Console to View a List of DB Servers on an Exadata Infrastructure
To view a list of database server hosts on an Oracle Exadata Cloud@Customer system, use this procedure.
Using the Console to Remove a VM from a VM Cluster
To remove a virtual machine from a provisioned cluster, use this procedure.
Terminating a VM from a cluster requires the removal of any database which is part of a Data Guard configuration (either primary or standby) from the VM to proceed with the terminate flow. For more information on manual steps, see My Oracle Support note 2811352.1.
Using the Console to Update the License Type on a VM Cluster
To modify licensing, be prepared to provide values for the fields required for modifying the licensing information.
Using the Console to Scale the Resources on a VM Cluster
Starting in Oracle Exadata Database Service on Cloud@Customer, you can scale up or down multiple resources at the same time. You can also scale up or down resources one at a time.
- Use Case 1: If you have allocated all of the resources to one VM cluster, and if you want to create multiple VM clusters, then there wouldn't be any resources available to allocate to the new clusters. Therefore, scale down the resources as needed to then create additional VM clusters.
- Use Case 2: If you want to allocate different resources based on the workload, then scale down or scale up accordingly. For example, you may want to run nightly batch jobs for reporting/ETL and scale down the VM once the job is over.
- OCPU
- Memory
- Local storage
- Exadata storage
Each scaling operation can take several minutes to complete. The time for each operation will vary based on activity in the system, but as a general rule, most operations should complete within 15 minutes for a quarter rack, 20 minutes for a half rack, and 30 minutes for a full or larger rack. Performing multiple OCPU scaling operations over a short period of time can lengthen the time for completion. Although online, OCPU scaling is not implemented on all VMs in parallel so as to detect and protect from any anomalies before they affect the entire system. Memory and Local Storage scaling require a VM reboot, and are performed one VM at a time in a rolling manner.
If you run multiple scale down operations, then each operation is performed serially. For example, if you scale memory and local storage from the Console, then the system will first scale memory, and when that operation completes, it will scale storage. The time to complete all operations will be the sum of the time to complete individual operations.
Using the Console to Stop, Start, or Reboot a VM Cluster Virtual Machine
Use the console to stop, start, or reboot a virtual machine.
Using the Console to Check the Status of a VM Cluster Virtual Machine
Review the health status of a VM cluster virtual machine.
Using the Console to Move a VM Cluster to Another Compartment
To change the compartment that contains your VM cluster on Oracle Exadata Database Service on Cloud@Customer, use this procedure.
When you move a VM cluster, the compartment change is also applied to the virtual machines and databases that are associated with the VM cluster. However, the compartment change does not affect any other associated resources, such as the Exadata infrastructure, which remains in its current compartment.
Using the API to Manage Oracle Exadata Database Service on Cloud@Customer VM Clusters
Review the list of API calls to manage your Oracle Exadata Database Service on Cloud@Customer VM cluster networks and VM clusters.
For information about using the API and signing requests, see "REST APIs" and "Security Credentials". For information about SDKs, see "Software Development Kits and Command Line Interface".
Use these API operations to manage Oracle Exadata Database Service on Cloud@Customer VM cluster networks and VM clusters:
GenerateRecommendedVmClusterNetwork
CreateVmClusterNetwork
DeleteVmClusterNetwork
GetVmClusterNetwork
ListVmClusterNetworks
UpdateVmClusterNetwork
ValidateVmClusterNetwork
CreateVmCluster
DeleteVmCluster
GetVmCluster
ListVmClusters
UpdateVmCluster
For the complete list of APIs, see "Database Service API".
Related Topics
- REST APIs
- Security Credentials
- Software Development Kits and Command Line Interface
- GenerateRecommendedVmClusterNetwork
- CreateVmClusterNetwork
- DeleteVmClusterNetwork
- GetVmClusterNetwork
- ListVmClusterNetworks
- UpdateVmClusterNetwork
- ValidateVmClusterNetwork
- CreateVmCluster
- DeleteVmCluster
- GetVmCluster
- ListVmClusters
- UpdateVmCluster
- Database Service API
Parent topic: Manage VM Clusters
Troubleshooting Virtual Machines Using Console Connections
You can troubleshoot malfunctioning virtual machines using console connections. For example, a previously working Guest VM stops responding.
The use of the serial console feature requires Exadata Infrastructure
version 22.1.10 or higher for 22.X users and version 23.1.1 or higher for 23.X
users. The serial console feature will be available on any new VM Clusters created
immediately but will only be available on previously existing VM Clusters after the
next quarterly maintenance cycle. Also, make sure to review all prerequisites stated
below, including setting a password for either the opc
or the
root
user. Failure to make necessary changes for meeting these
requirements in advance will result in the inability to urgently connect to the
serial console when the need arises when the VM is not otherwise accessible.
To connect to a running instance for administration and general use, use a Secure Shell (SSH). For more information, see Connecting to a Virtual Machine with SSH
- Ensure that you have the correct permissions.
- Complete the prerequisites, including creating your SSH key pair (in case you don't have one yet).
- Create the Virtual Machine Serial Console.
- Connect to the serial console via SSH.
- Open the navigation menu. Under Oracle Database, click Exadata Database Service on Cloud@Customer.
- Under Region, select the region that you want to associate with the Oracle Exadata infrastructure.
- Under Infrastructure, click Exadata Infrastructure.
- Click the name of the infrastructure that you are interested in.
- In the resulting Infrastructure Details page, go to the Version section to the find the DB Server version installed.
- Required IAM Policies
An administrator must grant you secure access to the virtual machine console on the Exadata Database Service on Cloud@Customer system through an IAM policy. - Prerequisites
You must install an SSH client and create SSH key pairs. - Create the Virtual Machine Serial Console Connection
- Make an SSH Connection to the Serial Console
- Using Cloud Shell to Connect to the Serial Console
- Displaying the Console History for a Virtual Machine
- Troubleshooting Virtual Machines from Guest VM Console Connections on Linux Operating Systems
- Exiting the Virtual Machine Serial Console Connection
Parent topic: Manage VM Clusters
Required IAM Policies
An administrator must grant you secure access to the virtual machine console on the Exadata Database Service on Cloud@Customer system through an IAM policy.
This access is required whether you're using the Console or the REST API with an SDK, CLI, or other tools. If you get a message that you don’t have permission or are unauthorized, verify with your administrator what type of access you have and which compartment to work in.
To create virtual machine console connections, an administrator needs to grant
user access to read and manage virtual machine console connections through an IAM policy. The
resource name for virtual machine console connections is
dbnode-console-connection
. The resource name for virtual machine is
db-nodes
. The following policies grant users the ability to create virtual
machine console connections:
Allow group <group_name> to manage dbnode-console-connection in tenancy
Allow group <group_name> to read db-nodes in tenancy
Prerequisites
You must install an SSH client and create SSH key pairs.
Ports to Open for Control Plane Connectivity
Ensure that the firewall rules are correct so that the Control Plane Server (CPS) can reach the required OCI endpoints. For more information, see Table 3-2
Parent topic: Prerequisites
Install an SSH Client and a Command-line Shell (Microsoft Windows)
Microsoft Windows does not include an SSH client by default. If you are connecting from a Windows client, you need to install an SSH client. You can use PuTTY plink.exe with Windows PowerShell or software that includes a version of OpenSSH such as:
The instructions in this topic frequently use PuTTY and Windows PowerShell.
If you want to make the console connection from Windows with Windows PowerShell, PowerShell might already be installed on your Windows operating system. If not, follow the steps at the link. If you are connecting to the instance from a Windows client using PowerShell, plink.exe is required. plink.exe is the command link connection tool included with PuTTY. You can install PuTTY or install plink.exe separately. For installation information, see http://www.putty.org.
Parent topic: Prerequisites
Create SSH Key Pairs
To create the secure console connection, you need an SSH key pair. The method to use for creating key pairs depends on your operating system. When connecting to the serial console, you must use an RSA key. The instructions in this section show how to create an RSA SSH key pair.
Parent topic: Prerequisites
Create the SSH key Pair for Linux
If you're using a UNIX-style system, you probably already have the
ssh-keygen
utility installed. To determine whether the utility
is installed, type ssh-keygen
on the command-line. If the utility
isn't installed, you can download OpenSSH
for UNIX from http://www.openssh.com/portable.html and install it.
Parent topic: Create SSH Key Pairs
Create the SSH Key Pair for Windows Using PuTTY
If you are using a Windows client to connect to the instance console connection, use an SSH key pair generated by PuTTY.
Parent topic: Create SSH Key Pairs
To create a connection using the SSH key pair generated using PuTTY
Do the following on the Create serial console access window:
- Paste the SSH Key generated from OpenSSH format or choose Upload SSH key file and provide the path of the public key saved at step 8 in Create the SSH Key Pair for Windows Using PuTTY.
- Once the connection is Active, click Copy serial console connection for Windows.
- Paste the connection string copied from the previous step into a text file.
- In the text file, replace
<PATH_FILE_PUTTY_PRIVATE.ppk>
to point to your PuTTY Private Key (PPK) file path on your computer. For example, if you have saved.ppk
file at$HOME\Documents\mykey.ppk
. - Paste the modified connection string into the PowerShell window, and then press Enter to connect to the console.
Parent topic: Create the SSH Key Pair for Windows Using PuTTY
Sign in to a Virtual Machine From the Serial Console
If you want to sign in to a virtual machine using a virtual machine console
connection, you can use Secure Shell (SSH) connection to sign in. If you want to sign in with
a username and password, you need a user account with a password. Oracle Exadata Cloud does
not set a default password for the opc
or root
users.
Therefore, if you want to sign in as the opc
or root
user,
you need to create a password for the opc
or root
user.
Otherwise, add a different user with a password and sign in as that user. This should be
completed in-advance, before a potential situation that might require you to log in to the
serial console.
Parent topic: Prerequisites
Connect Through Firewalls
If the client you will use to access the serial console is behind a firewall, you
must ensure that this client is able to reach the required endpoint in order to access the
serial console of the virtual machine. The client system connecting to the serial console must
be able to reach the serial console server (for example,
vm-console.exacc.us-ashburn-1.oci.oraclecloud.com
) over SSH using port 443,
directly or through a proxy.
Parent topic: Prerequisites
Create the Virtual Machine Serial Console Connection
Before you can make a local connection to the serial console, you need to create the virtual machine console connection.
Virtual machine console connections are limited to one client at a time. If the client fails, the connection remains active for approximately five minutes. During this time, no other client can connect. After five minutes, the connection is closed, and a new client can connect. During the five-minute timeout, any attempt to connect a new client fails with the following message:
channel 0: open failed: administratively prohibited: console access is limited to one connection at a time
Related Topics
Make an SSH Connection to the Serial Console
After you create the console connection for the virtual machine, you can connect to the serial console using a Secure Shell (SSH) connection. When making an SSH connection to the serial console, you must use an RSA key. You can use the same SSH key for the serial console that was used when you launched the instance, or you can use a different SSH key.
When you are finished with the serial console and have terminated the SSH connection, you should delete the serial console connection. If you do not disconnect from the session, Oracle Cloud Infrastructure terminates the serial console session after 24 hours and you must reauthenticate to connect again.
Validate Server Host Keys
When you first connect to the serial console, you're prompted to validate the fingerprint of the server host key. The fingerprint of the server host key is the SHA256 hash of the server host's public SSH key. The server SSH handshake response is signed with the associated private key. Validating the server host key's fingerprint protects against potential attacks.
When you make a manual connection to the serial console, the fingerprint of the server host key is not automatically validated. To manually validate the fingerprint, compare the fingerprint value displayed in the Oracle Cloud Infrastructure Console to the value of the RSA key fingerprint that appears in the terminal when you connect.
To find the fingerprint of the server host key in the Console, on the Virtual Machine details page, under Resources, click Console connection. The table displays the fingerprint of the server host key. The fingerprint in the Console should match the value of the RSA key fingerprint shown in the terminal when you connect to the serial console.
The server host keys are periodically rotated for security purposes. Key rotation
reduces the risk posed when keys are compromised by limiting the amount of data
encrypted or signed by one key version. When your key is rotated and you try to
connect to the serial console, a warning appears indicating a potential attack. The
warning includes an Host key verification failed error and a line
number in your .ssh/known_hosts
file. Delete that line in your
.ssh/known_hosts
file and then reconnect to the serial
console. You are then prompted to accept a new server host key fingerprint.
Parent topic: Make an SSH Connection to the Serial Console
Connect from Mac OS X and Linux Operating Systems
Use an SSH client to connect to the serial console. Mac OS X and most Linux and UNIX-like operating systems include the SSH client OpenSSH by default.
To connect to the serial console using OpenSSH on Mac OS X or Linux
Connect from Windows Operating Systems
The steps to connect to the serial console from Microsoft Windows PowerShell are different from the steps for OpenSSH. The following steps do not work in the Windows terminal.
If you are connecting to the instance from a Windows client using PowerShell,
plink.exe
is required. plink.exe
is the command link connection tool included with PuTTY. You can install PuTTY
or install plink.exe
separately. For more information, see
Installing an SSH Client and a Command-line Shell
(Windows).
To connect to the serial console on Microsoft Windows
Related Topics
Parent topic: Make an SSH Connection to the Serial Console
Using Cloud Shell to Connect to the Serial Console
You can connect to the serial console quickly and easily using the Cloud Shell integration. Cloud Shell is a web browser-based terminal accessible from the Console. The Cloud Shell integration automatically creates the instance console connection and a temporary SSH key. The only prerequisite for connecting to the serial console from Cloud Shell is granting users the correct permissions. For an introductory walkthrough of using Cloud Shell, see Using Cloud Shell.
- By default, Cloud Shell limits network access to OCI internal resources in your tenancy home region only unless you have enabled the Cloud Shell managed Public Network. Your administrator must configure an Identity policy to enable Cloud Shell Public Network. For more information, see Cloud Shell Networking.
- You cannot concurrently connect to more than one DB node using Cloud Shell. As an example, if you have an open connection to DBnode1 and want to connect to DBnode2, you must first exit the active Cloud Shell from DBnode1 and then establish a connection to DBnode2.
- Ensure that the firewall rules are correct so that the Control Plane Server (CPS) can reach the required OCI endpoints. For more information, see Table 3-2
When you are finished with the serial console and have terminated the SSH connection, you should delete the serial console connection. If you do not disconnect from the session, Oracle Cloud Infrastructure terminates the serial console session after 24 hours and you must re-authenticate to connect again.
Related Topics
To connect to the serial console using Cloud Shell
Related Topics
Parent topic: Using Cloud Shell to Connect to the Serial Console
Displaying the Console History for a Virtual Machine
To access the serial console and to use console history, firewall rules must be configured so that the Control Plane Server (CPS) can access the necessary OCI endpoints. Please review Table 3-2 details for Object Storage and VM console connectivity requirements.
You can capture and display recent serial console data for a Virtual Machine. The data includes configuration messages that occur when the Virtual Machine boots, such as kernel and BIOS messages, and is useful for checking the status of the Virtual Machine or diagnosing and troubleshooting problems.
The console history captures up to a megabyte of the most recent serial console data for the specified Virtual Machine. Note that the raw console data, including multi-byte characters, is captured.
The console history is a point-in-time record. To troubleshoot a malfunctioning Virtual Machine using an interactive console connection, use a serial console connection.
Managing Console History Data
You can use the Console or API to manage console history captures. Console history lets you see serial output from your Virtual Machine without having to connect to the instance remotely. The console history can be used to audit previous access and actions taken with the serial console.
On the instance details page in the Console, you can capture and download console histories, view and edit metadata details, and delete console history captures.
- Using the Console to Capture the Console History
- Using the Console to Download Console History Captures
- Using the Console to View Console History Captures
- Using the Console to View and Edit the Metadata Details of a Console History Capture
- Using the Console to Delete Console History Captures
- Using the API to Manage the Console History Data
Review the list of API calls to manage console history data.
Parent topic: Displaying the Console History for a Virtual Machine
Using the Console to View and Edit the Metadata Details of a Console History Capture
Parent topic: Managing Console History Data
Using the API to Manage the Console History Data
Review the list of API calls to manage console history data.
For information about using the API and signing requests, see REST APIs and Security Credentials. For information about SDKs, see Software Development Kits and Command Line Interface.
For the complete list of APIs, see Database Service API.
Use the following API operations to manage the console history data.
- To capture the console history, use the createDbNodeConsoleHistory method.
- To get details of console history metadata, use the getDbNodeConsoleHistory method.
- To get the details of console history content, use the getDbNodeConsoleHistoryContent method.
- To edit console history metadata, use the updateDbNodeConsoleHistory method.
- To list console history captures, use the listDbNodeConsoleHistories method.
- To delete console history captures, use the deleteDbNodeConsoleHistory method.
Parent topic: Managing Console History Data
Troubleshooting Virtual Machines from Guest VM Console Connections on Linux Operating Systems
After you are connected with an instance console connection, you can perform various tasks, such as:
- Edit system configuration files.
- Add or reset the SSH keys for the
opc
user. - Reset the password for the
opc
user.
These tasks require you to boot into a Bash shell in maintenance mode.
To boot into maintenance mode
Default user and password:
- Account: Grub boot loader
- Username: root
- Default Password: sos1Exadata
- Account Type: Operating system user
For more information, see Default User Accounts for Oracle Exadata.
Exiting the Virtual Machine Serial Console Connection
To exit the serial console connection
When using SSH, the ~
character at the beginning of a new line is
used as an escape character.
Parent topic: Exiting the Virtual Machine Serial Console Connection
To delete the serial console connection for a Virtual Machine
Parent topic: Exiting the Virtual Machine Serial Console Connection