Monitor Metrics for VM Cluster Resources
You can monitor the health, capacity, and performance of your VM clusters and databases with metrics, alarms, and notifications. You can use Oracle Cloud Infrastructure Console, Monitoring APIs, or Database Management APIs to view metrics.
Note: To view metrics you must have the required access as specified in an Oracle Cloud Infrastructure policy (whether you're using the Console, the REST API, or another tool). See Getting Started with Policies for information on policies.
WARNING:
Metrics, events, and audit events will not be sent if Cluster Ready Services (CRS) is not running before Autonomous Health Framework (AHF) starts.- Prerequisites for Using Metrics
- View Metrics for VM Cluster
- View Metrics for a Database
- View Metrics for VM Clusters in a Compartment
- View Metrics for Databases in a Compartment
- Manage Oracle Trace File Analyzer
- Manage Database Service Agent
Parent topic: Reference Guides for Exadata Cloud Infrastructure
Prerequisites for Using Metrics
The following prerequisites are required for the metrics to flow out of the VM cluster.
- Metrics on the VM clusters depends on Oracle Trace File Analyzer (TFA) agent. Ensure that these components are up and running. AHF version 22.2.4 or higher is required for capturing metrics from the VM clusters. To start, stop, or check the status of TFA, see Manage Oracle Trace File Analyzer.
- To view the metrics on the Oracle Cloud Infrastructure Console, the TFA
flag
defaultocimonitoring
must be set toON
. This flag is set toON
by default and you need not perform any action to set this. If you are not seeing metrics on the Console, then asroot
user on the guest VM, check if the flag is set toON
.tfactl get defaultocimonitoring .---------------------------------------------------------------------. | <host name> | +-------------------------------------------------------------+-------+ | Configuration Parameter | Value | +-------------------------------------------------------------+-------+ | Send CEF metrics to OCI Monitoring ( defaultOciMonitoring ) | ON | '-------------------------------------------------------------+-------'
If thedefaultocimonitoring
flag is set toOFF
, then run thetfactl set defaultocimonitoring=on
ortfactl set defaultocimonitoring=ON
command to turn it on:tfactl set defaultocimonitoring=on Successfully set defaultOciMonitoring=ON .---------------------------------------------------------------------. | <host name> | +-------------------------------------------------------------+-------+ | Configuration Parameter | Value | +-------------------------------------------------------------+-------+ | Send CEF metrics to OCI Monitoring ( defaultOciMonitoring ) | ON | '-------------------------------------------------------------+-------'
- The following network configurations are required.
- Egress rules for outgoing traffic: The default egress
rules are sufficient to enable the required network path : For more
information, see Default Security List
.If you have blocked the outgoing traffic by modifying the default egress
rules on your Virtual Cloud Network(VCN), you will need to revert the
settings to allow outgoing traffic. The default egress rule allowing
outgoing traffic (as shown in the Rules Required for both Client and
Backup Networks ) is as follows:
- Stateless: No (all rules must be stateful)
- Destination Type: CIDR
- Destination CIDR: All <region> Services in Oracle Services Network
- IP Protocol: 443 (HTTPS)
- Public IP or Service Gateway: The compute instance must
have either a public IP address or a service gateway to be able to send
compute instance metrics to the Monitoring service.
If the instance does not have a public IP address, set up a service gateway on the virtual cloud network (VCN). The service gateway lets the instance send compute instance metrics to the Monitoring service without the traffic going over the internet. Here are special notes for setting up the service gateway to access the Monitoring service:
- When creating the service gateway, enable the service label called All <region> Services in Oracle Services Network. It includes the Monitoring service.
-
When setting up routing for the subnet that contains the instance, set up a route rule with Target Type set to Service Gateway, and the Destination Service set to All <region> Services in Oracle Services Network.
For detailed instructions, see Access to Oracle Services: Service Gateway.
- Egress rules for outgoing traffic: The default egress
rules are sufficient to enable the required network path : For more
information, see Default Security List
.If you have blocked the outgoing traffic by modifying the default egress
rules on your Virtual Cloud Network(VCN), you will need to revert the
settings to allow outgoing traffic. The default egress rule allowing
outgoing traffic (as shown in the Rules Required for both Client and
Backup Networks ) is as follows:
View Metrics for VM Cluster
Perform the following steps to view the metrics for Guest VMs using the console.
When there is a network problem and Oracle Trace File Analyzer (TFA) is unable to post metrics, TFA will wait for one hour before attempting to retry posting the metrics. This is required to avoid creating a backlog of metrics processing on TFA.
Potentially one hour of metrics will be lost between network restore and the first metric posted.
- Open the navigation menu. Click Oracle Database, then click Oracle Exadata Database Service on Dedicated Infrastructure.
- Choose your Compartment. A list of VM clusters is displayed.
- In the list of VM clusters, click the VM cluster for which you want to view the metrics. Details of the VM cluster you selected are displayed.
- In the Resources section, click Metrics.
A chart for each metrics is displayed. By default, the metrics for the last one hour are displayed.
You can only select the
oci_database_cluster
namespace from the Metric namespace drop-down. - If you want to change the interval, select the required start time and end time. Alternatively, you can select the interval from the Quick Selects drop down menu. The metrics are refreshed immediately for the selected interval.
- For each metric, you can choose the interval and statistic independently.
- Interval - The time period for which the metric is calculated.
- Statistic - The mathematical method by which the metric is calculated.
- For each metric, you can choose the following options from the 'Options' drop down
menu.
-
View Query in Metrics Explorer
-
Copy Chart URL
-
Copy Query (MQL)
- Create an Alarm on this Query
- Table View
-
For Detailed information on various options for viewing the metrics chart, see Viewing Default Metric Charts.
Parent topic: Monitor Metrics for VM Cluster Resources
View Metrics for a Database
Perform the following steps to view the metrics for a database using the console.
When there is a network problem and Oracle Trace File Analyzer (TFA) is unable to post metrics, TFA will wait for one hour before attempting to retry posting the metrics. This is required to avoid creating a backlog of metrics processing on TFA.
Potentially one hour of metrics will be lost between network restore and the first metric posted.
- Open the navigation menu. Click Oracle Database, then click Exadata on Oracle Public Cloud.
- Choose your Compartment. A list of VM clusters is displayed.
- In the list of VM clusters, click the VM cluster that contains the database for which you want to view the metrics. Details of the VM cluster you selected are displayed.
- In the list of databases, click the database for which you want to view the metrics.
- In the Resources section, click Metrics.
A chart for each metrics is displayed. By default, the metrics for the last one hour are displayed.
- Select a namespace from the Metric namespace
from where you wish to view metrics.
Note
- When Database Management is enabled, you will have an option to choose
from
oci_database
ororacle_oci_database
namespace. - When Database Management is disabled, then you can view
metrics only from the
oci_database
namespace.
- When Database Management is enabled, you will have an option to choose
from
- If you want to change the interval, select the required start time and end time. Alternatively, you can select the interval from the Quick Selects drop down menu. The metrics are refreshed immediately for the selected interval.
- For each metric, you can choose the interval and statistic independently.
- Interval - The time period for which the metric is calculated.
- Statistic - The mathematical method by which the metric is calculated.
- For each metric, you can choose the following options from the 'Options' drop down
menu.
- View Query in Metrics Explorer
- Copy Chart URL
- Copy Query (MQL)
- Create an Alarm on this Query
- Table View
For Detailed information on various options for viewing the metrics chart, see Viewing Default Metric Charts.
View Metrics for a PDB
- Open the navigation menu. Click Oracle Database, then click Exadata on Oracle Public Cloud.
- Choose your Compartment. A list of VM clusters is displayed.
- In the list of VM clusters, click the VM cluster that contains the database for which you want to view the metrics. Details of the VM cluster you selected are displayed.
- In the list of databases, click the database that contains the PBD for which you want to view the metrics.
- Under Resources, click Pluggable Databases.
- In the list of VM clusters, click the PDB that you wish to view metrics.
- Select a namespace from the Metric namespace
from where you wish to view metrics.
Note
- When Database Management is enabled, you will have an
option to choose from
oracle_oci_database
namespace. - When Database Management is disabled, then the system will display a banner asking you to enable Database Management to provide metrics.
- When Database Management is enabled, you will have an
option to choose from
Parent topic: Monitor Metrics for VM Cluster Resources
View Metrics for VM Clusters in a Compartment
Perform the following steps to view the metrics for databases in a compartment using the console.
When there is a network problem and Oracle Trace File Analyzer (TFA) is unable to post metrics, TFA will wait for one hour before attempting to retry posting the metrics. This is required to avoid creating a backlog of metrics processing on TFA.
Potentially one hour of metrics will be lost between network restore and the first metric posted.
- Open the Oracle Cloud Infrastructure Console by clicking the menu icon next to Oracle Cloud.
- From the left navigation list click Observability & Management.
- Under Monitoring, click Service Metrics.
- On the Service Metrics page, under Compartment select your compartment.
- On the Service Metrics page, under Metric Namespace select
oci_database_cluster
. - If there are multiple VM clusters in the compartment you can show metrics aggregated across the clusters by selecting Aggregate Metric Streams.
- If you want to limit the metrics you see, next to Dimensions click Add (click Edit if you have already added dimensions).
- In the Dimension Name field select a dimension.
- In the Dimension Value field select a value.
- Click Done.
- In the Edit dimensions dialog click +Additional Dimension to add an additional dimension. Click X to remove a dimension.
- To create an alarm on a specific metric, click Options and select Create an Alarm on this Query. See Managing Alarms for information on setting and using alarms.
If you don't see any metrics, check the network settings and AHF version listed in the prerequisites section.
Related Topics
Parent topic: Monitor Metrics for VM Cluster Resources
View Metrics for Databases in a Compartment
Perform the following steps to view the metrics for databases in a compartment using the console.
When there is a network problem and Oracle Trace File Analyzer (TFA) is unable to post metrics, TFA will wait for one hour before attempting to retry posting the metrics. This is required to avoid creating a backlog of metrics processing on TFA.
Potentially one hour of metrics will be lost between network restore and the first metric posted.
- Open the Oracle Cloud Infrastructure Console by clicking the menu icon next to Oracle Cloud.
- From the left navigation list click Observability & Management.
- Under Monitoring, click Service Metrics.
- On the Service Metrics page, under Compartment select your compartment.
- On the Service Metrics page, under Metric Namespace select
oci_database
. - If there are multiple databases in the compartment you can show metrics aggregated across the databases by selecting Aggregate Metric Streams.
- If you want to limit the metrics you see, next to Dimensions click Add (click Edit if you have already added dimensions).
- In the Dimension Name field select a dimension.
- In the Dimension Value field select a value.
- Click Done.
- In the Edit dimensions dialog click +Additional Dimension to add an additional dimension. Click X to remove a dimension.
- To create an alarm on a specific metric, click Options and select Create an Alarm on this Query. See Managing Alarms for information on setting and using alarms.
Parent topic: Monitor Metrics for VM Cluster Resources
Manage Oracle Trace File Analyzer
The deployment of the cloud-certified Autonomous Health Framework (AHF), which includes Oracle Trace File Analyzer, is managed by Oracle. You shouldn’t install this manually on the guest VMs.
- To check the run status of Oracle Trace File Analyzer, run the
tfactl status
command asroot
or a non-root user:# tfactl status .-------------------------------------------------------------------------------------------------. | Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status| +----------------+---------------+--------+------+------------+----------------------+------------+ | node1 | RUNNING | 41312 | 5000 | 22.1.0.0.0 | 22100020220310214615| COMPLETE | | node2 | RUNNING | 272300 | 5000 | 22.1.0.0.0 | 22100020220310214615| COMPLETE | '----------------+---------------+--------+------+------------+----------------------+------------'
-
To start the Oracle Trace File Analyzer daemon on the local node, run the
tfactl start
command asroot
:# tfactl start Starting TFA.. Waiting up to 100 seconds for TFA to be started.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Successfully started TFA Process.. . . . . . TFA Started and listening for commands
- To stop the Oracle Trace File Analyzer daemon on the local node, run the
tfactl stop
command asroot
:# tfactl stop Stopping TFA from the Command Line Nothing to do ! Please wait while TFA stops Please wait while TFA stops TFA-00002 Oracle Trace File Analyzer (TFA) is not running TFA Stopped Successfully Successfully stopped TFA..
Parent topic: Monitor Metrics for VM Cluster Resources
Manage Database Service Agent
View the /opt/oracle/dcs/log/dcs-agent.log
file to identify issues with
the agent.
-
To check the status of the Database Service Agent, run the
systemctl status
command:# systemctl status dbcsagent.service dbcsagent.service Loaded: loaded (/usr/lib/systemd/system/dbcsagent.service; enabled; vendor preset: disabled) Active: active (running) since Fri 2022-04-0113:40:19UTC; 6min ago Process: 9603ExecStopPost=/bin/bash -c kill `ps -fu opc |grep "java.*dbcs-agent.*jar"|awk '{print $2}'` (code=exited, status=0/SUCCESS) Main PID: 10055(sudo) CGroup: /system.slice/dbcsagent.service ‣ 10055sudo -u opc /bin/bash -c umask 077; /bin/java
- To start the agent if it is not running, run the
systemctl start
command as theroot
user:systemctl start dbcsagent.service
Parent topic: Monitor Metrics for VM Cluster Resources