Metrics

You can monitor the health, capacity, and performance of some Data Science resources using metrics, alarms, and notifications.

Data Science monitors these running resources to collect and report metrics:

Job Metrics

CPU usage
GPU usage
Disk usage
Memory usage
Network bytes in
Network bytes out

Model Deployment Metrics

CPU usage
Memory usage
Network bytes
Predict request count
Predict response
Predict latency
Predict bandwidth usage

Notebook Session Metrics

CPU usage
Memory usage
Network bytes in
Network bytes out

Pipeline Run Metrics

CPU usage
GPU usage
Disk usage
Memory usage
Network bytes in
Network bytes out

Scheduler Metrics

Schedule execution success
Schedule execution failure

Before You Begin

IAM policies:

To monitor resources, you must be given the required access in a policy. This is true whether you're using the Console or the REST API with an SDK, CLI, or other tool. The policy must give you access to the monitoring services and the resources being monitored. If you try to perform an action and get a message that you don't have permission or are unauthorized, confirm with your administrator the type of access you have been granted, and which compartment you should work in. For more information, see Monitoring authentication and authorization or Notifications authentication and authorization.

Viewing Metrics from the Monitoring Service

You can view the default metric charts for all the notebook sessions in a compartment using the Monitoring service.

Open the navigation menu and select Observability & Management. Under Monitoring, select Service Metrics.
Select the compartment that contains the project of the resource that you want to view the metrics for.
Select the resource namespace you want to view for the Metric Namespace. For example, oci_datascience, oci_datascience_jobrun or oci_datascience_modeldeploy.

The Service Metrics page dynamically updates the page to show charts for each that's emitted by the selected metric namespace, see resources with metrics.

For more information about monitoring metrics and using alarms, see Overview of Monitoring. For information about notifications for alarms, see Notifications Overview.

Using the API

For information about using the API and signing requests, see REST APIs and Security Credentials. For information about SDKs, see Software Development Kits and Command Line Interface.

Use the following APIs for:

Monitoring Metrics API for metrics and alarms.
Notifications Topic API for notifications (used with alarms).

Oracle Cloud Infrastructure Documentation

Metrics

Before You Begin

Viewing Metrics from the Monitoring Service

Using the API