Metrics
You can monitor the health, capacity, and performance of some Data Science resources using metrics, alarms, and notifications.
Data Science monitors these running resources to collect and report metrics:
- Job Metrics
-
-
CPU usage
-
GPU usage
-
Disk usage
-
Memory usage
-
Network bytes in
-
Network bytes out
-
- Model Deployment Metrics
-
-
CPU usage
-
Memory usage
-
Network bytes
-
Predict request count
-
Predict response
-
Predict latency
-
Predict bandwidth usage
-
- Notebook Session Metrics
-
-
CPU usage
-
Memory usage
-
Network bytes in
-
Network bytes out
-
- Pipeline Run Metrics
-
-
CPU usage
-
GPU usage
-
Disk usage
-
Memory usage
-
Network bytes in
-
Network bytes out
-
Before You Begin
IAM policies:
To monitor resources, you must be given the required access in a policy. This is true whether you're using the Console or the REST API with an SDK, CLI, or other tool. The policy must give you access to the monitoring services and the resources being monitored. If you try to perform an action and get a message that you don't have permission or are unauthorized, confirm with your administrator the type of access you have been granted, and which compartment you should work in. For more information, see Monitoring authentication and authorization or Notifications authentication and authorization.
Viewing Metrics from the Monitoring Service
You can view the default metric charts for all the notebook sessions in a compartment using the Monitoring service.
For more information about monitoring metrics and using alarms, see Overview of Monitoring. For information about notifications for alarms, see Notifications Overview.
Using the API
For information about using the API and signing requests, see REST APIs and Security Credentials. For information about SDKs, see Software Development Kits and Command Line Interface.
Use the following APIs for:
-
Monitoring Metrics API for metrics and alarms.
-
Notifications Topic API for notifications (used with alarms).