OCI Cache Metrics
You monitor the health, capacity, and performance of clusters by using metrics, alarms, and notifications.
See Available Metrics for a list of OCI Cache metrics.
Metrics Terminology
- Namespace
- A namespace is a container for metrics. The namespace identifies the service
sending the metrics. The namespace for OCI Cache is
oci_redis
. - Metrics
- Metrics are the fundamental concept in telemetry and monitoring. Metrics define a time-series set of datapoints. Each metric has a namespace, metric name, compartment identifier, one or more dimensions, and a unit of measure. Each datapoint has a timestamp, value, and count associated with it.
- Dimensions
- A dimension is a key-value pair that defines the characteristics associated with the metric.
For example,
resourceId
is the OCID of the resource that was scanned. - Statistics
- Statistics are metric data aggregations over specified periods of time. Aggregations are done using the namespace, metric name, dimensions, and the data point unit of measure within the time period specified.
- Alarms
- Alarms are used to automate operations monitoring and performance. An alarm tracks changes that occur over a specific time period and performs one or more defined actions, based on the rules defined for the metric.
IAM Policy for Metrics
To monitor resources in Oracle Cloud Infrastructure, you must be granted the required type of access in a policy (IAM) written by an administrator, whether you're using the Console or the REST API with an SDK, CLI, or other tool.
The policy must give you access to the monitoring services and the resources being monitored. If you try to perform an action and get a message that you don't have permission or are unauthorized, confirm with your administrator the type of access you were granted and which compartment you're supposed to work in.
For more information on user authorizations for monitoring, see the Authentication and Authorization section for the related service: Monitoring or Notifications.
Available Metrics
Metrics are per cluster node, unless otherwise specified.
Metric | Metric Display Name | Unit | Description | Dimensions |
---|---|---|---|---|
CPUUtilization |
CPU Utilization | percent | Activity level from CPU, as a percentage of total time (busy time in addition to idle time) versus just idle time. A typical alarm threshold is 90% CPU utilization. |
|
MemoryUtilization |
Memory Utilization | percent | Amount of available memory in use. Measured by pages. Expressed as a percentage of used pages compared to unused pages. A typical alarm threshold is 85% of available memory in use. |
|
NetworksBytesIn |
Network Receive Bytes | bytes | Network receipt throughput. Expressed in bytes received per second. |
|
NetworksBytesOut |
Network Transmit Bytes | bytes | Network transmission throughput. Expressed in bytes transmitted per second. |
|
UsedMemory |
Used Memory | bytes | The total number of bytes allocated for all purposes, including the dataset, buffers, and so on. |
|
KeyspaceHits |
Keyspace Hits | count | The number of successful read-only key lookups in the main dictionary. |
|
ConnectedClients |
Connected Clients | count | The number of client connections, excluding connections from read replicas. |
|
EvictedKeys |
Evicted Keys | count | The number of keys that have been evicted because of exceeding the memory limit maximum. |
|
ReplicationLag |
Replication Lag | seconds | How far behind, in seconds, the replica is in applying changes from the primary node. Only applicable for a node running as a read replica. |
|
KeyspaceMisses |
Keyspace Misses | count | The number of unsuccessful read-only key lookups in the main dictionary. |
|
PrimaryReplicaOffset |
Primary Replica Offset | bytes | The number of bytes that the primary node is sending to all replicas. Applicable to nodes in a replicated configuration. This metric is representative of the write load on the replication group. |
|
ExpiredKeys |
Expired Keys | count | The total number of key expiration events. |
|