OCI Cache Metrics

You monitor the health, capacity, and performance of clusters by using metrics, alarms, and notifications.

See Available Metrics for a list of OCI Cache metrics.

Metrics Terminology

Namespace
A namespace is a container for metrics. The namespace identifies the service sending the metrics. The namespace for OCI Cache is oci_redis.
Metrics
Metrics are the fundamental concept in telemetry and monitoring. Metrics define a time-series set of datapoints. Each metric has a namespace, metric name, compartment identifier, one or more dimensions, and a unit of measure. Each datapoint has a timestamp, value, and count associated with it.
Dimensions
A dimension is a key-value pair that defines the characteristics associated with the metric. For example, resourceId is the OCID of the resource that was scanned.
Statistics
Statistics are metric data aggregations over specified periods of time. Aggregations are done using the namespace, metric name, dimensions, and the data point unit of measure within the time period specified.
Alarms
Alarms are used to automate operations monitoring and performance. An alarm tracks changes that occur over a specific time period and performs one or more defined actions, based on the rules defined for the metric.

IAM Policy for Metrics

To monitor resources in Oracle Cloud Infrastructure, you must be granted the required type of access in a policy (IAM)  written by an administrator, whether you're using the Console or the REST API with an SDK, CLI, or other tool.

The policy must give you access to the monitoring services and the resources being monitored. If you try to perform an action and get a message that you don't have permission or are unauthorized, confirm with your administrator the type of access you were granted and which compartment  you're supposed to work in.

For more information on user authorizations for monitoring, see the Authentication and Authorization section for the related service: Monitoring or Notifications.

Available Metrics

Metrics are per cluster node, unless otherwise specified.

Metric Metric Display Name Unit Description Dimensions
CPUUtilization CPU Utilization percent Activity level from CPU, as a percentage of total time (busy time in addition to idle time) versus just idle time. A typical alarm threshold is 90% CPU utilization.

resourceId

MemoryUtilization Memory Utilization percent Amount of available memory in use. Measured by pages. Expressed as a percentage of used pages compared to unused pages. A typical alarm threshold is 85% of available memory in use.

resourceId

NetworksBytesIn Network Receive Bytes bytes Network receipt throughput. Expressed in bytes received per second.

resourceId

NetworksBytesOut Network Transmit Bytes bytes Network transmission throughput. Expressed in bytes transmitted per second.

resourceId

UsedMemory Used Memory bytes The total number of bytes allocated for all purposes, including the dataset, buffers, and so on.

resourceId

KeyspaceHits Keyspace Hits count The number of successful read-only key lookups in the main dictionary.

resourceId

ConnectedClients Connected Clients count The number of client connections, excluding connections from read replicas.

resourceId

EvictedKeys Evicted Keys count The number of keys that have been evicted because of exceeding the memory limit maximum.

resourceId

ReplicationLag Replication Lag seconds How far behind, in seconds, the replica is in applying changes from the primary node. Only applicable for a node running as a read replica.

resourceId

KeyspaceMisses Keyspace Misses count The number of unsuccessful read-only key lookups in the main dictionary.

resourceId

PrimaryReplicaOffset Primary Replica Offset bytes The number of bytes that the primary node is sending to all replicas. Applicable to nodes in a replicated configuration. This metric is representative of the write load on the replication group.

resourceId

ExpiredKeys Expired Keys count The total number of key expiration events.

resourceId