Metrics in Generative AI Agents
By using metrics, you can monitor the endpoints in Generative AI Agents. Review the following topics for more information about these metrics.
Endpoint Metrics
This section lists the metrics for agent endpoints in Generative AI Agents. You can get the following metrics in an endpoint's detail page.
Metric Display Name | Description |
---|---|
Number of calls | Number of calls that the agent that's hosted on this endpoint has processed |
Total processing time (ms) | Total processing time for a call to finish in milliseconds |
Service errors count | Number of calls with an error from the service side |
Client errors count | Number of calls with an error from the client side |
Total input characters consumed | Number of input characters that the agent that's hosted on this endpoint has processed |
Total output characters produced | Number of output characters that the agent that's hosted on this endpoint has processed |
Number of error traces | Number of traces with an error (This option applies if tracing is enabled for this endpoint.) |
Success rate | Successful calls divided by the total number of calls |
In Generative AI Agents service, an endpoint's detail page, select the Options menu in each of the endpoint metric charts to get the following options:
- View Query in Metrics Explorer
- Copy chart URL
- Copy query in Monitoring Query Language (MQL)
- Create an alarm on this query
- Table View
Viewing Query in Metrics Explorer
The metrics explorer is a resource in the Monitoring service. To get permission to work with the Monitoring service resources, ask an administrator to review the IAM policies in Securing Monitoring and grant you the proper access for your role.
For each of the endpoint metrics, select the Options menu in each of the endpoint metric charts and then click View Query in Metrics Explorer The following table displays the parameters used for the endpoint metrics in Monitoring Query Language (MQL).
Metric Display Name | Metric Parameter | MQL |
---|---|---|
Number of calls | TotalInvocationCount |
TotalInvocationCount[1m].count() |
Total processing time | InvocationLatency |
InvocationLatency[1m].mean() |
Service errors count | ServerErrorCount |
ServerErrorCount[1m].count() |
Client errors count | ClientErrorCount |
ClientErrorCount[1m].count() |
Total input characters consumed | InputCharactersCount |
InputCharactersCount[1m].sum() |
Total output characters produced | OutputCharactersCount[1m].sum() |
OutputCharactersCount[1m].sum() |
Number of error traces | ErrorTraceCount |
ErrorTraceCount[1m].sum() |
The success rate is calculated as successful calls divided by the total number of calls with the following MQL:
TotalInvocationCount[1m]{resourceId = "<endpoint-OCID>", StatusCode="200"}.grouping().count()
/ TotalInvocationCount[1m]{resourceId = "<endpoint-OCID>"}.grouping().count() * 100
Creating an Alarm for an Endpoint Metric
For each of the endpoint metrics, select the Options menu in each of the endpoint metric charts and then click Create an alarm on this query to be transported to a populated Create alarm page in the Monitoring service. Fill in the remaining fields to set an alarm for the metric that you selected.