For Autoscaling Model Deployments

Troubleshoot autoscaling model deployments.

To help troubleshoot Autoscaling issues, see, Debugging a Model Deployment Failure. The autoscaling operation uses the update work request type, hence, in the event of an error, scrutinize the failed update work requests. Any occurring errors are presented in Error Messages. Review the specifics of the error and take appropriate action.

Service Timed Out during Operation

When creating, updating, or activating model deployment with an autoscaling type scaling policy, the operation fails with a Service Timed Out error.

The system checks for the presence of an IAM policy in the customer tenancy for the metrics retrieval by autoscaling service. A missing policy might lead to the service timing out.

Check for a missing policy and try again.

Invalid TQL query

When deploying a model with autoscaling using a custom metric MQL query, the operation fails with the error:
  • Failed to provision compute resources because of an invalid parameter in the request. Invalid TQL query.

An incorrect or invalid query.

Check the work request error messages. Correct the query and test it in the Metrics Explorer before creating or updating the model deployment using those queries.

Scaling isn't Triggered or Takes too Long to Complete

Scaling isn't triggered or takes too long complete.

The model deployment scaling operation operates on a best-effort basis. Following the creation of the model deployment with autoscaling enabled, a cool-down period is enforced. Autoscaling triggers only after the cool-down time has finished.
Note

This cool-down period is applied after creation or update, and between each scaling event. This cool-down period also resets after every user-performed update operation.
When the scaling operation has started on the model deployment, the completion of the process might take up to six minutes on average. The duration of the scaling time is predominantly influenced by the size of the artifact. If the artifact size is large, the model deployment might require more time to bootstrap, so extending the overall duration of the scaling operation.
The model deployment might require more time to bootstrap, so extending the overall duration of the scaling operation.