Use Custom Networking
Create a model deployment with the custom networking option.
The workload is attached by using a secondary VNIC to a customer-managed VCN and subnet. The subnet can be configured for egress to the public internet through a NAT/Internet gateway.
allow service datascience to use virtual-network-family in compartment <subnet_compartment>
For custom egress, the subnet must have at least 127 IP addresses available.
You can create and run custom networking model deployments using the Console, the OCI Python SDK, the OCI CLI, or the Data Science API.
- Use the Console to sign in to a tenancy with the necessary policies.
- Open the navigation menu and select Analytics & AI. Under Machine Learning, select Data Science.
-
Select the compartment that contains the project that you want to create the model deployment in.
All projects in the compartment are listed.
-
Select the name of the project.
The project details page opens and lists the notebook sessions.
-
Under Resources, select Model
deployments.
A tabular list of model deployments in the project is displayed.
- Select Create model deployment.
- (Optional)
Enter a unique name for the model (limit of 255 characters). If you don't provide a name, a name is automatically generated.
For example,
modeldeployment20200108222435
. - (Optional) Enter a description (limit of 400 characters) for the model deployment.
- (Optional) Under Default configuration, enter a custom environment variable key and corresponding value. Select + Additional custom environment key to add more environment variables.
-
In the Models section, Select
Select to select an active model to deploy from the
model catalog.
- Find a model by using the default compartment and project, or by selecting Using OCID and searching for the model by entering its OCID.
- Select the model.
- Select Submit.
Important
Model artifacts that exceed 400 GB aren't supported for deployment. Select a smaller model artifact for deployment. - (Optional)
Change the Compute shape by selecting
Change shape. Then, follow these steps in the
Select compute panel.
- Select an instance type.
- Select an shape series.
- Select one of the supported Compute shapes in the series.
-
Select the shape that best suits how you want to use the resource. For
the AMD shape, you can use the default or set the number of OCPUs and
memory.
For each OCPU, select up to 64 GB of memory and a maximum total of 512 GB. The minimum amount of memory allowed is either 1 GB or a value matching the number of OCPUs, whichever is greater.
- Select Select shape.
- Enter the number of instances for the model deployment to replicate the model on.
-
Select Custom networking to configure the network
type.
Select the VCN and subnet that you want to use for the resource (notebook session or job).
If you don't see the VCN or subnet that you want to use, select Change Compartment, and then select the compartment that contains the VCN or subnet.Note
Changing from default networking to custom networking is allowed. If custom networking is selected, it can't be changed to default networking. -
Select one of the following options to configure the endpoint type:
Public endpoint
: Data access in a managed instance from outside a VCN.Private endpoint
: The private endpoint that you want to use for the model deployment.
Private endpoint
, selectPrivate Endpoint
from Private Endpoint in Data Science.Select Change compartment to select the compartment containing the private endpoint.
- (Optional)
If you configured access or predict logging, in the
Logging section, select Select
and then follow these steps:
- For access logs, select a compartment, log group, and log name.
- For predict logs, select a compartment, log group, and log name.
- Select Submit.
- (Optional)
Select Show Advanced Options to add tags.
- (Optional) Select the serving mode for the model deployment, either as an HTTPS endpoint or using a Streaming service stream.
- (Optional)
Select the load balancing bandwidth in Mbps or use the 10 Mbps
default.
Tips for load balancing:
If you know the common payload size and the frequency of requests per second, you can use the following formula to estimate the bandwidth of the load balancer that you need. We recommend that you add an extra 20% to account for estimation errors and sporadic peak traffic.
(Payload size in KB) * (Estimated requests per second) * 8 / 1024
For example, if the payload is 1,024 KB and you estimate 120 requests per second, then the recommended load balancer bandwidth would be (1024 * 120 * 8 / 1024) * 1.2 = 1152 Mbps.
Remember that the maximum supported payload size is 10 MB when dealing with image payloads.
If the request payload size is more than the allocated bandwidth of the load balancer that was defined, then the request is rejected with a 429 status code.
- (Optional)
Select Use a custom container image and enter
the following:
-
Repository in <tenancy>: The repository that contains the custom image.
-
Image: The custom image to use in the model deployment at runtime.
-
CMD: More commands to run when the container starts. Add one instruction per text-box. For example if CMD is
["--host", "0.0.0.0"]
, then pass--host
in one text-box and0.0.0.0
in another one. Don't use quotation marks at the end. -
Entrypoint: One or more entry point files to run when the container starts. For example
/opt/script/entrypoint.sh
. Don't use quotation marks at the end. -
Server port: The port that the web server serving the inference is running on. The default is 8080. The port can be anything between 1024 and 65535. Don't use the 24224, 8446, 8447 ports.
-
Health check port: The port that the container
HEALTHCHECK
listens on. Defaults to the server port. The port can be anything between 1024 and 65535. Don't use the 24224, 8446, 8447 ports.
-
- (Optional)
Select the Tags tab, and then enter the tag
namespace (for a defined tag), key, and value to assign tags to the
resource.
To add more than one tag, select Add tag.
Tagging describes the various tags that you can use organize and find resources including cost-tracking tags.
- Select Create.
You can use the OCI CLI to create a model deployment as in this example.
-
Deploy the model with:
oci data-science model-deployment create \ --compartment-id <MODEL_DEPLOYMENT_COMPARTMENT_OCID> \ --model-deployment-configuration-details file://<MODEL_DEPLOYMENT_CONFIGURATION_FILE> \ --project-id <PROJECT_OCID> \ --category-log-details file://<OPTIONAL_LOGGING_CONFIGURATION_FILE> \ --display-name <MODEL_DEPLOYMENT_NAME>
-
Use this model deployment JSON configuration file:
{ "deploymentType": "SINGLE_MODEL", "modelConfigurationDetails": { "bandwidthMbps": <YOUR_BANDWIDTH_SELECTION>, "instanceConfiguration": { "subnetId": <YOUR_SUBNET_ID>, "instanceShapeName": "<YOUR_VM_SHAPE>" }, "modelId": "<YOUR_MODEL_OCID>", "scalingPolicy": { "instanceCount": <YOUR_INSTANCE_COUNT>, "policyType": "FIXED_SIZE" } } }
If you're specifying an environment configuration, you must include the
environmentConfigurationDetails
object as in this example:{ "modelDeploymentConfigurationDetails": { "deploymentType": "SINGLE_MODEL", "modelConfigurationDetails": { "modelId": "ocid1.datasciencemodel.oc1.iad........", "instanceConfiguration": { "subnetId": <YOUR_SUBNET_ID>, "instanceShapeName": "VM.Standard.E4.Flex", "modelDeploymentInstanceShapeConfigDetails": { "ocpus": 1, "memoryInGBs": 16 } }, "scalingPolicy": { "policyType": "FIXED_SIZE", "instanceCount": 1 }, "bandwidthMbps": 10 }, "environmentConfigurationDetails" : { "environmentConfigurationType": "OCIR_CONTAINER", "image": "iad.ocir.io/testtenancy/image_name:1.0.0", "entrypoint": [ "python", "/opt/entrypoint.py" ], "serverPort": "5000", "healthCheckPort": "5000" }, "streamConfigurationDetails": { "inputStreamIds": null, "outputStreamIds": null } } }
- (Optional)
Use this logging JSON configuration file:
{ "access": { "logGroupId": "<YOUR_LOG_GROUP_OCID>", "logId": "<YOUR_LOG_OCID>" }, "predict": { "logGroupId": "<YOUR_LOG_GROUP_OCID>", "logId": "<YOUR_LOG_OCID>" } }
- (Optional)
Use this to use a custom container:
oci data-science model-deployment create \ --compartment-id <MODEL_DEPLOYMENT_COMPARTMENT_OCID> \ --model-deployment-configuration-details file://<MODEL_DEPLOYMENT_CONFIGURATION_FILE> \ --project-id <PROJECT_OCID> \ --category-log-details file://<OPTIONAL_LOGGING_CONFIGURATION_FILE> \ --display-name <MODEL_DEPLOYMENT_NAME>
-
Deploy the model with:
Use the CreateModelDeployment operation to create a model deployment with custom networking. Set the subnet ID as described in the Instance Configuration API documentation.