Using an OpenSearch Pretrained Model
OCI Search with OpenSearch provides built in support for OpenSearch pretrained models.
This topic describes how to register and deploy any of the pretrained hugging sentence transformer models to a cluster by only specifying the model name. See Pretrained Models for the list of approved models. For an end-to-end procedure on how to use an OpenSearch pretrained model for semantic search in OCI Search with OpenSearch, see Semantic Search Walkthrough.
Prerequisites
Before you start, you need to do the following:
Select one of the pretrained models supported by OCI Search with OpenSearch
Confirm that the OpenSearch cluster is version 2.11 or newer.
Update the cluster settings to perform semantic search. The following example includes the recommended settings:
PUT _cluster/settings { "persistent": { "plugins": { "ml_commons": { "only_run_on_ml_node": "false", "model_access_control_enabled": "true", "native_memory_threshold": "99", "rag_pipeline_feature_enabled": "true", "memory_feature_enabled": "true", "allow_registering_model_via_local_file": "true", "allow_registering_model_via_url": "true", "model_auto_redeploy.enable":"true", "model_auto_redeploy.lifetime_retry_times": 10 } } } }
Step 1: Register the Model Group
Model groups enable you to manage access to specific models. Registering a model group is optional, however if you don't register a model group, ML Commons creates registers a new model group for you, so we recommend that you register the model group.
Register a model group using the register operation in the Model Group APIs, as shown in the following example:
POST /_plugins/_ml/model_groups/_register
{
"name": "new_model_group",
"description": "A model group for local models"
}
Make note of the model_group_id
returned in the response:
{
"model_group_id": "<modelgroupID>",
"status": "CREATED"
}
Step 2: Register the Model
model_group_id
: If you completed Step 1, this is the value for model_group_id to the _register request.name
: The model name for the pretrained model you want to use.version
: The version number for the pretrained model you want to use.model_format
: The format for the model, eitherTORCH_SCRIPT
orONNX
.
Register the model using the register operation from the Model APIs, as shown in the following example:
POST /_plugins/_ml/models/_register
{
"name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
"version": "1.0.2",
"model_group_id": "TOD4Zo0Bb0cPUYbzwcpD",
"model_format": "TORCH_SCRIPT"
}
Make note of the task_id
returned in the response, you can use the
task_id
to check the status of the operation.
For example, from the following response:
{
"task_id": "TuD6Zo0Bb0cPUYbz3Moz",
"status": "CREATED"
}
to check the status of the register operation, use the task_id
with the
Get operation of the Task APIs, as shown in the following example:
GET /_plugins/_ml/tasks/TuD6Zo0Bb0cPUYbz3Moz
When the register operation is complete, the status
value in the
response to the Get operation is COMPLETED
, as shown the following example:
{
"model_id": "iDf6Zo0BkIugivXi3E7z",
"task_type": "REGISTER_MODEL",
"function_name": "TEXT_EMBEDDING",
"state": "COMPLETED",
"worker_node": [
"3qSqVfK2RvGJv1URKfS1bw"
],
"create_time": 1706829732915,
"last_update_time": 1706829780094,
"is_async": true
}
Make note of the model_id
value returned in the response to use when you
deploy the model.
Step 3: Deploy the Model
After the register operation is completed for the model, you can deploy the model to the
cluster using the deploy operation of the Model APIs, passing the
model_id
from the Get operation response in the previous step, as shown
in the following example:
POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_deploy
Make note of the task_id
returned in the response, you can use the
task_id
to check the status of the operation.
For example, from the following response:
{
"task_id": "T-D7Zo0Bb0cPUYbz-cod",
"task_type": "DEPLOY_MODEL",
"status": "CREATED"
}
to check the status of the register operation, use the task_id
with the
Get operation of the Tasks APIs, as shown in the following example:
GET /_plugins/_ml/tasks/T-D7Zo0Bb0cPUYbz-cod
When the deploy operation is complete, the status
value in the response to
the Get operation is COMPLETED
, as shown the following example:
{
"model_id": "iDf6Zo0BkIugivXi3E7z",
"task_type": "DEPLOY_MODEL",
"function_name": "TEXT_EMBEDDING",
"state": "COMPLETED",
"worker_node": [
"3qSqVfK2RvGJv1URKfS1bw"
],
"create_time": 1706829732915,
"last_update_time": 1706829780094,
"is_async": true
}
Step 4: Test the Model
Use the Predict API to test the model, as shown in the following example for a text embedding model:
POST /_plugins/_ml/_predict/text_embedding/<your_embedding_model_ID>
{
"text_docs":[ "today is sunny"],
"return_number": true,
"target_response": ["sentence_embedding"]
}
The response contains text embeddings for the provided sentence, as shown in the following response example:
"inference_results" : [
{
"output" : [
{
"name" : "sentence_embedding",
"data_type" : "FLOAT32",
"shape" : [
768
],
"data" : [
0.25517133,
-0.28009856,
0.48519906,
...
]
}
]
}
]
}