Semantic Search in OCI OpenSearch
OCI Search with OpenSearch supports semantic search starting with OpenSearch version 2.11.
With semantic search, search engines use the context and content of search queries to better understand the meaning of a query, compared to keyword search, where search results are based on content matching keywords in a query. OpenSearch implements semantic search using neural search, which is a technique that uses large language models to understand the relationships between terms. For more information about neural search in OpenSearch, see Neural search tutorial.
Using Neural Search in OCI Search with OpenSearch
To use neural search for semantic search in OCI Search with OpenSearch, you need to:
Register and deploy your choice of model to the cluster.
Create an index and set up an ingestion pipeline using the deployed model. Use the ingestion pipeline to ingest documents into the index.
Run semantic search queries on the index using either hybrid search or neural search.
Prerequisites
To use semantic search, the OpenSearch version for the cluster must be 2.11 or newer. By default, new clusters use version 2.11. See Creating an OpenSearch Cluster.
For existing clusters configured for version 2.3, you can perform an inline upgrade to version 2.11, for more information, see Inline Upgrade for OpenSearch Clusters.
To upgrade existing clusters configured for version 1.2.3 to 2.11, you need to use the upgrade process described in Upgrading an OpenSearch Cluster.
Before you start setting up the model for semantic search, you need to complete the prerequisites, which include specifying the applicable IAM policy if required, and configuring the recommended cluster settings.
IAM Policy for Custom Models and Generative AI Connectors
If you're using one of the pretrained models that are hosted within OCI Search with OpenSearch you don't need to configure permissions, you can skip to the next prerequisite, Cluster Settings for Semantic Search. See also Semantic Search Walkthrough.
Otherwise, you need to create a policy to grant the required access.
You need to create a policy to grant the required access.
If you're new to policies, see Getting Started with Policies and Common Policies.
IAM Policy for Custom Models
If you're using a custom model, you need to grant access for OCI Search with OpenSearch to access to the Object Storage bucket that contains the model.
The following policy example includes the required permission:
ALLOW ANY-USER to manage object-family in tenancy WHERE ALL {request.principal.type='opensearchcluster', request.resource.compartment.id='<cluster_compartment_id>'}
IAM Policy for Generative AI Connectors
If you're using a Generative AI connector, you need to grant access for OCI Search with OpenSearch to access Generative AI resources.
The following policy example includes the required permission:
ALLOW ANY-USER to manage generative-ai-family in tenancy WHERE ALL {request.principal.type='opensearchcluster', request.resource.compartment.id='<cluster_compartment_id>'}
Regions for Generative AI Connectors
To use OCI Generative AI, your tenancy must be subscribed to the US Midwest (Chicago) region or the Germany Central (Frankfurt) region. You don't need to create the cluster in either of those regions, just ensure that your tenancy is subscribed to one of the regions.
Cluster Settings for Semantic Search
Use the settings operation of the Cluster APIs to configure the recommended cluster settings for semantic search. The following example includes the recommended settings:
PUT _cluster/settings
{
"persistent": {
"plugins": {
"ml_commons": {
"only_run_on_ml_node": "false",
"model_access_control_enabled": "true",
"native_memory_threshold": "99",
"rag_pipeline_feature_enabled": "true",
"memory_feature_enabled": "true",
"allow_registering_model_via_local_file": "true",
"allow_registering_model_via_url": "true",
"model_auto_redeploy.enable":"true",
"model_auto_redeploy.lifetime_retry_times": 10
}
}
}
}
Setting up a Model
The first step when configuring neural search is setting up the large language model you want to use. The model is used to generate vector embeddings from text fields.
Register a Model Group
Model groups enable you to manage access to specific models. Registering a model group is optional, however if you don't register a model group, ML Commons creates registers a new model group for you, so we recommend that you register the model group.
Register a model group using the register operation in the Model Group APIs, as shown in the following example:
POST /_plugins/_ml/model_groups/_register
{
"name": "new_model_group",
"description": "A model group for local models"
}
Make note of the model_group_id
returned in the response:
{
"model_group_id": "<model_group_ID>",
"status": "CREATED"
}
Register the Model to the Model Group
Register the model using the register operation from the Model APIs. The content of the POST request to the register operation depends on the type of model you're using.
-
Option 1: Built-in OpenSearch pretrained models
Several pretrained sentence transformer models are available for you to directly register and deploy to a cluster without needing to download and then upload them manually into a private storage bucket, unlike the process required for the custom models option. With this option, when you register a pretrained model, you only need the model's
model_group_id
,name
,version
, andmodel_format
. See Using an OpenSearch Pretrained Model for how to use a pretrained model. -
Option 2: Custom models
You need to pass the Object Storage URL in the
actions
section in the register operation, as follows:POST /_plugins/_ml/models/_register { ..... "actions": [ { "method": "GET", "action_type": "DOWNLOAD", "url": "<Object_Storage_URL_Path>" } ] } }
For an complete example for a register operation, see Custom Models - 2: Register the Model.
-
Option 3: Generative AI connector
To use a Generative AI connector to register a remote embedding model such as the
cohere.embed-english-v3.0
model, you need to create a connector first and then register the model, using the following steps:-
Create a connector to Cohere Embedding model:
POST /_plugins/_ml/connectors/_create { "name": "OCI GenAI Chat Connector cohere-embed-v5", "description": "The connector to public Cohere model service for embed", "version": "2", "protocol": "oci_sigv1", "parameters": { "endpoint": "inference.generativeai.us-chicago-1.oci.oraclecloud.com", "auth_type": "resource_principal", "model": "cohere.embed-english-v3.0", "input_type":"search_document", "truncate": "END" }, "credential": { }, "actions": [ { "action_type": "predict", "method":"POST", "url": "https://${parameters.endpoint}/20231130/actions/embedText", "request_body": "{ \"inputs\":[\"${parameters.passage_text}\"], \"truncate\": \"${parameters.truncate}\" ,\"compartmentId\": \"<compartment_ID>\", \"servingMode\": { \"modelId\": \"${parameters.model}\", \"servingType\": \"ON_DEMAND\" } }", "pre_process_function": "return '{\"parameters\": {\"passage_text\": \"' + params.text_docs[0] + '\"}}';", "post_process_function": "connector.post_process.cohere.embedding" } ] }
The response:
{ "connector_id": "<connector_ID>" }
- Register the
model:
POST /_plugins/_ml/models/_register { "name": "oci-genai-embed-model", "function_name": "remote", "model_group_id": "<model_group_ID>", "description": "test semantic", "connector_id": "<connector_ID>" }
To use a dedicated Generative AI model endpoint, reconfigure the connector payload with the following changes:
- Use
endpointId
instead ofmodelId
, and then specify the dedicated model endpoint's OCID instead of the model name. For example, change:
to:\"modelId\": \"${parameters.model}\"
\"endpointId\":\"<dedicated_model_enpoint_OCID>\"
Change
servingType
fromON_DEMAND
toDEDICATED
. For example, change:
to:\"servingType\":\"ON_DEMAND\"
\"servingType\":\"DEDICATED\"
-
task_id
with the Get operation of the Tasks APIs, as shown in the following example:
GET /_plugins/_ml/tasks/<task_ID>
When
the register operation is complete, the status
value in the response to the
Get operation is COMPLETED
, as shown the following example:
{
"model_id": "<embedding_model_ID>",
"task_type": "REGISTER_MODEL",
"function_name": "TEXT_EMBEDDING",
"state": "COMPLETED",
"worker_node": [
"3qSqVfK2RvGJv1URKfS1bw"
],
"create_time": 1706829732915,
"last_update_time": 1706829780094,
"is_async": true
}
Make note of the model_id
value returned in the response to
use when you deploy the model.
Deploy the Model
After the register operation is completed for the model, you can deploy the model to the
cluster using the deploy operation of the Model APIs, passing the
model_id
from the Get operation response in the previous step, as shown
in the following example:
POST /_plugins/_ml/models/<embedding_model_ID>/_deploy
Make note of the task_id
returned in the response, you can use the
task_id
to check the status of the operation.
For example, from the following response:
{
"task_id": "<task_ID>
",
"task_type": "DEPLOY_MODEL",
"status": "CREATED"
}
to check the status of the register operation, use the task_id
with the
Get operation of the Tasks APIs, as shown in the following example:
GET /_plugins/_ml/tasks/<task_ID>
When the deploy operation is complete, the status
value in the response to
the Get operation is COMPLETED
.
Ingest Data
The first step when configuring neural search is setting up the large language model you want to use. The model is used to generate vector embeddings from text fields.
Create Ingestion Pipeline
Using the model ID of the model deployed, create an ingestion pipeline as shown in the following example:
PUT _ingest/pipeline/test-nlp-pipeline
{
"description": "An example neural search pipeline",
"processors" : [
{
"text_embedding": {
"model_id": "<model_ID>
",
"field_map": {
"passage_text": "passage_embedding"
}
}
}
]
}
The ingestion pipeline defines a processor and the field mappings (in this case "passage_text" → "passage_embedding" ) This means if you use this pipeline on a specific index to ingest data, the pipeline automatically finds the "passage_text" field, and use the pipeline model to generate the corresponding embeddings, "passage_embedding", and then maps them before indexing.
Remember "passage_text" → "passage_embedding" are user defined and can be anything you want. Ensure that you're consistent with the naming while creating the index where you plan to use the pipeline. The pipeline processor needs to be able to map the fields as described.
Create an Index
During the index creation, you can specify the pipeline you want to use to ingest documents into the index.
The following API call example shows how to create an index using the test-nlp-pipeline pipeline created in the previous step.
PUT /test-index
{
"settings": {
"index.knn": true,
"default_pipeline": "test-nlp-pipeline"
},
"mappings": {
"properties": {
"passage_embedding": {
"type": "knn_vector",
"dimension": <model_dimension>,
"method": {
"name":"hnsw",
"engine":"lucene",
"space_type": "l2",
"parameters":{
"m":512,
"ef_construction": 245
}
}
},
"passage_text": {
"type": "text"
}
}
}
}
When creating the index, you also need to specify which library implementation of approximate nearest neighbor (ANN) you want to use. OCI Search with OpenSearch supports NMSLIB, Faiss, and Lucene libraries, for more information, see Search Engines.
The following example uses the Lucene engine.
{
"model_id": "<model_ID>",
"task_type": "REGISTER_MODEL",
"function_name": "TEXT_EMBEDDING",
"state": "COMPLETED",
"worker_node": [
"3qSqVfK2RvGJv1URKfS1bw"
],
"create_time": 1706829732915,
"last_update_time": 1706829780094,
"is_async": true
}
Ingest Data into Index
After successfully creating an index, you can now ingest data into the index as shown in the following example:
POST /test-index/_doc/1
{
"passage_text": "there are many sharks in the ocean"
}
POST /test-index/_doc/2
{
"passage_text": "fishes must love swimming"
}
POST /test-index/_doc/3
{
"passage_text": "summers are usually very hot"
}
POST /test-index/_doc/4
{
"passage_text": "florida has a nice weather all year round"
}
Use a GET to verify that the documents are being ingested correctly and
embeddings are getting auto generated during
ingestion:GET /test-index/_doc/3