Semantic Search Walkthrough

Prerequisites

Before you start, you need to do the following:

  • Select one of the pretrained models supported by OCI Search with OpenSearch

  • Confirm that the OpenSearch cluster is version 2.11.

  • Update the cluster settings to perform semantic search. The following example command updates the applicable settings:

    PUT _cluster/settings
    {
      "persistent": {
        "plugins": {
          "ml_commons": {
            "only_run_on_ml_node": "false",
            "model_access_control_enabled": "true",
            "native_memory_threshold": "99",
            "rag_pipeline_feature_enabled": "true",
            "memory_feature_enabled": "true",
            "allow_registering_model_via_local_file": "true",
            "allow_registering_model_via_url": "true",
            "model_auto_redeploy.enable":"true",
            "model_auto_redeploy.lifetime_retry_times": 10
          }
        }
      }
    }

Step 2: Register the Model Group

Use model groups to logically group models and control who gets access to them. Register a model group using the register operation in the Model Group APIs, as shown in the following example:

POST /_plugins/_ml/model_groups/_register
{
  "name": "general pretrained models",
  "description": "A model group for pretrained models hosted by OCI Search with OpenSearch"
}

Make note of the model_group_id returned in the response:

{
  "model_group_id": "<model_group_ID>",
  "status": "CREATED"
}

Step 3: Register and Deploy the Model

You have the following three options for semantic search:

  • Option 1: Register and deploy a pretrained model hosted in OCI Search with OpenSearch using the steps described in Using an OpenSearch Pretrained Model. This option is the simplest to use, you don't need to configure any additional IAM policies, and the payload isn't as complex as the payload for the next option.

  • Option 2: Import, register and deploy an OpenSearch pretrained model using the steps described in Custom Models. This includes uploading the model file to an Object Storage bucket, and then specifying the model file's Object Storage URL when you register the model.

  • Option 3: Create a Generative AI connector to register a remote embedding model such as the cohere.embed-english-v3.0 model. For more information, see Conversational Search with OCI Generative AI.

This walkthrough shows you how to use option 1, a pretrained model.
To register a pretrained model, you need the following:
  • model_group_id: If you completed Step 1, this is the value for model_group_id to the _register request.
  • name: The model name for the pretrained model you want to use.
  • version: The version number for the pretrained model you want to use.
  • model_format: The format for the model, either TORCH_SCRIPT or ONNX.

Register the model using the register operation from the Model APIs, as shown in the following example:

POST /_plugins/_ml/models/_register
{
  "name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
  "version": "1.0.2",
  "model_group_id": "<model_group_ID>",
  "model_format": "TORCH_SCRIPT"
}

Make note of the task_id returned in the response, you can use the task_id to check the status of the operation.

For example, from the following response:

{
  "task_id": "<task_ID>",
  "status": "CREATED"
}

to check the status of the register operation, use the task_id with the Get operation of the Task APIs, as shown in the following example:

GET /_plugins/_ml/tasks/<task_ID>

When the register operation is complete, the state value in the response to the Get operation is COMPLETED, as shown the following example:

{
  "model_id": "<embedding_model_ID>",
  "task_type": "REGISTER_MODEL",
  "function_name": "TEXT_EMBEDDING",
  "state": "COMPLETED",
  "worker_node": [
    "f2b_8-mVRVyVqeKqsA7dcQ"
  ],
  "create_time": 1706831015570,
  "last_update_time": 1706831070740,
  "is_async": true
}

Make note of the model_ID value returned in the response to use when you deploy the model.

Step 4: Deploy the Model

After the register operation is completed for the model, you can deploy the model to the cluster using the deploy operation of the Model APIs, passing the model_ID from the Get operation response in the previous step, as shown in the following example:

POST /_plugins/_ml/models/<embedding_model_ID>/_deploy

Make note of the task_id returned in the response, you can use the task_id to check the status of the operation.

For example, from the following response:

{
  "task_id": "<task_ID>",
  "task_type": "DEPLOY_MODEL",
  "status": "CREATED"
}

to check the status of the register operation, use the task_ID with the Get operation of the Tasks APIs, as shown in the following example:

GET /_plugins/_ml/tasks/<task_ID>

When the deploy operation is complete, the status value in the response to the Get operation is COMPLETED.

Step 5: Create a k-NN Ingestion Pipeline

After the deploy operation is complete, create an ingestion pipeline using the deployed model. The ingestion pipeline uses the deployed model to automatically generate the embedding vectors for each document at ingestion time. The processor handles everything for the embedding, so you only need to appropriately map the expected text field in the document being converted into embedding. The following example shows creating an ingestion pipeline:

PUT _ingest/pipeline/<pipeline_name>
{
  "description": "An example neural search pipeline",
  "processors" : [
    {
      "text_embedding": {
        "model_id": "<embedding_model_ID>",
        "field_map": {
           "<text_field_name>": "<embedding_field_name>"
        }
      }
    }
  ]
}

If the ingestion pipeline was created, the following response is returned:

{
  "acknowledged": true
}

Step 6: Create Index

Create an index using the ingestion pipeline created in the previous step. You can use any of the available ANN engines in the index. The following example uses the Lucene Engine:

PUT /lucene-index
{
    "settings": {
        "index.knn": true,
        "default_pipeline": "<pipeline_name>"
    },
    "mappings": {
        "properties": {
            "<embedding_field_name>": {
                "type": "knn_vector",
                "dimension": <model_dimension>,
                "method": {
                    "name":"hnsw",
                    "engine":"lucene",
                    "space_type": "l2",
                    "parameters":{
                        "m":512,
                        "ef_construction": 245
                    }
                }
            },
            "<text_field_name>": {
                "type": "text"
            }
        }
    }
    }

The passage_text field for the create index matches the passage_text field in the ingestion pipeline, this makes it so the pipeline knows how to create embeddings and then map them to documents at ingestion time.

To help you choose the engine you want to use, and the available configuration parameters for those engines, see k-NN index and Approximate k-NN search.

The following is an example response for a successful index creation:

{
        "acknowledged": true,
        "shards_acknowledged": true,
        "index": "lucene-index"
}

Step 7: Ingest documents into index

Ingest data into you index as shown in the following example:

POST /lucene-index/_doc/1
{
  "<text_field_name>": "there are many sharks in the ocean"
}
 
POST /lucene-index/_doc/2
{
  "<text_field_name>": "fishes must love swimming"
}
 
POST /lucene-index/_doc/3
{
  "<text_field_name>": "summers are usually very hot"
}
 
POST /lucene-index/_doc/4
{
  "<text_field_name>": "florida has a nice weather all year round"
}

Response:

# POST /lucene-index/_doc/1
{
  "_index": "lucene-index",
  "_id": "1",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 0,
  "_primary_term": 1
}
# POST /lucene-index/_doc/2
{
  "_index": "lucene-index",
  "_id": "2",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 1,
  "_primary_term": 1
}
# POST /lucene-index/_doc/3
{
  "_index": "lucene-index",
  "_id": "3",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 2,
  "_primary_term": 1
}
# POST /lucene-index/_doc/4
{
  "_index": "lucene-index",
  "_id": "4",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 3,
  "_primary_term": 1
}

Run the following to check one of the documents to ensure that embeddings are generated correctly:

GET /lucene-index/_doc/3

Response:

{
  "_index": "lucene-index",
  "_id": "3",
  "_version": 1,
  "_seq_no": 2,
  "_primary_term": 1,
  "found": true,
  "_source": {
    "<embedding_field_list>": [
      -0.1254959,
      -0.3151774,
      0.03526799,
      0.39322096,
      -0.0475556,
      -0.12378334,
      -0.032554347,
      0.4033895,
      0.050718695,
      -0.3587931,
      0.097042784,
      0.11742551,
      -0.06573639,
      0.14252506,
      -0.466573,
      0.56093556,
      -0.2815812,
      -0.00016521096,
      -0.2858566,

Performing Semantic Search

Semantic Search with Hybrid Search

Using the model ID from Step 3: Register and Deploy the Model, perform semantic search with hybrid search as follows:

GET lucene-index/_search
{
  "query": {
    "bool" : {
      "should" : [
        {
          "script_score": {
            "query": {
              "neural": {
                "<embedding_field_name>": {
                  "query_text": "what are temperatures in miami like",
                  "model_id": "<embedding_model_ID>",
                  "k": 2
                }
              }
            },
            "script": {
              "source": "_score * 1.5"
            }
          }
        }
      ]
    }
  },
  "fields": [
    "<text_field_name>"
  ],
  "_source": false
}

Response:

{
  "took": 343,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 0.032253794,
    "hits": [
      {
        "_index": "lucene-index",
        "_id": "4",
        "_score": 0.032253794,
        "fields": {
          "<text_field_name>": [
            "florida has a nice weather all year round"
          ]
        }
      },
      {
        "_index": "lucene-index",
        "_id": "3",
        "_score": 0.031487543,
        "fields": {
          "<text_field_name>": [
            "summers are usually very hot"
          ]
        }
      }
    ]
  }
}

You can see that the query question doesn't mention Florida, weather, nor summer , however the model infers the semantic meaning of temperatures and miami to return the most relevant answers.

Semantic Search with Neural Search

Using the model ID from Step 3: Register and Deploy the Model, perform semantic search with neural search as follows:

GET /lucene-index/_search
{
  "_source": {
    "excludes": [
      "<embedding_field_name>"
    ]
  },
  "query": {
    "neural": {
      "<embedding_field_name>": {
        "query_text": "good climate",
        "model_id": "<embedding_model_ID>",
        "k": 5
      }
    }
  }
}

Response:

{
  "took": 36,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 4,
      "relation": "eq"
    },
    "max_score": 0.012060728,
    "hits": [
      {
        "_index": "lucene-index",
        "_id": "4",
        "_score": 0.012060728,
        "_source": {
          "passage_text": "florida has a nice weather all year round"
        }
      },
      {
        "_index": "lucene-index",
        "_id": "1",
        "_score": 0.010120423,
        "_source": {
          "passage_text": "there are many sharks in the ocean"
        }
      },
      {
        "_index": "lucene-index",
        "_id": "3",
        "_score": 0.00985171,
        "_source": {
          "passage_text": "summers are usually very hot"
        }
      },
      {
        "_index": "lucene-index",
        "_id": "2",
        "_score": 0.009575767,
        "_source": {
          "passage_text": "fishes must love swimming"
        }
      }
    ]
  }
}