Semantic Search Walkthrough
Prerequisites
Before you start, you need to do the following:
-
Select one of the pretrained models supported by OCI Search with OpenSearch
-
Confirm that the OpenSearch cluster is version 2.11.
-
Update the cluster settings to perform semantic search. The following example command updates the applicable settings:
PUT _cluster/settings { "persistent": { "plugins": { "ml_commons": { "only_run_on_ml_node": "false", "model_access_control_enabled": "true", "native_memory_threshold": "99", "rag_pipeline_feature_enabled": "true", "memory_feature_enabled": "true", "allow_registering_model_via_local_file": "true", "allow_registering_model_via_url": "true", "model_auto_redeploy.enable":"true", "model_auto_redeploy.lifetime_retry_times": 10 } } } }
Step 2: Register the Model Group
Use model groups to logically group models and control who gets access to them. Register a model group using the register operation in the Model Group APIs, as shown in the following example:
POST /_plugins/_ml/model_groups/_register
{
"name": "general pretrained models",
"description": "A model group for pretrained models hosted by OCI Search with OpenSearch"
}
Make note of the model_group_id
returned in the response:
{
"model_group_id": "<model_group_ID>",
"status": "CREATED"
}
Step 3: Register and Deploy the Model
You have the following three options for semantic search:
-
Option 1: Register and deploy a pretrained model hosted in OCI Search with OpenSearch using the steps described in Using an OpenSearch Pretrained Model. This option is the simplest to use, you don't need to configure any additional IAM policies, and the payload isn't as complex as the payload for the next option.
-
Option 2: Import, register and deploy an OpenSearch pretrained model using the steps described in Custom Models. This includes uploading the model file to an Object Storage bucket, and then specifying the model file's Object Storage URL when you register the model.
-
Option 3: Create a Generative AI connector to register a remote embedding model such as the
cohere.embed-english-v3.0 model
. For more information, see Conversational Search with OCI Generative AI.
model_group_id
: If you completed Step 1, this is the value for model_group_id to the _register request.name
: The model name for the pretrained model you want to use.version
: The version number for the pretrained model you want to use.model_format
: The format for the model, eitherTORCH_SCRIPT
orONNX
.
Register the model using the register operation from the Model APIs, as shown in the following example:
POST /_plugins/_ml/models/_register
{
"name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
"version": "1.0.2",
"model_group_id": "<model_group_ID>",
"model_format": "TORCH_SCRIPT"
}
Make note of the task_id
returned in the response, you can use the
task_id
to check the status of the operation.
For example, from the following response:
{
"task_id": "<task_ID>",
"status": "CREATED"
}
to check the status of the register operation, use the task_id
with
the Get operation of the Task APIs, as shown in the following
example:
GET /_plugins/_ml/tasks/<task_ID>
When the register operation is complete, the state
value in the
response to the Get operation is COMPLETED
, as shown the following
example:
{
"model_id": "<embedding_model_ID>",
"task_type": "REGISTER_MODEL",
"function_name": "TEXT_EMBEDDING",
"state": "COMPLETED",
"worker_node": [
"f2b_8-mVRVyVqeKqsA7dcQ"
],
"create_time": 1706831015570,
"last_update_time": 1706831070740,
"is_async": true
}
Make note of the model_ID
value returned in the response to use when
you deploy the model.
Step 4: Deploy the Model
After the register operation is completed for the model, you can deploy the model to
the cluster using the deploy operation of the Model APIs,
passing the model_ID
from the Get operation response in the
previous step, as shown in the following example:
POST /_plugins/_ml/models/<embedding_model_ID>/_deploy
Make note of the task_id
returned in the response, you can use the
task_id
to check the status of the operation.
For example, from the following response:
{
"task_id": "<task_ID>",
"task_type": "DEPLOY_MODEL",
"status": "CREATED"
}
to check the status of the register operation, use the task_ID
with
the Get operation of the Tasks APIs, as shown in the following
example:
GET /_plugins/_ml/tasks/<task_ID>
When the deploy operation is complete, the status
value in the
response to the Get operation is COMPLETED
.
Step 5: Create a k-NN Ingestion Pipeline
After the deploy operation is complete, create an ingestion pipeline using the deployed model. The ingestion pipeline uses the deployed model to automatically generate the embedding vectors for each document at ingestion time. The processor handles everything for the embedding, so you only need to appropriately map the expected text field in the document being converted into embedding. The following example shows creating an ingestion pipeline:
PUT _ingest/pipeline/<pipeline_name>
{
"description": "An example neural search pipeline",
"processors" : [
{
"text_embedding": {
"model_id": "<embedding_model_ID>",
"field_map": {
"<text_field_name>": "<embedding_field_name>"
}
}
}
]
}
If the ingestion pipeline was created, the following response is returned:
{
"acknowledged": true
}
Step 6: Create Index
Create an index using the ingestion pipeline created in the previous step. You can use any of the available ANN engines in the index. The following example uses the Lucene Engine:
PUT /lucene-index
{
"settings": {
"index.knn": true,
"default_pipeline": "<pipeline_name>"
},
"mappings": {
"properties": {
"<embedding_field_name>": {
"type": "knn_vector",
"dimension": <model_dimension>,
"method": {
"name":"hnsw",
"engine":"lucene",
"space_type": "l2",
"parameters":{
"m":512,
"ef_construction": 245
}
}
},
"<text_field_name>": {
"type": "text"
}
}
}
}
The passage_text
field for the create index matches the
passage_text
field in the ingestion pipeline, this makes it so
the pipeline knows how to create embeddings and then map them to documents at
ingestion time.
To help you choose the engine you want to use, and the available configuration parameters for those engines, see k-NN index and Approximate k-NN search.
The following is an example response for a successful index creation:
{
"acknowledged": true,
"shards_acknowledged": true,
"index": "lucene-index"
}
Step 7: Ingest documents into index
Ingest data into you index as shown in the following example:
POST /lucene-index/_doc/1
{
"<text_field_name>": "there are many sharks in the ocean"
}
POST /lucene-index/_doc/2
{
"<text_field_name>": "fishes must love swimming"
}
POST /lucene-index/_doc/3
{
"<text_field_name>": "summers are usually very hot"
}
POST /lucene-index/_doc/4
{
"<text_field_name>": "florida has a nice weather all year round"
}
Response:
# POST /lucene-index/_doc/1
{
"_index": "lucene-index",
"_id": "1",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
# POST /lucene-index/_doc/2
{
"_index": "lucene-index",
"_id": "2",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 1,
"_primary_term": 1
}
# POST /lucene-index/_doc/3
{
"_index": "lucene-index",
"_id": "3",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 2,
"_primary_term": 1
}
# POST /lucene-index/_doc/4
{
"_index": "lucene-index",
"_id": "4",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 3,
"_primary_term": 1
}
Run the following to check one of the documents to ensure that embeddings are generated correctly:
GET /lucene-index/_doc/3
Response:
{
"_index": "lucene-index",
"_id": "3",
"_version": 1,
"_seq_no": 2,
"_primary_term": 1,
"found": true,
"_source": {
"<embedding_field_list>": [
-0.1254959,
-0.3151774,
0.03526799,
0.39322096,
-0.0475556,
-0.12378334,
-0.032554347,
0.4033895,
0.050718695,
-0.3587931,
0.097042784,
0.11742551,
-0.06573639,
0.14252506,
-0.466573,
0.56093556,
-0.2815812,
-0.00016521096,
-0.2858566,
Performing Semantic Search
Semantic Search with Hybrid Search
Using the model ID from Step 3: Register and Deploy the Model, perform semantic search with hybrid search as follows:
GET lucene-index/_search
{
"query": {
"bool" : {
"should" : [
{
"script_score": {
"query": {
"neural": {
"<embedding_field_name>": {
"query_text": "what are temperatures in miami like",
"model_id": "<embedding_model_ID>",
"k": 2
}
}
},
"script": {
"source": "_score * 1.5"
}
}
}
]
}
},
"fields": [
"<text_field_name>"
],
"_source": false
}
Response:
{
"took": 343,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 0.032253794,
"hits": [
{
"_index": "lucene-index",
"_id": "4",
"_score": 0.032253794,
"fields": {
"<text_field_name>": [
"florida has a nice weather all year round"
]
}
},
{
"_index": "lucene-index",
"_id": "3",
"_score": 0.031487543,
"fields": {
"<text_field_name>": [
"summers are usually very hot"
]
}
}
]
}
}
You can see that the query question doesn't mention Florida, weather, nor summer , however the model infers the semantic meaning of temperatures and miami to return the most relevant answers.
Semantic Search with Neural Search
Using the model ID from Step 3: Register and Deploy the Model, perform semantic search with neural search as follows:
GET /lucene-index/_search
{
"_source": {
"excludes": [
"<embedding_field_name>"
]
},
"query": {
"neural": {
"<embedding_field_name>": {
"query_text": "good climate",
"model_id": "<embedding_model_ID>",
"k": 5
}
}
}
}
Response:
{
"took": 36,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 4,
"relation": "eq"
},
"max_score": 0.012060728,
"hits": [
{
"_index": "lucene-index",
"_id": "4",
"_score": 0.012060728,
"_source": {
"passage_text": "florida has a nice weather all year round"
}
},
{
"_index": "lucene-index",
"_id": "1",
"_score": 0.010120423,
"_source": {
"passage_text": "there are many sharks in the ocean"
}
},
{
"_index": "lucene-index",
"_id": "3",
"_score": 0.00985171,
"_source": {
"passage_text": "summers are usually very hot"
}
},
{
"_index": "lucene-index",
"_id": "2",
"_score": 0.009575767,
"_source": {
"passage_text": "fishes must love swimming"
}
}
]
}
}