Vector Search in the Data Analysis Tool
You can use Oracle AI Vector Search capabilities to search for relevant text from your source data on a specific column.
The Vector search helps quickly locate and match data that is similar to the text you provide as your search input.
Prerequisites
- You must have 23ai Database with Vector support.
- The ORDS REST APIs must support VECTOR data type.
- You must use the Data Studio Settings page to select the same credential you use for vector search.
- The source data must already have a non-numeric column to be embedded first.
- To utilize the Vector Search in the Data Analysis tool, you must have:
- An OpenAI, or Cohere or An Azure OpenAI account service with credit
- An access to
DBMS_VECTOR
andDBMS_VECTOR_CHAIN.UTL_TO_EMBEDDING
package.
Note
- You must set your AI profile. Follow the steps mentioned in this chapter to Use Select AI to Generate SQL from Natural Language Prompts.
- After you have created and configured your AI profile, set your AI profile in the Data Studio Settings wizard on the Data Studio navigation menu to use AI features such as generating SQL from Natural Language and Vector Search in the Data Studio tool.
The Data Analysis tool utilizes the Vector Utility PL/SQL package DBMS_VECTOR
and DBMS_VECTOR_CHAIN.UTL_TO_EMBEDDING
to provide the third-party REST APIs that let you interact with external embedding models such as Cohere, Google AI, Hugging Face, Oracle Cloud Infrastructure (OCI) Generative AI, OpenAI, or Vertex AI. You must understand the terms of using third-party embedding models.
Certain features of the database may allow you to access services offered separately by third-parties, for example, through the use of JSON specifications that facilitate your access to REST APIs.
Your use of these features is solely at your own risk, and you are solely responsible for complying with any terms and conditions related to use of any such third-party services. Notwithstanding any other terms and conditions related to the third-party services, your use of such database features constitutes your acceptance of that risk and express exclusion of Oracle's responsibility or liability for any damages resulting from such access.
Suppose you are a finance enthusiast and you want to learn about finance through news to gain better sense of financial products and investments. In this example, you can use the Vector Search feature to find news related to the word Bank
.
Consider a table that has the columns CATEGORY
, FILENAME
, TITLE
, CONTENT
and a different column labeled VECTOR
, that contains the vector representation of the CONTENT
column. Since we can use vector search on source columns with non-numeric data (image or text), any of the non-numeric column can be embedded and stored as a vector column. But in this example, we need to search for relevant text from CONTENT
column.
You can use the Data Analysis tool to perform vector search.
To indicate which column's embedding is kept in the vector column, we must first map the appropriate column with the appropriate vector using the Data Analysis tool, search for similar text, and then run the query. The Query Result tab will display the content similar to Bank
.
- From the Data Analysis tool, click + under Analyses to create a new Analysis.
- Select the Schema from Schema drop-down and Query from the Type drop-down.
- Select the table you want to query from the table browser. In this example, you will select
BBCNEWS_DATA_ENCODE
. -
Enter the Select statement in the SQL Worksheet edit area to retrieve all the columns from the
BBCNEWS_DATA_ENCODE
table.Specify the following command:Select * from BBCNEWS_DATA_ENCODE;
- Click Run.
You can view all the columns of the
BBCNEWS_DATA_ENCODE
table in the Query Results tab. - Select tabular view of mode from the icons that display various modes of visualizing the query result.
- Drag and drop the
CONTENT
column to the Column drop area and click it to map the vector column. - On the Map Vector Column wizard, specify the following field values:
- Map Vector: Select the column that consists of the vector embeddings of the selected column from the drop-down list. In this example, it is VECTOR.
- Select Service: Select the AI service you will use to utilize this feature of the tool. In this example, it is OCI Generative AI.
- Select Credential: Enter the credential you create from the Data Studio Settings to access the third party service provider. You require these credentials to enable access during REST API calls to your chosen third-party service provider.
- Generative AI URL: Specify the Generative AI URL you will use to call the third-party service provider, such as Cohere, Google AI, Hugging Face, Oracle Cloud Infrastructure (OCI) Generative AI, OpenAI, or Vertex AI.
- Select Model: Specify the embed model you will use to generate embeddings from the text. In this example, you will enter cohere.embed-english-light-v2.0.
Click Save to save the mapping. Select Clear to clear the field values you mention in this step.
- On the Faceted Filters, you can view text box under the Content column. Click the + sign on it to expand the text box.
- Enter the following text in the text box to content similar to the word
Bank
in theCONTENT
column:BANK
Click Search.
- The Query Results tab of the Output panel displays the content column that consists of content related to
Bank
.You can hover over each content to view it's expanded form.
Parent topic: The Data Analysis Tool