Vector Search in the Data Analysis Tool

You can use Oracle AI Vector Search capabilities to search for relevant text from your source data on a specific column.

The Vector search helps quickly locate and match data that is similar to the text you provide as your search input.

Prerequisites

For using the vector search feature in the Data Analysis tool, you need the following:
  • You must have 23ai Database with Vector support.
  • The ORDS REST APIs must support VECTOR data type.
  • You must use the Data Studio Settings page to select the same credential you use for vector search.
  • The source data must already have a non-numeric column to be embedded first.
  • To utilize the Vector Search in the Data Analysis tool, you must have:
    • An OpenAI, or Cohere or An Azure OpenAI account service with credit
    • An access to DBMS_VECTOR and DBMS_VECTOR_CHAIN.UTL_TO_EMBEDDING package.
    Note

The Data Analysis tool utilizes the Vector Utility PL/SQL package DBMS_VECTOR and DBMS_VECTOR_CHAIN.UTL_TO_EMBEDDING to provide the third-party REST APIs that let you interact with external embedding models such as Cohere, Google AI, Hugging Face, Oracle Cloud Infrastructure (OCI) Generative AI, OpenAI, or Vertex AI. You must understand the terms of using third-party embedding models.

Note

Certain features of the database may allow you to access services offered separately by third-parties, for example, through the use of JSON specifications that facilitate your access to REST APIs.

Your use of these features is solely at your own risk, and you are solely responsible for complying with any terms and conditions related to use of any such third-party services. Notwithstanding any other terms and conditions related to the third-party services, your use of such database features constitutes your acceptance of that risk and express exclusion of Oracle's responsibility or liability for any damages resulting from such access.

Suppose you are a finance enthusiast and you want to learn about finance through news to gain better sense of financial products and investments. In this example, you can use the Vector Search feature to find news related to the word Bank.

Consider a table that has the columns CATEGORY, FILENAME, TITLE, CONTENT and a different column labeled VECTOR, that contains the vector representation of the CONTENT column. Since we can use vector search on source columns with non-numeric data (image or text), any of the non-numeric column can be embedded and stored as a vector column. But in this example, we need to search for relevant text from CONTENT column.

You can use the Data Analysis tool to perform vector search.

To indicate which column's embedding is kept in the vector column, we must first map the appropriate column with the appropriate vector using the Data Analysis tool, search for similar text, and then run the query. The Query Result tab will display the content similar to Bank.

  1. From the Data Analysis tool, click + under Analyses to create a new Analysis.
  2. Select the Schema from Schema drop-down and Query from the Type drop-down.
  3. Select the table you want to query from the table browser. In this example, you will select BBCNEWS_DATA_ENCODE.


    Description of select-table.jpg follows

  4. Enter the Select statement in the SQL Worksheet edit area to retrieve all the columns from the BBCNEWS_DATA_ENCODE table.

    Specify the following command:
    Select * from BBCNEWS_DATA_ENCODE;
  5. Click Run.


    Description of run-data-analysis.jpg follows

    You can view all the columns of the BBCNEWS_DATA_ENCODE table in the Query Results tab.

  6. Select tabular view of mode from the icons that display various modes of visualizing the query result.


    Description of tabular-view.jpg follows

  7. Drag and drop the CONTENT column to the Column drop area and click it to map the vector column.


    Description of map-vector-column.jpg follows

  8. On the Map Vector Column wizard, specify the following field values:
    • Map Vector: Select the column that consists of the vector embeddings of the selected column from the drop-down list. In this example, it is VECTOR.
    • Select Service: Select the AI service you will use to utilize this feature of the tool. In this example, it is OCI Generative AI.
    • Select Credential: Enter the credential you create from the Data Studio Settings to access the third party service provider. You require these credentials to enable access during REST API calls to your chosen third-party service provider.
    • Generative AI URL: Specify the Generative AI URL you will use to call the third-party service provider, such as Cohere, Google AI, Hugging Face, Oracle Cloud Infrastructure (OCI) Generative AI, OpenAI, or Vertex AI.
    • Select Model: Specify the embed model you will use to generate embeddings from the text. In this example, you will enter cohere.embed-english-light-v2.0.

    Click Save to save the mapping. Select Clear to clear the field values you mention in this step.

  9. On the Faceted Filters, you can view text box under the Content column. Click the + sign on it to expand the text box.


    Description of text.jpg follows

  10. Enter the following text in the text box to content similar to the word Bank in the CONTENT column:
    BANK

    Click Search.


    Description of bank.jpg follows

  11. The Query Results tab of the Output panel displays the content column that consists of content related to Bank.


    Description of results.jpg follows

    You can hover over each content to view it's expanded form.


    Description of expanded-content.jpg follows