Creating a Knowledge Base in Generative AI Agents

  1. On the Knowledge Bases list page, click Create knowledge base.
    If you need help finding the list page, see Listing Knowledge Bases.
  2. Enter the following values:
    • Name: A name that starts with a letter or underscore, followed by letters, numbers, hyphens, or underscores. The length can be from 1 to 255 characters.
    • Compartment: The compartment that you want to store the knowledge base in
    • Description: An optional description
  3. Select one of the following options for the data store type.
  4. If you selected, Object Storage, perform the following actions:
    1. (Optional) Select Enable hybrid search to combine lexical and semantic search. Without hybrid search, you get lexical search only.
    2. (Optional) Select Show advanced options and add one or more tags to the data source. If you have permissions to create a resource, then you have permission to update its tags. If you need help, see Tags and Tag Namespace Concepts.
    3. Select Specify data source and enter a name and optional description for the data source.
    4. Select the bucket that contains the data for the knowledge store. Change the compartment if the bucket is in another compartment.
      See Data requirements for Object Storage to ensure that the files in the buckets meet the requirements for Generative AI Agents.
    5. After the contents of the bucket is listed, perform one of the following actions:
      • Select all in bucket
      • Select the files and folders to include.
      • Click Add object prefixes manually to type the prefixes for the files and folders to be included.
    6. Click Create.
    7. (Optional) Select Automatically start ingestion job for above data sources.
      If you don't select this option, you must ingest the data later for your agent to use it.
    Note

    You can only have one data source per knowledge base. See Limits and Limitations for Generative AI Agents.
  5. If you selected, OCI OpenSearch, from the information in OCI Search with OpenSearch Guidelines for Generative AI Agents, enter the following information:
    • OpenSearch cluster name
    • OpenSearch index information:
      • Index Name
      • Body key
      • Embed body key (Optional)
      • URL key (Optional)
      • Title key (Optional)
    • For Secret details select one of the following options:
      • Basic auth secret: For this option, select the Vault secret for OCI Search with OpenSearch.
      • IDCS secret: For this option enter the following information for the IDCS confidential application that you want to use for the agent:
        • Identity domain application name - Change the compartment if the identity domain is in another compartment.
        • Client Id for the OpenSearch cluster's IDCS client application.
        • Client secret vault that contains the client secret - Change the compartment if the secret is in another compartment.
        • Scope URL that's the API endpoint for the identity domain's resource server application and includes the agent scope. For example, for genaiagent scope, the URL is https://*.agent.aiservice.us-chicago-1.oci.oraclecloud.com/genaiagent.

      To learn about OpenSearch clusters, read about an OpenSearch cluster's detail page.

  6. If you selected, Oracle AI Vector Search, select the Database tool connection and then click Test connection to confirm a successful connection to the database. If successful, the database name and version is displayed. Then, enter the Vector search function for the database tool connection.
    Note

    See the Oracle Database Guidelines for Generative AI Agents to help you enter the values for this step.
  7. (Optional) Select Show advanced options and add one or more tags to the knowledge base.
  8. Click Create.
    Note

    For Object Storage data:
    After Creating an Ingestion Job
    1. Review the status logs to confirm that all updated files were successfully ingested.
    2. If the ingestion job fails (for example, because of a file being too large), address the issue and restart the job.
    How the Ingestion Pipeline Handles Previously Run Jobs

    When you restart a previously run ingestion job, the pipeline:

    1. Detects files that were successfully ingested earlier and skip them.
    2. Only ingests files that failed previously and have since been updated.
    Example Scenario

    Suppose you have 20 files to ingest, and the initial job run results in 2 failed files. When you restart the job, the pipeline:

    1. Recognizes that 18 files have already been successfully ingested and ignore them.
    2. Ingests only the 2 files that failed earlier and have since been updated.
The knowledge base takes a while to create. After the knowledge base is created, if you don't have ingested data, follow the steps in Ingesting Data Source Data in Generative AI Agents.