Creating a Knowledge Base in Generative AI Agents

Create a knowledge base in the Generative AI Agents service.

On the Knowledge bases list page, select Create knowledge base. If you need help finding the list page, see Listing Knowledge Bases.
Enter the following information:
- Name: A name that starts with a letter or underscore, followed by letters, numbers, hyphens, or underscores. The length can be from 1 to 255 characters.
- Compartment: The compartment that you want to store the knowledge base in
- Description: An optional description
For Data store type, select one of the following options:
- Object storage. See RAG Tool Object Storage Guidelines.
- OCI OpenSearch:
  You must have documents chunked to files with less than 512 tokens each, and you must have ingested and indexed those documents in OpenSearch before you continue. See RAG Tool OCI Search with OpenSearch Guidelines.
- Database AI Vector Search:
  - For data in Oracle Database 23ai. See RAG Tool Oracle Database Guidelines for the required setup.
  - For data in HeatWave MySQL. See RAG Tool Heatwave MySQL Guidelines for the required setup.
If you selected Object storage, perform the following actions:
1. Under Data sources, select Specify data source and enter a name and optional description for the data source.
2. Select the bucket that contains the data for the knowledge base. Change the compartment if the bucket is in another compartment.
  See RAG Tool Object Storage Guidelines to ensure that the files in the buckets meet the requirements for Generative AI Agents.
3. After the contents of the bucket are listed, perform one of the following actions to select the files to use:
  - To include all items in the bucket, click Select all in bucket.
  - Select the files and folders to include.
  - Expand Add object prefixes manually to type the prefixes for the files and folders to include.
4. (Optional) Select Automatically start ingestion job for above data sources.
  If you don't select this option, you must ingest the data later for the agent to use it.
Note

You can have only one data source per knowledge base. See Limits and Limitations for Generative AI Agents.
If you selected OCI OpenSearch for the data source type, enter the following information. For guidelines, see RAG Tool OCI Search with OpenSearch Guidelines.
1. For OpenSearch cluster, select the cluster that contains the data for the knowledge base. Change the compartment if the cluster is in another compartment.
  
  To learn about OpenSearch clusters, read about an OpenSearch cluster's detail page.
2. For OpenSearch index, enter the details of the OpenSearch index. See RAG Tool OCI Search with OpenSearch Guidelines.
3. For Secret details, select one of the following options:
  - Basic auth secret: For this option, select the Vault secret for OCI Search with OpenSearch.
  - IDCS secret: For this option, enter the following information for the IDCS confidential application that you want to use for the agent:
    - Identity domain: Select the identity domain to use to access the cluster. Change the compartment if the identity domain is in another compartment.
    - Client Id: Enter the ID for the OpenSearch cluster's IDCS client application.
    - Client secret vault: Select the vault that contains the client secret. Change the compartment if the secret is in another compartment.
    - Scope URL: Enter the URL that's the API endpoint for the identity domain's resource server application and includes the agent scope. For example, for genaiagent scope, the URL is https://*.agent.aiservice.us-chicago-1.oci.oraclecloud.com/genaiagent.
If you selected Oracle AI Vector Search for the data source type, select the database tool connection and then select Test connection to confirm a successful connection to the database. If successful, the database name and version are displayed. Then, enter the vector search function or procedure for the database tool connection.

Note

See the RAG Tool Oracle Database Guidelines or RAG Tool Heatwave MySQL Guidelines for information about the function or procedure.
(Optional) Select Show tagging options and add one or more tags to this resource. If you have permissions to create a resource, then you also have permissions to apply free-form tags to that resource. To apply a defined tag, you must have permissions to use the tag namespace. For more information about tagging, see Resource Tags. If you're not sure whether to apply tags, skip this option or ask an administrator. You can apply tags later.
Select Create.

The knowledge base takes a while to create. After the knowledge base is created, if you don't have ingested data for an Object Storage data source, follow the steps in Ingesting Data Source Data in Generative AI Agents.

Note

After a data ingestion job runs for an Object Storage data source, review the status and status logs to confirm that all updated files were successfully ingested.

For the meaning of an ingestion job status and the action to take if there is a failure issue, see Ingesting Data Source Data, Step 6.

If the ingestion job fails (for example, because a file was too large), address the issue and restart the job.

When you restart a previously run ingestion job, the pipeline detects files that were successfully ingested earlier and skips them. The pipeline ingests only files that failed before and have since been updated. For example, you have 20 files to ingest, and the initial job run results in 2 failed files. When you restart the job, the pipeline recognizes that 18 files have already been successfully ingested and ignores them. It ingests only the 2 files that failed earlier and have since been updated.

Oracle Cloud Infrastructure Documentation

Creating a Knowledge Base in Generative AI Agents