Ingesting Data Source Data in Generative AI Agents

A data ingestion job extracts data from data source documents, converts it into a structured format suitable for analysis, and then stores it in a knowledge base.

On the Knowledge Bases list page, select the knowledge base that you want to ingest data for its data source.
If you need help finding the list page, see Listing Knowledge Bases.
Select the data source that you want to ingest its data.
Select Create Ingestion job.
Enter the following values:
- Name: A name that starts with a letter or underscore, followed by letters, numbers, hyphens, or underscores. The length can be from 1 to 255 characters.
- Description: An optional description
- Tags: Select Show advanced options and add one or more tags to the ingestion job. If you have permissions to create a resource, then you have permission to update its tags. If you need help, see Tags and Tag Namespace Concepts.
Select Create.
Wait for the ingestion job status to change.

Note

After Creating an Ingestion Job

Review the status logs to confirm that all updated files were successfully ingested. If you need help getting the status logs, see Getting a Data Ingestion Job's Details.
If the ingestion job fails (for example, because of a file being too large), address the issue and restart the job.

How the Ingestion Pipeline Handles Previously Run Jobs

When you restart a previously run ingestion job, the pipeline:

Detects files that were successfully ingested earlier and skip them.
Only ingests files that failed previously and have since been updated.

Example Scenario

Suppose you have 20 files to ingest, and the initial job run results in 2 failed files. When you restart the job, the pipeline:

Recognizes that 18 files have already been successfully ingested and ignore them.
Ingests only the 2 files that failed earlier and have since been updated.

Oracle Cloud Infrastructure Documentation

Ingesting Data Source Data in Generative AI Agents