Ingesting Data Source Data in Generative AI Agents
A data ingestion job extracts data from data source documents, converts it into a structured format suitable for analysis, and then stores it in a knowledge base.
Note
- After Creating an Ingestion Job
-
- Review the status logs to confirm that all updated files were successfully ingested. If you need help getting the status logs, see Getting a Data Ingestion Job's Details.
- If the ingestion job fails (for example, because of a file being too large), address the issue and restart the job.
- How the Ingestion Pipeline Handles Previously Run Jobs
-
When you restart a previously run ingestion job, the pipeline:
- Detects files that were successfully ingested earlier and skip them.
- Only ingests files that failed previously and have since been updated.
- Example Scenario
-
Suppose you have 20 files to ingest, and the initial job run results in 2 failed files. When you restart the job, the pipeline:
- Recognizes that 18 files have already been successfully ingested and ignore them.
- Ingests only the 2 files that failed earlier and have since been updated.