Use a Pipeline
A pipeline lets you connect a set of tasks in a sequence or in parallel to orchestrate data processing.
By creating a pipeline, you can build a complex task dependency graph and automate an entire workload of tasks. The tasks must be published, and you can add published tasks from any application that's in the current workspace or from another workspace.
In this tutorial, you:
- Create two data loader tasks to be run in parallel in a pipeline.
- Create a REST task to use the Notification service for sending email notifications.
- Create a pipeline and add operators for data loader tasks, merge, integration task, and REST task.
- Create a pipeline task to configure a runtime context for a pipeline.
- Publish a pipeline task and run a pipeline.
- Monitor a pipeline run.
Before You Begin
To complete this tutorial, you must have:
- Completed the data flow tutorial.
- Completed the integration task tutorial.
- Completed the data loader task tutorial.
- A topic and email subscription created in the Notifications service.
- See Create a topic.
- See Create an email subscription.
1. Creating a Data Loader Task for Revenue Data
Duplicate the Load Revenue Data into Data Warehouse
task to create a new task that loads and overwrites revenue data.
2. Creating a Data Loader Task for Customer Data
Create a data loader task to load customer data into Data Warehouse by creating a new target data entity.
3. Creating a REST Task for Sending Notifications
You can use a REST task to run a REST API endpoint in a pipeline. In this tutorial, you use the Notifications service API in a Data Integration REST task to publish an email from within a pipeline.
-
A topic and email subscription created in the Notifications service.
-
See Create a topic.
-
-
The OCID of the topic you created. The OCID is available on the Topic Information section of the topic details page in the Notifications service.
-
The following policy statement that lets you run Data Integration tasks that invoke the Notifications REST API:
allow any-user to use notification-family in tenancy where ALL {request.principal.type='disworkspace'}
Then in Data Integration, create a REST task that uses the Notifications service API to publish an email.
4. Publishing the Data Loader and REST Tasks
5. Creating a Pipeline
6. Adding Pipeline Operators
You add task operators to specify the published tasks to orchestrate in the pipeline.
Learn more about pipeline operators.
7. Creating a Pipeline Task
8. Publishing and Running a Pipeline Task
Additional Resources
To learn more, see: