Bring Your Own Container (BYOC) for Pipelines

Apart from defining pipeline steps based on jobs and scripts, you can use custom container images to define the step runtime.

You can select the container image, command, or entry point for container execution. You can provide the custom code in a script or a compressed archive, which lets you update the code without rebuilding the image.

BYOC Step Configuration

(Optional) Select From container.
Under Container configuration, select Configure.
In the Configure your container environment panel:
1. Select a repository from the list. If the repository is in a different compartment, select Change compartment.
2. Select an image from the list.
3. (Optional) Enter an entry point. To add another, select +Add parameter.
4. (Optional) Enter a CMD. To add another, select +Add parameter.
5. (Optional) Enter an image digest.
6. (Optional) If using signature verification, enter the OCID of the image signature. For example, ocid1.containerimagesignature.oc1.iad.aaaaaaaaab....
7. Select Select.
(Optional) Upload the step artifact by dragging it into the box. This step is optional only if BYOC is configured.

Model Artifact

Uploading a step artifact when using BYOC is optional. However, it lets you change the code that runs inside the container without rebuilding the image.

The step artifacts are mounted in the container to the folder /home/datascience/. If an artifact is an archive, its content is extracted to the folder /home/datascience/decompressed_artifact.

Common docker image: It's convenient to build a generic container image with the required environment (for example, Python 3.8 and basic libraries as shown in Quick start) and add Python scripts later as step artifacts.
Custom artifacts - folder override: When you use a custom step artifact, the service mounts a volume with the artifact to the /home/datascience folder, overriding the folder in your container image. Archive artifacts (zip/tar/...) are decompressed and the content is presented in the folder /home/datascience/decompressed_artifact.

Create a Container Pipeline - Quick Start

Follow these steps to create a container pipeline step.

Building the Container

Use an existing image in the OCI Registry or create a new image using the sample Dockerfile. Here is the sample code that builds an image that lets you run Python scripts.

Dockerfile:

FROM python:3.8-slim AS base
 
ENV DATASCIENCE_USER datascience
ENV DATASCIENCE_UID 1000
ENV HOME /home/$DATASCIENCE_USER
 
RUN python -m pip install \
        oci \
        ocifs
 
COPY simplest.py .
CMD ["python", "simplest.py"]

The Dockerfile assumes that the script, simplest.py is in the same folder. Here is sample code for simplest.py:

import datetime
import os
import time
 
pipe_id = os.environ.get("PIPELINE_RUN_OCID", "LOCAL")
print(f"Starting pipeline run: {pipe_id}")
print(f"Current timestamp in UTC: {datetime.datetime.utcnow()}")
 
print("Delay 5s")
 
time.sleep(5)
 
print("Environment variables:")
 
for item, value in os.environ.items():
    print(f"\t {item}: {value}")

Run the docker build command:
```
docker build -t byoc:1.0.0 .
```

Testing the Container

Before pushing the image to a container registry, you can test it locally.

Run the image locally:

docker run --rm -it -v "/home/lin/.oci:/home/datascience/.oci" byoc:1.0.0

Confirm the output is similar to:

Starting pipeline run: LOCAL
Current timestamp in UTC: 2024-03-07 14:44:08.506191
Delay 5s
Environment variables:
         PATH: /usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
         HOSTNAME: ae441d10f33e
         TERM: xterm
         LANG: C.UTF-8
         GPG_KEY: E3FF2839C048B25C084DEBE9B26995E310250568
         PYTHON_VERSION: 3.8.18
         PYTHON_PIP_VERSION: 23.0.1
         PYTHON_SETUPTOOLS_VERSION: 57.5.0
         PYTHON_GET_PIP_URL: https://github.com/pypa/get-pip/raw/dbf0c85f76fb6e1ab42aa672ffca6f0a675d9ee4/public/get-pip.py
         PYTHON_GET_PIP_SHA256: dfe9fd5c28dc98b5ac17979a953ea550cec37ae1b47a5116007395bfacff2ab9
         DATASCIENCE_USER: datascience
         DATASCIENCE_UID: 1000
         HOME: /home/datascience

Pushing the Container to OCIR

Follow the steps in the Container Registry documentation to generate an auth token to log in to OCIR.

Log in to Oracle Container Repository (OCIR):
```
docker login -u '<tenant_namespace>/<username>' <region>.ocir.io
```
For more information, see the Container Registry documentation.

Tag the local container image:

docker tag <local_image_name>:<local_version>
 <region>.ocir.io/<tenancy_ocir_namespace>/<repo>:<version>
docker tag byoc:1.0.0 iad.ocir.io/testtenancy/byoc:1.0.0

For more information, see the Container Registry documentation.

Push the tagged image to OCI:
```
docker push <region>.ocir.io/<tenancy_ocid_namespace>/<repo>:<version>
docker push byoc:1.0.0 iad.ocir.io/testtenancy/byoc:1.0.0
```
For more information, see the Container Registry documentation.

Creating the Pipeline

See the Pipeline Policies section to ensure that you have policies that let the pipeline run resource pull container images from the container registry.

Create a pipeline to use the container.

Create a pipeline with an appropriate name, for example, BYOC Demo name.
Select Add pipeline steps.
Give the step a name, for example, Step 1.
To use Bring Your Own Container, select From container.
In Container configuration, select Configure.
In Configure your container environment:
1. Select the repository quickstart or byoc from the list. If the repository is in a different compartment, select Change compartment.
2. Select the image, iad.ocir.io/idtlxnfdweil/quickstart/byoc:1.0.0, from the list.
3. Select Select.
Select Save.
Optional: Define logging.
Select Create.

Enabling the Pipeline Logs

Create a pipeline and start it.

This task it optional, if you don't want to generate the logs, you can ignore it.

From the list of pipelines, select the pipeline you want to enable.
From the pipeline details page, select Logs.
Select Enable logs.

Supported Configurations

Important information about configurations that are supported.

ML Pipelines only support container images residing in the OCI Registry.
The size of the container image is limited to 40 GB in uncompressed form.
The user who creates the ML Pipeline resource must have access to the container image in the OCI Registry. If not, create a user access IAM policy before creating the ML Pipeline resource.

Container images on Apple Silicon M1 Mac

For more information, see Docker Image on Apple an M1 MacBook.

Image digests

Images in a container registry are identified by repository, name, and a tag. Also, Docker gives each version of an image a unique alphanumeric digest. When pushing an updated container image, we recommend giving the updated image a new tag to identify it, rather than reusing an existing tag, as best practice. However, even if you push an updated image and give it the same name and tag as an earlier version, the newly pushed version has a different digest to the earlier version.

When you create a pipeline resource, you specify the name and tag of a particular version of an image. To avoid inconsistencies later on, pipelines also record the unique digest of that particular version of the image. You can also provide the digest of the image when creating the pipeline resource.

By default, if you push an updated version of an image to the Docker registry with the same name and tag as the original version of the image, pipelines continue to use the original digest to pull the original version of the image. This might be the behavior that you require. However, if you want pipelines to pull the later version of the image, you can explicitly change the image name with the tag and digest that pipelines use to identify which version of the image to pull.

Oracle Cloud Infrastructure Documentation