Before you Begin with Data Flow
Before you begin using Data Flow, you must have:
- An Oracle Cloud Infrastructure account. Trial accounts can be used to demo Data Flow.
- A Service Administrator role for your Oracle Cloud services. When the service is activated, Oracle sends the credentials and URL to the chosen Account Administrator. The Account Administrator creates an account for each user who needs access to the service.
- A supported browser, such as:
-
Microsoft Internet Explorer 11.x+
-
Mozilla Firefox ESR 38+
-
Google Chrome 42+
-
- A Spark Application uploaded to Object Storage. Do not provide it
packaged in a zipped format such as
.zip
or.gzip
. - Data for processing loaded into Oracle Cloud Infrastructure Object Storage. Data can be read from external data sources or clouds. Data Flow optimizes performance and security for data stored in an Oracle Cloud Infrastructure Object Store.
- The supported application types are:
- Java
- Scala
- SparkSQL
- PySpark (Python 3 only)
- This table shows the Spark versions supported by Data Flow.
This table is for reference only, and isn't meant to be comprehensive.
Supported Spark Versions Spark Version Hadoop Java Python Scala oci-hdfs oci-java-sdk Spark Documentation Spark 3.5.0 3.3.4 17.0.10 3.11.5 2.12.18 3.3.4.1.4.2 3.34.1 Spark Release 3.5.0 Guide Spark 3.2.1 3.3.1 11.0.14 3.8.13 2.12.15 3.3.1.0.3.2 2.45.0 Spark Release 3.2.1 Guide Spark 3.0.2 3.2.0 1.8.0_321 3.6.8 2.12.10 3.2.1.3 1.25.2 Spark Release 3.0.2 Guide Spark 2.4.4 2.9.2 1.8.0_162 3.6.8 2.11.12 2.9.2.6 1.25.0 Spark Release 2.4.4 Guide
Note
Avoid entering confidential information when assigning descriptions, tags, or friendly names to your cloud resources through the Oracle Cloud Infrastructure Console, API, or CLI. This applies when creating or editing an application in Data Flow.
Avoid entering confidential information when assigning descriptions, tags, or friendly names to your cloud resources through the Oracle Cloud Infrastructure Console, API, or CLI. This applies when creating or editing an application in Data Flow.