Getting Started with Spark-Submit and SDK
A tutorial to help you get started to use Java SDK code to run a Spark application in
Data Flow using spark-submit with the
execute
string.
Get started with spark-submit in Data Flow using SDK. Follow the existing tutorial for Getting Started with Oracle Cloud Infrastructure Data Flow, but use Java SDK to run spark-submit commands.
Before You Begin
Complete prerequisites before you can use spark-submit commands in Data Flow with Java SDK.
1. ETL with Java
Use Spark-submit and Java SDK to carry out ETL with Java.
2: Machine Learning with PySpark
Using Spark-submit and Java SDK, carry out machine learning with PySpark.
What's Next
Use Spark-submit and the CLI in other situations.
You can use spark-submit and Java SDK to create and run Java, Python, or SQL applications with Data Flow, and explore the results. Data Flow handles all details of deployment, tear down, log management, security, and UI access. With Data Flow, you focus on developing Spark applications without worrying about the infrastructure.