Data Science now connects to Data Flow
- Services: Data Science
- Release Date: November 15, 2022
You can connect to Data Flow and run an Apache Spark application from a Data Science notebook session. These sessions allow you to run interactive Spark workloads on a long lasting Data Flow cluster through an Apache Livy integration.
Data Flow integration with Data Science uses fully managed Jupyter Notebooks to enable data scientists and data engineers to create, visualize, collaborate, and debug data engineering and data science applications. You can write these applications in Python, Scala, and PySpark. You can also connect a Data Science notebook session to Data Flow to run applications. The Data Flow studio kernels and applications run on Oracle Cloud Infrastructure Data Flow.
Data Flow is a fully managed Apache Spark service that performs processing tasks on extremely large datasets, without the need to deploy or manage infrastructure. For more information, see the Data Flow documentation.