Using OCI HDFS Connector

The OCI Hadoop Distributed File System (HDFS) connector lets your Apache Hadoop application read and write data to and from Object Storage.

Important

This is for Big Data Service earlier than version 3.0.4. If the version is 3.0.4 or later, use the Object Storage API Key Integration to connect to Object Storage. The Big Data Service version is displayed on the Cluster Information tab of the cluster details page.

Examples of using HDFS Connector from Big Data Service Clusters

You must have the required IAM policies created to access the Object Storage buckets and other resources.

  • Hadoop
    hadoop fs -ls oci://<bucket-name>@<namespace>/
  • Spark
    import org.apache.spark._
    val conf = sc.getConf
    val test_prefix = sc.textFile("oci://<bucket-name>@<namespace>/")
    test_prefix.toDF().show()

References: URI Formats for HDFS Connectors