Integrating with Data Catalog External Hive Metastore
Data Catalog provides a highly available and scalable metastore for Hive implementations.
Prerequisites
You must already have:
- Created a Big Data Service cluster with version 3.0.3 or later. The Big Data Service Version is displayed on the Cluster Information tab of the Cluster details page.
- Created a Data Catalog metastore and retrieved the OCID for the metastore.
- Configured the OCI HDFS Connector for Object Storage.
- Generated an API key and downloaded the private key.
- Copied the private key to all the nodes in the Big Data Service cluster.
Validate the Cluster
You can sign-in to the Big Data Service cluster and test Spark using spark-shell, spark-sql, spark-submit, or spark-beeline client frameworks to run Spark jobs. Use the following examples to validate the cluster.
Examples for Managed Table
CREATE DATABASE IF NOT EXISTS managed_db LOCATION 'oci://<bucket-name>@<tenancy-name-of-bucket>/<path/to/managed/table/directory>'DESCRIBE DATABASE EXTENDED managed_dbUSE managed_dbcreate table IF NOT EXISTS myINTtable_metastorecert (id int, name string) partitioned by (part int, part2 int)insert into myINTtable_metastorecert partition(part=1, part2=1) values (3, "SK")show partitions myINTtable_metastorecertmsck repair table myINTtable_metastorecertshow tablesshow databasesExamples for External Table
CREATE DATABASE IF NOT EXISTS external_db LOCATION 'oci://<bucket-name>@<tenancy-name-of-bucket>/<path/to/external/table/directory>'DESCRIBE DATABASE EXTENDED external_dbUSE external_dbCREATE EXTERNAL TABLE external_test(a string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'location 'oci://<bucket-name>@<tenancy-name-of-bucket>/<path/to/external/table/directory>'select * from external_testselect count(*) from external_test