Enabling Hive ACID

If you're in Big Data Service 3.0.9+, both Spark and Hive use the same catalog 'hive' and ACID in Hive is disabled by default. To enable ACID, the following configs must be updated using Ambari UI. After enabling ACID in Hive, Spark can't read/write in Hive managed tables. Therefore, be sure to update the catalog within Spark. Only the external table works correctly from Spark.

  1. Access Apache Ambari.
  2. From the side toolbar, under Services select Hive.
  3. Configs.
  4. Under Custom hive-site, enter the following configuration information:
    • Search for hive.support.concurrency: Enable, and then save
    • Search for hive.txn.manager: Update value to org.apache.hadoop.hive.ql.lockmgr.DbTxnManager, and then save
    • Search for hive.enforce.bucketing: Enable
    • Search for hive.exec.dynamic.partition.mode: Update to Nonstrict, and then save
    • Search for hive.compactor.initiator.on: Enable, and then save
    • Search for hive.compactor.worker.threads: Set to 5, and then save
    • Search for hive.strict.managed.tables: Update to True, and then save
    • Search for hive.create.as.insert.only: Enable, and then save
    • Search for metastore.create.as.acid: Enable, and then save
  5. Restart all impacted services.
    Note

    If creating table with TBLPROPERTIES (transactional=true,transactional_properties=insert_only) then doing a normal insert into table command fails with the following error in hiveserver/hivemetastore logs.Caused by: MetaException(message:Your client does not appear to support insert-only tables. To skip capability checks, please set metastore.client.capability.check to false. This setting can be set globally, or on the client for the current metastore session. Note that this may lead to incorrect results, data loss, undefined behavior, etc. if your client is actually incompatible. You can also specify custom client capabilities via get_table_req API.)

    To resolve:

    1. In Ambari, select Hive > Configs.
    2. Under Custom hive-site, select Add. Set metastore.client.capability.check=false.
    3. Save it and restart all impacted services.
  6. Restart the Hive service.