Hive 中的行级事务

Row Level Transactions in Hive

我是 HiveQL 的新手。当我创建一个 table 时,我开始知道我们需要保持交易的一些属性为 TRUE。然后我经历了那些是什么:

hive>set hive.support.concurrency = true;
hive>set hive.enforce.bucketing = true;
hive>set hive.exec.dynamic.partition.mode = nonstrict;
hive>set hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
hive>set hive.compactor.initiator.on = true;
hive>set hive.compactor.worker.threads = a positive number on at least one instance of the Thrift metastore service;

究竟是什么Concurrency,bucketing,Dynamic.partition.mode = 'nonstrict'

我一直在努力学习这些东西,但我正在获取有关锁定机制和 ZooKeeper 以及内存概念的信息。

由于我是这个领域的新手,所以我无法获得这方面的正确知识属性。

任何人都可以解释一下吗?

来自 Hive 文档

hive.support.concurrency

Whether Hive supports concurrency or not. A ZooKeeper instance must be up and running for the default Hive lock manager to support read-write locks.

Set to true to support INSERT ... VALUES, UPDATE, and DELETE transactions (Hive 0.14.0 and later). For a complete list of parameters required for turning on Hive transactions

 hive.enforce.bucketing

Whether bucketing is enforced. If true, while inserting into the table, bucketing is enforced.

hive.exec.dynamic.partition.mode

In strict mode, the user must specify at least one static partition in case the user accidentally overwrites all partitions. In nonstrict mode all partitions are allowed to be dynamic.

hive.txn.manager

Set this to org.apache.hadoop.hive.ql.lockmgr.DbTxnManager as part of turning on Hive transactions. The default DummyTxnManager replicates pre-Hive-0.13 behavior and provides no transactions.

hive.compactor.initiator.on

Whether to run the initiator and cleaner threads on this metastore instance. Set this to true on one instance of the Thrift metastore service as part of turning on Hive transactions. For a complete list of parameters required for turning on transactions, see hive.txn.manager.

It's critical that this is enabled on exactly one metastore service instance (not enforced yet).

hive.compactor.worker.threads

How many compactor worker threads to run on this metastore instance. Set this to a positive number on one or more instances of the Thrift metastore service as part of turning on Hive transactions. For a complete list of parameters required for turning on transactions, see hive.txn.manager.

Worker threads spawn MapReduce jobs to do compactions. They do not do the compactions themselves. Increasing the number of worker threads will decrease the time it takes tables or partitions to be compacted once they are determined to need compaction. It will also increase the background load on the Hadoop cluster as more MapReduce jobs will be running in the background.