Cassandra 连接 - 共享 'Cluster' 个实例或多个实例？

Question

在 Java 项目中使用 Cassandra driver 时，管理连接的最佳做法是什么？特别是关于允许多个线程共享一个 Cluster 实例还是为每个需要与 Cassandra 通信的线程分配一个单独的 Cluster 实例更好的做法。

我按照 example code 设置了我的 Cluster 实例，例如：

Cluster.builder().addContactPoint(HOST).withPort(PORT)
    .withCredentials(USER, PASS).build();

所以我要问的是，首选方法是做这样的事情（单个共享 Cluster 实例）：

private static Cluster _cluster = null;

public static Cluster connect() {
    if (_cluster != null && ! _cluster.isClosed()) {
        //return the cached instance
        return _cluster;
    }

    //create a new instance
    _cluster = Cluster.builder().addContactPoint(HOST).withPort(PORT)
                .withCredentials(USER, PASS).build();
    return _cluster;
}

...或者 return 多个 Cluster 实例是最佳实践吗？像这样：

public static Cluster connect() {
    //every caller gets their own Cluster instance
    return Cluster.builder().addContactPoint(HOST).withPort(PORT)
                .withCredentials(USER, PASS).build();
}

我想这个问题的核心要点是：

构建新的 Cluster 实例是一项昂贵的操作吗？
Cluster 对象会在内部 manage/pool 连接到后备数据存储，还是它的功能更像是单个连接的抽象？
Cluster 对象是线程安全的吗？

Answer 1

Is building a new Cluster instance an expensive operation?

调用 build 构建 Cluster 实例不做网络 IO，因此它是一个 non-expensive 操作。

Will the Cluster object internally manage/pool connections to the backing datastore, or does it function more like an abstraction of a single connection?

昂贵的是调用 cluster.init()，它会创建与您的联系点之一的单一连接（控制连接）。 cluster.connect() 甚至更昂贵，因为它会初始化集群（如果尚未初始化）并为每个发现的主机创建一个 Session which manages a connection pool (with pool size based on your PoolingOptions)。所以是的，Cluster 有一个 'control connection' 来管理主机的状态，通过 Cluster.connect() 创建的每个 Session 将有一个到每个主机的连接池。

Is the Cluster object thread-safe?

简单地说，是的:)

4 simple rules when using the DataStax drivers for Cassandra 提供有关此主题的进一步指导。

Cassandra 连接 - 共享 'Cluster' 个实例或多个实例？

Cassandra connections - Shared 'Cluster' instance or multiple?

java

connection

concurrency

cassandra