TokenAware 策略 Cassandra 和一个查询中的多个节点

Question

如果我们的查询包含多个最终出现在不同节点上的标记，会发生什么情况？客户端是否可能在节点上运行多个同步或异步查询？

示例：

//Our query
SELECT * FROM keyspace1.standard1 WHERE key = 1 or key = 2 or key = 3;

//Client change our query to multiple queries depends on the token ranges and run them sync or async.
SELECT * FROM keyspace1.standrad1 WHERE key = 1 or key = 3; //Token On node X
SELECT * FROM keyspace1.standard1 WHERE key = 3; //token On node Y

样本 2:

 //Our Query
 SELECT * FROM kspc.standard1;

 //Client Change our query to multiple queries on the token ranges and run them sync or async. 
 SELECT * FROM kspc.standard1 WHERE token(key) > [start range node1] and token(key) < [end range node1]; 
 SELECT * FROM kspc.standard1 WHERE token(key) > [start range node2] and token(key) < [end range node2]; 
 and ...

Answer 1

对于示例1，仅查询单个分区并在客户端合并结果。这会快得多。 Datastax 驱动程序具有令牌感知策略，但它仅在查询引用单个分区时有效。

你可以参考这个link。

对于示例 2，它是一个反模式查询，您不能指望客户端为您完成所有工作。如果你想阅读完整的 table 那么你可以使用 spark。 Datastax 提供了 spark-cassandra-connector，它可以提供与您提供的功能相同的功能。 Here 你可以找到 spark-cassandra-connector 的描述。

Answer 2

正如 Manish 所提到的，如果查询包含多个分区，则令牌感知策略将不会 select 任何内容，并将查询发送到集群中的任何节点（相同的行为适用于未准备好的查询和 DDL）。通常，这是一种反模式，因为它会给节点带来更多负载，因此应该避免。但是如果你真的需要，那么你可以强制驱动程序将查询发送到拥有特定分区键的节点之一。在 Java 驱动程序 3.x 中有一个函数 statement.setRoutingKey，对于 Java 驱动程序 4.x 应该是类似的东西。对于其他驱动程序应该有类似的东西，但可能没有。

对于第二个 class 查询 - 它是相同的，默认情况下驱动程序无法找到将查询发送到哪个节点，并且应该明确设置路由键。但总的来说，完整的 table 扫描可能会很棘手，因为您需要处理下限和上限的条件，并且您不能期望令牌范围恰好从下限开始 - 这可能是令牌范围开始接近上限并结束略高于下限 - 这是我经常看到的典型错误。如果您感兴趣，我有一个示例，说明如何使用 Java 执行完整 table 扫描（它使用与专门为此设计的 Spark Cassandra Connector and DSBulk) - the main part is this cycle over the available token ranges. But if you're looking into writing full table scan yourself, think about using the DSBulk parts as an SDK - you need to look onto partitioner module 相同的算法。

TokenAware 策略 Cassandra 和一个查询中的多个节点

TokenAware policy Cassandra and several node in one query

cassandra