加载 500 万行数据后,Cassandra 查询失败 - 一致性读取查询期间 Cassandra 失败 LOCAL_ONE
Cassandra Query fails after data load of 5m rows - Cassandra failure during read query at consistency LOCAL_ONE
问题
简单的 CQL select 在我加载大量数据时失败。
设置
我正在使用以下 Cassandra 架构:
CREATE KEYSPACE fv WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
create table entity_by_identifier (
identifier text,
state entity_state,
PRIMARY KEY(identifier)
);
CREATE TYPE entity_state,(
identifier text,
number1 int,
number2 double,
entity_type text,
string1 text,
string2 text
);
我正在尝试执行的查询:
SELECT * FROM fv.entity_by_identifier WHERE identifier=:identifier;
问题
此查询在小型数据集(尝试使用 500 行)中运行良好。
但是,通过大数据负载测试,在继续多次执行此查询之前,我在此 table 中创建了超过 500 万行(10 个线程连续执行此查询 1 小时)。
数据加载完成后,查询开始但立即失败并出现以下错误:
com.datastax.driver.core.exceptions.ReadFailureException: Cassandra failure during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded, 1 failed)
at com.datastax.driver.core.exceptions.ReadFailureException.copy(ReadFailureException.java:85)
at com.datastax.driver.core.exceptions.ReadFailureException.copy(ReadFailureException.java:27)
at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:64)
...my calling classes...
我检查了 Cassandra 日志,只发现了这个异常:
java.lang.AssertionError: null
at org.apache.cassandra.db.rows.BTreeRow.getCell(BTreeRow.java:212) ~[apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.canRemoveRow(SinglePartitionReadCommand.java:899) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.reduceFilter(SinglePartitionReadCommand.java:863) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndSSTablesInTimestampOrder(SinglePartitionReadCommand.java:748) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDiskInternal(SinglePartitionReadCommand.java:519) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDisk(SinglePartitionReadCommand.java:496) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.queryStorage(SinglePartitionReadCommand.java:358) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:366) ~[apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1797) ~[apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2466) ~[apache-cassandra-3.7.jar:3.7]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101]
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.concurrent.SEPExecutor.maybeExecuteImmediately(SEPExecutor.java:192) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.AbstractReadExecutor.makeRequests(AbstractReadExecutor.java:117) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.AbstractReadExecutor.makeDataRequests(AbstractReadExecutor.java:85) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.AbstractReadExecutor$NeverSpeculatingReadExecutor.executeAsync(AbstractReadExecutor.java:214) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.doInitialQueries(StorageProxy.java:1702) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1657) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1604) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1523) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.execute(SinglePartitionReadCommand.java:335) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:67) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.pager.SinglePartitionPager.fetchPage(SinglePartitionPager.java:34) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.statements.SelectStatement$Pager$NormalPager.fetchPage(SelectStatement.java:325) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:361) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:237) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:78) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:208) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:486) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:463) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:130) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401) [apache-cassandra-3.7.jar:3.7]
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.36.Final.jar:4.0.36.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292) [netty-all-4.0.36.Final.jar:4.0.36.Final]
at io.netty.channel.AbstractChannelHandlerContext.access0(AbstractChannelHandlerContext.java:32) [netty-all-4.0.36.Final.jar:4.0.36.Final]
at io.netty.channel.AbstractChannelHandlerContext.run(AbstractChannelHandlerContext.java:283) [netty-all-4.0.36.Final.jar:4.0.36.Final]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101]
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-3.7.jar:3.7]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
如您所见,我使用的是 Cassandra 3.7。
使用的 Datastax 驱动程序是 3.1.0 版。
知道为什么较大的数据集会导致此错误吗?
对于您要检索的记录数量,值得使用 pagination 来检索更小的块。
编辑
如解释的那样 here, you may be encountering a timeout of the read; going through the millions of records that you refer may be taking longer than the read_request_timeout_in_ms
threshold (default is 5 seconds)。一种选择是提高该阈值。
找到问题的解决方案。
使用用户定义类型时,需要使用 "frozen" 关键字。
create table entity_by_identifier (
identifier text,
state entity_state,
PRIMARY KEY(identifier)
);
变成:
create table entity_by_identifier (
identifier text,
state frozen <entity_state>,
PRIMARY KEY(identifier)
);
您可以在以下位置找到有关此 Frozen 关键字的信息:
http://docs.datastax.com/en/cql/3.1/cql/cql_using/cqlUseUDT.html
和
http://docs.datastax.com/en/cql/3.1/cql/cql_reference/create_table_r.html#reference_ds_v3f_vfk_xj__tuple-udt-columns
尽管如此,我仍然不清楚为什么缺少此 "Frozen" 关键字会导致我看到错误。
问题
简单的 CQL select 在我加载大量数据时失败。
设置
我正在使用以下 Cassandra 架构:
CREATE KEYSPACE fv WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
create table entity_by_identifier (
identifier text,
state entity_state,
PRIMARY KEY(identifier)
);
CREATE TYPE entity_state,(
identifier text,
number1 int,
number2 double,
entity_type text,
string1 text,
string2 text
);
我正在尝试执行的查询:
SELECT * FROM fv.entity_by_identifier WHERE identifier=:identifier;
问题
此查询在小型数据集(尝试使用 500 行)中运行良好。 但是,通过大数据负载测试,在继续多次执行此查询之前,我在此 table 中创建了超过 500 万行(10 个线程连续执行此查询 1 小时)。
数据加载完成后,查询开始但立即失败并出现以下错误:
com.datastax.driver.core.exceptions.ReadFailureException: Cassandra failure during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded, 1 failed)
at com.datastax.driver.core.exceptions.ReadFailureException.copy(ReadFailureException.java:85)
at com.datastax.driver.core.exceptions.ReadFailureException.copy(ReadFailureException.java:27)
at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:64)
...my calling classes...
我检查了 Cassandra 日志,只发现了这个异常:
java.lang.AssertionError: null
at org.apache.cassandra.db.rows.BTreeRow.getCell(BTreeRow.java:212) ~[apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.canRemoveRow(SinglePartitionReadCommand.java:899) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.reduceFilter(SinglePartitionReadCommand.java:863) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndSSTablesInTimestampOrder(SinglePartitionReadCommand.java:748) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDiskInternal(SinglePartitionReadCommand.java:519) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDisk(SinglePartitionReadCommand.java:496) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.queryStorage(SinglePartitionReadCommand.java:358) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:366) ~[apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1797) ~[apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2466) ~[apache-cassandra-3.7.jar:3.7]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101]
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.concurrent.SEPExecutor.maybeExecuteImmediately(SEPExecutor.java:192) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.AbstractReadExecutor.makeRequests(AbstractReadExecutor.java:117) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.AbstractReadExecutor.makeDataRequests(AbstractReadExecutor.java:85) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.AbstractReadExecutor$NeverSpeculatingReadExecutor.executeAsync(AbstractReadExecutor.java:214) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.doInitialQueries(StorageProxy.java:1702) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1657) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1604) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1523) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.db.SinglePartitionReadCommand.execute(SinglePartitionReadCommand.java:335) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:67) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.pager.SinglePartitionPager.fetchPage(SinglePartitionPager.java:34) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.statements.SelectStatement$Pager$NormalPager.fetchPage(SelectStatement.java:325) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:361) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:237) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:78) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:208) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:486) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:463) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:130) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401) [apache-cassandra-3.7.jar:3.7]
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.36.Final.jar:4.0.36.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292) [netty-all-4.0.36.Final.jar:4.0.36.Final]
at io.netty.channel.AbstractChannelHandlerContext.access0(AbstractChannelHandlerContext.java:32) [netty-all-4.0.36.Final.jar:4.0.36.Final]
at io.netty.channel.AbstractChannelHandlerContext.run(AbstractChannelHandlerContext.java:283) [netty-all-4.0.36.Final.jar:4.0.36.Final]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101]
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-3.7.jar:3.7]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
如您所见,我使用的是 Cassandra 3.7。 使用的 Datastax 驱动程序是 3.1.0 版。
知道为什么较大的数据集会导致此错误吗?
对于您要检索的记录数量,值得使用 pagination 来检索更小的块。
编辑
如解释的那样 here, you may be encountering a timeout of the read; going through the millions of records that you refer may be taking longer than the read_request_timeout_in_ms
threshold (default is 5 seconds)。一种选择是提高该阈值。
找到问题的解决方案。
使用用户定义类型时,需要使用 "frozen" 关键字。
create table entity_by_identifier (
identifier text,
state entity_state,
PRIMARY KEY(identifier)
);
变成:
create table entity_by_identifier (
identifier text,
state frozen <entity_state>,
PRIMARY KEY(identifier)
);
您可以在以下位置找到有关此 Frozen 关键字的信息: http://docs.datastax.com/en/cql/3.1/cql/cql_using/cqlUseUDT.html 和 http://docs.datastax.com/en/cql/3.1/cql/cql_reference/create_table_r.html#reference_ds_v3f_vfk_xj__tuple-udt-columns
尽管如此,我仍然不清楚为什么缺少此 "Frozen" 关键字会导致我看到错误。