Mongodb 聚合仅在 Spring 数据上缓慢

Question

我正在使用 Spring Boot 2 与 Spring 数据和本地 Mongodb 3.4。我目前有一个包含 ~200k 文档的集合，通过 Spring 的注释正确索引。

我制作了一个聚合管道（见此 post 的底部），大约需要 2000 毫秒 完成：Python，Studio3T , Spring 引导单元测试.

当我运行通常在我的应用程序 (bootJar) 的 已部署实例上执行完全相同的查询时，需要 8000 毫秒 这对我的用例来说是不可接受的，而且也很奇怪。

数据库日志报告的查询时间在所有情况下均为 ~300 毫秒，因此数据库运行良好。

找出发生这种情况的原因非常有趣，因为正常部署和单元测试期间的代码完全相同（也使用相同的参数），因此在性能上应该具有可比性。

我的猜测是部署和测试之间的配置不同。我在两种环境中都使用默认设置（mongo 没有特定的 bean 声明，只是自动装配 MongoOperations 并享受 "Boot magic"）。

以下是一些示例：

文档如下所示：

{ "_id" : ObjectId("5b4f76696d370f30d401f246"), "description" : "IoT420", "timestamp" : NumberLong(1530286316), "sensor" : "Temperature", "value" : 30.02, "class" : "net.derp.iot.piws.entities.dto.MongoMeasurementRepr" }

聚合管道：

Aggregation aggregation = newAggregation( match( where( "description" ).is( filterDescription ) ), match( where( "sensor" ).is( sensorName ) ), match( where( "timestamp" ).gte( tsFrom ).lte( tsTo ) ), sort( Sort.Direction.ASC, "timestamp" ));

此聚合 returns 大约 120k 个文档。

我用System.nanoTime()来衡量时间。

更新：删除runtime('org.springframework.boot:spring-boot-devtools')后，时间下降到5000ms，仍然比测试用例慢。我怀疑某种 "validation" 在测试期间被禁用。

Answer 1

我查看了您的查询，发现您在两个匹配阶段后给出了时间戳标准。理想情况下，日期条件应该在 mongo 查询中排在第一位，因为它可以缩小下一阶段要扫描的文档数量。

所以你的聚合查询应该是

Aggregation aggregation = newAggregation(
            match( where( "timestamp" ).gte( tsFrom ).lte( tsTo ) ),
            match( where( "description" ).is( filterDescription ) ),
            match( where( "sensor" ).is( sensorName ) ),
            sort( Sort.Direction.ASC, "timestamp" ));

Answer 2

在 linux 服务器（使用不同的数据库）上部署相同的 jar 后，一切看起来都很正常（查询时间约 2000 毫秒）。我猜我的环境或类似的配置有冲突..

Mongodb 聚合仅在 Spring 数据上缓慢

Mongodb aggregation is slow only on Spring Data

spring

mongodb

spring-data

spring-data-mongodb

spring-boot