MongoDB - $geoNear 错误 "Too many results for query, truncating output"
MongoDB - Error "Too many results for query, truncating output" with $geoNear
我是 运行 我的 分片 集群上的 $geoNear 查询(6 个节点有 3 个副本集,每个副本集有 2 个 shardsvr 和 1 个仲裁器)。
我希望查询 return 1.1m 文档。我只收到 ~130.xxx 份文件。我正在使用 Java 驱动程序发出查询并处理数据(目前,我只计算得到 returned 的文档)。我正在使用 MongoDB 3.2.9 和最新的 java 驱动程序。
mongod 日志显示以下错误,这是由输出文档大于 16MB 引起的:
2016-10-10T12:00:22.933+0200 W COMMAND [conn22] Too many geoNear results for query { location: { $nearSphere: { type: "Point", coordinates: [ 10.xxxx, 52.xxxxx] }, $maxDistance: 3900.0 } }, truncating output.
2016-10-10T12:00:22.951+0200 I COMMAND [conn22] command mydb.data command: geoNear { geoNear: "data", near: { type: "Point", coordinates: [ 10.xxxx, 52.xxxxx ] },
num: 50000000, maxDistance: 3900.0, query: {}, spherical: true, distanceMultiplier: 1.0, includeLocs: true } keyUpdates:0 writeConflicts:0 numYields:890 reslen:16777310
locks:{ Global: { acquireCount: { r: 1784 } }, Database: { acquireCount: { r: 892 } }, Collection: { acquireCount: { r: 892 } } } protocol:op_query 589ms
2016-10-10T12:00:23.183+0200 I COMMAND [conn22] getmore mydb.data query: { aggregate: "data", pipeline: [ { $geoNear: { near: { type: "Point", coordinates: [ 10.xxxx, 52.xxxxx ] },
distanceField: "dist.calculated", limit: 50000000, maxDistance: 3900.0, query: {}, spherical: true, distanceMultiplier: 1.0, includeLocs: "dist.location" } }, { $project: { _id: false,
dist: { calculated: true } } } ], fromRouter: true, cursor: { batchSize: 0 } } cursorid:170255616227 ntoreturn:0 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:43558
reslen:1568108 locks:{ Global: { acquireCount: { r: 1786 } }, Database: { acquireCount: { r: 893 } }, Collection: { acquireCount: { r: 893 } } } 820ms
查询:
db.data.aggregate([
{
$geoNear:{
near:{
type:"Point",
coordinates:[
10.xxxx,
52.xxxxx
]
},
distanceField:"dist.calculated",
maxDistance:3900,
num:50000000,
includeLocs:"dist.location",
spherical:true
}
}
])
请注意,我使用和不使用参数 num
发出了查询,两者都失败并出现了上面显示的错误。
我希望在超过文档大小限制 (16 MB) 后查询 return 数据库块。
我错过了什么?我怎样才能检索所有数据?
编辑:
当我添加小组阶段时,查询也会失败,并在 mongod 日志中显示相同的错误:
db.data.aggregate([
{
$geoNear:{
near:{
type:"Point",
coordinates:[
10.xxxx,
52.xxxxxx
]
},
distanceField:"dist.calculated",
maxDistance:3900,
includeLocs:"dist.location",
num:2000000,
spherical:true
}
},
{
$group:{
_id:"$root_document"
}
}
])
MongoDB工作人员方伦刚同时在MongoDB用户群回复了我的询问。以下是他的回答:
Currently, the “geoNear” aggregation stage is limited to return
results that are within the 16MB BSON size limit. This is related to
an issue with earlier version of MongoDB (which is described in
https://jira.mongodb.org/browse/SERVER-13486). Your query hit this
issue because “geoNear” returns a single document (contains an array
of result documents) and the “allowDiskUse” aggregation pipeline
option unfortunately does not help in this case.
There are two options that could be considered:
If you don’t need all the results, you could limit the “geoNear”
aggregation result size using num, limit, or maxDistance options If
you require all of the results, you can use the find() operator which
is not limited to the BSON maximum size since it returns a cursor.
Below is a test I done on MongoDB 3.2.10 For your information.
Create “2dsphere” for designated collection:
db.coll.createIndex({location: '2dsphere'})
Create and insert several big documents:
var padding = '';
for (var j = 0; j < 15; j++) {
for (var i = 1024*128; i > 0; --i) {
var padding = padding + '12345678';
}
}
db.coll.insert({location:{type:"Point", coordinates:[-73.861, 40.73]}, padding:padding})
db.coll.insert({location:{type:"Point", coordinates:[-73.862, 40.73]}, padding:padding})
db.coll.insert({location:{type:"Point", coordinates:[-73.863, 40.73]}, padding:padding})
db.coll.insert({location:{type:"Point", coordinates:[-73.864, 40.73]}, padding:padding})
db.coll.insert({location:{type:"Point", coordinates:[-73.865, 40.73]}, padding:padding})
db.coll.insert({location:{type:"Point", coordinates:[-73.866, 40.73]}, padding:padding}) Query using “geoNear” and server log shows “Too many geoNear results …, truncating output”
db.coll.aggregate(
[
{
$geoNear:{
near:{type:"Point", coordinates:[-73.86, 40.73]},
distanceField:"dist.calculated",
maxDistance:150000000,
spherical:true
}
},
{$project: {location:1}}
]
) Query using “find” and all expected documents are returned
// This and following "var" are necessary to avoid the screen being flushed by padding string.
var cursor = db.coll.find (
{
location: {
$near: {
$geometry:{type:"Point", coordinates:[-73.86, 40.73]},
maxDistance:150000,
}
}
}
)
// It is necessary to iterate through the cursor. Otherwise, the query is not actually executed.
var x = cursor.next()
x._id
var x = cursor.next()
x._id
...
Regards, Lungang
我是 运行 我的 分片 集群上的 $geoNear 查询(6 个节点有 3 个副本集,每个副本集有 2 个 shardsvr 和 1 个仲裁器)。 我希望查询 return 1.1m 文档。我只收到 ~130.xxx 份文件。我正在使用 Java 驱动程序发出查询并处理数据(目前,我只计算得到 returned 的文档)。我正在使用 MongoDB 3.2.9 和最新的 java 驱动程序。
mongod 日志显示以下错误,这是由输出文档大于 16MB 引起的:
2016-10-10T12:00:22.933+0200 W COMMAND [conn22] Too many geoNear results for query { location: { $nearSphere: { type: "Point", coordinates: [ 10.xxxx, 52.xxxxx] }, $maxDistance: 3900.0 } }, truncating output.
2016-10-10T12:00:22.951+0200 I COMMAND [conn22] command mydb.data command: geoNear { geoNear: "data", near: { type: "Point", coordinates: [ 10.xxxx, 52.xxxxx ] },
num: 50000000, maxDistance: 3900.0, query: {}, spherical: true, distanceMultiplier: 1.0, includeLocs: true } keyUpdates:0 writeConflicts:0 numYields:890 reslen:16777310
locks:{ Global: { acquireCount: { r: 1784 } }, Database: { acquireCount: { r: 892 } }, Collection: { acquireCount: { r: 892 } } } protocol:op_query 589ms
2016-10-10T12:00:23.183+0200 I COMMAND [conn22] getmore mydb.data query: { aggregate: "data", pipeline: [ { $geoNear: { near: { type: "Point", coordinates: [ 10.xxxx, 52.xxxxx ] },
distanceField: "dist.calculated", limit: 50000000, maxDistance: 3900.0, query: {}, spherical: true, distanceMultiplier: 1.0, includeLocs: "dist.location" } }, { $project: { _id: false,
dist: { calculated: true } } } ], fromRouter: true, cursor: { batchSize: 0 } } cursorid:170255616227 ntoreturn:0 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:43558
reslen:1568108 locks:{ Global: { acquireCount: { r: 1786 } }, Database: { acquireCount: { r: 893 } }, Collection: { acquireCount: { r: 893 } } } 820ms
查询:
db.data.aggregate([
{
$geoNear:{
near:{
type:"Point",
coordinates:[
10.xxxx,
52.xxxxx
]
},
distanceField:"dist.calculated",
maxDistance:3900,
num:50000000,
includeLocs:"dist.location",
spherical:true
}
}
])
请注意,我使用和不使用参数 num
发出了查询,两者都失败并出现了上面显示的错误。
我希望在超过文档大小限制 (16 MB) 后查询 return 数据库块。 我错过了什么?我怎样才能检索所有数据?
编辑: 当我添加小组阶段时,查询也会失败,并在 mongod 日志中显示相同的错误:
db.data.aggregate([
{
$geoNear:{
near:{
type:"Point",
coordinates:[
10.xxxx,
52.xxxxxx
]
},
distanceField:"dist.calculated",
maxDistance:3900,
includeLocs:"dist.location",
num:2000000,
spherical:true
}
},
{
$group:{
_id:"$root_document"
}
}
])
MongoDB工作人员方伦刚同时在MongoDB用户群回复了我的询问。以下是他的回答:
Currently, the “geoNear” aggregation stage is limited to return results that are within the 16MB BSON size limit. This is related to an issue with earlier version of MongoDB (which is described in https://jira.mongodb.org/browse/SERVER-13486). Your query hit this issue because “geoNear” returns a single document (contains an array of result documents) and the “allowDiskUse” aggregation pipeline option unfortunately does not help in this case.
There are two options that could be considered:
If you don’t need all the results, you could limit the “geoNear” aggregation result size using num, limit, or maxDistance options If you require all of the results, you can use the find() operator which is not limited to the BSON maximum size since it returns a cursor. Below is a test I done on MongoDB 3.2.10 For your information.
Create “2dsphere” for designated collection:
db.coll.createIndex({location: '2dsphere'})
Create and insert several big documents:
var padding = ''; for (var j = 0; j < 15; j++) { for (var i = 1024*128; i > 0; --i) { var padding = padding + '12345678'; } }
db.coll.insert({location:{type:"Point", coordinates:[-73.861, 40.73]}, padding:padding}) db.coll.insert({location:{type:"Point", coordinates:[-73.862, 40.73]}, padding:padding}) db.coll.insert({location:{type:"Point", coordinates:[-73.863, 40.73]}, padding:padding}) db.coll.insert({location:{type:"Point", coordinates:[-73.864, 40.73]}, padding:padding}) db.coll.insert({location:{type:"Point", coordinates:[-73.865, 40.73]}, padding:padding}) db.coll.insert({location:{type:"Point", coordinates:[-73.866, 40.73]}, padding:padding}) Query using “geoNear” and server log shows “Too many geoNear results …, truncating output” db.coll.aggregate( [ { $geoNear:{ near:{type:"Point", coordinates:[-73.86, 40.73]}, distanceField:"dist.calculated", maxDistance:150000000, spherical:true } }, {$project: {location:1}} ] ) Query using “find” and all expected documents are returned // This and following "var" are necessary to avoid the screen being flushed by padding string. var cursor = db.coll.find ( { location: { $near: { $geometry:{type:"Point", coordinates:[-73.86, 40.73]}, maxDistance:150000, } } } ) // It is necessary to iterate through the cursor. Otherwise, the query is not actually executed. var x = cursor.next() x._id var x = cursor.next() x._id ...
Regards, Lungang