MongoDB ：带数组的文本索引，只有第一个词被索引

Question

我有一个具有以下架构的文档

{
  description : String,
  tags : [String]
}

我已将这两个字段编入索引，但问题是每当我搜索数组中的特定字符串，只有当该字符串是数组的第一个元素时，它才会 return 文档。因此，$text 索引似乎只适用于第一个元素，这是 mongo 固有的工作方式还是必须传递给索引的选项？

示例文档

{
   description : 'random description',
   tags : ["hello", "there"]
}

创建索引的对象

{description : 'text', tags : 'text'}

查询

db.myCollection.find({$text : {$search : 'hello'}});

return是一个文件但是

db.myCollection.find({$text : {$search : 'there'}});

没有return任何东西。

使用版本 2.6.11

我还有其他索引，但这些是唯一的文本索引。这里是db.myCollection.getIndexes()

对应的输出

{
                "v" : 1,
                "key" : {
                        "_fts" : "text",
                        "_ftsx" : 1
                },
                "name" : "description_text_tags_text",
                "ns" : "myDB.myCollection",
                "weights" : {
                        "description" : 1,
                        "tags" : 1
                },
                "default_language" : "english",
                "language_override" : "language",
                "textIndexVersion" : 2
        },

Answer 1

这与字符串是数组的第一个元素还是第二个元素无关。单词 "there" 在 "english" 语言的停用词列表中，根本没有添加到索引中。文本索引过程涉及在将术语添加到文本索引之前从文本中提取和删除停用词，这些过程取决于语言。

您可能希望将文本索引创建为：

db.myCollection.ensureIndex({description : 'text', tags : 'text'}, { default_language: "none" })

如果使用"none"作为默认语言，那么文本索引过程将进行简单的标记化，不会使用任何停用词列表。默认情况下，"english" 用作文本索引的 "default_language"。

MongoDB ：带数组的文本索引，只有第一个词被索引

MongoDB : text index with arrays, only first term is indexed

mongodb

mongodb-query

mongodb-indexes