如何使用 Logstash 导入 CSV 数据以在 Elasticsearch suggester 中完成字段类型

How to import CSV data using Logstash for field type completion in Elasticsearch suggester

ElasticSearch 索引创建

curl -XPOST 'http://localhost:9200/music/' -d '{}'

字段映射

curl -XPUT 'http://localhost:9200/music/_mapping/song' -d '
{
  "properties": {
    "name" : {
      "type" : "string"
    },
    "suggest": {
      "type" : "completion"
    }
  }
}'

LogStash 配置文件,musicStash.config

input {
    file {
        path => "pathToCsv"
        start_position => beginning
    }
}

filter {  
    csv {
        columns => ["id", "name", "suggest"]
        separator => ","
    }
}

output {
    elasticsearch {
        hosts => "localhost"
        index => "music"
        document_id => "%{id}"
    }
}

现在在执行 logstash 配置文件时,在 elasticsearch 控制台中收到以下异常

failed to put mappings on indices [[music]], type [logs]
java.lang.IllegalArgumentException: Mapper for [suggest] conflicts with existing mapping in other types:
[mapper [suggest] cannot be changed from type [completion] to [string]]
at org.elasticsearch.index.mapper.FieldTypeLookup.checkCompatibility(FieldTypeLookup.java:117)

并且在 logstash 控制台中收到错误,

response=>{"index"=>{"_index"=>"music", "_type"=>"logs", "_id"=>"5", "status"=>400, 
"error"=>{"type"=>"illegal_argument_exception", 
"reason"=>"Mapper for [suggest] conflicts with existing mapping in other types:\n[mapper [suggest] cannot be changed from type [completion] to [string]]"}}}, :level=>:warn}

那么如何通过Logstash导入csv文件实现elasticsearch的自动补全功能

您的 elasticsearch 输出中缺少以下设置:

document_type => "song"

logstash 正在创建一个名为 logs (by default) 的新类型,因为从 ES 2.0 开始,禁止有两个名称相同但类型不同的字段 (string vs completion) 在同一个索引中,它出错了。

只需像这样修改您的输出,它就会起作用:

output {
    elasticsearch {
        hosts => "localhost"
        index => "music"
        document_type => "song"
        document_id => "%{id}"
    }
}

我是 elasticsearch_loader
的作者 如果您只想将 CSV 数据加载到 elasticsearch 中,您可以使用 elasticsearch_loader
安装后,您将能够通过发出以下命令将 csv/json/parquet 文件加载到 elasticsearch 中:

elasticsearch_loader \
  --index-settings-file mappings.json \
  --index completion \
  --type song \
  --id-field id \
  csv \
  input1.csv input2.csv