HBase 索引器和 Solr：未找到数据

Question

我目前正在使用 hbase-indexer 在 solr 中索引 hbase。当我执行 foolowing 命令来检查索引器时，

hbase-indexer$ bin/hbase-indexer list-indexers --zookeeper 127.0.0.1:2181

结果是：

myindexer
+ Lifecycle state: ACTIVE 
+ Incremental indexing state: SUBSCRIBE_AND_CONSUME
+ Batch indexing state: INACTIVE
+ SEP subscription ID: Indexer_myindexer
+ SEP subscription timestamp: 2017-01-24T13:15:48.614+09:00
+ Connection type: solr
+ Connection params:
  + solr.zk = localhost:2181/solr
  + solr.collection = tagcollect
+ Indexer config:
    222 bytes, use -dump to see content
+ Indexer component factory:     
com.ngdata.hbaseindexer.conf.DefaultIndexerComponentFactory
+ Additional batch index CLI arguments:
  (none)
+ Default additional batch index CLI arguments:
  (none)
+ Processes
  + 1 running processes
  + 0 failed processes

我认为 hbase-indexer 如上所示运行良好，因为它显示为 + 1 运行个进程。（在此之前，我已经通过以下命令执行了 hbase-indexer 守护进程：~$ bin/hbase-indexer 服务器 )

为了测试，我通过 put 命令在 Hbase 中插入数据并检查数据是否已插入。

但是，solr qry 表示如下：（无记录）

我希望分享您与此相关的知识和经验。谢谢你。

{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":7,
"params":{
  "q":"*:*",
  "indent":"on",
  "wt":"json",
  "_":"1485246329559"}},
"response":{"numFound":0,"start":0,"maxScore":0.0,"docs":[]
}}

Answer 1

我们遇到了同样的问题。

正如您所说的服务器实例运行状况良好，以下是它无法正常工作的原因。

首先，如果 'Write ahead log'(WAL) 被禁用（可能是出于写入性能原因），那么您的 puts 将不会创建 solr 文档。

Hbase NRT 索引器适用于 WAL。如果它被禁用那么它不会创建 solr 文档。

第二个原因可能是 mophiline 配置如果不正确则不会创建 solr 文档

但是，我建议编写一个自定义的 mapreduce 程序（或 spark 作业）通过读取 hbase 数据来索引 solr 文档（如果不是实时的，这意味着当你将数据立即放入 hbase 时它不会反映，在 mapreduce solr 索引器运行后将创建 solr 文档）

HBase 索引器和 Solr：未找到数据

HBase-indexer & Solr : NOT found data

indexing

solr

hbase

Hbase NRT 索引器适用于 WAL。如果它被禁用那么它不会创建 solr 文档。