Solr 5.3 Zookeeper Ensemble create_collection 超时 180s

Solr 5.3 Zookeeper Ensemble create_collection timeout 180s

我有 3 个服务器 运行 每个 Solr 5.3 和 Zookeeper (solr-cloud-01/zookeeper-01, solr-cloud-02/zookeeper-02 & solr-cloud-03/zookeeper-03)

Zookeeper 已启动 运行 其中一个服务器是领导者,其他服务器是追随者

# zkServer.sh status 

如果我尝试创建一个 solr 集合,配置在 Zookeeper 中正确创建,但核心本身不会创建,但在 180 秒后超时

# solr create_collection -c [collection_name] -d [config_name]

Connecting to ZooKeeper at zookeeper-01:2181,zookeeper-02:2181,zookeeper-03:2181 ...    
Uploading /opt/solr/server/solr/configsets/[config_name]/conf for config 
[collection_name] to ZooKeeper at zookeeper-01:2181,zookeeper-02:2181,zookeeper-03:2181

(或)

Re-using existing configuration directory [collection_name]

下一个:

Creating new collection '[collection_name]' using command:
http://localhost:8983/solr/admin/collections?action=CREATE&name=
[collection_name]&numShards=1&replicationFactor=1&maxShardsPerNode=1&
collection.configName=[collection_name]

ERROR: Failed to create collection '[collection_name]' due to: 
create the collection time out:180s

solr 管理控制台日志显示 2 条相同的错误消息,一条来自 SolrCore,另一条来自 SolrDispatchFilter

null:org.apache.solr.common.SolrException: create the collection time out:180s
    at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:239)
    at org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:170)
    at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
    at org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:675)
    at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:443)
    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:214)
    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
    at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
    at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
    at org.eclipse.jetty.server.Server.handle(Server.java:499)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
    at org.eclipse.jetty.io.AbstractConnection.run(AbstractConnection.java:540)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.run(QueuedThreadPool.java:555)
    at java.lang.Thread.run(Thread.java:745)

如果我然后编辑 /opt/zookeeper/conf/zoo.cfg 并取消注释其他动物园管理员(将法定人数减少到 1 个服务器)

server.1=zookeeper-01:2888:3888
#server.2=zookeeper-02:2888:3888
#server.3=zookeeper-03:2888:3888

并更改/var/solr/solr中的ZK_HOSTS选项。in.sh

#ZK_HOST="zookeeper-01:2181,zookeeper-02:2181,zookeeper-03:2181"
ZK_HOST="zookeeper-01:2181"

并重新启动 zookeeper 和 solr => 核心已创建(它以某种方式排队?)。但是离线因为法定人数下降(3 个动物园管理员节点中的 1 个)


然后我尝试了一个独立的 solr / zookeeper 设置 (solr-cloud-01 / zookeeper-01)

# zkServer.sh status
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Mode: standalone

# zkServer.sh status
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Mode: standalone

我执行了同样的命令:

# solr create_collection -c [collection_name] -d [config_name]

Connecting to ZooKeeper at zookeeper-01:2181 ...
Uploading /opt/solr/server/solr/configsets/[config_name]/conf for config [collection_name] 
to ZooKeeper at zookeeper-01:2181

Creating new collection '[collection_name]' using command:
http://localhost:8983/solr/admin/collections?action=CREATE
&name=[collection_name]&numShards=1&replicationFactor=1&
maxShardsPerNode=1&collection.configName=[collection_name]

{
  "responseHeader":{
    "status":0,
    "QTime":9417},
  "success":{"":{
      "responseHeader":{
        "status":0,
        "QTime":8869},
      "core":"[collection_name]_shard1_replica1"}}}

这样行得通!


总之,我感觉有些路由配置不正确,但我似乎无法找出是哪一个...因为 Zookeeper 似乎可以工作,所有单独的 solr 实例也可以工作

这是我的主机文件:

127.0.0.1 localhost
10.0.0.1 solr-cloud-01  
10.0.0.2 solr-cloud-02
10.0.0.3 solr-cloud-03
10.0.0.1 zookeeper-01
10.0.0.2 zookeeper-02
10.0.0.3 zookeeper-03

所以,我终于找到了答案!

在通过 zkCli.sh 检查 /clusterstate.json 之后,我看到当断开连接时 3 'rogue' 副本对独立集群很生气。全部指向 127.0.1.1,(这是到本地主机的 debian 特定环回,参见 https://www.debian.org/doc/manuals/debian-reference/ch05.en.html#_the_hostname_resolution

线索在我的主机文件中。

所以当我将所有对主机名的引用从 127.0.1.1 更改为外部 IP(在我的例子中是 10.0.0.x)时,它开始工作了!

我的新主机文件:

127.0.0.1 localhost
10.0.0.1 solr-cloud-01
10.0.0.2 solr-cloud-02
10.0.0.3 solr-cloud-03
10.0.0.1 zookeeper-01
10.0.0.2 zookeeper-02
10.0.0.3 zookeeper-03