Consul 集群未启动
Consul cluster doesn't start
我有一个用于 Consul 集群的 AWS 启动配置。到目前为止它 运行 没有问题,但现在它不起作用。查询任何节点都会导致 "no leader elected"。
所以我通过 SSH 进入了实例。 consul info
结果为 Error querying agent: Get http://127.0.0.1:8500/v1/agent/self: dial tcp 127.0.0.1:8500: getsockopt: connection refused
。
接下来我尝试了:
$ ps -ef | grep consul
consul 2760 1 0 Nov28 ? 00:01:38 /usr/local/bin/consul agent -server -config-file=/etc/consul.conf -data-dir=/tmp/consul -node=1.1.1.1_i-042b3e8f28c622a -bind=2.2.2.2 -config-dir=/etc/consul.d
(我在这里隐藏了 IP 和实例 ID)
查看日志我看到:
==> WARNING: Expect Mode enabled, expecting 3 servers
==> Starting Consul agent...
==> Consul agent running!
Version: 'v0.8.3'
Node ID: '6e0b3c-ad49-90d7-c8e2-121144a4ba'
Node name: '1.1.1.1_i-029b3e8f28622a'
Datacenter: 'dc1'
Server: true (bootstrap: false)
Client Addr: 127.0.0.1 (HTTP: 8500, HTTPS: -1, DNS: 8600)
Cluster Addr: 2.2.2.2 (LAN: 8301, WAN: 8302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
==> Log data will now stream in as it occurs:
2017/11/28 13:19:36 [INFO] raft: Initial configuration (index=0): []
2017/11/28 13:19:36 [INFO] serf: EventMemberJoin: 1.1.1.1_i-029b3e8f28c46622a 2.2.2.2
2017/11/28 13:19:36 [INFO] serf: EventMemberJoin: 1.1.1.1_i-029b3e8f28c46622a.dc1 2.2.2.2
2017/11/28 13:19:36 [INFO] raft: Node at 2.2.2.2:8300 [Follower] entering Follower state (Leader: "")
2017/11/28 13:19:36 [INFO] consul: Adding LAN server 1.1.1.1_i-029b3e8f28c46622a (Addr: tcp/2.2.2.2:8300) (DC: dc1)
2017/11/28 13:19:36 [INFO] consul: Handled member-join event for server "1.1.1.1_i-029b3e8f28c22a.dc1" in area "wan"
2017/11/28 13:19:36 [INFO] agent: Joining cluster...
2017/11/28 13:19:36 [INFO] agent: No EC2 region provided, querying instance metadata endpoint...
2017/11/28 13:19:36 [INFO] agent: Discovered 0 servers from EC2
2017/11/28 13:19:36 [WARN] agent: Join failed: No servers to join, retrying in 30s
2017/11/28 13:19:43 [ERR] agent: failed to sync remote state: No cluster leader
关于如何解决此问题的任何想法?
您应该 bootstrap 集群以允许初始领导者选举,最简单的方法是使用 -bootstrap-expect
集群中的服务器数量(对所有服务器使用相同的标志和值)。
有关 bootstrapping 集群的更多信息 - https://www.consul.io/docs/guides/bootstrapping.html
和https://www.consul.io/docs/agent/options.html#_bootstrap
在你的情况下它说 "WARNING: Expect Mode enabled, expecting 3 servers" 所以它期望在 bootstraping 集群之前有 3 个服务器。我看你只用了两个?加入另一个,它应该可以工作......(共识系统不推荐少于 3 个)。
还有更好的方法,可以使用-bootstrap指定服务器节点。这样就无需启动 3 个服务器来启动 consul 集群 pick leader。
Ubuntu 的详细解释 - +AWS :
- 您的文件应该是这样的:/etc/consul/base.json
{
"server": true,
"ui": true,
"bootstrap_expect":3,
"bind_addr": "102.102.3.1",
"performance": { "raft_multiplier": 1 },
"enable_syslog": true,
"retry_join": [ "provider=aws tag_key=HostIdentifier tag_value=us1-Consul-Prod addr_type=private_v4" ],
"disable_remote_exec": true,
"log_level": "DEBUG",
"data_dir": "/var/lib/consul",
"recursors": ["1.1.1.1"],
"datacenter": "us1"
}
(retry_join - 如果您使用 ec2 标记,则可选),确保您已将 IAM 角色附加到实例)
服务领事重启
运行 - #consul operator raft list-peers
你应该看到领导者,否则检查 /var/log/syslog 以获取更多详细信息以解决问题
我有一个用于 Consul 集群的 AWS 启动配置。到目前为止它 运行 没有问题,但现在它不起作用。查询任何节点都会导致 "no leader elected"。
所以我通过 SSH 进入了实例。 consul info
结果为 Error querying agent: Get http://127.0.0.1:8500/v1/agent/self: dial tcp 127.0.0.1:8500: getsockopt: connection refused
。
接下来我尝试了:
$ ps -ef | grep consul
consul 2760 1 0 Nov28 ? 00:01:38 /usr/local/bin/consul agent -server -config-file=/etc/consul.conf -data-dir=/tmp/consul -node=1.1.1.1_i-042b3e8f28c622a -bind=2.2.2.2 -config-dir=/etc/consul.d
(我在这里隐藏了 IP 和实例 ID)
查看日志我看到:
==> WARNING: Expect Mode enabled, expecting 3 servers
==> Starting Consul agent...
==> Consul agent running!
Version: 'v0.8.3'
Node ID: '6e0b3c-ad49-90d7-c8e2-121144a4ba'
Node name: '1.1.1.1_i-029b3e8f28622a'
Datacenter: 'dc1'
Server: true (bootstrap: false)
Client Addr: 127.0.0.1 (HTTP: 8500, HTTPS: -1, DNS: 8600)
Cluster Addr: 2.2.2.2 (LAN: 8301, WAN: 8302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
==> Log data will now stream in as it occurs:
2017/11/28 13:19:36 [INFO] raft: Initial configuration (index=0): []
2017/11/28 13:19:36 [INFO] serf: EventMemberJoin: 1.1.1.1_i-029b3e8f28c46622a 2.2.2.2
2017/11/28 13:19:36 [INFO] serf: EventMemberJoin: 1.1.1.1_i-029b3e8f28c46622a.dc1 2.2.2.2
2017/11/28 13:19:36 [INFO] raft: Node at 2.2.2.2:8300 [Follower] entering Follower state (Leader: "")
2017/11/28 13:19:36 [INFO] consul: Adding LAN server 1.1.1.1_i-029b3e8f28c46622a (Addr: tcp/2.2.2.2:8300) (DC: dc1)
2017/11/28 13:19:36 [INFO] consul: Handled member-join event for server "1.1.1.1_i-029b3e8f28c22a.dc1" in area "wan"
2017/11/28 13:19:36 [INFO] agent: Joining cluster...
2017/11/28 13:19:36 [INFO] agent: No EC2 region provided, querying instance metadata endpoint...
2017/11/28 13:19:36 [INFO] agent: Discovered 0 servers from EC2
2017/11/28 13:19:36 [WARN] agent: Join failed: No servers to join, retrying in 30s
2017/11/28 13:19:43 [ERR] agent: failed to sync remote state: No cluster leader
关于如何解决此问题的任何想法?
您应该 bootstrap 集群以允许初始领导者选举,最简单的方法是使用 -bootstrap-expect
集群中的服务器数量(对所有服务器使用相同的标志和值)。
有关 bootstrapping 集群的更多信息 - https://www.consul.io/docs/guides/bootstrapping.html
和https://www.consul.io/docs/agent/options.html#_bootstrap
在你的情况下它说 "WARNING: Expect Mode enabled, expecting 3 servers" 所以它期望在 bootstraping 集群之前有 3 个服务器。我看你只用了两个?加入另一个,它应该可以工作......(共识系统不推荐少于 3 个)。
还有更好的方法,可以使用-bootstrap指定服务器节点。这样就无需启动 3 个服务器来启动 consul 集群 pick leader。
Ubuntu 的详细解释 - +AWS :
- 您的文件应该是这样的:/etc/consul/base.json
{
"server": true,
"ui": true,
"bootstrap_expect":3,
"bind_addr": "102.102.3.1",
"performance": { "raft_multiplier": 1 },
"enable_syslog": true,
"retry_join": [ "provider=aws tag_key=HostIdentifier tag_value=us1-Consul-Prod addr_type=private_v4" ],
"disable_remote_exec": true,
"log_level": "DEBUG",
"data_dir": "/var/lib/consul",
"recursors": ["1.1.1.1"],
"datacenter": "us1"
}
(retry_join - 如果您使用 ec2 标记,则可选),确保您已将 IAM 角色附加到实例)
服务领事重启
运行 - #consul operator raft list-peers 你应该看到领导者,否则检查 /var/log/syslog 以获取更多详细信息以解决问题