jGroups 不加入 RHEL 上任何具有 link-local ipv6 地址的组

jGroups does not join any group on RHEL with link-local ipv6 address

当 运行 我的代码形成 jGroups 集群时,我在 RHEL 7.3.1 中看到了这个异常。在日志中看到以下异常。

[DEBUG] 2017-10-03 20:23:01.339 [pool-10-thread-1] client.jgroups  - Creating new Channel
[WARN ] 2017-10-03 20:23:01.342 [pool-10-thread-1] stack.Configurator  - JGRP000014: TP.loopback has been deprecated: enabled by default
[DEBUG] 2017-10-03 20:23:01.343 [pool-10-thread-1] stack.Configurator  - set property UDP.bind_addr to default value /fe80:0:0:0:2d57:389e:e4fe:9520%eth0
[DEBUG] 2017-10-03 20:23:01.345 [pool-10-thread-1] stack.Configurator  - set property UDP.diagnostics_addr to default value /ff0e:0:0:0:0:0:75:75
[DEBUG] 2017-10-03 20:23:01.346 [pool-10-thread-1] client.jgroups  - STATE OPEN
[DEBUG] 2017-10-03 20:23:01.347 [pool-10-thread-1] protocols.UDP  - sockets will use interface fe80:0:0:0:2d57:389e:e4fe:9520%eth0
[ERROR] 2017-10-03 20:23:01.374 [pool-10-thread-1] client.jgroups  - Catching
java.lang.Exception: failed to open a port in range 40000-40255
    at org.jgroups.protocols.UDP.createDatagramSocketWithBindPort(UDP.java:500) ~[xxx-xxx.jar:2.0.1]
    at org.jgroups.protocols.UDP.createSockets(UDP.java:361) ~[xxx-xxx.jar:2.0.1]
    at org.jgroups.protocols.UDP.start(UDP.java:270) ~[xxx-xxx.jar:2.0.1]
    at org.jgroups.stack.ProtocolStack.startStack(ProtocolStack.java:965) ~[xxx-xxx.jar:2.0.1]
    at org.jgroups.JChannel.startStack(JChannel.java:891) ~[xxx-xxx.jar:2.0.1]
    at org.jgroups.JChannel._preConnect(JChannel.java:553) ~[xxx-xxx.jar:2.0.1]
    at org.jgroups.JChannel.connect(JChannel.java:288) ~[xxx-xxx.jar:2.0.1]
    at org.jgroups.JChannel.connect(JChannel.java:279) ~[xxx-xxx.jar:2.0.1]

现在相同的客户端代码可以在 Ubuntu 14.04 机器上完美运行。另外需要注意的是,在这两种情况下都没有提供以下标志。

-Djava.net.preferIPv4Stack=true

同样在这两种情况下 link- 都使用了本地 IPv6 地址。 如何让相同的代码在 RHEL 上运行?

添加以下信息,针对@bela-ban提出的问题: 尝试配置 xml.

中的选项

I tried both LINK_LOCAL & NON_LOOPBACK, but still getting the same error.

JGroups 版本?

I am using 3.6.3-Final version of JGroups.

省略 IPv4 标志

We have omitted -Djava.net.preferIPv4Stack=true, as we want to test our client in an IPv6 client environment.

运行 ifconfig -a

Also running the command ifconfig -a , gives the following output :

ifconfig -a
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.66.194.103  netmask 255.255.252.0  broadcast 10.66.195.255
        inet6 fe80::4b16:4a66:2bc3:c505  prefixlen 64  scopeid 0x20<link>
        inet6 fe80::30cb:2f41:5e04:51c2  prefixlen 64  scopeid 0x20<link>
        inet6 fe80::2d57:389e:e4fe:9520  prefixlen 64  scopeid 0x20<link>
        ether 00:15:5d:b8:65:47  txqueuelen 1000  (Ethernet)
        RX packets 8485475  bytes 1961303302 (1.8 GiB)
        RX errors 0  dropped 109087  overruns 0  frame 0
        TX packets 49088  bytes 4169469 (3.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1  (Local Loopback)
        RX packets 154252  bytes 11261136 (10.7 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 154252  bytes 11261136 (10.7 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

您使用哪个版本的 JGroups? (java -cp jgroups.jar org.jgroups.Version 将版本打印到标准输出)。

使用系统 属性 -Djava.net.preferIPv4Stack=true 将强制使用 IPv4 地址。在你的情况下,在 RHEL 上你似乎忽略了这个 属性,因此使用 IPv6 地址。

确保您有一个地址 fe80:0:0:0:2d57:389e:e4fe:9520%eth0 (ifconfig -a)。请注意,您可以使用 bind_addr=link_local 来选择任何 link 本地地址。

[1] http://www.jgroups.org/manual4/index.html#Transport

因此,由于本地 link 地址错误,这一切都失败了。

第一个错误是使用现已过时的 ifconfig 命令。如果分配的本地 link 地址有效,它不提供任何信息。

正确的命令是ip address。在我的例子中,此命令 returns 如下:

# ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:15:5d:b8:65:47 brd ff:ff:ff:ff:ff:ff
    inet 10.66.194.103/22 brd 10.66.195.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::2d57:389e:e4fe:9520/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever
    inet6 fe80::30cb:2f41:5e04:51c2/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever
    inet6 fe80::4b16:4a66:2bc3:c505/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever

如您所见,此处列出的 ipv6 本地 link 地址标记为 tentative dadfailed。这意味着这些地址不能用于任何用途。所以下一步是清除这些不良地址并添加我们自己的唯一本地地址。我执行了以下步骤来完成此操作:

#add the new unique local address. Again this can be duplicate, so chose wisely. A reboot may be required after this.
$nmcli c mod eth0 ipv6.addresses fc00::10:8:8:71/7 ipv6.method manual
# Remove out the old local link addresses
$ip address delete fe80::4b16:4a66:2bc3:c505/64 dev eth0
$ip address delete fe80::30cb:2f41:5e04:51c2/64 dev eth0
$ip address delete fe80::2d57:389e:e4fe:9520/64 dev eth0

之后我们可以再次验证以上步骤是否有效

ip address show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:15:5d:b8:65:47 brd ff:ff:ff:ff:ff:ff
    inet 10.66.194.103/22 brd 10.66.195.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fc00::10:8:8:71/7 scope global
       valid_lft forever preferred_lft forever

看不下去了tentative dadfailed.

因此,可以得出结论,这与 JGroups 根本无关,只是由错误的本地 link 地址引起的。