Infinispan 集群节点只能看到自己，看不到 Kubernetes 节点中的其他实例运行

Question

我尝试在我的应用程序中设置一个 infinispan 缓存，它是运行在 google-cloud-platform 上的几个节点上，使用 Kubernetes 和 Docker。

这些缓存中的每一个都应与其他节点缓存共享它们的数据，以便它们都具有相同的可用数据。

我的问题是 JGroups 配置似乎没有按照我想要的方式工作，节点看不到它们的任何兄弟节点。

我尝试了几种配置，但节点总是看到自己，并且没有与其他节点建立集群。

我尝试了 GitHub 示例中的一些配置，例如 https://github.com/jgroups-extras/jgroups-kubernetes or https://github.com/infinispan/infinispan-simple-tutorials

这是我的jgroups.xml

<config xmlns="urn:org:jgroups"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups-4.0.xsd">

    <TCP bind_addr="${jgroups.tcp.address:127.0.0.1}"
         bind_port="${jgroups.tcp.port:7800}"
         enable_diagnostics="false"
         thread_naming_pattern="pl"
         send_buf_size="640k"
         sock_conn_timeout="300"
         bundler_type="no-bundler"
         logical_addr_cache_expiration="360000"

         thread_pool.min_threads="${jgroups.thread_pool.min_threads:0}"
         thread_pool.max_threads="${jgroups.thread_pool.max_threads:200}"
         thread_pool.keep_alive_time="60000"
    />
    <org.jgroups.protocols.kubernetes.KUBE_PING
        port_range="1"
        namespace="${KUBERNETES_NAMESPACE:myGoogleCloudPlatformNamespace}"
    />
    <MERGE3 min_interval="10000"
            max_interval="30000"
    />
    <FD_SOCK />
    <!-- Suspect node `timeout` to `timeout + timeout_check_interval` millis after the last heartbeat -->
    <FD_ALL timeout="10000"
            interval="2000"
            timeout_check_interval="1000"
    />
    <VERIFY_SUSPECT timeout="1000"/>
    <pbcast.NAKACK2 xmit_interval="100"
                    xmit_table_num_rows="50"
                    xmit_table_msgs_per_row="1024"
                    xmit_table_max_compaction_time="30000"
                    resend_last_seqno="true"
    />
    <UNICAST3 xmit_interval="100"
              xmit_table_num_rows="50"
              xmit_table_msgs_per_row="1024"
              xmit_table_max_compaction_time="30000"
    />
    <pbcast.STABLE stability_delay="500"
                   desired_avg_gossip="5000"
                   max_bytes="1M"
    />
    <pbcast.GMS print_local_addr="false"
                join_timeout="${jgroups.join_timeout:5000}"
    />
    <MFC max_credits="2m"
         min_threshold="0.40"
    />
    <FRAG3 frag_size="8000"/>
</config>

以及我如何初始化 Infinispan 缓存 (Kotlin)

import org.infinispan.configuration.cache.CacheMode
import org.infinispan.configuration.cache.ConfigurationBuilder
import org.infinispan.configuration.global.GlobalConfigurationBuilder
import org.infinispan.manager.DefaultCacheManager
import java.util.concurrent.TimeUnit

class MyCache<V : Any>(private val cacheName: String) {

    companion object {
        private var cacheManager = DefaultCacheManager(
            GlobalConfigurationBuilder()
                .transport().defaultTransport()
                .addProperty("configurationFile", "jgroups.xml")
                .build()
        )
    }

    private val backingCache = buildCache()

    private fun buildCache(): org.infinispan.Cache<CacheKey, V> {
        val cacheConfiguration = ConfigurationBuilder()
            .expiration().lifespan(8, TimeUnit.HOURS)
            .clustering().cacheMode(CacheMode.REPL_ASYNC)
            .build()
        cacheManager.defineConfiguration(this.cacheName, cacheConfiguration)
        log.info("Started cache with name $cacheName. Found cluster members are ${cacheManager.clusterMembers}")
        return cacheManager.getCache(this.cacheName)
    }
}

这是日志所说的

INFO  o.i.r.t.jgroups.JGroupsTransport - ISPN000078: Starting JGroups channel ISPN
INFO  o.j.protocols.kubernetes.KUBE_PING - namespace myNamespace set; clustering enabled
INFO  org.infinispan.CLUSTER - ISPN000094: Received new cluster view for channel ISPN: [myNamespace-7d878d4c7b-cks6n-57621|0] (1) [myNamespace-7d878d4c7b-cks6n-57621]
INFO  o.i.r.t.jgroups.JGroupsTransport - ISPN000079: Channel ISPN local address is myNamespace-7d878d4c7b-cks6n-57621, physical addresses are [127.0.0.1:7800]

我希望在启动时新节点会找到已经存在的节点并从中获取日期。

目前，每个节点在启动时只能看到自己，没有任何共享

Answer 1

通常，当您在 JGroups/Infinispan 方面需要帮助时，首先要做的就是设置跟踪级别的日志记录。

KUBE_PING 的问题可能是 pod 没有运行在正确的服务帐户下，因此它没有访问 Kubernetes Master API 的授权令牌。这就是为什么目前首选的方法是使用 DNS_PING，并注册一个 headless service. See this example.

Answer 2

此外，bind_addr 设置为 127.0.0.1。这意味着，不同主机上的成员将无法找到彼此。我建议设置 bind_addr，例如<TCP bind_addr="site_local".../>。

详情见[1]。

[1] http://www.jgroups.org/manual4/index.html#Transport

Infinispan 集群节点只能看到自己，看不到 Kubernetes 节点中的其他实例 运行

Infinispan cluster nodes only see themself and not the other instances running in Kubernetes nodes

jgroups

infinispan

google-cloud-platform

kubernetes

Infinispan 集群节点只能看到自己，看不到 Kubernetes 节点中的其他实例运行