在 EKS 上的 Kubernetes 1.14 上启动 Helm Chart stable/minecraft 失败 Liveness Probe

Launching Helm Chart stable/minecraft on Kubernetes 1.14 on EKS Fails Liveness Probe

我正在尝试在 AWS EKS 上使用 Helm on Kubernetes 1.14 运行 从 stable/minecraft 部署 Vanilla MineCraft 服务器,但我一直在获取 CrashLoopBackOffLiveness Probe Failures.这对我来说似乎很奇怪,因为我正在按照文档指定的方式部署图表:

helm install --name mine-release --set minecraftServer.eula=TRUE --namespace=mine-release stable/minecraft

已尝试调试:

  1. 尝试减少和增加内存helm install --name mine-release --set resources.requests.memory="1024Mi" --set minecraftServer.memory="1024M" --set minecraftServer.eula=TRUE --namespace=mine-release stable/minecraft
  2. 尝试通过 kubectl logs mine-release-minecraft-56f9c8588-xn9pv --namespace mine-release 查看日志,但此错误总是出现
Error from server: Get https://10.0.143.216:10250/containerLogs/mine-release/mine-release-minecraft-56f9c8588-xn9pv/mine-release-minecraft: dial tcp 10.0.143.216:10250: i/o timeout

为了提供更多上下文,下面是 Pod 描述和事件的 kubectl describe pods mine-release-minecraft-56f9c8588-xn9pv --namespace mine-release 输出:

Name:               mine-release-minecraft-56f9c8588-xn9pv
Namespace:          mine-release
Priority:           0
PriorityClassName:  <none>
Node:               ip-10-0-143-216.ap-southeast-2.compute.internal/10.0.143.216
Start Time:         Fri, 11 Oct 2019 08:48:34 +1100
Labels:             app=mine-release-minecraft
                    pod-template-hash=56f9c8588
Annotations:        kubernetes.io/psp: eks.privileged
Status:             Running
IP:                 10.0.187.192
Controlled By:      ReplicaSet/mine-release-minecraft-56f9c8588
Containers:
  mine-release-minecraft:
    Container ID:   docker://893f622e1129937fab38dc902e25e95ac86c2058da75337184f105848fef773f
    Image:          itzg/minecraft-server:latest
    Image ID:       docker-pullable://itzg/minecraft-server@sha256:00f592eb6660682f327770d639cf10692b9617fa8b9a764b9f991c401e325105
    Port:           25565/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Fri, 11 Oct 2019 08:50:56 +1100
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Fri, 11 Oct 2019 08:50:03 +1100
      Finished:     Fri, 11 Oct 2019 08:50:53 +1100
    Ready:          False
    Restart Count:  2
    Requests:
      cpu:      500m
      memory:   1Gi
    Liveness:   exec [mcstatus localhost:25565 status] delay=30s timeout=1s period=5s #success=1 #failure=3
    Readiness:  exec [mcstatus localhost:25565 status] delay=30s timeout=1s period=5s #success=1 #failure=3
    Environment:
      EULA:                          true
      TYPE:                          VANILLA
      VERSION:                       1.14.4
      DIFFICULTY:                    easy
      WHITELIST:                     
      OPS:                           
      ICON:                          
      MAX_PLAYERS:                   20
      MAX_WORLD_SIZE:                10000
      ALLOW_NETHER:                  true
      ANNOUNCE_PLAYER_ACHIEVEMENTS:  true
      ENABLE_COMMAND_BLOCK:          true
      FORCE_gameMode:                false
      GENERATE_STRUCTURES:           true
      HARDCORE:                      false
      MAX_BUILD_HEIGHT:              256
      MAX_TICK_TIME:                 60000
      SPAWN_ANIMALS:                 true
      SPAWN_MONSTERS:                true
      SPAWN_NPCS:                    true
      VIEW_DISTANCE:                 10
      SEED:                          
      MODE:                          survival
      MOTD:                          Welcome to Minecraft on Kubernetes!
      PVP:                           false
      LEVEL_TYPE:                    DEFAULT
      GENERATOR_SETTINGS:            
      LEVEL:                         world
      ONLINE_MODE:                   true
      MEMORY:                        512M
      JVM_OPTS:                      
      JVM_XX_OPTS:                   
    Mounts:
      /data from datadir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-j8zql (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  datadir:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  mine-release-minecraft-datadir
    ReadOnly:   false
  default-token-j8zql:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-j8zql
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                  Age                    From                                                      Message
  ----     ------                  ----                   ----                                                      -------
  Warning  FailedScheduling        2m25s                  default-scheduler                                         pod has unbound immediate PersistentVolumeClaims (repeated 3 times)
  Normal   Scheduled               2m24s                  default-scheduler                                         Successfully assigned mine-release/mine-release-minecraft-56f9c8588-xn9pv to ip-10-0-143-216.ap-southeast-2.compute.internal
  Warning  FailedAttachVolume      2m22s (x3 over 2m23s)  attachdetach-controller                                   AttachVolume.Attach failed for volume "pvc-b48ba754-eba7-11e9-b609-02ed13ff0a10" : "Error attaching EBS volume \"vol-08b29bb4eeca4df56\"" to instance "i-00ae1f5b96eed8e6a" since volume is in "creating" state
  Normal   SuccessfulAttachVolume  2m18s                  attachdetach-controller                                   AttachVolume.Attach succeeded for volume "pvc-b48ba754-eba7-11e9-b609-02ed13ff0a10"
  Warning  Unhealthy               60s                    kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  Readiness probe failed: Traceback (most recent call last):
  File "/usr/bin/mcstatus", line 11, in <module>
    sys.exit(cli())
  File "/usr/lib/python2.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python2.7/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python2.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python2.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/mcstatus/scripts/mcstatus.py", line 58, in status
    response = server.status()
  File "/usr/lib/python2.7/site-packages/mcstatus/server.py", line 49, in status
    connection = TCPSocketConnection((self.host, self.port))
  File "/usr/lib/python2.7/site-packages/mcstatus/protocol/connection.py", line 129, in __init__
    self.socket = socket.create_connection(addr, timeout=timeout)
  File "/usr/lib/python2.7/socket.py", line 575, in create_connection
    raise err
socket.error: [Errno 99] Address not available
  Normal   Pulling    58s (x2 over 2m14s)  kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  pulling image "itzg/minecraft-server:latest"
  Normal   Killing    58s                  kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  Killing container with id docker://mine-release-minecraft:Container failed liveness probe.. Container will be killed and recreated.
  Normal   Started    55s (x2 over 2m11s)  kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  Started container
  Normal   Pulled     55s (x2 over 2m11s)  kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  Successfully pulled image "itzg/minecraft-server:latest"
  Normal   Created    55s (x2 over 2m11s)  kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  Created container
  Warning  Unhealthy  25s (x2 over 100s)   kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  Readiness probe failed: Traceback (most recent call last):
  File "/usr/bin/mcstatus", line 11, in <module>
    sys.exit(cli())
  File "/usr/lib/python2.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python2.7/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python2.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python2.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/mcstatus/scripts/mcstatus.py", line 58, in status
    response = server.status()
  File "/usr/lib/python2.7/site-packages/mcstatus/server.py", line 61, in status
    raise exception
socket.error: [Errno 104] Connection reset by peer
  Warning  Unhealthy  20s (x8 over 95s)  kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  Readiness probe failed:
  Warning  Unhealthy  17s (x5 over 97s)  kubelet, ip-10-0-143-216.ap-southeast-2.compute.internal  Liveness probe failed:

我更了解我的 Kubernetes 设置:

Kubernetes 版本 1.14 和 m5.larges

上的节点 运行

我重现了你的问题,答案是readiness and liveness probe.

你的图表没有足够的时间起床,所以在就绪探测 return 错误后,活性探测将其杀死并再次尝试。

livenessProbe: Indicates whether the Container is running. If the liveness probe fails, the kubelet kills the Container, and the Container is subjected to its restart policy. If a Container does not provide a liveness probe, the default state is Success.

readinessProbe: Indicates whether the Container is ready to service requests. If the readiness probe fails, the endpoints controller removes the Pod’s IP address from the endpoints of all Services that match the Pod. The default state of readiness before the initial delay is Failure. If a Container does not provide a readiness probe, the default state is Success.

你可以在我编辑后使用你的命令

helm install --name mine-release --set resources.requests.memory="1024Mi" --set minecraftServer.memory="1024M" --set minecraftServer.eula=TRUE --set livenessProbe.initialDelaySeconds=90 --set livenessProbe.periodSeconds=15 --set readinessProbe.initialDelaySeconds=90 --set readinessprobe.periodSeconds=15 --namespace=mine-release stable/minecraft

使用 helm fetch 将 helm chart 下载到你的电脑上

helm fetch stable/minecraft --untar 

您可以使用 vi 或 nano 等文本编辑器来更新 minecraft/values.yaml

中的所有内容,而不是在 helm install 命令中更改值
vi/nano ./minecraft/values.yaml

minecraft/values.yaml 文件编辑后

# ref: https://hub.docker.com/r/itzg/minecraft-server/
image: itzg/minecraft-server
imageTag: latest

## Configure resource requests and limits
## ref: http://kubernetes.io/docs/user-guide/compute-resources/
##
resources:
  requests:
    memory: 1024Mi
    cpu: 500m

nodeSelector: {}

tolerations: []

affinity: {}

securityContext:
  # Security context settings
  runAsUser: 1000
  fsGroup: 2000
# Most of these map to environment variables. See Minecraft for details:
# https://hub.docker.com/r/itzg/minecraft-server/
livenessProbe:
  command:
    - mcstatus
    - localhost:25565
    - status
  initialDelaySeconds: 90
  periodSeconds: 15
readinessProbe:
  command:
    - mcstatus
    - localhost:25565
    - status
  initialDelaySeconds: 90
  periodSeconds: 15
minecraftServer:
  # This must be overridden, since we can't accept this for the user.
  eula: "TRUE"
  # One of: LATEST, SNAPSHOT, or a specific version (ie: "1.7.9").
  version: "1.14.4"
  # This can be one of "VANILLA", "FORGE", "SPIGOT", "BUKKIT", "PAPER", "FTB", "SPONGEVANILLA"
  type: "VANILLA"
  # If type is set to FORGE, this sets the version; this is ignored if forgeInstallerUrl is set
  forgeVersion:
  # If type is set to SPONGEVANILLA, this sets the version
  spongeVersion:
  # If type is set to FORGE, this sets the URL to download the Forge installer
  forgeInstallerUrl:
  # If type is set to BUKKIT, this sets the URL to download the Bukkit package
  bukkitDownloadUrl:
  # If type is set to SPIGOT, this sets the URL to download the Spigot package
  spigotDownloadUrl:
  # If type is set to PAPER, this sets the URL to download the PaperSpigot package
  paperDownloadUrl:
  # If type is set to FTB, this sets the server mod to run
  ftbServerMod:
  # Set to true if running Feed The Beast and get an error like "unable to launch forgemodloader"
  ftbLegacyJavaFixer: false
  # One of: peaceful, easy, normal, and hard
  difficulty: easy
  # A comma-separated list of player names to whitelist.
  whitelist:
  # A comma-separated list of player names who should be admins.
  ops:
  # A server icon URL for server listings. Auto-scaled and transcoded.
  icon:
  # Max connected players.
  maxPlayers: 20
  # This sets the maximum possible size in blocks, expressed as a radius, that the world border can obtain.
  maxWorldSize: 10000
  # Allows players to travel to the Nether.
  allowNether: true
  # Allows server to announce when a player gets an achievement.
  announcePlayerAchievements: true
  # Enables command blocks.
  enableCommandBlock: true
  # If true, players will always join in the default gameMode even if they were previously set to something else.
  forcegameMode: false
  # Defines whether structures (such as villages) will be generated.
  generateStructures: true
  # If set to true, players will be set to spectator mode if they die.
  hardcore: false
  # The maximum height in which building is allowed.
  maxBuildHeight: 256
  # The maximum number of milliseconds a single tick may take before the server watchdog stops the server with the message. -1 disables this entirely.
  maxTickTime: 60000
  # Determines if animals will be able to spawn.
  spawnAnimals: true
  # Determines if monsters will be spawned.
  spawnMonsters: true
  # Determines if villagers will be spawned.
  spawnNPCs: true
  # Max view distance (in chunks).
  viewDistance: 10
  # Define this if you want a specific map generation seed.
  levelSeed:
  # One of: creative, survival, adventure, spectator
  gameMode: survival
  # Message of the Day
  motd: "Welcome to Minecraft on Kubernetes!"
  # If true, enable player-vs-player damage.
  pvp: false
  # One of: DEFAULT, FLAT, LARGEBIOMES, AMPLIFIED, CUSTOMIZED
  levelType: DEFAULT
  # When levelType == FLAT or CUSTOMIZED, this can be used to further customize map generation.
  # ref: https://hub.docker.com/r/itzg/minecraft-server/
  generatorSettings:
  worldSaveName: world
  # If set, this URL will be downloaded at startup and used as a starting point
  downloadWorldUrl:
  # force re-download of server file
  forceReDownload: false
  # If set, the modpack at this URL will be downloaded at startup
  downloadModpackUrl:
  # If true, old versions of downloaded mods will be replaced with new ones from downloadModpackUrl
  removeOldMods: false
  # Check accounts against Minecraft account service.
  onlineMode: true
  # If you adjust this, you may need to adjust resources.requests above to match.
  memory: 1024M
  # General JVM options to be passed to the Minecraft server invocation
  jvmOpts: ""
  # Options like -X that need to proceed general JVM options
  jvmXXOpts: ""
  serviceType: LoadBalancer
  rcon:
    # If you enable this, make SURE to change your password below.
    enabled: false
    port: 25575
    password: "CHANGEME!"
    serviceType: LoadBalancer

  query:
    # If you enable this, your server will be "published" to Gamespy
    enabled: false
    port: 25565

## Additional minecraft container environment variables
##
extraEnv: {}

persistence:
  ## minecraft data Persistent Volume Storage Class
  ## If defined, storageClassName: <storageClass>
  ## If set to "-", storageClassName: "", which disables dynamic provisioning
  ## If undefined (the default) or set to null, no storageClassName spec is
  ##   set, choosing the default provisioner.  (gp2 on AWS, standard on
  ##   GKE, AWS & OpenStack)
  ##
  # storageClass: "-"
  dataDir:
    # Set this to false if you don't care to persist state between restarts.
    enabled: true
    Size: 1Gi

podAnnotations: {}

然后我们使用helm install

helm install --name mine-release --namespace=mine-release ./minecraft -f ./minecraft/values.yaml

helm 安装的结果:

NAME:   mine-release
LAST DEPLOYED: Fri Oct 11 14:52:17 2019
NAMESPACE: mine-release
STATUS: DEPLOYED

RESOURCES:
==> v1/PersistentVolumeClaim
NAME                            STATUS   VOLUME    CAPACITY  ACCESS MODES  STORAGECLASS  AGE
mine-release-minecraft-datadir  Pending  standard  0s

==> v1/Pod(related)
NAME                                    READY  STATUS   RESTARTS  AGE
mine-release-minecraft-f4558bfd5-mwm55  0/1    Pending  0         0s

==> v1/Secret
NAME                    TYPE    DATA  AGE
mine-release-minecraft  Opaque  1     0s

==> v1/Service
NAME                    TYPE          CLUSTER-IP   EXTERNAL-IP  PORT(S)          AGE
mine-release-minecraft  LoadBalancer  10.0.13.180  <pending>    25565:32020/TCP  0s

==> v1beta1/Deployment
NAME                    READY  UP-TO-DATE  AVAILABLE  AGE
mine-release-minecraft  0/1    1           0          0s


NOTES:
Get the IP address of your Minecraft server by running these commands in the
same shell:

!! NOTE: It may take a few minutes for the LoadBalancer IP to be available. !!

You can watch for EXTERNAL-IP to populate by running:
  kubectl get svc --namespace mine-release -w mine-release-minecraft

日志结果:

[12:53:45] [Server-Worker-1/INFO]: Preparing spawn area: 98%
[12:53:45] [Server thread/INFO]: Time elapsed: 26661 ms
[12:53:45] [Server thread/INFO]: Done (66.833s)! For help, type "help"
[12:53:45] [Server thread/INFO]: Starting remote control listener
[12:53:45] [RCON Listener #1/INFO]: RCON running on 0.0.0.0:25575