LivenessProbe 失败但端口转发在同一个端口上工作

LivenessProbe is failing but port-forward is working on the same port

我有以下部署 yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: gofirst
  labels:
    app: gofirst
spec:
  selector:
    matchLabels:
      app: gofirst
  template:
    metadata:
      labels:
        app: gofirst
    spec:
      restartPolicy: Always
      containers:
      - name: gofirst
        image: lbvenkatesh/gofirst:0.0.5
        resources:
          limits:
            memory: "128Mi"
            cpu: "500m"
        ports:
        - name: http
          containerPort: 8080
        livenessProbe:
          httpGet:
            path: /health
            port: http
            httpHeaders:
            - name: "X-Health-Check"
              value: "1"
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health
            port: http
            httpHeaders:
            - name: "X-Health-Check"
              value: "1"
          initialDelaySeconds: 30
          periodSeconds: 10

我的服务 yaml 是这样的:

apiVersion: v1
kind: Service
metadata:
  name: gofirst
  labels:
    app: gofirst
spec:
  publishNotReadyAddresses: true
  type: NodePort
  selector:
    app: gofirst
  ports:
  - port: 8080
    targetPort: http
    name: http

"gofirst" 是一个用 Golang Gin 编写的简单 Web 应用程序。 这是相同的 docker 文件:

FROM golang:latest 
LABEL MAINTAINER='Venkatesh Laguduva <lbvenkatesh@gmail.com>'
RUN mkdir /app 
ADD . /app/
RUN apt -y update && apt -y install git
RUN go get github.com/gin-gonic/gin
RUN go get -u github.com/RaMin0/gin-health-check
WORKDIR /app 
RUN go build -o main . 
ARG verArg="0.0.1"
ENV VERSION=$verArg
ENV PORT=8080
ENV GIN_MODE=release
EXPOSE 8080
CMD ["/app/main"]

我已经在 Minikube 中部署了这个应用程序,当我尝试描述这个 pods 时,我看到了这些事件:

  Type     Reason            Age                     From               Message
  ----     ------            ----                    ----               -------
  Warning  FailedScheduling  10m (x2 over 10m)       default-scheduler  0/1 nodes are available: 1 Insufficient cpu.
  Normal   Scheduled         10m                     default-scheduler  Successfully assigned default/gofirst-95fc8668c-6r4qc to m01
  Normal   Pulling           10m                     kubelet, m01       Pulling image "lbvenkatesh/gofirst:0.0.5"
  Normal   Pulled            10m                     kubelet, m01       Successfully pulled image "lbvenkatesh/gofirst:0.0.5"
  Normal   Killing           8m13s (x2 over 9m13s)   kubelet, m01       Container gofirst failed liveness probe, will be restarted
  Normal   Pulled            8m13s (x2 over 9m12s)   kubelet, m01       Container image "lbvenkatesh/gofirst:0.0.5" already present on machine
  Normal   Created           8m12s (x3 over 10m)     kubelet, m01       Created container gofirst
  Normal   Started           8m12s (x3 over 10m)     kubelet, m01       Started container gofirst
  Warning  Unhealthy         7m33s (x7 over 9m33s)   kubelet, m01       Liveness probe failed: Get http://172.17.0.4:8080/health: dial tcp 172.17.0.4:8080: connect: connection refused
  Warning  Unhealthy         5m35s (x12 over 9m25s)  kubelet, m01       Readiness probe failed: Get http://172.17.0.4:8080/health: dial tcp 172.17.0.4:8080: connect: connection refused
  Warning  BackOff           31s (x17 over 4m13s)    kubelet, m01       Back-off restarting failed container

我尝试了示例容器 "hello-world" 并且在 "minikube service hello-world" 时运行良好,但是当我尝试使用 "minikube service gofirst" 时,我在浏览器中遇到了连接错误。

一定是我做的比较简单,但是找不到错误。请检查我的 yaml 和 docker 文件,如果我犯了任何错误,请告诉我。

我已经重现了您的情况并遇到了与您相同的问题。所以我决定删除 liveness 和 rediness 探测器,以便能够登录到 pod 并进行调查。

这是我使用的 yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: gofirst
  labels:
    app: gofirst
spec:
  selector:
    matchLabels:
      app: gofirst
  template:
    metadata:
      labels:
        app: gofirst
    spec:
      restartPolicy: Always
      containers:
      - name: gofirst
        image: lbvenkatesh/gofirst:0.0.5
        resources:
          limits:
            memory: "128Mi"
            cpu: "500m"
        ports:
        - name: http
          containerPort: 8080

我登录 pod 以检查应用程序是否正在侦听您尝试测试的端口:

kubectl exec -ti gofirst-65cfc7556-bbdcg -- bash

然后我安装了netstat:

# apt update
# apt install net-tools

已检查应用程序是否 运行:

# ps -ef 
UID          PID    PPID  C STIME TTY          TIME CMD
root           1       0  0 10:06 ?        00:00:00 /app/main
root           9       0  0 10:06 pts/0    00:00:00 sh
root          15       9  0 10:07 pts/0    00:00:00 ps -ef

最后检查端口 8080 是否正在侦听:

# netstat -an
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 127.0.0.1:8080          0.0.0.0:*               LISTEN     
tcp        0      0 10.28.0.9:56106         151.101.184.204:80      TIME_WAIT  
tcp        0      0 10.28.0.9:56130         151.101.184.204:80      TIME_WAIT  
tcp        0      0 10.28.0.9:56104         151.101.184.204:80      TIME_WAIT  
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node   Path

正如我们所见,应用程序只监听本地主机连接,而不是来自任何地方。预期输出应为:0.0.0.0:8080

希望对您解决问题有所帮助。