Horizo​​ntal Pod Autoscale 无法读取指标

Horizontal Pod Autoscale unable to read metrics

我正在使用来自 here 的 Kafka Helm 图表。 我正在尝试使用 Horizo​​ntal Pod Autoscaler。

我在模板文件夹中添加了一个 hpa.yaml 文件,如下所示。

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: kafka-hpa
spec:
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: {{ include "kafka.fullname" . }}
minReplicas: {{ .Values.replicas }}
maxReplicas: 5
metrics:
- type: Resource
  resource:
    name: cpu
    targetAverageUtilization: 50
- type: Resource
  resource:
    name: memory
    targetAverageValue: 8000Mi

我也用 kind: StatefulSet 尝试了上面的 YAML,但同样的问题仍然存在。

我打算最初拥有 3 个 Kafka pods,然后根据 CPU 和上面提到的内存目标值将其扩展到 5 个。

然而,hpa 已部署,但根据我的理解,它无法读取指标,因为当前使用情况显示未知,如下所述。

NAME        REFERENCE                          TARGETS                          MINPODS   MAXPODS   REPLICAS   AGE
kafka-hpa   Deployment/whopping-walrus-kafka   <unknown>/8000Mi, <unknown>/50%   3         5         0          1h . 

我是 helm 和 Kubernetes 的新手,所以我假设我的理解可能存在一些问题。

我也部署了 metrics-server。

$ kubectl get deployments
NAME                             DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
metrics-server                   1         1         1            1           1d
whopping-walrus-kafka-exporter   1         1         1            1           1h

Pods输出

$ kubectl get pods
NAME                                              READY     STATUS    RESTARTS   AGE
metrics-server-55cbf87bbb-vm2v5                   1/1       Running   0          15m
whopping-walrus-kafka-0                           1/1       Running   1          1h
whopping-walrus-kafka-1                           1/1       Running   0          1h
whopping-walrus-kafka-2                           1/1       Running   0          1h
whopping-walrus-kafka-exporter-5c66b5b4f9-mv5kv   1/1       Running   1          1h
whopping-walrus-zookeeper-0                       1/1       Running   0          1h

我希望 whopping-walrus-kafka pod 在负载时最多扩展到 5 个,但是没有与之对应的部署。

StatefulSet 输出

$ kubectl get statefulset
NAME                        DESIRED   CURRENT   AGE
original-bobcat-kafka       3         2         2m
original-bobcat-zookeeper   1         1         2m

describe hpa when kind in hpa.yaml 的输出是 StatefulSet.

$ kubectl describe hpa
Name:                                                  kafka-hpa
Namespace:                                             default
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Fri, 18 Jan 2019 12:13:59 +0530
Reference:                                             StatefulSet/original-bobcat-kafka
Metrics:                                               ( current / target )
  resource memory on pods:                             <unknown> / 8000Mi
  resource cpu on pods  (as a percentage of request):  <unknown> / 5%
Min replicas:                                          3
Max replicas:                                          5
Conditions:
  Type         Status  Reason          Message
  ----         ------  ------          -------
  AbleToScale  False   FailedGetScale  the HPA controller was unable to get the target's current scale: no matches for kind "StatefulSet" in group "extensions"
Events:
  Type     Reason          Age                From                       Message
  ----     ------          ----               ----                       -------
  Warning  FailedGetScale  15s (x17 over 8m)  horizontal-pod-autoscaler  no matches for kind "StatefulSet" in group "extensions"

describe hpa when kind in hpa.yaml 的输出是 Deployment

$ kubectl describe hpa
Name:                                                  kafka-hpa
Namespace:                                             default
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Fri, 18 Jan 2019 12:30:07 +0530
Reference:                                             Deployment/good-elephant-kafka
Metrics:                                               ( current / target )
  resource memory on pods:                             <unknown> / 8000Mi
  resource cpu on pods  (as a percentage of request):  <unknown> / 5%
Min replicas:                                          3
Max replicas:                                          5
Conditions:
  Type         Status  Reason          Message
  ----         ------  ------          -------
  AbleToScale  False   FailedGetScale  the HPA controller was unable to get the target's current scale: could not fetch the scale for deployments.extensions good-elephant-kafka: deployments/scale.extensions "good-elephant-kafka" not found
Events:
  Type     Reason          Age   From                       Message
  ----     ------          ----  ----                       -------
  Warning  FailedGetScale  9s    horizontal-pod-autoscaler  could not fetch the scale for deployments.extensions good-elephant-kafka: deployments/scale.extensions "good-elephant-kafka" not found

来自指标服务器 pod 的输出

$ kubectl describe pods metrics-server-55cbf87bbb-vm2v5
Name:           metrics-server-55cbf87bbb-vm2v5
Namespace:      default
Node:           docker-for-desktop/192.168.65.3
Start Time:     Fri, 18 Jan 2019 11:26:33 +0530
Labels:         app=metrics-server
            pod-template-hash=1176943666
            release=metrics-server
Annotations:    <none>
Status:         Running
IP:             10.1.0.119
Controlled By:  ReplicaSet/metrics-server-55cbf87bbb
Containers:
  metrics-server:
    Container ID:  docker://ee4b3d9ed1b15c2c8783345b0ffbbc565ad25f1493dec0148f245c9581443631
    Image:         gcr.io/google_containers/metrics-server-amd64:v0.3.1
    Image ID:      docker-pullable://gcr.io/google_containers/metrics-server-amd64@sha256:78938f933822856f443e6827fe5b37d6cc2f74ae888ac8b33d06fdbe5f8c658b
    Port:          <none>
    Host Port:     <none>
    Command:
      /metrics-server
      --kubelet-insecure-tls
      --kubelet-preferred-address-types=InternalIP
      --logtostderr
    State:          Running
  Started:      Fri, 18 Jan 2019 11:26:35 +0530
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
  /var/run/secrets/kubernetes.io/serviceaccount from metrics-server-token-d2g7b (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          True 
  PodScheduled   True 
Volumes:
  metrics-server-token-d2g7b:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  metrics-server-token-d2g7b
    Optional:    false
    QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
             node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

如果我哪里出错了,请大家随时澄清我的理解。

我们将不胜感激。

您需要在 metrics-server 部署文件中添加以下命令:

containers:
   - command:
     - /metrics-server
     - --metric-resolution=30s
     - --kubelet-insecure-tls
     - --kubelet-preferred-address-types=InternalIP
     name: metrics-server

我认为 metrics-server 找不到带有 InternalIP 的 kubelet,因此出现了问题。有关详细信息,请查看我的以下答案以了解设置 HPA 的分步说明。

我进行了一些操作,类似于上面@PrafullLadha 提到的操作。

修改了 metrics-server 部署文件并添加了以下代码:

containers:
 - command:
  - /metrics-server
  - --metric-resolution=30s
  - --kubelet-insecure-tls 
  - --kubelet-preferred-address-types=InternalIP`

此外,取消注释 statefulset.yaml 文件中的以下部分

resources: requests: cpu: 200m memory: 256Mi

它在上面运行良好。

如果您的部署到目前为止没有节点,并且集群中可用的节点不能满足您的资源要求,也会发生此错误。在那种情况下,显然没有可用的指标。​​