Horizontal Pod Autoscale 无法读取指标
Horizontal Pod Autoscale unable to read metrics
我正在使用来自 here 的 Kafka Helm 图表。
我正在尝试使用 Horizontal Pod Autoscaler。
我在模板文件夹中添加了一个 hpa.yaml 文件,如下所示。
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: kafka-hpa
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: {{ include "kafka.fullname" . }}
minReplicas: {{ .Values.replicas }}
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 50
- type: Resource
resource:
name: memory
targetAverageValue: 8000Mi
我也用 kind: StatefulSet 尝试了上面的 YAML,但同样的问题仍然存在。
我打算最初拥有 3 个 Kafka pods,然后根据 CPU 和上面提到的内存目标值将其扩展到 5 个。
然而,hpa 已部署,但根据我的理解,它无法读取指标,因为当前使用情况显示未知,如下所述。
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
kafka-hpa Deployment/whopping-walrus-kafka <unknown>/8000Mi, <unknown>/50% 3 5 0 1h .
我是 helm 和 Kubernetes 的新手,所以我假设我的理解可能存在一些问题。
我也部署了 metrics-server。
$ kubectl get deployments
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
metrics-server 1 1 1 1 1d
whopping-walrus-kafka-exporter 1 1 1 1 1h
Pods输出
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
metrics-server-55cbf87bbb-vm2v5 1/1 Running 0 15m
whopping-walrus-kafka-0 1/1 Running 1 1h
whopping-walrus-kafka-1 1/1 Running 0 1h
whopping-walrus-kafka-2 1/1 Running 0 1h
whopping-walrus-kafka-exporter-5c66b5b4f9-mv5kv 1/1 Running 1 1h
whopping-walrus-zookeeper-0 1/1 Running 0 1h
我希望 whopping-walrus-kafka pod 在负载时最多扩展到 5 个,但是没有与之对应的部署。
StatefulSet 输出
$ kubectl get statefulset
NAME DESIRED CURRENT AGE
original-bobcat-kafka 3 2 2m
original-bobcat-zookeeper 1 1 2m
describe hpa when kind in hpa.yaml 的输出是 StatefulSet.
$ kubectl describe hpa
Name: kafka-hpa
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Fri, 18 Jan 2019 12:13:59 +0530
Reference: StatefulSet/original-bobcat-kafka
Metrics: ( current / target )
resource memory on pods: <unknown> / 8000Mi
resource cpu on pods (as a percentage of request): <unknown> / 5%
Min replicas: 3
Max replicas: 5
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale False FailedGetScale the HPA controller was unable to get the target's current scale: no matches for kind "StatefulSet" in group "extensions"
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetScale 15s (x17 over 8m) horizontal-pod-autoscaler no matches for kind "StatefulSet" in group "extensions"
describe hpa when kind in hpa.yaml 的输出是 Deployment。
$ kubectl describe hpa
Name: kafka-hpa
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Fri, 18 Jan 2019 12:30:07 +0530
Reference: Deployment/good-elephant-kafka
Metrics: ( current / target )
resource memory on pods: <unknown> / 8000Mi
resource cpu on pods (as a percentage of request): <unknown> / 5%
Min replicas: 3
Max replicas: 5
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale False FailedGetScale the HPA controller was unable to get the target's current scale: could not fetch the scale for deployments.extensions good-elephant-kafka: deployments/scale.extensions "good-elephant-kafka" not found
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetScale 9s horizontal-pod-autoscaler could not fetch the scale for deployments.extensions good-elephant-kafka: deployments/scale.extensions "good-elephant-kafka" not found
来自指标服务器 pod 的输出
$ kubectl describe pods metrics-server-55cbf87bbb-vm2v5
Name: metrics-server-55cbf87bbb-vm2v5
Namespace: default
Node: docker-for-desktop/192.168.65.3
Start Time: Fri, 18 Jan 2019 11:26:33 +0530
Labels: app=metrics-server
pod-template-hash=1176943666
release=metrics-server
Annotations: <none>
Status: Running
IP: 10.1.0.119
Controlled By: ReplicaSet/metrics-server-55cbf87bbb
Containers:
metrics-server:
Container ID: docker://ee4b3d9ed1b15c2c8783345b0ffbbc565ad25f1493dec0148f245c9581443631
Image: gcr.io/google_containers/metrics-server-amd64:v0.3.1
Image ID: docker-pullable://gcr.io/google_containers/metrics-server-amd64@sha256:78938f933822856f443e6827fe5b37d6cc2f74ae888ac8b33d06fdbe5f8c658b
Port: <none>
Host Port: <none>
Command:
/metrics-server
--kubelet-insecure-tls
--kubelet-preferred-address-types=InternalIP
--logtostderr
State: Running
Started: Fri, 18 Jan 2019 11:26:35 +0530
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from metrics-server-token-d2g7b (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
metrics-server-token-d2g7b:
Type: Secret (a volume populated by a Secret)
SecretName: metrics-server-token-d2g7b
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
如果我哪里出错了,请大家随时澄清我的理解。
我们将不胜感激。
您需要在 metrics-server
部署文件中添加以下命令:
containers:
- command:
- /metrics-server
- --metric-resolution=30s
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
name: metrics-server
我认为 metrics-server 找不到带有 InternalIP
的 kubelet,因此出现了问题。有关详细信息,请查看我的以下答案以了解设置 HPA 的分步说明。
我进行了一些操作,类似于上面@PrafullLadha 提到的操作。
修改了 metrics-server 部署文件并添加了以下代码:
containers:
- command:
- /metrics-server
- --metric-resolution=30s
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP`
此外,取消注释 statefulset.yaml 文件中的以下部分
resources:
requests:
cpu: 200m
memory: 256Mi
它在上面运行良好。
如果您的部署到目前为止没有节点,并且集群中可用的节点不能满足您的资源要求,也会发生此错误。在那种情况下,显然没有可用的指标。
我正在使用来自 here 的 Kafka Helm 图表。 我正在尝试使用 Horizontal Pod Autoscaler。
我在模板文件夹中添加了一个 hpa.yaml 文件,如下所示。
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: kafka-hpa
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: {{ include "kafka.fullname" . }}
minReplicas: {{ .Values.replicas }}
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 50
- type: Resource
resource:
name: memory
targetAverageValue: 8000Mi
我也用 kind: StatefulSet 尝试了上面的 YAML,但同样的问题仍然存在。
我打算最初拥有 3 个 Kafka pods,然后根据 CPU 和上面提到的内存目标值将其扩展到 5 个。
然而,hpa 已部署,但根据我的理解,它无法读取指标,因为当前使用情况显示未知,如下所述。
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
kafka-hpa Deployment/whopping-walrus-kafka <unknown>/8000Mi, <unknown>/50% 3 5 0 1h .
我是 helm 和 Kubernetes 的新手,所以我假设我的理解可能存在一些问题。
我也部署了 metrics-server。
$ kubectl get deployments
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
metrics-server 1 1 1 1 1d
whopping-walrus-kafka-exporter 1 1 1 1 1h
Pods输出
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
metrics-server-55cbf87bbb-vm2v5 1/1 Running 0 15m
whopping-walrus-kafka-0 1/1 Running 1 1h
whopping-walrus-kafka-1 1/1 Running 0 1h
whopping-walrus-kafka-2 1/1 Running 0 1h
whopping-walrus-kafka-exporter-5c66b5b4f9-mv5kv 1/1 Running 1 1h
whopping-walrus-zookeeper-0 1/1 Running 0 1h
我希望 whopping-walrus-kafka pod 在负载时最多扩展到 5 个,但是没有与之对应的部署。
StatefulSet 输出
$ kubectl get statefulset
NAME DESIRED CURRENT AGE
original-bobcat-kafka 3 2 2m
original-bobcat-zookeeper 1 1 2m
describe hpa when kind in hpa.yaml 的输出是 StatefulSet.
$ kubectl describe hpa
Name: kafka-hpa
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Fri, 18 Jan 2019 12:13:59 +0530
Reference: StatefulSet/original-bobcat-kafka
Metrics: ( current / target )
resource memory on pods: <unknown> / 8000Mi
resource cpu on pods (as a percentage of request): <unknown> / 5%
Min replicas: 3
Max replicas: 5
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale False FailedGetScale the HPA controller was unable to get the target's current scale: no matches for kind "StatefulSet" in group "extensions"
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetScale 15s (x17 over 8m) horizontal-pod-autoscaler no matches for kind "StatefulSet" in group "extensions"
describe hpa when kind in hpa.yaml 的输出是 Deployment。
$ kubectl describe hpa
Name: kafka-hpa
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Fri, 18 Jan 2019 12:30:07 +0530
Reference: Deployment/good-elephant-kafka
Metrics: ( current / target )
resource memory on pods: <unknown> / 8000Mi
resource cpu on pods (as a percentage of request): <unknown> / 5%
Min replicas: 3
Max replicas: 5
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale False FailedGetScale the HPA controller was unable to get the target's current scale: could not fetch the scale for deployments.extensions good-elephant-kafka: deployments/scale.extensions "good-elephant-kafka" not found
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetScale 9s horizontal-pod-autoscaler could not fetch the scale for deployments.extensions good-elephant-kafka: deployments/scale.extensions "good-elephant-kafka" not found
来自指标服务器 pod 的输出
$ kubectl describe pods metrics-server-55cbf87bbb-vm2v5
Name: metrics-server-55cbf87bbb-vm2v5
Namespace: default
Node: docker-for-desktop/192.168.65.3
Start Time: Fri, 18 Jan 2019 11:26:33 +0530
Labels: app=metrics-server
pod-template-hash=1176943666
release=metrics-server
Annotations: <none>
Status: Running
IP: 10.1.0.119
Controlled By: ReplicaSet/metrics-server-55cbf87bbb
Containers:
metrics-server:
Container ID: docker://ee4b3d9ed1b15c2c8783345b0ffbbc565ad25f1493dec0148f245c9581443631
Image: gcr.io/google_containers/metrics-server-amd64:v0.3.1
Image ID: docker-pullable://gcr.io/google_containers/metrics-server-amd64@sha256:78938f933822856f443e6827fe5b37d6cc2f74ae888ac8b33d06fdbe5f8c658b
Port: <none>
Host Port: <none>
Command:
/metrics-server
--kubelet-insecure-tls
--kubelet-preferred-address-types=InternalIP
--logtostderr
State: Running
Started: Fri, 18 Jan 2019 11:26:35 +0530
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from metrics-server-token-d2g7b (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
metrics-server-token-d2g7b:
Type: Secret (a volume populated by a Secret)
SecretName: metrics-server-token-d2g7b
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
如果我哪里出错了,请大家随时澄清我的理解。
我们将不胜感激。
您需要在 metrics-server
部署文件中添加以下命令:
containers:
- command:
- /metrics-server
- --metric-resolution=30s
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
name: metrics-server
我认为 metrics-server 找不到带有 InternalIP
的 kubelet,因此出现了问题。有关详细信息,请查看我的以下答案以了解设置 HPA 的分步说明。
我进行了一些操作,类似于上面@PrafullLadha 提到的操作。
修改了 metrics-server 部署文件并添加了以下代码:
containers:
- command:
- /metrics-server
- --metric-resolution=30s
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP`
此外,取消注释 statefulset.yaml 文件中的以下部分
resources:
requests:
cpu: 200m
memory: 256Mi
它在上面运行良好。
如果您的部署到目前为止没有节点,并且集群中可用的节点不能满足您的资源要求,也会发生此错误。在那种情况下,显然没有可用的指标。