HPA 无法读取 GKE 上的指标值(CPU 利用率)
HPA cannot read metric value (CPU utilization) on GKE
我正在单个集群上开发 Google Kubernetes Engine。
集群自动缩放节点数。
我已经创建了三个 Deployment 并使用网站设置了自动缩放策略(Workloads -> Deployment -> Actions -> Auto-scaling),所以没有手动编写 YAML 配置。
基于官方guide,我没有搞错
If you do not specify requests, you can autoscale based only on the
absolute value of the resource's utilization, such as milliCPUs for
CPU utilization.
以下是完整的部署 YAML:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: student
name: student
namespace: ulibretto
spec:
replicas: 1
selector:
matchLabels:
app: student
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
app: student
spec:
containers:
- env:
- name: CLUSTER_HOST
valueFrom:
configMapKeyRef:
key: CLUSTER_HOST
name: shared-env-vars
- name: BIND_HOST
valueFrom:
configMapKeyRef:
key: BIND_HOST
name: shared-env-vars
- name: TOKEN_TIMEOUT
valueFrom:
configMapKeyRef:
key: TOKEN_TIMEOUT
name: shared-env-vars
image: gcr.io/ulibretto/github.com/ulibretto/studentservice
imagePullPolicy: IfNotPresent
name: studentservice-1
---
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
labels:
app: student
name: student-hpa-n3bp
namespace: ulibretto
spec:
maxReplicas: 100
metrics:
- resource:
name: cpu
targetAverageUtilization: 80
type: Resource
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: student
---
apiVersion: v1
kind: Service
metadata:
annotations:
cloud.google.com/neg: '{"ingress":true}'
labels:
app: student
name: student-ingress
namespace: ulibretto
spec:
clusterIP: 10.44.5.59
ports:
- port: 5000
protocol: TCP
targetPort: 5000
selector:
app: student
sessionAffinity: None
type: ClusterIP
问题是 HPA 看不到指标(平均 CPU 利用率),这真的很奇怪(见图)。
HPA cannot read metric value
我错过了什么?
已编辑
你是对的。您不需要像我之前提到的那样在 scaleTargetRef:
中指定 namespace: ulibretto
。
由于您提供了所有 YAML,我能够找到正确的根本原因。
如果你检查 GKE docs 你会在代码中找到注释
resources:
# You must specify requests for CPU to autoscale
# based on CPU utilization
requests:
cpu: "250m"
您的部署没有指定 resource requests
。我试过这个(我删除了一些部分,因为我无法部署您的容器并更改了 HPA 中的 apiVersion):
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: student
name: student
namespace: ulibretto
spec:
replicas: 3
selector:
matchLabels:
app: student
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
app: student
spec:
containers:
- image: nginx
imagePullPolicy: IfNotPresent
name: studentservice-1
resources:
requests:
cpu: "250m"
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
labels:
app: student
name: student-hpa
namespace: ulibretto
spec:
maxReplicas: 100
minReplicas: 1
targetCPUUtilizationPercentage: 80
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: student
$ kubectl get all -n ulibretto
NAME READY STATUS RESTARTS AGE
pod/student-6f797d5888-84xfq 1/1 Running 0 7s
pod/student-6f797d5888-b7ctq 1/1 Running 0 7s
pod/student-6f797d5888-fbtmd 1/1 Running 0 7s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/student 3/3 3 3 7s
NAME DESIRED CURRENT READY AGE
replicaset.apps/student-6f797d5888 3 3 3 7s
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
horizontalpodautoscaler.autoscaling/student-hpa Deployment/student <unknown>/80% 1 100 0 7s
大约 1-5 分钟后,您将收到一些指标。
$ kubectl get all -n ulibretto
NAME READY STATUS RESTARTS AGE
pod/student-6f797d5888-84xfq 1/1 Running 0 95s
pod/student-6f797d5888-b7ctq 1/1 Running 0 95s
pod/student-6f797d5888-fbtmd 1/1 Running 0 95s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/student 3/3 3 3 95s
NAME DESIRED CURRENT READY AGE
replicaset.apps/student-6f797d5888 3 3 3 95s
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
horizontalpodautoscaler.autoscaling/student-hpa Deployment/student 0%/80% 1 100 3 95s
如果您想使用 CLI 创建 HPA,情况相同:
$ kubectl autoscale deployment student -n ulibretto --cpu-percent=50 --min=1 --max=100
horizontalpodautoscaler.autoscaling/student autoscaled
$ kubectl get hpa -n ulibretto
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
student Deployment/student <unknown>/50% 1 100 0 3s
一段时间后您将收到 0%
而不是 <unknown>
$ kubectl get all -n ulibretto
NAME READY STATUS RESTARTS AGE
pod/student-6f797d5888-84xfq 1/1 Running 0 4m4s
pod/student-6f797d5888-b7ctq 1/1 Running 0 4m4s
pod/student-6f797d5888-fbtmd 1/1 Running 0 4m4s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/student 3/3 3 3 4m5s
NAME DESIRED CURRENT READY AGE
replicaset.apps/student-6f797d5888 3 3 3 4m5s
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
horizontalpodautoscaler.autoscaling/student Deployment/student 0%/50% 1 100 3 58s
我正在单个集群上开发 Google Kubernetes Engine。 集群自动缩放节点数。 我已经创建了三个 Deployment 并使用网站设置了自动缩放策略(Workloads -> Deployment -> Actions -> Auto-scaling),所以没有手动编写 YAML 配置。 基于官方guide,我没有搞错
If you do not specify requests, you can autoscale based only on the absolute value of the resource's utilization, such as milliCPUs for CPU utilization.
以下是完整的部署 YAML:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: student
name: student
namespace: ulibretto
spec:
replicas: 1
selector:
matchLabels:
app: student
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
app: student
spec:
containers:
- env:
- name: CLUSTER_HOST
valueFrom:
configMapKeyRef:
key: CLUSTER_HOST
name: shared-env-vars
- name: BIND_HOST
valueFrom:
configMapKeyRef:
key: BIND_HOST
name: shared-env-vars
- name: TOKEN_TIMEOUT
valueFrom:
configMapKeyRef:
key: TOKEN_TIMEOUT
name: shared-env-vars
image: gcr.io/ulibretto/github.com/ulibretto/studentservice
imagePullPolicy: IfNotPresent
name: studentservice-1
---
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
labels:
app: student
name: student-hpa-n3bp
namespace: ulibretto
spec:
maxReplicas: 100
metrics:
- resource:
name: cpu
targetAverageUtilization: 80
type: Resource
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: student
---
apiVersion: v1
kind: Service
metadata:
annotations:
cloud.google.com/neg: '{"ingress":true}'
labels:
app: student
name: student-ingress
namespace: ulibretto
spec:
clusterIP: 10.44.5.59
ports:
- port: 5000
protocol: TCP
targetPort: 5000
selector:
app: student
sessionAffinity: None
type: ClusterIP
问题是 HPA 看不到指标(平均 CPU 利用率),这真的很奇怪(见图)。 HPA cannot read metric value
我错过了什么?
已编辑
你是对的。您不需要像我之前提到的那样在 scaleTargetRef:
中指定 namespace: ulibretto
。
由于您提供了所有 YAML,我能够找到正确的根本原因。
如果你检查 GKE docs 你会在代码中找到注释
resources:
# You must specify requests for CPU to autoscale
# based on CPU utilization
requests:
cpu: "250m"
您的部署没有指定 resource requests
。我试过这个(我删除了一些部分,因为我无法部署您的容器并更改了 HPA 中的 apiVersion):
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: student
name: student
namespace: ulibretto
spec:
replicas: 3
selector:
matchLabels:
app: student
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
app: student
spec:
containers:
- image: nginx
imagePullPolicy: IfNotPresent
name: studentservice-1
resources:
requests:
cpu: "250m"
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
labels:
app: student
name: student-hpa
namespace: ulibretto
spec:
maxReplicas: 100
minReplicas: 1
targetCPUUtilizationPercentage: 80
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: student
$ kubectl get all -n ulibretto
NAME READY STATUS RESTARTS AGE
pod/student-6f797d5888-84xfq 1/1 Running 0 7s
pod/student-6f797d5888-b7ctq 1/1 Running 0 7s
pod/student-6f797d5888-fbtmd 1/1 Running 0 7s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/student 3/3 3 3 7s
NAME DESIRED CURRENT READY AGE
replicaset.apps/student-6f797d5888 3 3 3 7s
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
horizontalpodautoscaler.autoscaling/student-hpa Deployment/student <unknown>/80% 1 100 0 7s
大约 1-5 分钟后,您将收到一些指标。
$ kubectl get all -n ulibretto
NAME READY STATUS RESTARTS AGE
pod/student-6f797d5888-84xfq 1/1 Running 0 95s
pod/student-6f797d5888-b7ctq 1/1 Running 0 95s
pod/student-6f797d5888-fbtmd 1/1 Running 0 95s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/student 3/3 3 3 95s
NAME DESIRED CURRENT READY AGE
replicaset.apps/student-6f797d5888 3 3 3 95s
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
horizontalpodautoscaler.autoscaling/student-hpa Deployment/student 0%/80% 1 100 3 95s
如果您想使用 CLI 创建 HPA,情况相同:
$ kubectl autoscale deployment student -n ulibretto --cpu-percent=50 --min=1 --max=100
horizontalpodautoscaler.autoscaling/student autoscaled
$ kubectl get hpa -n ulibretto
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
student Deployment/student <unknown>/50% 1 100 0 3s
一段时间后您将收到 0%
而不是 <unknown>
$ kubectl get all -n ulibretto
NAME READY STATUS RESTARTS AGE
pod/student-6f797d5888-84xfq 1/1 Running 0 4m4s
pod/student-6f797d5888-b7ctq 1/1 Running 0 4m4s
pod/student-6f797d5888-fbtmd 1/1 Running 0 4m4s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/student 3/3 3 3 4m5s
NAME DESIRED CURRENT READY AGE
replicaset.apps/student-6f797d5888 3 3 3 4m5s
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
horizontalpodautoscaler.autoscaling/student Deployment/student 0%/50% 1 100 3 58s