Kubernetes HPA 没有按预期缩小规模
Kubernetes HPA not downscaling as expected
发生了什么:
我已经使用这些详细信息配置了一个 hpa:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: api-horizontalautoscaler
namespace: develop
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: api-deployment
minReplicas: 1
maxReplicas: 4
metrics:
- type: Resource
resource:
name: memory
targetAverageValue: 400Mib
我预期会发生什么:
当我们放置一些负载并且平均内存超过预期的 400 时,pods 扩展到 3。现在,平均内存已回落到大约 300,但 pods 仍然没有缩小,尽管它们现在已经低于目标几个小时了。
一天后:
我预计 pods 会在内存低于 400 时缩减
环境:
- Kubernetes 版本(使用
kubectl version
):
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.9", GitCommit:"3e4f6a92de5f259ef313ad876bb008897f6a98f0", GitTreeState:"clean", BuildDate:"2019-08-05T09:22:00Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.10", GitCommit:"37d169313237cb4ceb2cc4bef300f2ae3053c1a2", GitTreeState:"clean", BuildDate:"2019-08-19T10:44:49Z", GoVersion:"go1.11.13", Compiler:"gc", Platform:"linux/amd64"}re configuration:
- OS(例如:
cat /etc/os-release
):
> cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.3 LTS (Bionic Beaver)"
- 内核(例如
uname -a
):
x86_64 x86_64 x86_64 GNU/Linux
我很想知道这是为什么。我很乐意提供任何需要的信息。
谢谢!
有两点要看:
The beta version, which includes support for scaling on memory and
custom metrics, can be found in autoscaling/v2beta2
. The new fields
introduced in autoscaling/v2beta2
are preserved as annotations when
working with autoscaling/v1
.
autoscaling/v2beta2
是在 K8s 1.12 中引入的,因此尽管您使用的是 1.13(现在是 6 个主要版本)它应该可以正常工作(但是,建议升级到更新的版本)。尝试将 apiVersion:
更改为 autoscaling/v2beta2
。
--horizontal-pod-autoscaler-downscale-stabilization
: The value for
this option is a duration that specifies how long the autoscaler has
to wait before another downscale operation can be performed after the
current one has completed. The default value is 5 minutes (5m0s
).
更改上面建议的 API 后检查此特定标志的值。
HPA 如何决定 pods 到 运行 的公式在 Horizontal Pod Autoscaler documentation 中:
desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
根据您给出的数字,currentReplicas
是 3,currentMetricValue
是 300 MiB,desiredMetricValue
是 400 MiB,所以这减少到
desiredReplicas = ceil[3 * (300 / 400)]
desiredReplicas = ceil[3 * 0.75]
desiredReplicas = ceil[2.25]
desiredReplicas = 3
您需要进一步降低负载(低于 266 MiB 平均内存利用率)或增加目标内存利用率以进一步缩减。
(仅仅低于目标不会触发 scale-down一个会触发一个方向或另一个方向缩放的阈值。)
发生了什么: 我已经使用这些详细信息配置了一个 hpa:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: api-horizontalautoscaler
namespace: develop
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: api-deployment
minReplicas: 1
maxReplicas: 4
metrics:
- type: Resource
resource:
name: memory
targetAverageValue: 400Mib
我预期会发生什么:
当我们放置一些负载并且平均内存超过预期的 400 时,pods 扩展到 3。现在,平均内存已回落到大约 300,但 pods 仍然没有缩小,尽管它们现在已经低于目标几个小时了。
一天后:
我预计 pods 会在内存低于 400 时缩减
环境:
- Kubernetes 版本(使用
kubectl version
):
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.9", GitCommit:"3e4f6a92de5f259ef313ad876bb008897f6a98f0", GitTreeState:"clean", BuildDate:"2019-08-05T09:22:00Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.10", GitCommit:"37d169313237cb4ceb2cc4bef300f2ae3053c1a2", GitTreeState:"clean", BuildDate:"2019-08-19T10:44:49Z", GoVersion:"go1.11.13", Compiler:"gc", Platform:"linux/amd64"}re configuration:
- OS(例如:
cat /etc/os-release
):
> cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.3 LTS (Bionic Beaver)"
- 内核(例如
uname -a
): x86_64 x86_64 x86_64 GNU/Linux
我很想知道这是为什么。我很乐意提供任何需要的信息。
谢谢!
有两点要看:
The beta version, which includes support for scaling on memory and custom metrics, can be found in
autoscaling/v2beta2
. The new fields introduced inautoscaling/v2beta2
are preserved as annotations when working withautoscaling/v1
.
autoscaling/v2beta2
是在 K8s 1.12 中引入的,因此尽管您使用的是 1.13(现在是 6 个主要版本)它应该可以正常工作(但是,建议升级到更新的版本)。尝试将 apiVersion:
更改为 autoscaling/v2beta2
。
--horizontal-pod-autoscaler-downscale-stabilization
: The value for this option is a duration that specifies how long the autoscaler has to wait before another downscale operation can be performed after the current one has completed. The default value is 5 minutes (5m0s
).
更改上面建议的 API 后检查此特定标志的值。
HPA 如何决定 pods 到 运行 的公式在 Horizontal Pod Autoscaler documentation 中:
desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
根据您给出的数字,currentReplicas
是 3,currentMetricValue
是 300 MiB,desiredMetricValue
是 400 MiB,所以这减少到
desiredReplicas = ceil[3 * (300 / 400)]
desiredReplicas = ceil[3 * 0.75]
desiredReplicas = ceil[2.25]
desiredReplicas = 3
您需要进一步降低负载(低于 266 MiB 平均内存利用率)或增加目标内存利用率以进一步缩减。
(仅仅低于目标不会触发 scale-down一个会触发一个方向或另一个方向缩放的阈值。)