AWS 上如何使用合适容量的 EKS 解决子网 IP 不足的问题?
How to use EKS with suitable volumes and resolve subnet IP insufficient issue on AWS?
我在 EKS 中部署了一个应用程序。部署总是挂起,当我检查事件时发现这些问题。
$ kubectl get events
LAST SEEN TYPE REASON OBJECT MESSAGE
89s Warning FailedScheduling pod/awx-demo-111111111-122222 running PreBind plugin "VolumeBinding": binding volumes: provisioning failed for PVC "awx-demo-projects-claim"
49m Warning FailedDeployModel ingress/awx-demo-ingress Failed deploy model due to InvalidSubnet: Not enough IP space available in subnet-031f9c702bc474e8f. ELB requires at least 8 free IP addresses in each subnet.
status code: 400, request id: 11111111-2222-3333-4444-555555555555
32m Warning FailedDeployModel ingress/awx-demo-ingress Failed deploy model due to InvalidSubnet: Not enough IP space available in subnet-01322i912fas0123na. ELB requires at least 8 free IP addresses in each subnet.
status code: 400, request id: 11111111-2222-3333-4444-555555555515
15m Warning FailedDeployModel ingress/awx-demo-ingress Failed deploy model due to InvalidSubnet: Not enough IP space available in subnet-031f9c702bc474e8f. ELB requires at least 8 free IP addresses in each subnet.
status code: 400, request id: 11111111-2222-3333-4444-555555555525
89s Normal WaitForPodScheduled persistentvolumeclaim/awx-demo-projects-claim waiting for pod awx-demo-111111111-122222 to be scheduled
21m Warning ProvisioningFailed persistentvolumeclaim/awx-demo-projects-claim Failed to provision volume with StorageClass "gp2": invalid AccessModes [ReadWriteMany]: only AccessModes [ReadWriteOnce] are supported
看来是设备问题和子网问题。我使用这些配置创建了 EKS 集群和节点组:
resource "aws_eks_cluster" "this" {
encryption_config {
resources = ["secrets"]
provider {
key_arn = aws_kms_key.this.arn
}
}
enabled_cluster_log_types = ["api", "authenticator", "audit", "scheduler", "controllerManager"]
name = local.cluster_name
version = "1.20"
role_arn = aws_iam_role.eks_cluster.arn
vpc_config {
subnet_ids = [
data.aws_ssm_parameter.private_subnet_0_id.value,
data.aws_ssm_parameter.private_subnet_1_id.value,
]
security_group_ids = [aws_security_group.this.id]
endpoint_public_access = true
}
depends_on = [
aws_iam_role_policy_attachment.eks_cluster_policy,
aws_iam_role_policy_attachment.eks_vpc_resource_controller,
aws_iam_role_policy_attachment.eks_service_policy,
]
tags = merge(
local.tags,
)
}
resource "aws_eks_node_group" "this" {
cluster_name = local.cluster_name
node_group_name = local.node_group_name
node_role_arn = aws_iam_role.eks_nodes.arn
instance_types = ["m5.2xlarge"]
subnet_ids = [
data.aws_ssm_parameter.private_subnet_0_id.value,
data.aws_ssm_parameter.private_subnet_1_id.value,
]
scaling_config {
desired_size = 2
max_size = 2
min_size = 2
}
lifecycle {
ignore_changes = [scaling_config[0].desired_size]
}
depends_on = [
aws_iam_role_policy_attachment.eks_worker_node_policy,
aws_iam_role_policy_attachment.eks_cni_policy,
aws_iam_role_policy_attachment.ec2_container_register_readonly,
]
tags = merge(
local.tags,
)
}
我没有为 EBS 定义卷类型,可能它使用的是默认设置。如何解决这个问题?
针对VPC的IP地址不足问题,如果新建子网供EKS使用,是否需要删除EKS集群或节点组?
顺便说一句,我使用的部署是https://raw.githubusercontent.com/ansible/awx-operator/0.13.0/deploy/awx-operator.yaml。
安装使用 https://github.com/ansible/awx-operator#basic-install.
@miantian,继续我们的评论讨论:
不能只增加子网大小。如果您更改子网大小,它将被重新创建。但由于 EKS 存在,子网创建将失败。所以,我会说——重新开始。删除所有内容,然后重新开始。
注册音量问题,默认EKS只支持ReadWriteOnce访问方式。这是因为 AWS 的技术限制,EBS 卷只能附加到 1 个 EC2 实例。如果要使用ReadWriteMany访问方式,需要使用EFS。
如果您想使用 EFS,请查找 NFS/EFS EKS 的客户端配置程序。要在 EKS 中创建 EFS 供应商,您需要执行几个步骤。然后,就可以开始使用ReadWriteMany访问模式了。
我在 EKS 中部署了一个应用程序。部署总是挂起,当我检查事件时发现这些问题。
$ kubectl get events
LAST SEEN TYPE REASON OBJECT MESSAGE
89s Warning FailedScheduling pod/awx-demo-111111111-122222 running PreBind plugin "VolumeBinding": binding volumes: provisioning failed for PVC "awx-demo-projects-claim"
49m Warning FailedDeployModel ingress/awx-demo-ingress Failed deploy model due to InvalidSubnet: Not enough IP space available in subnet-031f9c702bc474e8f. ELB requires at least 8 free IP addresses in each subnet.
status code: 400, request id: 11111111-2222-3333-4444-555555555555
32m Warning FailedDeployModel ingress/awx-demo-ingress Failed deploy model due to InvalidSubnet: Not enough IP space available in subnet-01322i912fas0123na. ELB requires at least 8 free IP addresses in each subnet.
status code: 400, request id: 11111111-2222-3333-4444-555555555515
15m Warning FailedDeployModel ingress/awx-demo-ingress Failed deploy model due to InvalidSubnet: Not enough IP space available in subnet-031f9c702bc474e8f. ELB requires at least 8 free IP addresses in each subnet.
status code: 400, request id: 11111111-2222-3333-4444-555555555525
89s Normal WaitForPodScheduled persistentvolumeclaim/awx-demo-projects-claim waiting for pod awx-demo-111111111-122222 to be scheduled
21m Warning ProvisioningFailed persistentvolumeclaim/awx-demo-projects-claim Failed to provision volume with StorageClass "gp2": invalid AccessModes [ReadWriteMany]: only AccessModes [ReadWriteOnce] are supported
看来是设备问题和子网问题。我使用这些配置创建了 EKS 集群和节点组:
resource "aws_eks_cluster" "this" {
encryption_config {
resources = ["secrets"]
provider {
key_arn = aws_kms_key.this.arn
}
}
enabled_cluster_log_types = ["api", "authenticator", "audit", "scheduler", "controllerManager"]
name = local.cluster_name
version = "1.20"
role_arn = aws_iam_role.eks_cluster.arn
vpc_config {
subnet_ids = [
data.aws_ssm_parameter.private_subnet_0_id.value,
data.aws_ssm_parameter.private_subnet_1_id.value,
]
security_group_ids = [aws_security_group.this.id]
endpoint_public_access = true
}
depends_on = [
aws_iam_role_policy_attachment.eks_cluster_policy,
aws_iam_role_policy_attachment.eks_vpc_resource_controller,
aws_iam_role_policy_attachment.eks_service_policy,
]
tags = merge(
local.tags,
)
}
resource "aws_eks_node_group" "this" {
cluster_name = local.cluster_name
node_group_name = local.node_group_name
node_role_arn = aws_iam_role.eks_nodes.arn
instance_types = ["m5.2xlarge"]
subnet_ids = [
data.aws_ssm_parameter.private_subnet_0_id.value,
data.aws_ssm_parameter.private_subnet_1_id.value,
]
scaling_config {
desired_size = 2
max_size = 2
min_size = 2
}
lifecycle {
ignore_changes = [scaling_config[0].desired_size]
}
depends_on = [
aws_iam_role_policy_attachment.eks_worker_node_policy,
aws_iam_role_policy_attachment.eks_cni_policy,
aws_iam_role_policy_attachment.ec2_container_register_readonly,
]
tags = merge(
local.tags,
)
}
我没有为 EBS 定义卷类型,可能它使用的是默认设置。如何解决这个问题?
针对VPC的IP地址不足问题,如果新建子网供EKS使用,是否需要删除EKS集群或节点组?
顺便说一句,我使用的部署是https://raw.githubusercontent.com/ansible/awx-operator/0.13.0/deploy/awx-operator.yaml。 安装使用 https://github.com/ansible/awx-operator#basic-install.
@miantian,继续我们的评论讨论:
不能只增加子网大小。如果您更改子网大小,它将被重新创建。但由于 EKS 存在,子网创建将失败。所以,我会说——重新开始。删除所有内容,然后重新开始。
注册音量问题,默认EKS只支持ReadWriteOnce访问方式。这是因为 AWS 的技术限制,EBS 卷只能附加到 1 个 EC2 实例。如果要使用ReadWriteMany访问方式,需要使用EFS。
如果您想使用 EFS,请查找 NFS/EFS EKS 的客户端配置程序。要在 EKS 中创建 EFS 供应商,您需要执行几个步骤。然后,就可以开始使用ReadWriteMany访问模式了。