EKS ALB 无法自动发现子网
EKS ALB is not to able to auto-discover subnets
背景:
我有一个 VPC
和 3 public subnets
(子网可以访问互联网网关)
我在这个 VPC 中有一个 EKS 集群,EKS 集群是从控制台创建的,没有使用 eksctl
我使用了官方 aws 文档中的 this 教程,我设法设置了我的 ALB 控制器并且控制器是 运行 完美:
集群包含两个节点组:
- 第一个节点组有一个节点类型:
t3a.micro
- 第二个节点组有一个节点类型:
t3.small
$ kubectl get deployment -n kube-system aws-load-balancer-controller
NAME READY UP-TO-DATE AVAILABLE AGE
aws-load-balancer-controller 1/1 1 1 60m
我使用了他们的游戏示例,这是清单文件:
---
apiVersion: v1
kind: Namespace
metadata:
name: game-2048
---
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: game-2048
name: deployment-2048
spec:
selector:
matchLabels:
app.kubernetes.io/name: app-2048
replicas: 1
template:
metadata:
labels:
app.kubernetes.io/name: app-2048
spec:
containers:
- image: alexwhen/docker-2048
imagePullPolicy: Always
name: app-2048
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
namespace: game-2048
name: service-2048
spec:
ports:
- port: 80
targetPort: 80
protocol: TCP
type: NodePort
selector:
app.kubernetes.io/name: app-2048
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
namespace: game-2048
name: ingress-2048
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
spec:
rules:
- http:
paths:
- path: /*
backend:
serviceName: service-2048
servicePort: 80
然而,当我描述进入时:我收到以下消息
DNDT@DNDT-DEV-2 MINGW64 ~/Desktop/.k8s
$ kubectl describe ingress/ingress-2048 -n game-2048
Name: ingress-2048
Namespace: game-2048
Address:
Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
Rules:
Host Path Backends
---- ---- --------
*
/* service-2048:80 (172.31.4.64:80)
Annotations: alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
kubernetes.io/ingress.class: alb
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedBuildModel 9s (x13 over 32s) ingress Failed build model due to couldn't auto-discover subnets: unable to discover at least one subnet
以下是在 3 个子网上设置的标签:
这里是子网的路由 table,如您所见,它们连接了一个互联网 gw:
我到处搜索,他们都在谈论添加标签,我从头开始创建了一个全新的集群,但仍然遇到这个问题,还有什么我遗漏的吗?
我检查了 答案,但它不相关,因为它用于 ELB 而不是 ALB,
==================================
更新:
我明确添加了子网:
alb.ingress.kubernetes.io/subnets: subnet-xxxxxx, subnet-xxxxx, subnet-xxx
现在我得到了我的外部 IP,但有一些警告
$ kubectl describe ingress/ingress-2048 -n game-2048
Name: ingress-2048
Namespace: game-2048
Address: k8s-game2048-ingress2-330cc1efad-115981283.eu-central-1.elb.amazonaws.com
Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
Rules:
Host Path Backends
---- ---- --------
*
/* service-2048:80 (172.31.13.183:80)
Annotations: alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/subnets: subnet-8ea768e4, subnet-bf2821f2, subnet-7c023801
alb.ingress.kubernetes.io/target-type: ip
kubernetes.io/ingress.class: alb
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedDeployModel 43s ingress Failed deploy model due to ListenerNotFound: One or more listeners not found
status code: 400, request id: e866eba4-328c-4282-a399-4e68f55ee266
Normal SuccessfullyReconciled 43s ingress Successfully reconciled
同样转到浏览器并使用外部 ip return: 503 Service Temporarily Unavailable
确保 aws-load-balancer-controller 部署中的 --cluster-name 配置正确。
使用
kubectl get deployment -n kube-system aws-load-balancer-controller -oyaml |grep "cluster-name"
获取部署中的集群名称。
如果不正确,请使用下一个命令编辑部署并重命名:
kubectl edit deployment -n kube-system aws-load-balancer-controller
就我而言,这是因为我没有使用正确的资源标签标记 AWS 子网。 https://kubernetes-sigs.github.io/aws-load-balancer-controller/guide/controller/subnet_discovery/
编辑 - 2021 年 5 月 28 日
Public 子网应使用以下资源标记:
kubernetes.io/role/elb: 1
私有子网应标记为:
kubernetes.io/role/internal-elb: 1
私有子网和 public 子网都应标记为:kubernetes.io/cluster/${your-cluster-name}: owned
或者如果子网也被非 EKS 资源使用
kubernetes.io/cluster/${your-cluster-name}: shared
来源:
https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.1/deploy/subnet_discovery/
如果将 aws-load-balancer-controller 从 v2.1 升级到 v2.2,请注意您会遇到同样的错误,因为需要新的 IAM 权限。有关这些新权限的详细信息/links,请参阅此处的变更日志:https://github.com/kubernetes-sigs/aws-load-balancer-controller/releases/tag/v2.2.0
IAM 权限的显式 link:https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.2.0/docs/install/iam_policy.json
我在 AWS 控制台上手动创建的集群也有同样的问题。
但后来我尝试使用 eksctl
创建集群,它创建的子网的标签略有不同,即:
Key
Value
Name
eksctl-cluster-name-cluster/SubnetPublicUSEAST1A
aws:cloudformation:logical-id
SubnetPublicUSEAST1A
kubernetes.io/role/elb
1
aws:cloudformation:stack-name
eksctl-cluster-name-cluster
alpha.eksctl.io/cluster-name
cluster-name
aws:cloudformation:stack-id
stack-id
alpha.eksctl.io/eksctl-version
0.76.0
eksctl.cluster.k8s.io/v1alpha1/cluster-name
cluster-name
子网发现可能与其中一些有关,也可能与某些 subnet\IAM 等配置有关。
我建议尝试使用 eksctl
启动集群
您还可以明确定义您的 specific subnets:
alb.ingress.kubernetes.io/subnets: subnet-xxx,subnet-yyyy
尽管仍然建议启用自动发现
背景:
我有一个
VPC
和 3 publicsubnets
(子网可以访问互联网网关)我在这个 VPC 中有一个 EKS 集群,EKS 集群是从控制台创建的,没有使用
eksctl
我使用了官方 aws 文档中的 this 教程,我设法设置了我的 ALB 控制器并且控制器是 运行 完美:
集群包含两个节点组:
- 第一个节点组有一个节点类型:
t3a.micro
- 第二个节点组有一个节点类型:
t3.small
$ kubectl get deployment -n kube-system aws-load-balancer-controller
NAME READY UP-TO-DATE AVAILABLE AGE
aws-load-balancer-controller 1/1 1 1 60m
我使用了他们的游戏示例,这是清单文件:
---
apiVersion: v1
kind: Namespace
metadata:
name: game-2048
---
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: game-2048
name: deployment-2048
spec:
selector:
matchLabels:
app.kubernetes.io/name: app-2048
replicas: 1
template:
metadata:
labels:
app.kubernetes.io/name: app-2048
spec:
containers:
- image: alexwhen/docker-2048
imagePullPolicy: Always
name: app-2048
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
namespace: game-2048
name: service-2048
spec:
ports:
- port: 80
targetPort: 80
protocol: TCP
type: NodePort
selector:
app.kubernetes.io/name: app-2048
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
namespace: game-2048
name: ingress-2048
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
spec:
rules:
- http:
paths:
- path: /*
backend:
serviceName: service-2048
servicePort: 80
然而,当我描述进入时:我收到以下消息
DNDT@DNDT-DEV-2 MINGW64 ~/Desktop/.k8s
$ kubectl describe ingress/ingress-2048 -n game-2048
Name: ingress-2048
Namespace: game-2048
Address:
Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
Rules:
Host Path Backends
---- ---- --------
*
/* service-2048:80 (172.31.4.64:80)
Annotations: alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
kubernetes.io/ingress.class: alb
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedBuildModel 9s (x13 over 32s) ingress Failed build model due to couldn't auto-discover subnets: unable to discover at least one subnet
以下是在 3 个子网上设置的标签:
这里是子网的路由 table,如您所见,它们连接了一个互联网 gw:
我到处搜索,他们都在谈论添加标签,我从头开始创建了一个全新的集群,但仍然遇到这个问题,还有什么我遗漏的吗?
我检查了
==================================
更新:
我明确添加了子网:
alb.ingress.kubernetes.io/subnets: subnet-xxxxxx, subnet-xxxxx, subnet-xxx
现在我得到了我的外部 IP,但有一些警告
$ kubectl describe ingress/ingress-2048 -n game-2048
Name: ingress-2048
Namespace: game-2048
Address: k8s-game2048-ingress2-330cc1efad-115981283.eu-central-1.elb.amazonaws.com
Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
Rules:
Host Path Backends
---- ---- --------
*
/* service-2048:80 (172.31.13.183:80)
Annotations: alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/subnets: subnet-8ea768e4, subnet-bf2821f2, subnet-7c023801
alb.ingress.kubernetes.io/target-type: ip
kubernetes.io/ingress.class: alb
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedDeployModel 43s ingress Failed deploy model due to ListenerNotFound: One or more listeners not found
status code: 400, request id: e866eba4-328c-4282-a399-4e68f55ee266
Normal SuccessfullyReconciled 43s ingress Successfully reconciled
同样转到浏览器并使用外部 ip return: 503 Service Temporarily Unavailable
确保 aws-load-balancer-controller 部署中的 --cluster-name 配置正确。
使用
kubectl get deployment -n kube-system aws-load-balancer-controller -oyaml |grep "cluster-name"
获取部署中的集群名称。
如果不正确,请使用下一个命令编辑部署并重命名:
kubectl edit deployment -n kube-system aws-load-balancer-controller
就我而言,这是因为我没有使用正确的资源标签标记 AWS 子网。 https://kubernetes-sigs.github.io/aws-load-balancer-controller/guide/controller/subnet_discovery/
编辑 - 2021 年 5 月 28 日
Public 子网应使用以下资源标记:
kubernetes.io/role/elb: 1
私有子网应标记为:
kubernetes.io/role/internal-elb: 1
私有子网和 public 子网都应标记为:kubernetes.io/cluster/${your-cluster-name}: owned
或者如果子网也被非 EKS 资源使用
kubernetes.io/cluster/${your-cluster-name}: shared
来源: https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.1/deploy/subnet_discovery/
如果将 aws-load-balancer-controller 从 v2.1 升级到 v2.2,请注意您会遇到同样的错误,因为需要新的 IAM 权限。有关这些新权限的详细信息/links,请参阅此处的变更日志:https://github.com/kubernetes-sigs/aws-load-balancer-controller/releases/tag/v2.2.0
IAM 权限的显式 link:https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.2.0/docs/install/iam_policy.json
我在 AWS 控制台上手动创建的集群也有同样的问题。
但后来我尝试使用 eksctl
创建集群,它创建的子网的标签略有不同,即:
Key | Value |
---|---|
Name | eksctl-cluster-name-cluster/SubnetPublicUSEAST1A |
aws:cloudformation:logical-id | SubnetPublicUSEAST1A |
kubernetes.io/role/elb | 1 |
aws:cloudformation:stack-name | eksctl-cluster-name-cluster |
alpha.eksctl.io/cluster-name | cluster-name |
aws:cloudformation:stack-id | stack-id |
alpha.eksctl.io/eksctl-version | 0.76.0 |
eksctl.cluster.k8s.io/v1alpha1/cluster-name | cluster-name |
子网发现可能与其中一些有关,也可能与某些 subnet\IAM 等配置有关。
我建议尝试使用 eksctl
您还可以明确定义您的 specific subnets:
alb.ingress.kubernetes.io/subnets: subnet-xxx,subnet-yyyy
尽管仍然建议启用自动发现