具有 IRSA 授权错误的 AWS ALB 入口控制器

AWS ALB Ingress Controller with IRSA authorization error

我正在尝试使用 IRSA 方法而不是 kube2iam 来设置 AWS ALB Ingress Controller。然而,缺少一些文档,所以我走到了死胡同。

到目前为止我做了什么:

eksctl utils associate-iam-oidc-provider --cluster devops --approve

eksctl create iamserviceaccount --name alb-ingress --namespace default --cluster devops --attach-policy-arn arn:aws:iam::112233445566:policy/eks-ingressController-iam-policy-IngressControllerPolicy-1111111111 --approve

kubectl apply -f rbac-role.yaml

到目前为止一切都已部署好。现在我尝试部署我的 Ingress 服务,但我收到此错误(在控制器日志中)

kubebuilder/controller "msg"="Reconciler error" "error"="failed to build LoadBalancer configuration due to failed to get AWS tags. Error: AccessDeniedException: User: arn:aws:sts::1122334455:assumed-role/eksctl-devops-nodegroup-ng-1-work-NodeInstanceRole-J08FDJHIWPI7/i-000000000000 is not authorized to perform: tag:GetResources\n\tstatus code: 400, request id: 94d614a1-c05d-4b92-8ad6-86b450407f6a"  "Controller"="alb-ingress-controller" "Request"={"Namespace":"superset","Name":"superset-ingress"}

显然该节点没有创建 ALB 的适当权限,我想如果我将我的策略附加到日志中所述的角色,它就会起作用。但这违背了使用 IRSA 方法的全部目的,对吧?

我希望 Ingress Controller pod 需要适当的权限 - 通过使用服务帐户 - 来创建 ALB 而不是节点。我在这里遗漏了什么吗?

所以,以防有人遇到同样的问题。

解决方案是,在创建 rbac 角色时,从 rbac-role.yaml(如提供的 here)中注释掉创建服务帐户的最后部分。

由于我们已经使用 eksctl 创建了一个服务帐户并附加了 aws 策略,我们还可以将 rbac 权限附加到该服务帐户。然后这个服务账号就可以在ingress controller pod中正常使用了,发挥它的魔力

根据文档需要 CRUD ALB 的权限。如果你想尝试只给 ALB 驱动程序 Pod 一个具有创建 ALB 权限的角色,但我没有测试它,我不确定它是否重要,如果你的整个调度程序已被授予使用 ALB driver/pod 在 AWS 上创建这些对象。

我没有使用 EKS 3.0 的集群创建工具,而是使用自己的 CFT 来创建工作人员,因为我的组织有额外的安全要求。

我已经创建了以下托管策略并将其附加到需要创建 ALB 的工作人员,并且它可以正常工作。

  ALBPolicy:
    Type: "AWS::IAM::ManagedPolicy"
    Properties:
      Description: Allows workers to CRUD alb's
      PolicyDocument:
        Version: "2012-10-17"
        Statement:
          -
            Effect: "Allow"
            Action:
              - "acm:DescribeCertificate"
              - "acm:ListCertificates"
              - "acm:GetCertificate"
            Resource: "*"
          -
            Effect: "Allow"
            Action:
              - "ec2:AuthorizeSecurityGroupIngress"
              - "ec2:CreateSecurityGroup"
              - "ec2:CreateTags"
              - "ec2:DeleteTags"
              - "ec2:DeleteSecurityGroup"
              - "ec2:DescribeAccountAttributes"
              - "ec2:DescribeAddresses"
              - "ec2:DescribeInstances"
              - "ec2:DescribeInstanceStatus"
              - "ec2:DescribeInternetGateways"
              - "ec2:DescribeNetworkInterfaces"
              - "ec2:DescribeSecurityGroups"
              - "ec2:DescribeSubnets"
              - "ec2:DescribeTags"
              - "ec2:DescribeVpcs"
              - "ec2:ModifyInstanceAttribute"
              - "ec2:ModifyNetworkInterfaceAttribute"
              - "ec2:RevokeSecurityGroupIngress"
            Resource: "*"
          -
            Effect: "Allow"
            Action:
              - "elasticloadbalancing:AddListenerCertificates"
              - "elasticloadbalancing:AddTags"
              - "elasticloadbalancing:CreateListener"
              - "elasticloadbalancing:CreateLoadBalancer"
              - "elasticloadbalancing:CreateRule"
              - "elasticloadbalancing:CreateTargetGroup"
              - "elasticloadbalancing:DeleteListener"
              - "elasticloadbalancing:DeleteLoadBalancer"
              - "elasticloadbalancing:DeleteRule"
              - "elasticloadbalancing:DeleteTargetGroup"
              - "elasticloadbalancing:DeregisterTargets"
              - "elasticloadbalancing:DescribeListenerCertificates"
              - "elasticloadbalancing:DescribeListeners"
              - "elasticloadbalancing:DescribeLoadBalancers"
              - "elasticloadbalancing:DescribeLoadBalancerAttributes"
              - "elasticloadbalancing:DescribeRules"
              - "elasticloadbalancing:DescribeSSLPolicies"
              - "elasticloadbalancing:DescribeTags"
              - "elasticloadbalancing:DescribeTargetGroups"
              - "elasticloadbalancing:DescribeTargetGroupAttributes"
              - "elasticloadbalancing:DescribeTargetHealth"
              - "elasticloadbalancing:ModifyListener"
              - "elasticloadbalancing:ModifyLoadBalancerAttributes"
              - "elasticloadbalancing:ModifyRule"
              - "elasticloadbalancing:ModifyTargetGroup"
              - "elasticloadbalancing:ModifyTargetGroupAttributes"
              - "elasticloadbalancing:RegisterTargets"
              - "elasticloadbalancing:RemoveListenerCertificates"
              - "elasticloadbalancing:RemoveTags"
              - "elasticloadbalancing:SetIpAddressType"
              - "elasticloadbalancing:SetSecurityGroups"
              - "elasticloadbalancing:SetSubnets"
              - "elasticloadbalancing:SetWebACL"
            Resource: "*"
          -
            Effect: "Allow"
            Action:
              - "iam:CreateServiceLinkedRole"
              - "iam:GetServerCertificate"
              - "iam:ListServerCertificates"
            Resource: "*"
          -
            Effect: "Allow"
            Action:
              - "cognito-idp:DescribeUserPoolClient"
            Resource: "*"
          -
            Effect: "Allow"
            Action:
              - "waf-regional:GetWebACLForResource"
              - "waf-regional:GetWebACL"
              - "waf-regional:AssociateWebACL"
              - "waf-regional:DisassociateWebACL"
            Resource: "*"
          -
            Effect: "Allow"
            Action:
              - "tag:GetResources"
              - "tag:TagResources"
            Resource: "*"
          -
            Effect: "Allow"
            Action:
              - "waf:GetWebACL"
            Resource: "*"

我在使用此控制器的 v1.1.8 版本时遇到了类似的错误(不完全相同):

kubebuilder/controller "msg"="Reconciler
error"="failed get WAFv2 webACL for load balancer arn:aws:elasticloadbalancing:...: AccessDeniedException: User: arn:aws:sts:::assumed-role/eks-node-group-role/ is not authorized to perform: wafv2:GetWebACLForResource on resource: arn:aws:wafv2:us-east-2::regional/webacl/*\n\tstatus code: 400, request id: ..."
"controller"="alb-ingress-controller" "request"={"Namespace":"default","Name":"aws-alb-ingress"}

我将添加它,因为我认为它可以帮助在相同错误消息下搜索的人。

上述错误的原因是版本 v1.1.7 of this controller needs new IAM permissions in the nodegroup role's *PolicyALBIngress policy.

(!)请注意,即使不使用 wafv2 注释,也需要新的 IAM 权限。

解决方案 1

wafv2 部分添加到策略允许操作:

{
  "Effect": "Allow",
  "Action": [
    "wafv2:GetWebACL",
    "wafv2:GetWebACLForResource",
    "wafv2:AssociateWebACL",
    "wafv2:DisassociateWebACL"
  ],
  "Resource": "*"
}

解决方案 2

可以通过控制器标志禁用 WAFV2 支持,如前所述 here

A) 如果您通过 kubectl 安装它,请将 - --feature-gates=waf=false 添加到 spec -> containers -> args 部分。

B) 如果通过helm安装,在helm升级命令中添加--set extraArgs."feature-gates"='waf=false'


请注意,此要求已经 updated in the eksctl tool (Review also in here)。


其他 reference.