添加 EKS 托管 windows 节点组失败。如何调试?
Adding EKS managed windows node group failed. How to debug?
在 AWS 控制台,
- 我使用以下 IAM 策略创建了一个 AWS EKS 节点 IAM 角色:
AmazonEKSWorkerNodePolicy
AmazonEKS_CNI_Policy
AmazonEC2ContainerRegistryReadOnly
- 我使用 AMI 创建了启动模板,ami-0e6430de0e2d50a33
(Windows_Server-English-Full-EKS-Optimized-1.16-2020.09.09)
我有一个由 terraform (0.11.13) 创建的现有 eks 集群。它有一个 eks 节点组。我想手动添加一个新的 windows eks 节点组。在 AWS 控制台,我转到我的 eks 集群,单击“添加节点组”,使用上面的模板,然后单击“创建按钮”。但是,我收到“创建失败”。我不知道失败的原因。我在哪里可以找到 AWS 控制台的日志?
不确定在哪里可以找到这些类型的日志。
但是,这是我们用来创建 self-managed Windows Server 2019 节点组并加入给定集群的 AWS CloudFormation 模板。请注意,它使用 spot 实例并且工作节点也加入了现有的 AD。
您需要从另一个 CF 模板导出您的 EKS 集群名称或 hard-code UserData 属性 中的值(或传入您的 EKS 集群名称)。
去掉第'New-SSMAssociation'行如果不加入AD
AWSTemplateFormatVersion: 2010-09-09
Description: Creates EC2 instances to support the EKS cluster worker nodes.
Metadata:
AWS::CloudFormation::Interface:
ParameterGroups:
-
Label:
default: "EKS Worker Nodes Configuration"
Parameters:
- Environment
- NodeImageIdSSMParam
- SpotPrice
- Subnets
- ActiveDirectoryIdentifier
- ActiveDirectoryName
- DesiredCapacity
- MaxCapacity
- MinCapacity
Parameters:
Environment:
Type: String
Description: The associated environment of the EKS cluster.
AllowedValues:
- preprod
- prod
BootstrapArguments:
Type: String
Default: ""
Description: Arguments to pass to the bootstrap script.
NodeImageIdSSMParam:
Type: "AWS::SSM::Parameter::Value<AWS::EC2::Image::Id>"
Default: /aws/service/ami-windows-latest/Windows_Server-2019-English-Core-EKS_Optimized-1.17/image_id
Description: AWS Systems Manager Parameter Store parameter of the AMI ID for the worker node instances.
SpotPrice:
Type: String
Description: The spot price to bid for the EKS Optimized instances.
Default: 0.4000
Subnets:
Description: Select the PRIVATE subnets where workers can be created.
Type: List<AWS::EC2::Subnet::Id>
ActiveDirectoryIdentifier:
Type: String
Description: The identifier of the shared Microsoft Managed AD
ActiveDirectoryName:
Type: String
Description: The name of the shared Microsoft Managed AD
DesiredCapacity:
Type: Number
Description: The desired number of EC2 instances for the Autoscaling group.
Default: 6
MaxCapacity:
Type: Number
Description: The maximum number of EC2 instances for the Autoscaling group.
Default: 6
MinCapacity:
Type: Number
Description: The minimum number of EC2 instances for the Autoscaling group.
Default: 6
Resources:
LaunchConfiguration:
Type: AWS::AutoScaling::LaunchConfiguration
Properties:
BlockDeviceMappings:
- DeviceName: /dev/sda1
Ebs:
DeleteOnTermination: true
VolumeSize: 50
VolumeType: gp2
LaunchConfigurationName: !Sub eks-worker-nodes-windows-${Environment}-launch-config
SpotPrice: !Ref SpotPrice
AssociatePublicIpAddress: false
ImageId: !Ref NodeImageIdSSMParam
InstanceType: t3.large
IamInstanceProfile: !ImportValue eks-worker-instance-profile-arn
InstanceMonitoring: true
KeyName: samtec-ec2-key
SecurityGroups:
- Fn::ImportValue: !Sub eks-${Environment}-sg
UserData:
Fn::Base64: !Sub
- |
<powershell>
Set-DefaultAWSRegion -Region ${AWS::Region}
Set-Variable -name instance_id -value (Invoke-Restmethod -uri http://169.254.169.254/latest/meta-data/instance-id)
New-SSMAssociation -InstanceId $instance_id -Name "awsconfig_Domain_${ActiveDirectoryIdentifier}_${ActiveDirectoryName}"
[string]$EKSBinDir = "$env:ProgramFiles\Amazon\EKS"
[string]$EKSBootstrapScriptName = 'Start-EKSBootstrap.ps1'
[string]$EKSBootstrapScriptFile = "$EKSBinDir$EKSBootstrapScriptName"
[string]$cfn_signal = "$env:ProgramFiles\Amazon\cfn-bootstrap\cfn-signal.exe"
& $EKSBootstrapScriptFile -EKSClusterName ${ClusterName} ${BootstrapArguments} 3>&1 4>&1 5>&1 6>&1
$LastError = if ($?) { 0 } else { $Error[0].Exception.HResult }
& $cfn_signal --exit-code=$LastError `
--stack="${AWS::StackName}" `
--resource="NodeGroup" `
--region=${AWS::Region}
</powershell>
- ClusterName:
'Fn::ImportValue': !Sub 'eks-${Environment}-name'
AutoScalingGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
AutoScalingGroupName: !Sub eks-worker-nodes-windows-${Environment}-autoscaler
Cooldown: 30
DesiredCapacity: !Ref DesiredCapacity
HealthCheckGracePeriod: 300
HealthCheckType: EC2
LaunchConfigurationName: !Ref LaunchConfiguration
MaxSize: !Ref MaxCapacity
MinSize: !Ref MinCapacity
MetricsCollection:
- Granularity: 1Minute
Tags:
- Key: Name
Value: !Sub eks-windows-${Environment}-worker
PropagateAtLaunch: true
- Key: operating-system
Value: windows
PropagateAtLaunch: true
- Key: !Sub
- |
kubernetes.io/cluster/${ClusterName}
- ClusterName:
'Fn::ImportValue': !Sub 'eks-${Environment}-name'
Value: owned
PropagateAtLaunch: true
- Key: !Sub
- |
k8s.io/cluster-autoscaler/${ClusterName}
- ClusterName:
'Fn::ImportValue': !Sub 'eks-${Environment}-name'
Value: owned
PropagateAtLaunch: true
- Key: k8s.io/cluster-autoscaler/enabled
Value: true
PropagateAtLaunch: true
- Key: eks:cluster-name
Value:
'Fn::ImportValue': !Sub 'eks-${Environment}-name'
PropagateAtLaunch: true
- Key: eks:nodegroup-name
Value:
'Fn::ImportValue': !Sub 'eks-${Environment}-name'
PropagateAtLaunch: true
VPCZoneIdentifier: !Ref Subnets
在 AWS 控制台,
- 我使用以下 IAM 策略创建了一个 AWS EKS 节点 IAM 角色:
AmazonEKSWorkerNodePolicy AmazonEKS_CNI_Policy AmazonEC2ContainerRegistryReadOnly
- 我使用 AMI 创建了启动模板,ami-0e6430de0e2d50a33 (Windows_Server-English-Full-EKS-Optimized-1.16-2020.09.09)
我有一个由 terraform (0.11.13) 创建的现有 eks 集群。它有一个 eks 节点组。我想手动添加一个新的 windows eks 节点组。在 AWS 控制台,我转到我的 eks 集群,单击“添加节点组”,使用上面的模板,然后单击“创建按钮”。但是,我收到“创建失败”。我不知道失败的原因。我在哪里可以找到 AWS 控制台的日志?
不确定在哪里可以找到这些类型的日志。
但是,这是我们用来创建 self-managed Windows Server 2019 节点组并加入给定集群的 AWS CloudFormation 模板。请注意,它使用 spot 实例并且工作节点也加入了现有的 AD。
您需要从另一个 CF 模板导出您的 EKS 集群名称或 hard-code UserData 属性 中的值(或传入您的 EKS 集群名称)。
去掉第'New-SSMAssociation'行如果不加入AD
AWSTemplateFormatVersion: 2010-09-09
Description: Creates EC2 instances to support the EKS cluster worker nodes.
Metadata:
AWS::CloudFormation::Interface:
ParameterGroups:
-
Label:
default: "EKS Worker Nodes Configuration"
Parameters:
- Environment
- NodeImageIdSSMParam
- SpotPrice
- Subnets
- ActiveDirectoryIdentifier
- ActiveDirectoryName
- DesiredCapacity
- MaxCapacity
- MinCapacity
Parameters:
Environment:
Type: String
Description: The associated environment of the EKS cluster.
AllowedValues:
- preprod
- prod
BootstrapArguments:
Type: String
Default: ""
Description: Arguments to pass to the bootstrap script.
NodeImageIdSSMParam:
Type: "AWS::SSM::Parameter::Value<AWS::EC2::Image::Id>"
Default: /aws/service/ami-windows-latest/Windows_Server-2019-English-Core-EKS_Optimized-1.17/image_id
Description: AWS Systems Manager Parameter Store parameter of the AMI ID for the worker node instances.
SpotPrice:
Type: String
Description: The spot price to bid for the EKS Optimized instances.
Default: 0.4000
Subnets:
Description: Select the PRIVATE subnets where workers can be created.
Type: List<AWS::EC2::Subnet::Id>
ActiveDirectoryIdentifier:
Type: String
Description: The identifier of the shared Microsoft Managed AD
ActiveDirectoryName:
Type: String
Description: The name of the shared Microsoft Managed AD
DesiredCapacity:
Type: Number
Description: The desired number of EC2 instances for the Autoscaling group.
Default: 6
MaxCapacity:
Type: Number
Description: The maximum number of EC2 instances for the Autoscaling group.
Default: 6
MinCapacity:
Type: Number
Description: The minimum number of EC2 instances for the Autoscaling group.
Default: 6
Resources:
LaunchConfiguration:
Type: AWS::AutoScaling::LaunchConfiguration
Properties:
BlockDeviceMappings:
- DeviceName: /dev/sda1
Ebs:
DeleteOnTermination: true
VolumeSize: 50
VolumeType: gp2
LaunchConfigurationName: !Sub eks-worker-nodes-windows-${Environment}-launch-config
SpotPrice: !Ref SpotPrice
AssociatePublicIpAddress: false
ImageId: !Ref NodeImageIdSSMParam
InstanceType: t3.large
IamInstanceProfile: !ImportValue eks-worker-instance-profile-arn
InstanceMonitoring: true
KeyName: samtec-ec2-key
SecurityGroups:
- Fn::ImportValue: !Sub eks-${Environment}-sg
UserData:
Fn::Base64: !Sub
- |
<powershell>
Set-DefaultAWSRegion -Region ${AWS::Region}
Set-Variable -name instance_id -value (Invoke-Restmethod -uri http://169.254.169.254/latest/meta-data/instance-id)
New-SSMAssociation -InstanceId $instance_id -Name "awsconfig_Domain_${ActiveDirectoryIdentifier}_${ActiveDirectoryName}"
[string]$EKSBinDir = "$env:ProgramFiles\Amazon\EKS"
[string]$EKSBootstrapScriptName = 'Start-EKSBootstrap.ps1'
[string]$EKSBootstrapScriptFile = "$EKSBinDir$EKSBootstrapScriptName"
[string]$cfn_signal = "$env:ProgramFiles\Amazon\cfn-bootstrap\cfn-signal.exe"
& $EKSBootstrapScriptFile -EKSClusterName ${ClusterName} ${BootstrapArguments} 3>&1 4>&1 5>&1 6>&1
$LastError = if ($?) { 0 } else { $Error[0].Exception.HResult }
& $cfn_signal --exit-code=$LastError `
--stack="${AWS::StackName}" `
--resource="NodeGroup" `
--region=${AWS::Region}
</powershell>
- ClusterName:
'Fn::ImportValue': !Sub 'eks-${Environment}-name'
AutoScalingGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
AutoScalingGroupName: !Sub eks-worker-nodes-windows-${Environment}-autoscaler
Cooldown: 30
DesiredCapacity: !Ref DesiredCapacity
HealthCheckGracePeriod: 300
HealthCheckType: EC2
LaunchConfigurationName: !Ref LaunchConfiguration
MaxSize: !Ref MaxCapacity
MinSize: !Ref MinCapacity
MetricsCollection:
- Granularity: 1Minute
Tags:
- Key: Name
Value: !Sub eks-windows-${Environment}-worker
PropagateAtLaunch: true
- Key: operating-system
Value: windows
PropagateAtLaunch: true
- Key: !Sub
- |
kubernetes.io/cluster/${ClusterName}
- ClusterName:
'Fn::ImportValue': !Sub 'eks-${Environment}-name'
Value: owned
PropagateAtLaunch: true
- Key: !Sub
- |
k8s.io/cluster-autoscaler/${ClusterName}
- ClusterName:
'Fn::ImportValue': !Sub 'eks-${Environment}-name'
Value: owned
PropagateAtLaunch: true
- Key: k8s.io/cluster-autoscaler/enabled
Value: true
PropagateAtLaunch: true
- Key: eks:cluster-name
Value:
'Fn::ImportValue': !Sub 'eks-${Environment}-name'
PropagateAtLaunch: true
- Key: eks:nodegroup-name
Value:
'Fn::ImportValue': !Sub 'eks-${Environment}-name'
PropagateAtLaunch: true
VPCZoneIdentifier: !Ref Subnets