AWS CloudFormation 堆栈失败并出现错误 Received 0 SUCCESS signal(s) out of 1

AWS CloudFormation stack fails with error Received 0 SUCCESS signal(s) out of 1

我的 AWS CloudFormation 模板失败并出现错误:

Received 0 SUCCESS signal(s) out of 1. Unable to satisfy 100% MinSuccessfulInstancesPercent requirement

我在想我的 WaitConditionHandles 设置不正确(或者 EC2 实例可能没有发送一个),但不确定如何解决这个问题。

似乎在 AWS 中正​​确创建了所有内容(ASG、EC2 实例)。

我正在使用以下 CloudFormation 模板:

AWSTemplateFormatVersion: "2010-09-09"
Description: "Auto Scaling Group"
Outputs:
  AsgArn: 
    Value: !Ref "AutoScalingGroup"
  AsgMinSize:
    Description: "The minimum size of the Auto Scaling Group"
    Value: !FindInMap [ "HighAvailability", "MinSize", !Ref "HighAvailabilityFlag" ]
Parameters:
  Ami:
    Description: "Base AMI"
    Type: "AWS::EC2::Image::Id"
  EnvironmentName:
    Description: "The environment name"
    Type: "String"
  HighAvailabilityFlag:
    Description: "Flag used to set the minimum and maximum size of the Auto Scaling Group"
    Default: false
    Type: "String"
    AllowedValues: [ "true", "false" ]
  KeyPairName:
    Description: "Name of EC2 key pair for logging in to the instances"
    Type: "String"
  SecurityGroupIds:
    Description: "The IDs of security groups that are permitted access to EC2 instances"
    Type: "String"
  Subnets:
    Description: "Subnets to associate with the ASG"
    Type: "List<AWS::EC2::Subnet::Id>"
  VersionToDeploy:
    Description: "Version to deploy"
    Type: "String"
  VpcId:
    Description: "The ID of the VPC"
    Type: "AWS::EC2::VPC::Id"
Mappings:
  HighAvailability:
    MinSize:
      "false": 1
      "true": 2
    MaxSize:
      "false": 1
      "true": 4
Resources:
  InstanceProfile:
    Properties:
      Path: "/"
      Roles:
        - !Ref "InstanceRole"
    Type: "AWS::IAM::InstanceProfile"
  InstanceRole:
    Properties:
      AssumeRolePolicyDocument:
        Statement:
          - Action:
              - sts:AssumeRole
            Effect: "Allow"
            Principal:
              Service:
                - ec2.amazonaws.com
        Version: "2012-10-17"
      Path: "/"
    Type: "AWS::IAM::Role"
  Policy:
    Properties:
      PolicyDocument:
        Statement:
          - Action:
              - cloudformation:DescribeStacks
              - ec2:Describe*
            Effect: "Allow"
            Resource: "*"
        Version: "2012-10-17"
      PolicyName: "Service"
      Roles:
        - !Ref "InstanceRole"
    Type: "AWS::IAM::Policy"    
  AutoScalingGroup:
    Properties:
      HealthCheckGracePeriod: 300
      MetricsCollection:
        - Granularity: "1Minute"
      HealthCheckType: "ELB"
      LaunchConfigurationName: !Ref "LaunchConfiguration"
      MaxSize: !FindInMap [ "HighAvailability", "MaxSize", !Ref "HighAvailabilityFlag" ]
      MinSize: !FindInMap [ "HighAvailability", "MinSize", !Ref "HighAvailabilityFlag" ]
      VPCZoneIdentifier: !Ref "Subnets"
    CreationPolicy:
      ResourceSignal:
        Count: !FindInMap [ "HighAvailability", "MinSize", !Ref "HighAvailabilityFlag" ]
        Timeout: "PT5M"
    UpdatePolicy:
      AutoScalingRollingUpdate:
        MinInstancesInService: !FindInMap [ "HighAvailability", "MinSize", !Ref "HighAvailabilityFlag" ]
        PauseTime: "PT5M"
        WaitOnResourceSignals: true
    Type: "AWS::AutoScaling::AutoScalingGroup"
  LaunchConfiguration:
    Properties:
      AssociatePublicIpAddress: true
      IamInstanceProfile: !Ref "InstanceProfile"
      ImageId: !Ref "Ami"
      InstanceType: "t2.micro"
      KeyName: !Ref "KeyPairName"
      SecurityGroups: !Split [ ",", !Join [ ",", [ !Ref "SecurityGroupIds" ] ] ]
      UserData:
        Fn::Base64:
          cfn-init.exe -v -s "AWS::StackName" --region "AWS::Region" 
          cfn-signal.exe -e 0 !Ref "WindowsServerWaitHandle"
    Type: "AWS::AutoScaling::LaunchConfiguration"
  WindowsServerWaitHandle:
    Type: "AWS::CloudFormation::WaitConditionHandle"
  WindowsServerWaitCondition:
    DependsOn: "AutoScalingGroup"
    Properties:
      Handle: !Ref "WindowsServerWaitHandle"
      Timeout: "1800"
      Count: 0
    Type: "AWS::CloudFormation::WaitCondition"

创建 EC2 实例后,我看到生成了一些日志文件:

UserdataExecution.log

2017/03/05 05:54:47Z: Userdata execution begins
2017/03/05 05:54:47Z: Zero or more than one <persist> tag was not provided
2017/03/05 05:54:47Z: Unregistering the persist scheduled task
2017/03/05 05:54:50Z: Zero or more than one <runAsLocalSystem> tag was not provided
2017/03/05 05:54:50Z: Zero or more than one <script> tag was not provided
2017/03/05 05:54:50Z: Zero or more than one <powershell> tag was not provided
2017/03/05 05:54:50Z: Zero or more than one <powershellArguments> tag was not provided
2017/03/05 05:54:50Z: Userdata execution done

WindowsIsReadyToConsole.log

2017/03/03 04:46:27Z: Sending "Windows is Ready" message to console is scheduled successfully
2017/03/05 05:54:27Z: Sending windows is ready message started
2017/03/05 05:54:28Z: Opening COM port handle to write to the console
2017/03/05 05:54:30Z: Serial Port in use. Waiting for Serial Port...
2017/03/05 05:54:48Z: Message: Windows is Ready to use
2017/03/05 05:54:48Z: Sending windows is ready message done

您在 IAM 角色的 PolicyDocument 中缺少 - cloudformation:SignalResource 操作。发送信号需要此权限。

TLDR

这是当 EC2 无法向 ASG 发送成功信号时发生的一般错误。发生这种情况的可能原因有很多,但很可能您使用的任何健康检查都没有按预期工作。

使用下面的用户数据应该对健康检查进行硬编码,这是开始测试您的应用程序和 Cloud Formation 模板的好方法。

我的问题

我删除了对 AWS::CloudFormation::WaitConditionHandleAWS::CloudFormation::WaitCondition

的所有引用

我的 UserData 脚本有问题:

  • 脚本需要 <script> 个标签才能执行
  • 命令没有正确的参数
  • 未正确注入变量(例如${AWS::StackName}

结果是:

UserData:
  "Fn::Base64":
    !Sub |
      <script>
        cfn-init.exe -v --stack ${AWS::StackName} --resource AutoScalingGroup --region ${AWS::Region}
        cfn-signal.exe -e 0 --stack ${AWS::StackName} --resource AutoScalingGroup --region ${AWS::Region}
      </script>