Step Function 在尝试 2 次后卡在 "Running" 状态

Step Function stuck in "Running" state after 2 tries

我有一个 Step Function 在运行两次后卡在 running 状态。前 2 次它按预期运行。

该函数检查快照的状态并更新 DynamoDB 中的记录 table。如果快照仍处于 creating 状态,则会引发异常。

{ "error": "SnapshotToolException", "cause": "{\"errorMessage\": \"There are still 1 snapshots in creating state.\", \"errorType\": \"SnapshotToolException\", \"stackTrace\": [[\"/var/task/lambda_function.py\", 5, \"lambda_handler\", \"checkSnapshotRecordsState()\"], [\"/var/task/dynamodb_control_utils.py\", 92, \"checkSnapshotRecordsState\", \"raise SnapshotToolException(log_message)\"]]}" }

{
  "Comment": "Triggers check DynamoDB snapshots records lambda function",
  "StartAt": "CheckSnapshots",
  "States": {
    "CheckSnapshots": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:ACCOUNT:function:checkDynamoDBSnapshotRecords",
      "Retry": [
        {
          "ErrorEquals": ["SnapshotToolException"],
          "IntervalSeconds": 120,
          "MaxAttempts": 20,
          "BackoffRate": 30
        }
      ],
      "End": true
    }
  }
}

"BackoffRate": 30 更改为 "BackoffRate": 30.0。似乎在末尾省略 .0 小数点会告诉 Step Function 等待 30 分钟,而不是文档建议的 30 秒。