无法将对象从存储桶复制到 Lambda 函数中的另一个存储桶

Unable to copy object from bucket to another bucket in Lambda function

我有一个 Lambda 函数,我正在使用它根据存储桶上的 PUT 事件将特定文件格式复制到另一个存储桶。 CloudWatch 日志中不会抛出任何错误,但代码不会复制该文件。这只发生在这个按日期分区的键上。

Lambda 事件

{
  "Records": [
    {
      "s3": {
        "s3SchemaVersion": "1.0",
        "configurationId": "lasic2-artifacts",
        "bucket": {
          "name": "BUCKETNAME",
          "arn": "arn:aws:s3:::BUCKETNAME"
        },
        "object": {
          "key": "models/operatorai-model-store/lasic2/2022/03/08/10%3A21%3A05/artifacts.tar.gz"
        }
      }
    }
  ]
}

Lambda 函数

import boto3
from botocore.exceptions import ClientError

print("Loading function")

s3 = boto3.client("s3", region_name="us-east-1")

class NoRecords(Exception):
    """
    Exception thrown when there are no records found from
    s3:ObjectCreatedPut
    """

def get_source(bucket, key):
    """
    Returns the source object to be passed when copying over the contents from
    bucket A to bucket B
    :param bucket: name of the bucket to copy the key to
    :param key: the path of the object to copy
    """
    return {
        "Bucket": bucket,
        "Key": key,
    }


def process_record(
    record,
    production_bucket,
    staging_bucket,
):
    """
    Process individual records(example record can be found here
    https://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html#test-manual-invoke)
    :param record: a record from s3:ObjectCreated:Put
    :param production_bucket: name of the production bucket which comes from
    the records
    :param staging_bucket: name of the staging bucket to save the key from the
    production_bucket into
    """
    key = record["s3"]["object"]["key"]
    print(f"Key: \n{key}")
    try:
        s3_response = s3.get_object(Bucket=production_bucket, Key=key)
        s3_object = s3_response["Body"].read()
        copy_source = get_source(bucket=production_bucket, key=key)
        s3.copy_object(
            Bucket=staging_bucket,
            Key=key,
            CopySource=copy_source,
            ACL="bucket-owner-full-control",
        )
    except ClientError as error:
        error_code = error.response["Error"]["Code"]
        error_message = error.response["Error"]["Message"]
        if error_code == "NoSuchBucket":
            print(error_message)
            raise
    except Exception as error:
        print(f"Failed to upload {key}")
        print(error)
        raise


def lambda_handler(event, _):
    print(f"Event: \n{event}")
    records = event["Records"]
    num_records = len(records)
    if num_records == 0:
        raise NoRecords("No records found")
    record = records[0]
    production_bucket = record["s3"]["bucket"]["name"]
    staging_bucket = f"{production_bucket}-staging"
    process_record(
        record=record,
        production_bucket=production_bucket,
        staging_bucket=staging_bucket,
    )

如果您收到以下问题提示:

"key": "models/operatorai-model-store/lasic2/2022/03/08/10%3A21%3A05/artifacts.tar.gz"

可以看到这里的对象键是经过编码的。文档 is explicit about this:

The s3 key provides information about the bucket and object involved in the event. The object key name value is URL encoded. For example, "red flower.jpg" becomes "red+flower.jpg" (Amazon S3 returns "application/x-www-form-urlencoded" as the content type in the response).

由于您可以在 boto3 中使用的所有 SDK API 都需要一个未编码的字符串,因此您需要在对象密钥进入 Lambda 之前对其进行解码:

import urllib.parse
# ....
    key = record["s3"]["object"]["key"]
    key = urllib.parse.unquote_plus(key)
    print(f"Key: \n{key}")