如何从 s3 下载图像作为 numpy 数组?

How to download images from s3 as numpy arrays?

我正在尝试训练一个神经网络,我在其中传递一系列图像。我想创建一个生成器,将每个图像作为一个 numpy 数组

from skimage import io
image_array = io.imread(url)

我想要这样的东西:

s3 = boto3.resource('s3')
my_bucket = s3.Bucket('some-bucket')
def my_generator():
    for object in my_bucket.objects.all():
        image_array = io.imread(object)    # this will not work. object is of type s3.ObjectSummary(bucket_name='manga-learn-data', key=u'one-piece-colored-5340113_06_05.png') whereas io.imread is expecting a url or uri. 
        yield image_array

但是那个 image_array 变量永远不会起作用。我能找到的关于从 amazon s3 下载图像的所有信息都表明您将文件下载到一个文件中。我想将它下载到可以作为数组打开的图像对象。

基于docs for imread, it appears that it only supports passing in a filename or a URL. So no file-like objects. So it looks like you have two options: save to a temp file, or generate a presigned url传入。下面是生成预签名url的例子:

import boto3

s3 = boto3.client('s3')
params = {'Bucket': 'foo', 'Key': 'img.jpg'}
url = s3.generate_presigned_url(ClientMethod='get_object', Params=params)