无法使用 Python 3 写入 gzip.open() 将压缩文件上传到云存储

Cannot upload compressed file to Cloud Storage with Python 3 writing with gzip.open()

当我尝试在云 Shell 实例上使用 python 脚本将压缩的 gzip 文件上传到云存储时,它总是上传一个空文件。

这是重现错误的代码:

import gzip
from google.cloud import storage

storage_client = storage.Client()

list=['hello', 'world', 'please', 'upload']

out_file=gzip.open('test.gz', 'wt')
    for line in list:
    out_file.write(line + '\n')
out_file.close

out_bucket = storage_client.bucket('test-bucket')
out_blob = out_bucket.blob('test')
out_blob.upload_from_filename('test.gz')

它只在我的存储桶中上传了一个名为 'test' 的空文件,这不是我所期望的。

但是,我写在我的云 Shell 中的文件不是空的,因为当我这样做时 zcat test.gz 它显示了预期的内容:

hello
world
please
upload

要了解您的代码中发生了什么,请参阅以下 gzip docs 的描述:

Calling a GzipFile object’s close() method does not close fileobj, since you might wish to append more material after the compressed data.

这解释了为什么未关闭文件对象会影响文件的上传。这是一个 ,它描述了您的代码在未关闭 fileobj 时的行为,其中:

The warning about fileobj not being closed only applies when you open the file, and pass it to the GzipFile via the fileobj= parameter. When you pass only a filename, GzipFile "owns" the file handle and will also close it.

解决方案是不通过 fileobj = parameter 传递 gzipfile 并像这样重写它:

import gzip
from google.cloud import storage

storage_client = storage.Client()

list=['hello', 'world', 'please', 'upload']

with gzip.open('test.gz', 'rt') as f_in, gzip.open('test.gz', 'wt') as f_out: 
  for line in list:
    f_out.writelines(line + '\n')

out_bucket = storage_client.bucket('test-bucket')
out_blob = out_bucket.blob('test.gz') # include file format in dest filename
out_blob.upload_from_filename("test.gz")