无法使用 Python 3 写入 gzip.open() 将压缩文件上传到云存储
Cannot upload compressed file to Cloud Storage with Python 3 writing with gzip.open()
当我尝试在云 Shell 实例上使用 python 脚本将压缩的 gzip 文件上传到云存储时,它总是上传一个空文件。
这是重现错误的代码:
import gzip
from google.cloud import storage
storage_client = storage.Client()
list=['hello', 'world', 'please', 'upload']
out_file=gzip.open('test.gz', 'wt')
for line in list:
out_file.write(line + '\n')
out_file.close
out_bucket = storage_client.bucket('test-bucket')
out_blob = out_bucket.blob('test')
out_blob.upload_from_filename('test.gz')
它只在我的存储桶中上传了一个名为 'test' 的空文件,这不是我所期望的。
但是,我写在我的云 Shell 中的文件不是空的,因为当我这样做时 zcat test.gz
它显示了预期的内容:
hello
world
please
upload
要了解您的代码中发生了什么,请参阅以下 gzip docs 的描述:
Calling a GzipFile object’s close() method does not close fileobj, since you might wish to append more material after the compressed data.
这解释了为什么未关闭文件对象会影响文件的上传。这是一个 ,它描述了您的代码在未关闭 fileobj 时的行为,其中:
The warning about fileobj not being closed only applies when you open the file, and pass it to the GzipFile via the fileobj= parameter. When you pass only a filename, GzipFile "owns" the file handle and will also close it.
解决方案是不通过 fileobj = parameter
传递 gzipfile 并像这样重写它:
import gzip
from google.cloud import storage
storage_client = storage.Client()
list=['hello', 'world', 'please', 'upload']
with gzip.open('test.gz', 'rt') as f_in, gzip.open('test.gz', 'wt') as f_out:
for line in list:
f_out.writelines(line + '\n')
out_bucket = storage_client.bucket('test-bucket')
out_blob = out_bucket.blob('test.gz') # include file format in dest filename
out_blob.upload_from_filename("test.gz")
当我尝试在云 Shell 实例上使用 python 脚本将压缩的 gzip 文件上传到云存储时,它总是上传一个空文件。
这是重现错误的代码:
import gzip
from google.cloud import storage
storage_client = storage.Client()
list=['hello', 'world', 'please', 'upload']
out_file=gzip.open('test.gz', 'wt')
for line in list:
out_file.write(line + '\n')
out_file.close
out_bucket = storage_client.bucket('test-bucket')
out_blob = out_bucket.blob('test')
out_blob.upload_from_filename('test.gz')
它只在我的存储桶中上传了一个名为 'test' 的空文件,这不是我所期望的。
但是,我写在我的云 Shell 中的文件不是空的,因为当我这样做时 zcat test.gz
它显示了预期的内容:
hello
world
please
upload
要了解您的代码中发生了什么,请参阅以下 gzip docs 的描述:
Calling a GzipFile object’s close() method does not close fileobj, since you might wish to append more material after the compressed data.
这解释了为什么未关闭文件对象会影响文件的上传。这是一个
The warning about fileobj not being closed only applies when you open the file, and pass it to the GzipFile via the fileobj= parameter. When you pass only a filename, GzipFile "owns" the file handle and will also close it.
解决方案是不通过 fileobj = parameter
传递 gzipfile 并像这样重写它:
import gzip
from google.cloud import storage
storage_client = storage.Client()
list=['hello', 'world', 'please', 'upload']
with gzip.open('test.gz', 'rt') as f_in, gzip.open('test.gz', 'wt') as f_out:
for line in list:
f_out.writelines(line + '\n')
out_bucket = storage_client.bucket('test-bucket')
out_blob = out_bucket.blob('test.gz') # include file format in dest filename
out_blob.upload_from_filename("test.gz")