如何将 pandas 数据添加到 Google 云存储中的现有 csv 文件?

How to add pandas data to an existing csv file in Google Cloud Storage?

我正在使用 pandas 将文件保存到我的 google 云存储上的 csv。问题是当我重写数据时我的文件被覆盖了。

    url = gs://mybucket/my.csv
    df.to_csv(url,mode="a", index=False, header=False)

但是我已经将写入模式指定为“a”以便以后添加而不重写文件。

非常感谢您的帮助:)

Google Cloud Storage 对象是不可变的。这意味着您不能修改对象一旦创建。您必须实施读取-修改-写入并替换现有对象。

Object immutability

Objects are immutable, which means that an uploaded object cannot change throughout its storage lifetime. An object's storage lifetime is the time between successful object creation, such as uploading, and successful object deletion. In practice, this means that you cannot make incremental changes to objects, such as append operations or truncate operations. However, it is possible to replace objects that are stored in Cloud Storage, and doing so happens atomically: until the new upload completes, the old version of the object is served to readers, and after the upload completes the new version of the object is served to readers. So a single replacement operation simply marks the end of one immutable object's lifetime and the beginning of a new immutable object's lifetime.

Google 也支持 Compose API。这支持组合两个或多个对象以生成新的 Cloud Storage 对象。

Composing Objects

使用 Compose API,您可以将 append 数据上传到一个临时对象,然后将原始对象与追加对象组合起来。这将模拟附加到文件。