将属性添加到 Python class 时如何更新腌制对象

How to update pickled objects when adding an attribute to a Python class

我定义了一个Python3 class然后用pickle序列化并保存了一个实例到文件。后来我向我的 class 添加了另一个实例属性,但我意识到如果我加载我的实例并尝试引用该属性,我将得到一个 "Object has no attribute" 错误,因为实例是在没有它的情况下构建的。将新属性添加到我的腌制对象并进行配置的最佳选择是什么?

在代码中,我定义了一个class like

# First definition
class Foo:
  def __init__(self, params):
    # define and initialize attributes
  def print_number(self):
    print(2)

我使用 pickle 创建并序列化一个实例,并将其保存到文件

import pickle

inst = Foo(params)
with open("filename", 'wb') as f:
  pickle.dump(inst, f)

然后我希望我的 class 表现得有点不同,所以我更新了它的定义:

# Updated definition
class Foo:
  def __init__(self, params):
    # define and initialize attributes
    self.bar = "baz"                    # bar is a new attribute
  def print_number(self):
    print(3)                            # prints 3 instead of 2

然后我加载我的实例并尝试调用一些方法

import pickle

with open("filename", 'rb') as f:
  inst = pickle.load(f)

inst.print_number()
print(inst.bar)

由于 pickle 不保存方法定义,因此更新了实例方法的行为,因此 inst.print_number() 打印 3 而不是 2。然而,引用 inst.bar 导致 "Object has no attribute" 错误,因为 instFoo 定义该属性之前被初始化。

更新

这对我来说是一个有点菜鸟的问题,我没有意识到 Python 让你可以做类似 inst.bar = "baz" 的事情并动态设置(我来自Java 背景,一切都必须从一开始就修复)。我仍然有兴趣了解如何正确地做到这一点 and/or Pythonicaly and/or pickle-specific,尤其是当可以预期多个 class 更新时。

您可以使用 class inheritance 将新的 methods/attributes 添加到现有的 class:

# First definition
class Foo:
    def __init__(self, params):
        self.params = params
    def print_number(self):
        print(2)

import pickle

inst = Foo('params')
with open("filename", 'wb') as f:
    pickle.dump(inst, f)

del inst

# Updated definition
class Foo(Foo):
    def __init__(self, params):
        super().__init__(params)
        self.bar = "baz"                    # bar is a new attribute
    def print_number(self):
        print(3)


with open("filename", 'rb') as f:
    inst = Foo(pickle.load(f))

inst.print_number()
print(inst.bar)

# Outputs:
# 3
# baz

或者在实践中做这样的事情可能更有意义:

with open("filename", 'rb') as f:
    inst = pickle.load(f)

# Updated definition
class Foo(inst.__class__):
    def __init__(self, params):
        super().__init__(params)
        self.bar = "baz"                    # bar is a new attribute
    def print_number(self):
        print(3)



inst = Foo(inst)
inst.print_number()
print(inst.bar)

我要解决这个问题的一般方法是实施 __setstate__。我在下面粘贴了一些代码,您可以尝试一下以了解它的工作原理。您还可以定义一个 __setstate____init__ 都使用 dict 调用的方法(__init__ 的关键字参数或 __setstate__ 的状态)将确保预期的属性都已设置,而不管对象是如何创建的。您也可以考虑为您的 class 实现 __new__,因为即使是在 unpickling 时也会调用它。

mod.py:

VERSION = 1

if VERSION == 1:
    # Version 1
    class A:
        def __init__(self):
            # Note: either the instance's dict has to have something set or __getstate__
            # has to be overridden to return a `value` for which `bool(value) == True`
            #
            # See https://docs.python.org/3/library/pickle.html#object.__setstate__
            self.some_attr = 2
elif VERSION == 2:
    # Version 2
    class A:
        def __new__(cls):
            inst = super().__new__(cls)
            inst.other_new_attr = 6
            return inst

        def __init__(self):
            self.some_attr = 2
            self.new_attr = 5

        def __setstate__(self, state):
            print('setting state', state)
            self.__dict__.update(state)
            if not hasattr(self, 'new_attr'):
                print('adding new_attr')
                # you can do whatever you want to calculate new_attr here
                self.new_attr = 5

run.py:

import sys
from mod import A
from pickle import dump, load


if __name__ == '__main__':
    if sys.argv[1] == 'dump':
        with open('a.pickle', 'wb') as f:
            dump(A(), f)
    elif sys.argv[1] == 'load':
        # call this after adding the attribute
        with open('a.pickle', 'rb') as f:
            a = load(f)
        print(a.new_attr)
        print(a.other_new_attr)