过程中保存在 SQLAlchemy 会话中的对象

Question

我们的 Flask 网络应用程序中有一些功能，这些功能由一个函数调用组成，该函数调用调用许多子函数并在幕后做很多事情。例如，它向 (MSSQL) 数据库添加（金融）事务，将内容写入数据库中的 log-table 并更改特定对象的属性，从而导致特定 table 中的列发生更改我们的数据库。所有这些都是通过对象使用 SQLAlchemy 完成的。

在一种新方法中，由于可测试性，并且因为我们有时只想显示这些更改而不实际将它们提交到数据库，我们有这些功能 return 一个包含所有已更改对象的复合 Python 对象。因此，我们没有在 inside 函数和子函数中提交数据库更改，而是 return 更改的对象，因此我们可以决定在主函数之外显示或保存它们功能。

所以主函数 return 是一个包含所有这些更改对象的复合对象，在主函数之外，我们将这些更改的对象添加到我们的 SQLAlchemy 会话中，并将会话提交到数据库。（或者，如果我们只需要显示信息，我们不添加和提交）。我们这样做的方法是复合结果对象有一个 save_to_session() 函数，它使用 SQLAlchemy 的 bulk_save_objects() 操作保存我们更改的对象：

if result:
    result.save_to_session(current_app.db_session)
    current_app.db_session.commit()

def save_to_session(self, session):
    session.bulk_save_objects(self.adminlog)
    ...

这种新方法导致了我们在 current_app.db_session.commit() 行中没有预料到的错误。似乎在过程结束时，当我们将 returned 对象添加到会话中并尝试将会话提交到数据库时，出现了关于 duplicate key[=44] 的错误=]. 看起来在这个过程中，returned 对象已经添加到某个地方的会话中，并且 SQLAlchemy 尝试添加它们两次。

我们之所以得出这个结论，是因为当我们注释掉bulk_save_objects()这个调用的时候，已经没有报错信息了。但是，更改的数据正确地提交到数据库，并且恰好一次。

当我们在发生此错误后检查数据库时，没有记录 包含错误消息中提到的主键。这是因为发生错误的回滚。所以这也不是数据库中已经存在记录，而是会话尝试两次添加相同的记录。

这是我们得到的错误，使用 pymssql 作为驱动程序：

sqlalchemy.exc.IntegrityError: (pymssql.IntegrityError) (2627, b"Violation of PRIMARY KEY constraint 'PK_adminlog_id'. Cannot insert duplicate key in object 'dbo.adminlog'. The duplicate key value is (0E5537FF-E45C-40C5-98FC-7B1ACAD8104E). DB-Lib error message 20018, severity 14:\n General SQL Server error: Check messages from the SQL Server\n ") [SQL: 'INSERT INTO adminlog ( alog_id, alog_ppl_id, alog_user_ppl_id, alog_user_name, alog_datetime, [alog_ipAddress], [alog_macAddress], alog_comment, alog_type, alog_act_id, alog_comp_id, alog_artc_id) VALUES ( %(alog_id)s, %(alog_ppl_id)s, %(alog_user_ppl_id)s, %(alog_user_name)s, %(alog_datetime)s, %(alog_ipAddress)s, %(alog_macAddress)s, %(alog_comment)s, %(alog_type)s, %(alog_act_id)s, %(alog_comp_id)s, %(alog_artc_id)s)'] [parameters: ( {'alog_act_id': None, 'alog_comment': 'Le service a été ajouté. Cours Coll (119,88)', 'alog_datetime': datetime.datetime(2018, 10, 29, 13, 46, 54, 837178), 'alog_macAddress': b'4A-NO-NY-MO-US', 'alog_type': b'user', 'alog_artc_id': None, 'alog_comp_id': None, 'alog_id': b'0E5537FF-E45C-40C5-98FC-7B1ACAD8104E', 'alog_user_ppl_id': b'99999999-9999-9999-1111-999999999999', 'alog_user_name': 'System', 'alog_ipAddress': b'0.0.0.0', 'alog_ppl_id': b'AE841D1C-5D8D-47F7-B81F-89C5C931BD14'}, {'alog_act_id': None, 'alog_comment': 'Le service a été supprimé. 01/12/2019 Cours Coll (119,88)', 'alog_datetime': datetime.datetime(2018, 10, 29, 13, 46, 55, 71600), 'alog_macAddress': b'4A-NO-NY-MO-US', 'alog_type': b'user', 'alog_artc_id': None, 'alog_comp_id': None, 'alog_id': b'E22176FB-7490-470F-A8BA-A35D5F55A96A', 'alog_user_ppl_id': b'99999999-9999-9999-1111-999999999999', 'alog_user_name': 'System', 'alog_ipAddress': b'0.0.0.0', 'alog_ppl_id': b'AE841D1C-5D8D-47F7-B81F-89C5C931BD14'} )]

我们在使用 PyODBC 时遇到了类似的错误：

sqlalchemy.exc.IntegrityError: (pyodbc.IntegrityError) ('23000', "[23000] [Microsoft][SQL Server Native Client 11.0][SQL Server]Violation of PRIMARY KEY constraint 'PK_adminlog_id'. Cannot insert duplicate key in object 'dbo.adminlog'. The duplicate key value is (F5CABD8F-E000-4677-8F5F-78B4CD3B9560). (2627) (SQLExecDirectW); [23000] [Microsoft][SQL Server Native Client 11.0][SQL Server]The statement has been terminated. (3621)") [SQL: 'INSERT INTO adminlog ( alog_id, alog_ppl_id, alog_user_ppl_id, alog_user_name, alog_datetime, [alog_ipAddress], [alog_macAddress], alog_comment, alog_type, alog_act_id, alog_comp_id, alog_artc_id) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)'] [parameters: (( b'F5CABD8F-E000-4677-8F5F-78B4CD3B9560', b'0D10D3EF-F37E-45BE-8EED-B5987AE80732', b'99999999-9999-9999-1111-999999999999', 'System', datetime.datetime(2018, 10, 29, 13, 51, 30, 555495), b'0.0.0.0', b'4A-NO-NY-MO-US', 'Le service a été ajouté. Cours Coll (119,88)', b'user', None, None, None), ( b'39395ACA-0AFB-4C5F-90D4-0C6F95D7B8BC', b'0D10D3EF-F37E-45BE-8EED-B5987AE80732', b'99999999-9999-9999-1111-999999999999', 'System', datetime.datetime(2018, 10, 29, 13, 51, 30, 777909), b'0.0.0.0', b'4A-NO-NY-MO-US', 'Le service a été supprimé. 01/12/2019 Cours Coll (119,88)', b'user', None, None, None) )]

我的问题是，是否有一个自动过程可以在我们不使用 session.add() 的情况下将（更改的）对象添加到会话中？ SQLAlchemy 中是否有一个选项可以禁用此行为，并且仅在使用 session.add(object) 明确完成时才提交会话？

Answer 1

My question is, is there an automatic process that adds (changed) objects to the session, without us using session.add()?

至少有一项功能可以将对象拉到 Session 而无需显式添加它们：save-update cascade。当一个对象被添加到 Session 时，所有通过配置了此级联的 relationship() 属性与之关联的对象也将被放置在 Session 中。当一个对象与另一个已经在 Session.

中的对象相关联时，也会发生同样的情况

Is there an option in SQLAlchemy to disable this behaviour and only commit to the session when it's explicitly done using session.add(object)?

您当然可以将 relationship() 属性配置为不包含此行为，但似乎没有全局开关可以完全禁用级联。

如果您的代码中存在这种情况，那么对象被添加两次的原因是您已经明确地这样做了。 bulk operations 省略了 Session 的大多数更高级的功能以支持原始性能——例如，如果一个对象已经被持久化，它们就不会与 Session 协调，它们也不会附加将对象持久化到 Session:

The objects as given have no defined relationship to the target Session, even when the operation is complete, meaning there’s no overhead in attaching them or managing their state in terms of the identity map or session.

至于问题首先出现的原因，您不需要手动为对象保留一个 "staging area" – 您的复合对象。这正是 Session 与正确使用事务相结合的目的。函数和子函数应该在有意义的时候向 Session 添加对象，但它们 不应该控制正在进行的事务 。这应该只发生在你的主要功能之外，你现在正在处理你的复合对象。如果回滚，所有更改都会消失。

在测试中，您可以绕过 a Session that has joined an external transaction，无论被测代码做什么，都将显式回滚。

过程中保存在 SQLAlchemy 会话中的对象

Objects saved in the SQLAlchemy session during process

python

sqlalchemy

pyodbc

flask

flask-sqlalchemy