具有内联 SELECT 而不是显式 table 的 FROM 的 SqlAlchemy UPDATE

SqlAlchemy UPDATE that has a FROM that is an inline SELECT rather than an explicit table

如何使用 SQLAlchemy Core API (aka SQLAlchemy Expression Language) 执行 SQL UPDATE,其 FROM 是内联 SELECT 而不是显式 table?

例如,如何使用 SQLAlchemy 而不是 sql 字符串来表达以下查询?

UPDATE data
SET invalid = True 
FROM (
    SELECT data.person_id as inner_col 
    FROM data
    LEFT JOIN people
    ON people.id = data.person_id
    WHERE people.id IS NULL
) inner_query
WHERE inner_query.inner_col = data.person_id

我希望能够使用 table 名称和列名称的变量动态构造这样的查询,并且我想使用 SQLAlchemy 表达式语言来防止SQL injection attacks。我也希望 SQLAlchemy 会产生比一堆字符串连接更易于维护的代码(特别是如果我们在内部查询中加入的列数是动态列表等)。


作为参考,这是一个完整的示例,我希望在不使用 sqlalchemy.text:

的情况下具有相同的行为
#!/usr/bin/env python3

import sqlalchemy

# create and connect to in memory database 
engine = sqlalchemy.create_engine('sqlite://') 
metadata_obj = sqlalchemy.MetaData()
people = sqlalchemy.Table(
    'people', metadata_obj,
    sqlalchemy.Column('id', sqlalchemy.Integer),
    sqlalchemy.Column('name', sqlalchemy.String),
)
data = sqlalchemy.Table(
    'data', metadata_obj,
    sqlalchemy.Column('person_id', sqlalchemy.Integer),
    sqlalchemy.Column('data_foo', sqlalchemy.String),
    sqlalchemy.Column('invalid', sqlalchemy.Boolean),
)
metadata_obj.create_all(engine)
conn = engine.connect()

def create_records():
    conn.execute(people.insert().values(id=1, name='Mary'))
    conn.execute(people.insert().values(id=2, name='James'))

    conn.execute(data.insert().values(person_id=1, data_foo='good foo', invalid=None))
    conn.execute(data.insert().values(person_id=42, data_foo='chop suey', invalid=None))

def dynamic_update(referenced_table, referenced_column, referencing_table, referencing_column):
    sql = f"""
UPDATE {referencing_table}
SET invalid = True 
FROM (
    SELECT {referencing_table}.{referencing_column} as inner_col 
    FROM {referencing_table}
    LEFT JOIN {referenced_table}
    ON {referenced_table}.{referenced_column} = {referencing_table}.{referencing_column}
    WHERE {referenced_table}.{referenced_column} IS NULL
) inner_query
WHERE inner_query.inner_col = {referencing_table}.{referencing_column}
    """
    conn.execute(sqlalchemy.text(sql))
    

def print_records(msg):
    print(f'{msg}:')
    print("  people:")
    for person in conn.execute(sqlalchemy.sql.select(people)):
        print('    ', person)
    print("  data:")
    for datum in conn.execute(sqlalchemy.sql.select(data)):
        print('    ', datum)
    print()

create_records()
print_records('initial values')
dynamic_update(
    referenced_table="people",
    referenced_column="id",
    referencing_table="data",
    referencing_column="person_id",
)
# Instead of passing strings, I really want to do:
# dynamic_update(people, people.c.id, data, data.c.person_id)
print_records('after update')

这是 运行 与 python 3.9 和 sqlalchemy 1.4.36 时的输出:

initial values:
  people:
     (1, 'Mary')
     (2, 'James')
  data:
     (1, 'good foo', None)
     (42, 'chop suey', None)

after update:
  people:
     (1, 'Mary')
     (2, 'James')
  data:
     (1, 'good foo', None)
     (42, 'chop suey', True)
def dynamic_update(referenced_table, referenced_column, referencing_table, referencing_column):
    valid_referenced_values = sqlalchemy.sql.select(referenced_column)
    conn.execute(referencing_table.update().\
        where(referencing_column.not_in(valid_referenced_values)).\
        values(invalid=True))

SqlAlchemy 执行此表达式作为以下 SQL 查询:

UPDATE data
SET invalid=?
WHERE (data.person_id NOT IN (SELECT people.id FROM people))