防止 ORM 关系中的重复子条目

Preventing duplicate child entries in an ORM relationship

基本上,我有一个从电子表格读取并插入数据库的服务。 在 SQLAlchemy 中,我有以下关系

class Customer(Base):
 __tablename__ = 'customers'
 id = Column(Integer, primary_key=True)
 name = Column(String)
 children = relationship('Email', backref=('customer')

class Email(Base):
 __tablename__ = 'emails'
 id = Column(Integer, primary_key=True)
 customer = Column(Integer, ForeignKey('customer.id')) 
 email = Column(String)
 primary = Column(Boolean)

SQLAlchemy 是否可以检查获取的资源和在 ORM 中创建的资源之间的重复条目? 例如,假设客户 123 有一个电子邮件 some_email,我们尝试再次添加它:

email_object = Email(customer=123, email='some_email', primary=True)
cust = connection.query(Customer).options(joinedload(Customer.emails)).filter_by(
        id=123).first()
cust.emails.append(email_object)

理想情况下,我希望 SQLAlchemy 注意到存在这样的组合并merge/ignore它,或者抛出某种异常。

但是,如果我打印出 cust.emails

,我会得到以下结果
[<Email(id=1, email=some_email, primary=True, customer=123>), 
<Email(customer=192071, email='some_email', primary=True, customers=<Employee(id=123, name='John', emails=['some_email', 'some_email']>>)]

并进行合并和提交似乎只是在数据库中添加了一个额外的相同行(pk 除外)。 我认为这可能与电子邮件中未使用的主键有关,但这是在提交到数据库时自动生成的。 有任何想法吗? 让我知道是否需要澄清任何事情。

Setting the Email class to have two primary keys doesn't seem to make SQLAlchemy stop from appending the extra email

没错。在 (customer_id, email) 上使用复合主键不会阻止 SQLAlchemy 尝试 插入一个基本上复制现有电子邮件的新对象——尽管它会警告您identity map中已经存在主键相同的object,由于PK重复,INSERT会失败(抛出异常回滚),从而防止子记录重复。

如果您想在尝试添加之前检查电子邮件是否存在,您可以使用 session.get()

with Session(engine) as session:
    # retrieve John's object
    john = (
        session.execute(select(Customer).where(Customer.name == "John"))
        .scalars()
        .one()
    )
    print(john)  # <Customer(id=123, name='John')>

    # check if email already exists using .get()
    email = session.get(Email, (john.id, "some_email"))
    if email:
        print(f"email already exists: {email}")
        # email already exists: <Email(customer_id=123, email='some_email')>
    else:
        print("email does not already exist")

…或者客户中的关系可以提供现有的电子邮件,允许您搜索要添加的电子邮件

    # alternative (less efficient) method: check via relationship
    e_list = [e for e in john.emails if e.email == "some_email"]
    if e_list:  # list not empty
        print("email already exists")
    else:
        print("email does not already exist")