Django ORM:等同于 SQL `NOT IN`? `exclude` 和 `Q` 对象不起作用

Django ORM: Equivalent of SQL `NOT IN`? `exclude` and `Q` objects do not work

问题

我正在尝试使用 Django ORM 来执行 SQL NOT IN 子句的等效操作,在子 select 中提供 ID 列表以返回一组来自日志记录的记录 table。我不知道这是否可能。

模特

class JobLog(models.Model):
    job_number = models.BigIntegerField(blank=True, null=True)
    name = models.TextField(blank=True, null=True)
    username = models.TextField(blank=True, null=True)
    event = models.TextField(blank=True, null=True)
    time = models.DateTimeField(blank=True, null=True)

我试过的

我的第一次尝试是使用 exclude,但这确实 NOT 否定了整个 Subquery,而不是所需的 NOT IN:

query = (
    JobLog.objects.values(
        "username", "job_number", "name", "time",
    )
    .filter(time__gte=start, time__lte=end, event="delivered")
    .exclude(
        job_number__in=models.Subquery(
            JobLog.objects.values_list("job_number", flat=True).filter(
                time__gte=start, time__lte=end, event="finished",
            )
        )
    )
)

不幸的是,这会产生这个 SQL:

SELECT "view_job_log"."username", "view_job_log"."group", "view_job_log"."job_number", "view_job_log"."name", "view_job_log"."time"
FROM "view_job_log"
WHERE (
    "view_job_log"."event" = 'delivered'
    AND "view_job_log"."time" >= '2020-03-12T11:22:28.300590+00:00'::timestamptz
    AND "view_job_log"."time" <= '2020-03-13T11:22:28.300600+00:00'::timestamptz
    AND NOT (
        "view_job_log"."job_number" IN (
            SELECT U0."job_number"
            FROM "view_job_log" U0
            WHERE (
                U0."event" = 'finished' AND U0."time" >= '2020-03-12T11:22:28.300590+00:00'::timestamptz
                AND U0."time" <= '2020-03-13T11:22:28.300600+00:00'::timestamptz
            )
        )
        AND "view_job_log"."job_number" IS NOT NULL
    )
)

我需要的是第三个 AND 子句是 AND "view_job_log"."job_number" NOT IN 而不是 AND NOT (.

我也试过做 sub-select,因为它首先是自己的查询,使用 exclude,如这里所建议的:

Django equivalent of SQL not in

但是,这会产生同样有问题的结果。然后我尝试了一个 Q 对象,它产生了一个类似的查询:

query = (
    JobLog.objects.values(
        "username", "subscriber_code", "job_number", "name", "time",
    )
    .filter(
        ~models.Q(job_number__in=models.Subquery(
            JobLog.objects.values_list("job_number", flat=True).filter(
                time__gte=start, time__lte=end, event="finished",
            )
        )),
        time__gte=start,
        time__lte=end,
        event="delivered",
    )
)

这次使用 Q 对象的尝试再次产生以下 SQL,但没有 NOT IN

SELECT "view_job_log"."username", "view_job_log"."group", "view_job_log"."job_number", "view_job_log"."name", "view_job_log"."time"

FROM "view_job_log" WHERE (
    NOT (
        "view_job_log"."job_number" IN (
            SELECT U0."job_number"
            FROM "view_job_log" U0
            WHERE (
                U0."event" = 'finished'
                AND U0."time" >= '2020-03-12T11:33:28.098653+00:00'::timestamptz
                AND U0."time" <= '2020-03-13T11:33:28.098678+00:00'::timestamptz
            )
        )
        AND "view_job_log"."job_number" IS NOT NULL
    )
    AND "view_job_log"."event" = 'delivered'
    AND "view_job_log"."time" >= '2020-03-12T11:33:28.098653+00:00'::timestamptz
    AND "view_job_log"."time" <= '2020-03-13T11:33:28.098678+00:00'::timestamptz
)

有没有办法让 Django 的 ORM 做一些等同于 AND job_number NOT IN (12345, 12346, 12347) 的事情?或者我将不得不下降到原始 SQL 来完成这个?

预先感谢您阅读整个文字墙问题。显式优于隐式。 :)

你能试试这个吗:

JobLog.objects.filter(time__gte=start, time__lte=end, event="delivered").exclude(time__gte=start, event='finished').exclude(time__lte=end, event='finished')

我认为最简单的方法是定义一个自定义查找,类似于 this one or the in lookup

from django.db.models.lookups import In as LookupIn

class NotIn(LookupIn):
    lookup_name = "notin"

    def get_rhs_op(self, connection, rhs):
        return "NOT IN %s" % rhs

Field.register_lookup(NotIn)

class NotIn(models.Lookup):
    lookup_name = "notin"

    def as_sql(self, compiler, connection):
        lhs, params = self.process_lhs(compiler, connection)
        rhs, rhs_params = self.process_rhs(compiler, connection)
        params.extend(rhs_params)

        return "%s NOT IN %s" % (lhs, rhs), params

然后在您的查询中使用它:

query = (
    JobLog.objects.values(
        "username", "job_number", "name", "time",
    )
    .filter(time__gte=start, time__lte=end, event="delivered")
    .filter(
        job_number__notin=models.Subquery(
            JobLog.objects.values_list("job_number", flat=True).filter(
                time__gte=start, time__lte=end, event="finished",
            )
        )
    )
)

这会生成 SQL:

SELECT
    "people_joblog"."username",
    "people_joblog"."job_number",
    "people_joblog"."name",
    "people_joblog"."time"
FROM
    "people_joblog"
WHERE ("people_joblog"."event" = delivered
    AND "people_joblog"."time" >= 2020 - 03 - 13 15:24:34.691222 + 00:00
    AND "people_joblog"."time" <= 2020 - 03 - 13 15:24:41.678069 + 00:00
    AND "people_joblog"."job_number" NOT IN (
        SELECT
            U0. "job_number"
        FROM
            "people_joblog" U0
        WHERE (U0. "event" = finished
            AND U0. "time" >= 2020 - 03 - 13 15:24:34.691222 + 00:00
            AND U0. "time" <= 2020 - 03 - 13 15:24:41.678069 + 00:00)))

您可能可以通过使用 Exists 和特殊大小写 NULL 来获得相同的结果。

.filter(
   ~Exists(
       JobLog.objects.filter(
           Q(jobnumber=None) | Q(jobnumber=OuterRef('jobnumber')),
           time__gte=start,
           time__lte=end,
           event='finished',
       )
   )
)