PostgreSQL,NOT IN 子句

PostgreSQL, NOT IN clause

我想计算 DAU 并排除我们不考虑 "real" 的用户(员工、Beta 测试人员等)。

之前我在查询中编写过滤时它工作正常:

SELECT 
    count(distinct user_id) AS daily, 
    e.event_timestamp::DATE AS date
FROM 
    "public"."events" AS e
WHERE
   user_id IN (SELECT  
           distinct id
        from
            "user"."user"
        WHERE 
            username IS NOT NULL AND position IS NOT NULL )
GROUP BY date

当我尝试将其更改为以下值时,它应该给出或多或少相同的计数(基本上我没有定义 4000 "real users",而是定义了我想要排除的 1000 "non-users")。但是,这给了我更高的计数。好像 distinct 语句不起作用。

我在子查询中添加了 NOT NULL,但没有改变结果。 NOT IN + 子查询是否以不同于 IN 子句的方式工作?

SELECT 
    count(distinct e.user_id) AS daily, 
    e.event_timestamp::DATE AS date
FROM 
    "public"."events" AS e
WHERE
   e.user_id NOT IN (SELECT distinct id FROM "public"."non_users" WHERE id IS NOT NULL)
GROUP BY 
    date
ORDER BY
    date

是的。如果子查询中的任何值是 NULL,则 NOT IN returns 没有行 为此,我强烈建议您始终使用 NOT EXISTS —— 它的行为符合预期.

您似乎知道这一点,因为您在 WHERE 中使用了 NULL 比较。因此,差异可能是由于其他条件造成的。所以,也包括它:

SELECT count(distinct e.user_id) AS daily, 
       e.event_timestamp::DATE AS date
FROM  "public"."events" e
WHERE NOT EXISTS (SELECT 1
                  FROM "public"."non_users" nu
                  WHERE e.user_id = nu.id AND
                        nu.position IS NOT NULL
                 )
GROUP BY date
ORDER BY date;