PostgreSQL,NOT IN 子句
PostgreSQL, NOT IN clause
我想计算 DAU 并排除我们不考虑 "real" 的用户(员工、Beta 测试人员等)。
之前我在查询中编写过滤时它工作正常:
SELECT
count(distinct user_id) AS daily,
e.event_timestamp::DATE AS date
FROM
"public"."events" AS e
WHERE
user_id IN (SELECT
distinct id
from
"user"."user"
WHERE
username IS NOT NULL AND position IS NOT NULL )
GROUP BY date
当我尝试将其更改为以下值时,它应该给出或多或少相同的计数(基本上我没有定义 4000 "real users",而是定义了我想要排除的 1000 "non-users")。但是,这给了我更高的计数。好像 distinct 语句不起作用。
我在子查询中添加了 NOT NULL,但没有改变结果。 NOT IN + 子查询是否以不同于 IN 子句的方式工作?
SELECT
count(distinct e.user_id) AS daily,
e.event_timestamp::DATE AS date
FROM
"public"."events" AS e
WHERE
e.user_id NOT IN (SELECT distinct id FROM "public"."non_users" WHERE id IS NOT NULL)
GROUP BY
date
ORDER BY
date
是的。如果子查询中的任何值是 NULL
,则 NOT IN
returns 没有行 为此,我强烈建议您始终使用 NOT EXISTS
—— 它的行为符合预期.
您似乎知道这一点,因为您在 WHERE
中使用了 NULL
比较。因此,差异可能是由于其他条件造成的。所以,也包括它:
SELECT count(distinct e.user_id) AS daily,
e.event_timestamp::DATE AS date
FROM "public"."events" e
WHERE NOT EXISTS (SELECT 1
FROM "public"."non_users" nu
WHERE e.user_id = nu.id AND
nu.position IS NOT NULL
)
GROUP BY date
ORDER BY date;
我想计算 DAU 并排除我们不考虑 "real" 的用户(员工、Beta 测试人员等)。
之前我在查询中编写过滤时它工作正常:
SELECT
count(distinct user_id) AS daily,
e.event_timestamp::DATE AS date
FROM
"public"."events" AS e
WHERE
user_id IN (SELECT
distinct id
from
"user"."user"
WHERE
username IS NOT NULL AND position IS NOT NULL )
GROUP BY date
当我尝试将其更改为以下值时,它应该给出或多或少相同的计数(基本上我没有定义 4000 "real users",而是定义了我想要排除的 1000 "non-users")。但是,这给了我更高的计数。好像 distinct 语句不起作用。
我在子查询中添加了 NOT NULL,但没有改变结果。 NOT IN + 子查询是否以不同于 IN 子句的方式工作?
SELECT
count(distinct e.user_id) AS daily,
e.event_timestamp::DATE AS date
FROM
"public"."events" AS e
WHERE
e.user_id NOT IN (SELECT distinct id FROM "public"."non_users" WHERE id IS NOT NULL)
GROUP BY
date
ORDER BY
date
是的。如果子查询中的任何值是 NULL
,则 NOT IN
returns 没有行 为此,我强烈建议您始终使用 NOT EXISTS
—— 它的行为符合预期.
您似乎知道这一点,因为您在 WHERE
中使用了 NULL
比较。因此,差异可能是由于其他条件造成的。所以,也包括它:
SELECT count(distinct e.user_id) AS daily,
e.event_timestamp::DATE AS date
FROM "public"."events" e
WHERE NOT EXISTS (SELECT 1
FROM "public"."non_users" nu
WHERE e.user_id = nu.id AND
nu.position IS NOT NULL
)
GROUP BY date
ORDER BY date;