如何使用 max() 函数使用 SQL / postgreSQL 检索最近的行
How to use max() function to retrieve the most recent row using SQL / postgreSQL
我有一个看起来像这样的数据集
example data set.
一个域下有多个用户。每个 email_domain 我只想要一行,并且该行应该对应于 max(last_login) 值。简而言之,我只希望来自同一域的所有用户中最后登录的 email_domain 用户。
我试过一个看起来像这样的查询
select *
FROM
(
select LOWER(SUBSTRING(ua.email FROM POSITION ('@' IN ua.email) + 1)) AS email_domain, last_login, last_name, first_name, email, phone
from user_with_address ua
order by email_domain
) as A
group by email_domain, last_login, last_name, first_name, email, phone
having last_login = max(last_login)
order by email_domain
我仍然得到一个包含每个电子邮件域的多个值的列表,我做错了什么?请帮忙。
免责声明:我对 SQL 有基础->中级知识。
一种选择是使用ROW_NUMBER()
并为每组相同的电子邮件域记录保留最近的登录记录。
SELECT t.email_domain, t.last_login, t.last_name, t.first_name, t.email, t.phone
FROM
(
SELECT a.*,
ROW_NUMBER() OVER (PARTITION BY a.email_domain ORDER BY a.last_login DESC) rn
FROM
(
SELECT LOWER(SUBSTRING(ua.email FROM POSITION ('@' IN ua.email) + 1)) AS email_domain,
last_login, last_name, first_name, email, phone
FROM user_with_address ua
) a
) t
WHERE t.rn = 1
ORDER BY t.email_domain
请注意,我实际上在这里进行了两次子查询,以避免重复计算电子邮件域的代码。如果不是这样,我们可以只用一个子查询来完成这个。我们可以在这里使用单个子查询,但是查询会有点难读。
我喜欢 Tim Biegeleisen 的回答,但这更简单,SQL 明智。虽然不知道性能差异。
select
LOWER(SUBSTRING(ua.email FROM POSITION ('@' IN ua.email) + 1)) AS email_domain,
last_login,
last_name,
first_name,
email,
phone
from user_with_address ua
where last_login = (select max(last_login)
from user_with_address ua2
where LOWER(SUBSTRING(ua.email FROM POSITION ('@' IN ua.email) =
LOWER(SUBSTRING(ua2.email FROM POSITION ('@' IN ua2.email))
order by email_domain;
使用distinct on ()
select distinct on (email_domain) *
FROM (
select lower(split_part(email, '@', 2)) AS email_domain,
last_login,
last_name,
first_name,
email,
phone
from user_with_address
) as A
order by email_domain, last_login desc;
我还采纳了 Patrick 的建议来简化从电子邮件中提取域的表达式。
我有一个看起来像这样的数据集 example data set.
一个域下有多个用户。每个 email_domain 我只想要一行,并且该行应该对应于 max(last_login) 值。简而言之,我只希望来自同一域的所有用户中最后登录的 email_domain 用户。
我试过一个看起来像这样的查询
select *
FROM
(
select LOWER(SUBSTRING(ua.email FROM POSITION ('@' IN ua.email) + 1)) AS email_domain, last_login, last_name, first_name, email, phone
from user_with_address ua
order by email_domain
) as A
group by email_domain, last_login, last_name, first_name, email, phone
having last_login = max(last_login)
order by email_domain
我仍然得到一个包含每个电子邮件域的多个值的列表,我做错了什么?请帮忙。
免责声明:我对 SQL 有基础->中级知识。
一种选择是使用ROW_NUMBER()
并为每组相同的电子邮件域记录保留最近的登录记录。
SELECT t.email_domain, t.last_login, t.last_name, t.first_name, t.email, t.phone
FROM
(
SELECT a.*,
ROW_NUMBER() OVER (PARTITION BY a.email_domain ORDER BY a.last_login DESC) rn
FROM
(
SELECT LOWER(SUBSTRING(ua.email FROM POSITION ('@' IN ua.email) + 1)) AS email_domain,
last_login, last_name, first_name, email, phone
FROM user_with_address ua
) a
) t
WHERE t.rn = 1
ORDER BY t.email_domain
请注意,我实际上在这里进行了两次子查询,以避免重复计算电子邮件域的代码。如果不是这样,我们可以只用一个子查询来完成这个。我们可以在这里使用单个子查询,但是查询会有点难读。
我喜欢 Tim Biegeleisen 的回答,但这更简单,SQL 明智。虽然不知道性能差异。
select
LOWER(SUBSTRING(ua.email FROM POSITION ('@' IN ua.email) + 1)) AS email_domain,
last_login,
last_name,
first_name,
email,
phone
from user_with_address ua
where last_login = (select max(last_login)
from user_with_address ua2
where LOWER(SUBSTRING(ua.email FROM POSITION ('@' IN ua.email) =
LOWER(SUBSTRING(ua2.email FROM POSITION ('@' IN ua2.email))
order by email_domain;
使用distinct on ()
select distinct on (email_domain) *
FROM (
select lower(split_part(email, '@', 2)) AS email_domain,
last_login,
last_name,
first_name,
email,
phone
from user_with_address
) as A
order by email_domain, last_login desc;
我还采纳了 Patrick 的建议来简化从电子邮件中提取域的表达式。