GROUP BY 用户显示按时间排序的结果
GROUP BY user to display results ORDERed BY time
我正在尝试为用户创建一个收件箱。我需要显示所有按通讯员分组并按特定通讯的最后 posted 消息的时间排序的线程。
我被这个 sql 困住了,不知道该如何继续:
CREATE TABLE `user_mail` (
`id` int(10) NOT NULL,
`author` int(10) NOT NULL,
`recipient` int(10) NOT NULL,
`title` varchar(100) NOT NULL,
`message` text NOT NULL,
`date` int(100) NOT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
SELECT * FROM user_mail t1
INNER JOIN
(SELECT author, recepient, MAX(date) AS Ordered FROM user_mail
WHERE recepient = '$thisUser' OR author = '$thisUser' GROUP BY author) t2
ON t1.author = t2.author
WHERE t1.recepient = '$thisUser' OR t1.author = '$thisUser'
ORDER BY t2.Ordered DESC
这是我需要显示的方案:
Correspondence with User 1
Newest reply - author: User 1 | time: 11:00
Next reply - author: This user | time: ...
Reply - author: User 1 | time: ...
...
Original post - author: This user | time: 09:30
________________________________________________
Correspondence with User 2
Newest reply - author: This user | time: 10:30
...
Original post - author: User 2 | time: 10:00
您可以看到与用户 1 的通信是如何排在最前面的,因为它有最新的回复(尽管它的原始 post 比另一个旧)。
此外,无论是此用户启动还是其他用户启动,所有通信都应显示。
使用以下 SQL 语句,结果将与您的显示示例相同。
SELECT id
,CASE WHEN rn_min = 1
THEN 'Original Post - '
WHEN rn_max = 1
THEN 'Newest reply - '
WHEN rn_min = 2 AND rn_max != 2
THEN 'Reply - '
ELSE 'Next reply - '
END +
CASE WHEN author = @thisuser
THEN 'author: This ' + CONVERT(VARCHAR, author)
ELSE 'author: User ' + CONVERT(VARCHAR, author)
END +
CASE WHEN rn_min = 1 OR rn_max = 1
THEN ' | time: '+ CONVERT(VARCHAR(8),posteddate,108)
ELSE ''
END value
FROM (SELECT id
,author
,recipient
,message
,posteddate
,row_number() OVER (PARTITION BY id ORDER BY posteddate) rn_min
,row_number() OVER (PARTITION BY id ORDER BY posteddate desc) rn_max
FROM user_mail
WHERE author = @thisuser OR recipient = @thisuser
) t1
由于您的用户可以在两列中,因此您必须在搜索和分组依据中使用两列的值。
试试这个:
select *
from user_mail t1
join
(
select max(date) as ConvMaxDate,
case when author = '$thisUser' then recipient
else author
end as OtherUser
from user_mail
where author = '$thisUser' or recipient = '$thisUser'
group by case when author = '$thisUser' then recipient
else author
end
) ConversationMaxDate
on Author = '$thisUser' and OtherUser = recipient
or Recipient = '$thisUser' and OtherUser = Author
order by ConvMaxDate desc, Date desc;
ConversationMaxDate
的内部查询首先确定对话伙伴,然后以此 "OtherUser" 分组,计算每个线程的最新日期。这是可行的,因为您可以提供 "ThisUser"(因为只有这样您才能在特定的电子邮件中知道对话中的哪一个)。
您需要 (author, recipient, date)
和 (recipient, author, date)
上的索引,因为 MySQL 可以使用索引合并。否则将需要完整的 table/index 扫描。
由于您不知道每条消息 $thisUser
是 author
还是 recipient
,您可以使用 LEAST(author, recipient)
和 GREATEST(author, recipient)
确定一个 "thread" 并在子查询的 GROUP BY 子句和 JOIN 条件中使用它们。
SELECT m.*
FROM user_mail m
JOIN (
SELECT
LEAST(author, recipient) as user1,
GREATEST(author, recipient) as user2,
MAX(date) as date
FROM user_mail
WHERE $thisUser IN (author, recipient)
GROUP BY user1, user2
) s ON s.user1 = LEAST(m.author, m.recipient)
AND s.user2 = GREATEST(m.author, m.recipient)
WHERE $thisUser IN (m.author, m.recipient)
ORDER BY
s.date DESC,
LEAST(m.author, m.recipient),
GREATEST(m.author, m.recipient),
m.date DESC
但这在大数据集上会很慢,因为没有索引可以用于 GROUP BY 子句和 JOIN 条件。
我会制作 id
AUTO_INCREMENT PRIMARY KEY
并使用它代替 date
。
这样您至少可以为 JOIN 使用索引 (PK)。而且查询也会更短。
SELECT m.*
FROM user_mail m
JOIN (
SELECT MAX(id) as id
FROM user_mail
WHERE $thisUser IN (author, recipient)
GROUP BY
LEAST(author, recipient),
GREATEST(author, recipient)
) s ON s.id = m.id
ORDER BY s.id DESC, m.id DESC
您可以使用子查询的 UNION ALL 优化获得更好的性能。
SELECT m.*
FROM user_mail m
JOIN (
SELECT MAX(id) as id
FROM (
SELECT recipient as user, MAX(id) as id
FROM user_mail
WHERE author = $thisUser
GROUP BY recipient
UNION ALL
SELECT author as user, MAX(id) as id
FROM user_mail
WHERE recipient = $thisUser
GROUP BY author
) sub1
GROUP BY user
) s ON s.id = m.id
ORDER BY s.id DESC, m.id DESC
对于此查询,您应该在 (author, recipient)
和 (recipient, author)
上定义复合索引。
更新
您的评论是对的:最后两个查询仅 return 每个对话的最新消息。但是第一个应该 return 所有消息。
但是 - 这是 UNION ALL 优化查询的正确版本:
SELECT m.*, s.max_id
FROM user_mail m
JOIN (
SELECT other_user, MAX(id) as max_id
FROM (
SELECT recipient as other_user, MAX(id) as id
FROM user_mail
WHERE author = $thisUser
GROUP BY recipient
UNION ALL
SELECT author as other_user, MAX(id) as id
FROM user_mail
WHERE recipient = $thisUser
GROUP BY author
) sub1
GROUP BY other_user
) s ON s.other_user = m.recipient
WHERE m.author = $thisUser
UNION ALL
SELECT m.*, s.max_id
FROM user_mail m
JOIN (
SELECT other_user, MAX(id) as max_id
FROM (
SELECT recipient as other_user, MAX(id) as id
FROM user_mail
WHERE author = $thisUser
GROUP BY recipient
UNION ALL
SELECT author as other_user, MAX(id) as id
FROM user_mail
WHERE recipient = $thisUser
GROUP BY author
) sub1
GROUP BY other_user
) s ON s.other_user = m.author
WHERE m.recipient = $thisUser
ORDER BY max_id DESC, id DESC
虽然看起来很大,但这个查询在我的百万行测试数据集上运行不到 20 毫秒(而其他解决方案需要 300 - 500 毫秒)。
请注意,子查询在两个部分中是相同的。 MySQL 应该能够缓存和重用结果。
为避免代码重复,您可以将子查询存储在一个字符串变量中并重新使用它。如果您使用 MariaDB 10.2,您可能还想尝试 CTE。
也不要忘记在 (author, recipient)
和 (recipient, author)
上定义索引
我正在尝试为用户创建一个收件箱。我需要显示所有按通讯员分组并按特定通讯的最后 posted 消息的时间排序的线程。 我被这个 sql 困住了,不知道该如何继续:
CREATE TABLE `user_mail` (
`id` int(10) NOT NULL,
`author` int(10) NOT NULL,
`recipient` int(10) NOT NULL,
`title` varchar(100) NOT NULL,
`message` text NOT NULL,
`date` int(100) NOT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
SELECT * FROM user_mail t1
INNER JOIN
(SELECT author, recepient, MAX(date) AS Ordered FROM user_mail
WHERE recepient = '$thisUser' OR author = '$thisUser' GROUP BY author) t2
ON t1.author = t2.author
WHERE t1.recepient = '$thisUser' OR t1.author = '$thisUser'
ORDER BY t2.Ordered DESC
这是我需要显示的方案:
Correspondence with User 1
Newest reply - author: User 1 | time: 11:00
Next reply - author: This user | time: ...
Reply - author: User 1 | time: ...
...
Original post - author: This user | time: 09:30
________________________________________________
Correspondence with User 2
Newest reply - author: This user | time: 10:30
...
Original post - author: User 2 | time: 10:00
您可以看到与用户 1 的通信是如何排在最前面的,因为它有最新的回复(尽管它的原始 post 比另一个旧)。
此外,无论是此用户启动还是其他用户启动,所有通信都应显示。
使用以下 SQL 语句,结果将与您的显示示例相同。
SELECT id
,CASE WHEN rn_min = 1
THEN 'Original Post - '
WHEN rn_max = 1
THEN 'Newest reply - '
WHEN rn_min = 2 AND rn_max != 2
THEN 'Reply - '
ELSE 'Next reply - '
END +
CASE WHEN author = @thisuser
THEN 'author: This ' + CONVERT(VARCHAR, author)
ELSE 'author: User ' + CONVERT(VARCHAR, author)
END +
CASE WHEN rn_min = 1 OR rn_max = 1
THEN ' | time: '+ CONVERT(VARCHAR(8),posteddate,108)
ELSE ''
END value
FROM (SELECT id
,author
,recipient
,message
,posteddate
,row_number() OVER (PARTITION BY id ORDER BY posteddate) rn_min
,row_number() OVER (PARTITION BY id ORDER BY posteddate desc) rn_max
FROM user_mail
WHERE author = @thisuser OR recipient = @thisuser
) t1
由于您的用户可以在两列中,因此您必须在搜索和分组依据中使用两列的值。
试试这个:
select *
from user_mail t1
join
(
select max(date) as ConvMaxDate,
case when author = '$thisUser' then recipient
else author
end as OtherUser
from user_mail
where author = '$thisUser' or recipient = '$thisUser'
group by case when author = '$thisUser' then recipient
else author
end
) ConversationMaxDate
on Author = '$thisUser' and OtherUser = recipient
or Recipient = '$thisUser' and OtherUser = Author
order by ConvMaxDate desc, Date desc;
ConversationMaxDate
的内部查询首先确定对话伙伴,然后以此 "OtherUser" 分组,计算每个线程的最新日期。这是可行的,因为您可以提供 "ThisUser"(因为只有这样您才能在特定的电子邮件中知道对话中的哪一个)。
您需要 (author, recipient, date)
和 (recipient, author, date)
上的索引,因为 MySQL 可以使用索引合并。否则将需要完整的 table/index 扫描。
由于您不知道每条消息 $thisUser
是 author
还是 recipient
,您可以使用 LEAST(author, recipient)
和 GREATEST(author, recipient)
确定一个 "thread" 并在子查询的 GROUP BY 子句和 JOIN 条件中使用它们。
SELECT m.*
FROM user_mail m
JOIN (
SELECT
LEAST(author, recipient) as user1,
GREATEST(author, recipient) as user2,
MAX(date) as date
FROM user_mail
WHERE $thisUser IN (author, recipient)
GROUP BY user1, user2
) s ON s.user1 = LEAST(m.author, m.recipient)
AND s.user2 = GREATEST(m.author, m.recipient)
WHERE $thisUser IN (m.author, m.recipient)
ORDER BY
s.date DESC,
LEAST(m.author, m.recipient),
GREATEST(m.author, m.recipient),
m.date DESC
但这在大数据集上会很慢,因为没有索引可以用于 GROUP BY 子句和 JOIN 条件。
我会制作 id
AUTO_INCREMENT PRIMARY KEY
并使用它代替 date
。
这样您至少可以为 JOIN 使用索引 (PK)。而且查询也会更短。
SELECT m.*
FROM user_mail m
JOIN (
SELECT MAX(id) as id
FROM user_mail
WHERE $thisUser IN (author, recipient)
GROUP BY
LEAST(author, recipient),
GREATEST(author, recipient)
) s ON s.id = m.id
ORDER BY s.id DESC, m.id DESC
您可以使用子查询的 UNION ALL 优化获得更好的性能。
SELECT m.*
FROM user_mail m
JOIN (
SELECT MAX(id) as id
FROM (
SELECT recipient as user, MAX(id) as id
FROM user_mail
WHERE author = $thisUser
GROUP BY recipient
UNION ALL
SELECT author as user, MAX(id) as id
FROM user_mail
WHERE recipient = $thisUser
GROUP BY author
) sub1
GROUP BY user
) s ON s.id = m.id
ORDER BY s.id DESC, m.id DESC
对于此查询,您应该在 (author, recipient)
和 (recipient, author)
上定义复合索引。
更新
您的评论是对的:最后两个查询仅 return 每个对话的最新消息。但是第一个应该 return 所有消息。
但是 - 这是 UNION ALL 优化查询的正确版本:
SELECT m.*, s.max_id
FROM user_mail m
JOIN (
SELECT other_user, MAX(id) as max_id
FROM (
SELECT recipient as other_user, MAX(id) as id
FROM user_mail
WHERE author = $thisUser
GROUP BY recipient
UNION ALL
SELECT author as other_user, MAX(id) as id
FROM user_mail
WHERE recipient = $thisUser
GROUP BY author
) sub1
GROUP BY other_user
) s ON s.other_user = m.recipient
WHERE m.author = $thisUser
UNION ALL
SELECT m.*, s.max_id
FROM user_mail m
JOIN (
SELECT other_user, MAX(id) as max_id
FROM (
SELECT recipient as other_user, MAX(id) as id
FROM user_mail
WHERE author = $thisUser
GROUP BY recipient
UNION ALL
SELECT author as other_user, MAX(id) as id
FROM user_mail
WHERE recipient = $thisUser
GROUP BY author
) sub1
GROUP BY other_user
) s ON s.other_user = m.author
WHERE m.recipient = $thisUser
ORDER BY max_id DESC, id DESC
虽然看起来很大,但这个查询在我的百万行测试数据集上运行不到 20 毫秒(而其他解决方案需要 300 - 500 毫秒)。 请注意,子查询在两个部分中是相同的。 MySQL 应该能够缓存和重用结果。 为避免代码重复,您可以将子查询存储在一个字符串变量中并重新使用它。如果您使用 MariaDB 10.2,您可能还想尝试 CTE。
也不要忘记在 (author, recipient)
和 (recipient, author)