GROUP BY 用户显示按时间排序的结果

GROUP BY user to display results ORDERed BY time

我正在尝试为用户创建一个收件箱。我需要显示所有按通讯员分组并按特定通讯的最后 posted 消息的时间排序的线程。 我被这个 sql 困住了,不知道该如何继续:

CREATE TABLE `user_mail` (
  `id` int(10) NOT NULL,
  `author` int(10) NOT NULL,
  `recipient` int(10) NOT NULL,
  `title` varchar(100) NOT NULL,
  `message` text NOT NULL,
  `date` int(100) NOT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

SELECT * FROM user_mail t1 
        INNER JOIN 
        (SELECT author, recepient, MAX(date) AS Ordered FROM user_mail
        WHERE recepient = '$thisUser' OR author = '$thisUser' GROUP BY author) t2
        ON t1.author = t2.author
        WHERE t1.recepient = '$thisUser' OR t1.author = '$thisUser' 
        ORDER BY t2.Ordered DESC

这是我需要显示的方案:

Correspondence with User 1        

 Newest reply  - author: User 1    | time: 11:00
 Next reply    - author: This user | time: ...
 Reply         - author: User 1    | time: ...
 ...
 Original post - author: This user | time: 09:30
________________________________________________
Correspondence with User 2

 Newest reply  - author: This user | time: 10:30
 ...
 Original post - author: User 2    | time: 10:00

您可以看到与用户 1 的通信是如何排在最前面的,因为它有最新的回复(尽管它的原始 post 比另一个旧)。

此外,无论是此用户启动还是其他用户启动,所有通信都应显示。

使用以下 SQL 语句,结果将与您的显示示例相同。

SELECT id
      ,CASE WHEN rn_min = 1
            THEN 'Original Post - '
            WHEN rn_max = 1
            THEN 'Newest reply  - '
            WHEN rn_min = 2 AND rn_max != 2
            THEN 'Reply         - '
            ELSE 'Next reply    - '
        END +
       CASE WHEN author = @thisuser
            THEN 'author: This ' + CONVERT(VARCHAR, author) 
            ELSE 'author: User ' + CONVERT(VARCHAR, author) 
        END +
       CASE WHEN rn_min = 1 OR rn_max = 1
            THEN ' | time: '+ CONVERT(VARCHAR(8),posteddate,108)
            ELSE ''
        END value
  FROM (SELECT id
              ,author
              ,recipient
              ,message
              ,posteddate
              ,row_number() OVER (PARTITION BY id ORDER BY posteddate) rn_min
              ,row_number() OVER (PARTITION BY id ORDER BY posteddate desc) rn_max
          FROM user_mail
         WHERE author = @thisuser OR recipient = @thisuser
       ) t1

由于您的用户可以在两列中,因此您必须在搜索和分组依据中使用两列的值。

试试这个:

select * 
from user_mail t1
join 
(  
  select max(date) as ConvMaxDate, 
    case when author = '$thisUser' then recipient 
         else author 
    end as OtherUser
  from user_mail
  where author = '$thisUser' or recipient = '$thisUser'
  group by case when author = '$thisUser' then recipient 
                else author 
           end
) ConversationMaxDate
on Author = '$thisUser' and OtherUser = recipient 
   or Recipient = '$thisUser' and OtherUser = Author
order by ConvMaxDate desc, Date desc;

ConversationMaxDate 的内部查询首先确定对话伙伴,然后以此 "OtherUser" 分组,计算每个线程的最新日期。这是可行的,因为您可以提供 "ThisUser"(因为只有这样您才能在特定的电子邮件中知道对话中的哪一个)。

您需要 (author, recipient, date)(recipient, author, date) 上的索引,因为 MySQL 可以使用索引合并。否则将需要完整的 table/index 扫描。

由于您不知道每条消息 $thisUserauthor 还是 recipient,您可以使用 LEAST(author, recipient)GREATEST(author, recipient)确定一个 "thread" 并在子查询的 GROUP BY 子句和 JOIN 条件中使用它们。

SELECT m.* 
FROM user_mail m
JOIN (
    SELECT
        LEAST(author, recipient)    as user1,
        GREATEST(author, recipient) as user2,
        MAX(date) as date
    FROM user_mail
    WHERE $thisUser IN (author, recipient)
    GROUP BY user1, user2
) s ON  s.user1 = LEAST(m.author, m.recipient)
    AND s.user2 = GREATEST(m.author, m.recipient)
WHERE $thisUser IN (m.author, m.recipient)
ORDER BY
    s.date DESC,
    LEAST(m.author, m.recipient),
    GREATEST(m.author, m.recipient),
    m.date DESC

但这在大数据集上会很慢,因为没有索引可以用于 GROUP BY 子句和 JOIN 条件。 我会制作 id AUTO_INCREMENT PRIMARY KEY 并使用它代替 date。 这样您至少可以为 JOIN 使用索引 (PK)。而且查询也会更短。

SELECT m.* 
FROM user_mail m
JOIN (
    SELECT MAX(id) as id
    FROM user_mail
    WHERE $thisUser IN (author, recipient)
    GROUP BY
        LEAST(author, recipient),
        GREATEST(author, recipient)
) s ON s.id = m.id
ORDER BY s.id DESC, m.id DESC

您可以使用子查询的 UNION ALL 优化获得更好的性能。

SELECT m.* 
FROM user_mail m
JOIN (
    SELECT MAX(id) as id
    FROM (
        SELECT recipient as user, MAX(id) as id
        FROM user_mail
        WHERE author = $thisUser
        GROUP BY recipient
        UNION ALL
        SELECT author as user, MAX(id) as id
        FROM user_mail
        WHERE recipient = $thisUser
        GROUP BY author
    ) sub1
    GROUP BY user
) s ON s.id = m.id
ORDER BY s.id DESC, m.id DESC

对于此查询,您应该在 (author, recipient)(recipient, author) 上定义复合索引。

更新

您的评论是对的:最后两个查询仅 return 每个对话的最新消息。但是第一个应该 return 所有消息。

但是 - 这是 UNION ALL 优化查询的正确版本:

SELECT m.*, s.max_id
FROM user_mail m
JOIN (
    SELECT other_user, MAX(id) as max_id
    FROM (
        SELECT recipient as other_user, MAX(id) as id
        FROM user_mail
        WHERE author = $thisUser
        GROUP BY recipient
        UNION ALL
        SELECT author as other_user, MAX(id) as id
        FROM user_mail
        WHERE recipient = $thisUser
        GROUP BY author
    ) sub1
    GROUP BY other_user
) s ON s.other_user = m.recipient
WHERE m.author = $thisUser

UNION ALL

SELECT m.*, s.max_id
FROM user_mail m
JOIN (
    SELECT other_user, MAX(id) as max_id
    FROM (
        SELECT recipient as other_user, MAX(id) as id
        FROM user_mail
        WHERE author = $thisUser
        GROUP BY recipient
        UNION ALL
        SELECT author as other_user, MAX(id) as id
        FROM user_mail
        WHERE recipient = $thisUser
        GROUP BY author
    ) sub1
    GROUP BY other_user
) s ON s.other_user = m.author
WHERE m.recipient = $thisUser

ORDER BY max_id DESC, id DESC

虽然看起来很大,但这个查询在我的百万行测试数据集上运行不到 20 毫秒(而其他解决方案需要 300 - 500 毫秒)。 请注意,子查询在两个部分中是相同的。 MySQL 应该能够缓存和重用结果。 为避免代码重复,您可以将子查询存储在一个字符串变量中并重新使用它。如果您使用 MariaDB 10.2,您可能还想尝试 CTE。

也不要忘记在 (author, recipient)(recipient, author)

上定义索引