按日期排序,同时按另一列对匹配项进行分组

Order by date, while grouping matches by another column

我有这个问题

SELECT *, COUNT(app.id) AS totalApps FROM users JOIN app ON app.id = users.id
  GROUP BY app.id ORDER BY app.time DESC LIMIT ?

应该从 "users" 中按相关 table 中的另一列(时间)排序的所有结果(来自应用 tables 的 id 引用来自用户 table).

我遇到的问题是分组是在按日期排序之前完成的,所以我得到的结果非常旧。但是我需要分组以获得不同的用户,因为每个用户可以有多个 'apps'... 有没有不同的方法来实现这个?


Table 用户:

id TEXT PRIMARY KEY

Table 应用程序:

id TEXT
time DATETIME
FOREIGN KEY(id) REFERENCES users(id)

在我的 SELECT 查询中,我想获取按 app.time 列排序的用户列表。但是因为一个用户可以关联多个应用程序记录,我可能会得到重复的用户,这就是我使用 GROUP BY 的原因。但是后来顺序乱了

由于您需要每个组中的最新日期,您可以 MAX 他们:

SELECT
  *,
  COUNT(app.id) AS totalApps,
  MAX(app.time) AS latestDate
FROM users
  JOIN app ON app.id = users.id
GROUP BY app.id
ORDER BY latestDate DESC
LIMIT ?

也许你可以使用?

SELECT DISTINCT

在此处阅读更多内容:https://www.w3schools.com/sql/sql_distinct.asp

尝试按id和时间分组,然后按时间排序。

select ... 
group by app.id desc, app.time

我假设 id 在应用程序中是唯一的 table。 以及如何将 ID 分配给?也许你有足够的 order by id desc

你可以使用窗口化 COUNT:

SELECT *, COUNT(app.id) OVER(PARTITION BY app.id) AS totalApps 
FROM users 
JOIN app 
  ON app.id = users.id
ORDER BY app.time DESC
LIMIT ?

潜在的问题是 SELECT 是一个聚合查询,因为它包含一个 GROUP BY 子句 :-

There are two types of simple SELECT statement - aggregate and non-aggregate queries. A simple SELECT statement is an aggregate query if it contains either a GROUP BY clause or one or more aggregate functions in the result-set.

SQL As Understood By SQLite - SELECT

因此该组的列值将是该组列的任意值(我怀疑首先根据 scan/search,因此值较低):-

If the SELECT statement is an aggregate query without a GROUP BY clause, then each aggregate expression in the result-set is evaluated once across the entire dataset. Each non-aggregate expression in the result-set is evaluated once for an arbitrarily selected row of the dataset. The same arbitrarily selected row is used for each non-aggregate expression. Or, if the dataset contains zero rows, then each non-aggregate expression is evaluated against a row consisting entirely of NULL values.

所以简而言之,当它是聚合查询时,您不能依赖不属于 group/aggregation 的列值。

因此必须使用聚合表达式检索所需的值,例如 max(app.time)。但是,您不能按此值进行排序(不确定为什么它可能在效率方面是固有的)

然而

您可以做的是使用查询构建 CTE,然后在不涉及聚合的情况下进行排序。

考虑以下问题,我认为它模拟了您的问题:-

DROP TABLE IF EXISTS users;
DROP TABLE If EXISTS app;

CREATE TABLE IF NOT EXISTS users (id INTEGER PRIMARY KEY, username TEXT);
INSERT INTO users (username) VALUES ('a'),('b'),('c'),('d');

CREATE TABLE app (the_id INTEGER PRIMARY KEY, id INTEGER, appname TEXT, time TEXT);
INSERT INTO app (id,appname,time) VALUES
    (4,'app9',721),(4,'app10',7654),(4,'app11',11),
        (3,'app1',1000),(3,'app2',7),
        (2,'app3',10),(2,'app4',101),(2,'app5',1),
        (1,'app6',15),(1,'app7',7),(1,'app8',212),
        (4,'app9',721),(4,'app10',7654),(4,'app11',11),
        (3,'app1',1000),(3,'app2',7),
        (2,'app3',10),(2,'app4',101),(2,'app5',1),
        (1,'app6',15),(1,'app7',7),(1,'app8',212)
    ;
    SELECT * FROM users;
    SELECT * FROM app;
    
    SELECT username 
      ,count(app.id) 
      , max(app.time) AS latest_time
        , min(app.time) AS earliest_time
    FROM users JOIN app ON users.id = app.id 
    GROUP BY users.id
    ORDER BY max(app.time)
    ;

这导致:-

虽然提取了每个组的最晚时间,但最终结果并未像您想象的那样排序。

将其包装到 CTE 中可以解决这个问题,例如:-

WITH cte1 AS 
(
    SELECT username 
        ,count(app.id) 
        , max(app.time) AS latest_time
        , min(app.time) AS earliest_time
    FROM users JOIN app ON users.id = app.id 
    GROUP BY users.id
)
SELECT * FROM cte1 ORDER BY cast(latest_time AS INTEGER) DESC;

现在:-

  • 请注意,为方便起见,我使用了简单的整数而不是实际时间。