Postgres Window 函数和 extracted/averaged 时间戳之间的持续时间

Postgres Window Function and extracted/averaged duration between timestamps

多年来一直在阅读 SO,但这是我的第一个 post。希望有人能帮我解决这个问题。

我是 window 函数的新手,但据我了解,这似乎是我要找的东西。我有 3 tables 个用户、任务和 task_users。可以将一个或多个用户分配给一项任务(通过 task_users)。我想看到的是显示以下内容的 table:

用户编号
用户全名
向该用户发出了多少任务(出现次数)
发给该用户的所有任务的平均持续时间 (average_duration)

我目前用来提取单个任务持续时间的方法是:

EXTRACT(来自closed_at的EPOCH)- EXTRACT(来自started_at的EPOCH)/3600 AS持续时间

以下是每个 table 中感兴趣的列:

用户
编号
last_name
first_name

任务
编号
started_at(时间戳w/o tz)
closed_at(时间戳w/o tz)

task_users
task_id(参考文献tasks.id)
user_id(参考文献users.id)

使用下面的 sql,我可以生成一个 table 显示每个用户、他们的 ID 以及发送给该用户的任务数:

SELECT 
    users.id AS u_id,
    concat(users.last_name, ', ', users.first_name) AS u_name, 
    COUNT(*) AS occurrences
FROM tasks
INNER JOIN task_users ON task_users.task_id = tasks.id
INNER JOIN users ON users.id = task_users.user_id
WHERE tasks.closed_at IS NOT NULL 
GROUP BY u_id
ORDER BY occurrences DESC

此查询显示:

----------------------------------
id    u_name           occurrences
----------------------------------
1  |  Mike Smith     | 10
2  |  Dave Johnson   | 5
3  |  George Wilson  | 3
etc...

我想要生成的内容与上述相同table,但发送给每个用户的所有任务的平均持续时间(完成每项任务所需的小时数)。类似于以下内容:

------------------------------------------------------
id    u_name           occurrences    average_duration
------------------------------------------------------
1  |  Mike Smith     | 10           | 32.7
2  |  Dave Johnson   | 5            | 15.2
3  |  George Wilson  | 3            | 10.0
etc...

我尝试了以下子查询和 window 函数,但它会将用户分成多行(每个用户显示的行数等于出现次数)。

SELECT 
    users.id AS u_id,
    concat(users.last_name, ', ', users.first_name) AS u_name, 
    COUNT(*) AS occurrences,
    AVG(tsk.duration) OVER(PARTITION BY users.id) AS average_duration
FROM 
    (SELECT id, (EXTRACT(EPOCH from closed_at) - EXTRACT(EPOCH from started_at)/3600) AS duration FROM tasks) tsk
INNER JOIN task_users ON tsk.id = task_users.task_id
INNER JOIN users ON users.id = task_users.user_id

我是 window 函数的新手,不是 SQL 大师,但在我看来 window 函数是最好的解决方案?

如果有人能指出正确的方向或提出建议,我将不胜感激。

谢谢!

window 函数将为每一行提供一个值。在您的场景中,一个用户有多个任务,因此,联接将导致每个用户多行。

您可以尝试以下方法修改您原来按用户分组数据的方法:

SELECT 
    users.id AS u_id,
    concat(users.last_name, ', ', users.first_name) AS u_name, 
    COUNT(*) AS occurrences,
    SUM(
        EXTRACT(EPOCH from closed_at) - EXTRACT(EPOCH from started_at)/3600
    ) / COUNT(*) as average_duration
FROM tasks
INNER JOIN task_users ON task_users.task_id = tasks.id
INNER JOIN users ON users.id = task_users.user_id
WHERE tasks.closed_at IS NOT NULL 
GROUP BY u_id
ORDER BY occurrences DESC

SELECT 
    users.id AS u_id,
    concat(users.last_name, ', ', users.first_name) AS u_name, 
    COUNT(*) AS occurrences,
    AVG(
        EXTRACT(EPOCH from closed_at) - EXTRACT(EPOCH from started_at)/3600
    ) as average_duration
FROM tasks
INNER JOIN task_users ON task_users.task_id = tasks.id
INNER JOIN users ON users.id = task_users.user_id
WHERE tasks.closed_at IS NOT NULL 
GROUP BY u_id
ORDER BY occurrences DESC

让我知道这是否适合你。