Postgres Window 函数和 extracted/averaged 时间戳之间的持续时间
Postgres Window Function and extracted/averaged duration between timestamps
多年来一直在阅读 SO,但这是我的第一个 post。希望有人能帮我解决这个问题。
我是 window 函数的新手,但据我了解,这似乎是我要找的东西。我有 3 tables 个用户、任务和 task_users。可以将一个或多个用户分配给一项任务(通过 task_users)。我想看到的是显示以下内容的 table:
用户编号
用户全名
向该用户发出了多少任务(出现次数)
发给该用户的所有任务的平均持续时间 (average_duration)
我目前用来提取单个任务持续时间的方法是:
EXTRACT(来自closed_at的EPOCH)- EXTRACT(来自started_at的EPOCH)/3600 AS持续时间
以下是每个 table 中感兴趣的列:
用户
编号
last_name
first_name
任务
编号
started_at(时间戳w/o tz)
closed_at(时间戳w/o tz)
task_users
task_id(参考文献tasks.id)
user_id(参考文献users.id)
使用下面的 sql,我可以生成一个 table 显示每个用户、他们的 ID 以及发送给该用户的任务数:
SELECT
users.id AS u_id,
concat(users.last_name, ', ', users.first_name) AS u_name,
COUNT(*) AS occurrences
FROM tasks
INNER JOIN task_users ON task_users.task_id = tasks.id
INNER JOIN users ON users.id = task_users.user_id
WHERE tasks.closed_at IS NOT NULL
GROUP BY u_id
ORDER BY occurrences DESC
此查询显示:
----------------------------------
id u_name occurrences
----------------------------------
1 | Mike Smith | 10
2 | Dave Johnson | 5
3 | George Wilson | 3
etc...
我想要生成的内容与上述相同table,但发送给每个用户的所有任务的平均持续时间(完成每项任务所需的小时数)。类似于以下内容:
------------------------------------------------------
id u_name occurrences average_duration
------------------------------------------------------
1 | Mike Smith | 10 | 32.7
2 | Dave Johnson | 5 | 15.2
3 | George Wilson | 3 | 10.0
etc...
我尝试了以下子查询和 window 函数,但它会将用户分成多行(每个用户显示的行数等于出现次数)。
SELECT
users.id AS u_id,
concat(users.last_name, ', ', users.first_name) AS u_name,
COUNT(*) AS occurrences,
AVG(tsk.duration) OVER(PARTITION BY users.id) AS average_duration
FROM
(SELECT id, (EXTRACT(EPOCH from closed_at) - EXTRACT(EPOCH from started_at)/3600) AS duration FROM tasks) tsk
INNER JOIN task_users ON tsk.id = task_users.task_id
INNER JOIN users ON users.id = task_users.user_id
我是 window 函数的新手,不是 SQL 大师,但在我看来 window 函数是最好的解决方案?
如果有人能指出正确的方向或提出建议,我将不胜感激。
谢谢!
window 函数将为每一行提供一个值。在您的场景中,一个用户有多个任务,因此,联接将导致每个用户多行。
您可以尝试以下方法修改您原来按用户分组数据的方法:
SELECT
users.id AS u_id,
concat(users.last_name, ', ', users.first_name) AS u_name,
COUNT(*) AS occurrences,
SUM(
EXTRACT(EPOCH from closed_at) - EXTRACT(EPOCH from started_at)/3600
) / COUNT(*) as average_duration
FROM tasks
INNER JOIN task_users ON task_users.task_id = tasks.id
INNER JOIN users ON users.id = task_users.user_id
WHERE tasks.closed_at IS NOT NULL
GROUP BY u_id
ORDER BY occurrences DESC
或
SELECT
users.id AS u_id,
concat(users.last_name, ', ', users.first_name) AS u_name,
COUNT(*) AS occurrences,
AVG(
EXTRACT(EPOCH from closed_at) - EXTRACT(EPOCH from started_at)/3600
) as average_duration
FROM tasks
INNER JOIN task_users ON task_users.task_id = tasks.id
INNER JOIN users ON users.id = task_users.user_id
WHERE tasks.closed_at IS NOT NULL
GROUP BY u_id
ORDER BY occurrences DESC
让我知道这是否适合你。
多年来一直在阅读 SO,但这是我的第一个 post。希望有人能帮我解决这个问题。
我是 window 函数的新手,但据我了解,这似乎是我要找的东西。我有 3 tables 个用户、任务和 task_users。可以将一个或多个用户分配给一项任务(通过 task_users)。我想看到的是显示以下内容的 table:
用户编号
用户全名
向该用户发出了多少任务(出现次数)
发给该用户的所有任务的平均持续时间 (average_duration)
我目前用来提取单个任务持续时间的方法是:
EXTRACT(来自closed_at的EPOCH)- EXTRACT(来自started_at的EPOCH)/3600 AS持续时间
以下是每个 table 中感兴趣的列:
用户
编号
last_name
first_name
任务
编号
started_at(时间戳w/o tz)
closed_at(时间戳w/o tz)
task_users
task_id(参考文献tasks.id)
user_id(参考文献users.id)
使用下面的 sql,我可以生成一个 table 显示每个用户、他们的 ID 以及发送给该用户的任务数:
SELECT
users.id AS u_id,
concat(users.last_name, ', ', users.first_name) AS u_name,
COUNT(*) AS occurrences
FROM tasks
INNER JOIN task_users ON task_users.task_id = tasks.id
INNER JOIN users ON users.id = task_users.user_id
WHERE tasks.closed_at IS NOT NULL
GROUP BY u_id
ORDER BY occurrences DESC
此查询显示:
----------------------------------
id u_name occurrences
----------------------------------
1 | Mike Smith | 10
2 | Dave Johnson | 5
3 | George Wilson | 3
etc...
我想要生成的内容与上述相同table,但发送给每个用户的所有任务的平均持续时间(完成每项任务所需的小时数)。类似于以下内容:
------------------------------------------------------
id u_name occurrences average_duration
------------------------------------------------------
1 | Mike Smith | 10 | 32.7
2 | Dave Johnson | 5 | 15.2
3 | George Wilson | 3 | 10.0
etc...
我尝试了以下子查询和 window 函数,但它会将用户分成多行(每个用户显示的行数等于出现次数)。
SELECT
users.id AS u_id,
concat(users.last_name, ', ', users.first_name) AS u_name,
COUNT(*) AS occurrences,
AVG(tsk.duration) OVER(PARTITION BY users.id) AS average_duration
FROM
(SELECT id, (EXTRACT(EPOCH from closed_at) - EXTRACT(EPOCH from started_at)/3600) AS duration FROM tasks) tsk
INNER JOIN task_users ON tsk.id = task_users.task_id
INNER JOIN users ON users.id = task_users.user_id
我是 window 函数的新手,不是 SQL 大师,但在我看来 window 函数是最好的解决方案?
如果有人能指出正确的方向或提出建议,我将不胜感激。
谢谢!
window 函数将为每一行提供一个值。在您的场景中,一个用户有多个任务,因此,联接将导致每个用户多行。
您可以尝试以下方法修改您原来按用户分组数据的方法:
SELECT
users.id AS u_id,
concat(users.last_name, ', ', users.first_name) AS u_name,
COUNT(*) AS occurrences,
SUM(
EXTRACT(EPOCH from closed_at) - EXTRACT(EPOCH from started_at)/3600
) / COUNT(*) as average_duration
FROM tasks
INNER JOIN task_users ON task_users.task_id = tasks.id
INNER JOIN users ON users.id = task_users.user_id
WHERE tasks.closed_at IS NOT NULL
GROUP BY u_id
ORDER BY occurrences DESC
或
SELECT
users.id AS u_id,
concat(users.last_name, ', ', users.first_name) AS u_name,
COUNT(*) AS occurrences,
AVG(
EXTRACT(EPOCH from closed_at) - EXTRACT(EPOCH from started_at)/3600
) as average_duration
FROM tasks
INNER JOIN task_users ON task_users.task_id = tasks.id
INNER JOIN users ON users.id = task_users.user_id
WHERE tasks.closed_at IS NOT NULL
GROUP BY u_id
ORDER BY occurrences DESC
让我知道这是否适合你。