如何在不拆分行的情况下根据单独的列将一列时间戳变成两列
How to turn a column of timestamps into two columns depending on separate column without splitting rows
我有一个 table,其中有一列用于时间戳,另一列用于状态。我想获取签入状态时的时间戳以及完成一行时的时间戳。当我尝试使用 case 语句时,我最终将它分成两行。我希望它 return 一行包含每个时间戳列中的值,而不是两行,每一行与另一行中的空值。
CASE WHEN aud.STATUS_DESCRIPTION = 'CHECKED_IN' THEN aud.STATUS_DATETIME
END AS "Check-In Time",
CASE WHEN aud.STATUS_DESCRIPTION = 'COMPLETED' THEN aud.STATUS_DATETIME
END AS "Completed Time",
Table with statuses and timestamps
What my case statement is returning
感谢您的帮助
发生这种情况是因为您的数据跨很多行。
你以太需要做某种形式的聚合,所以GROUP BY and then using an aggregate function like MIN/MAX
或者您需要对您想要的数据进行分类,然后使用 PIVOT 为您进行聚合。
第一个可能看起来像:
SELECT
some_column_a,
some_column_b,
MAX(IFF( aud.status_description = 'CHECKED_IN', aud.status_datetime, null)) as check_in_time
MAX(IFF( aud.status_description = 'COMPLETED', aud.status_datetime, null)) as complete_time
FROM table
GROUP BY some_column_a, some_column_b
ORDER BY some_column_a, some_column_b;
所以添加一个工作示例
WITH data AS (
SELECT to_date(column1) as STATUS_DATETIME,
column2 as STATUS_DESCRIPTION,
column3 as customer_id
FROM VALUES
('2021-12-11 11:12:03','CREATED', 1),
('2021-12-11 11:12:03','CHECKED_IN', 1),
('2021-12-11 11:22:49','PROGRESS', 1),
('2021-12-11 11:55:03','COMPLETED', 1),
('2021-10-11 11:55:03','COMPLETED', 0)
)
SELECT
aud.customer_id,
MAX(IFF( aud.status_description = 'CHECKED_IN', aud.status_datetime, null)) as check_in_time,
MAX(IFF( aud.status_description = 'COMPLETED', aud.status_datetime, null)) as complete_time
FROM data as aud
GROUP BY 1
ORDER BY 1;
如果您有很多 customer_id
并且每个 customer_id 有很多条目,则此示例效果很好。如果您的每个 table 大小都很小,并且您从来没有两个记录处于“完成”状态,那么连接就可以工作。
WITH data AS (
SELECT to_date(column1) as STATUS_DATETIME,
column2 as STATUS_DESCRIPTION,
column3 as customer_id
FROM VALUES
('2021-12-11 11:12:03','CREATED', 1),
('2021-12-11 11:12:03','CHECKED_IN', 1),
('2021-12-11 11:22:49','PROGRESS', 1),
('2021-12-11 11:55:03','COMPLETED', 1),
('2021-10-11 11:55:03','COMPLETED', 0)
)
SELECT
checked.customer_id,
checked.status_datetime as check_in_time,
completed.status_datetime as complete_time
FROM data as checked
JOIN data as completed
ON checked.customer_id = completed.customer_id
AND checked.STATUS_DESCRIPTION = 'CHECKED_IN'
AND completed.STATUS_DESCRIPTION = 'COMPLETED'
;
如果您没有同时拥有“已完成”和“checked_in”,则连接不起作用。对于上面的 SQL,没有 customer_id 0
的行。因为只有一个
因此你需要一个完整的外部连接,然后将过滤器移动到 CTE(或子 select)是有意义的,就像这样:
WITH data AS (
SELECT to_date(column1) as STATUS_DATETIME,
column2 as STATUS_DESCRIPTION,
column3 as customer_id
FROM VALUES
('2021-12-11 11:12:03','CREATED', 1),
('2021-12-11 11:12:03','CHECKED_IN', 1),
('2021-12-11 11:22:49','PROGRESS', 1),
('2021-12-11 11:55:03','COMPLETED', 1),
('2021-10-11 11:55:03','COMPLETED', 0)
), completed_data AS (
SELECT STATUS_DATETIME, STATUS_DESCRIPTION, customer_id
FROM data
WHERE STATUS_DESCRIPTION = 'COMPLETED'
), checked_in_data AS (
SELECT STATUS_DATETIME, STATUS_DESCRIPTION, customer_id
FROM data
WHERE STATUS_DESCRIPTION = 'CHECKED_IN'
)
SELECT
COALESCE(checked.customer_id, completed.customer_id) AS customer_id,
checked.status_datetime as check_in_time,
completed.status_datetime as complete_time
FROM checked_in_data as checked
FULL OUTER JOIN completed_data as completed
ON checked.customer_id = completed.customer_id
ORDER BY 1,2;
;
给出输出:
CUSTOMER_ID
CHECK_IN_TIME
COMPLETE_TIME
0
2021-10-11
1
2021-12-11
2021-12-11
我将从自助加入开始。
SELECT
chcecked.STATUS_DATETIME as CHECKED_IN_TIME,
completed.STATUS_DATETIME as COMPLETED_TIME
FROM
yourtable as checked
JOIN
yourtable as completed
ON ....
这只是一个如何使用 pivot
的示例,它是对 Simeon answer.Using 来自所提供图像的示例数据的补充。
Table 创建和数据插入:
create or replace temporary table _temp (
ts timestamp_ntz,
_status varchar
);
insert into _temp
values ('2021-12-11 11:12:03','created'),
('2021-12-11 11:12:03','checked_in'),
('2021-12-11 11:22:49','progress'),
('2021-12-11 11:55:03','completed');
数据透视查询:
select *
from _temp
pivot(max(ts) for _status in ('checked_in', 'completed')) as p;
结果:
'checked_in' 'completed'
2021-12-11 11:12:03.000 2021-12-11 11:55:03.000
请注意,我使用了 MAX
聚合函数,它可以被其他聚合函数替换。如果只有 2 列,这总是 return 单行,为了更好地了解数据透视有另一列并查看数据透视文档中提供的示例。
我有一个 table,其中有一列用于时间戳,另一列用于状态。我想获取签入状态时的时间戳以及完成一行时的时间戳。当我尝试使用 case 语句时,我最终将它分成两行。我希望它 return 一行包含每个时间戳列中的值,而不是两行,每一行与另一行中的空值。
CASE WHEN aud.STATUS_DESCRIPTION = 'CHECKED_IN' THEN aud.STATUS_DATETIME
END AS "Check-In Time",
CASE WHEN aud.STATUS_DESCRIPTION = 'COMPLETED' THEN aud.STATUS_DATETIME
END AS "Completed Time",
Table with statuses and timestamps
What my case statement is returning
感谢您的帮助
发生这种情况是因为您的数据跨很多行。
你以太需要做某种形式的聚合,所以GROUP BY and then using an aggregate function like MIN/MAX
或者您需要对您想要的数据进行分类,然后使用 PIVOT 为您进行聚合。
第一个可能看起来像:
SELECT
some_column_a,
some_column_b,
MAX(IFF( aud.status_description = 'CHECKED_IN', aud.status_datetime, null)) as check_in_time
MAX(IFF( aud.status_description = 'COMPLETED', aud.status_datetime, null)) as complete_time
FROM table
GROUP BY some_column_a, some_column_b
ORDER BY some_column_a, some_column_b;
所以添加一个工作示例
WITH data AS (
SELECT to_date(column1) as STATUS_DATETIME,
column2 as STATUS_DESCRIPTION,
column3 as customer_id
FROM VALUES
('2021-12-11 11:12:03','CREATED', 1),
('2021-12-11 11:12:03','CHECKED_IN', 1),
('2021-12-11 11:22:49','PROGRESS', 1),
('2021-12-11 11:55:03','COMPLETED', 1),
('2021-10-11 11:55:03','COMPLETED', 0)
)
SELECT
aud.customer_id,
MAX(IFF( aud.status_description = 'CHECKED_IN', aud.status_datetime, null)) as check_in_time,
MAX(IFF( aud.status_description = 'COMPLETED', aud.status_datetime, null)) as complete_time
FROM data as aud
GROUP BY 1
ORDER BY 1;
如果您有很多 customer_id
并且每个 customer_id 有很多条目,则此示例效果很好。如果您的每个 table 大小都很小,并且您从来没有两个记录处于“完成”状态,那么连接就可以工作。
WITH data AS (
SELECT to_date(column1) as STATUS_DATETIME,
column2 as STATUS_DESCRIPTION,
column3 as customer_id
FROM VALUES
('2021-12-11 11:12:03','CREATED', 1),
('2021-12-11 11:12:03','CHECKED_IN', 1),
('2021-12-11 11:22:49','PROGRESS', 1),
('2021-12-11 11:55:03','COMPLETED', 1),
('2021-10-11 11:55:03','COMPLETED', 0)
)
SELECT
checked.customer_id,
checked.status_datetime as check_in_time,
completed.status_datetime as complete_time
FROM data as checked
JOIN data as completed
ON checked.customer_id = completed.customer_id
AND checked.STATUS_DESCRIPTION = 'CHECKED_IN'
AND completed.STATUS_DESCRIPTION = 'COMPLETED'
;
如果您没有同时拥有“已完成”和“checked_in”,则连接不起作用。对于上面的 SQL,没有 customer_id 0
的行。因为只有一个
因此你需要一个完整的外部连接,然后将过滤器移动到 CTE(或子 select)是有意义的,就像这样:
WITH data AS (
SELECT to_date(column1) as STATUS_DATETIME,
column2 as STATUS_DESCRIPTION,
column3 as customer_id
FROM VALUES
('2021-12-11 11:12:03','CREATED', 1),
('2021-12-11 11:12:03','CHECKED_IN', 1),
('2021-12-11 11:22:49','PROGRESS', 1),
('2021-12-11 11:55:03','COMPLETED', 1),
('2021-10-11 11:55:03','COMPLETED', 0)
), completed_data AS (
SELECT STATUS_DATETIME, STATUS_DESCRIPTION, customer_id
FROM data
WHERE STATUS_DESCRIPTION = 'COMPLETED'
), checked_in_data AS (
SELECT STATUS_DATETIME, STATUS_DESCRIPTION, customer_id
FROM data
WHERE STATUS_DESCRIPTION = 'CHECKED_IN'
)
SELECT
COALESCE(checked.customer_id, completed.customer_id) AS customer_id,
checked.status_datetime as check_in_time,
completed.status_datetime as complete_time
FROM checked_in_data as checked
FULL OUTER JOIN completed_data as completed
ON checked.customer_id = completed.customer_id
ORDER BY 1,2;
;
给出输出:
CUSTOMER_ID | CHECK_IN_TIME | COMPLETE_TIME |
---|---|---|
0 | 2021-10-11 | |
1 | 2021-12-11 | 2021-12-11 |
我将从自助加入开始。
SELECT
chcecked.STATUS_DATETIME as CHECKED_IN_TIME,
completed.STATUS_DATETIME as COMPLETED_TIME
FROM
yourtable as checked
JOIN
yourtable as completed
ON ....
这只是一个如何使用 pivot
的示例,它是对 Simeon answer.Using 来自所提供图像的示例数据的补充。
Table 创建和数据插入:
create or replace temporary table _temp (
ts timestamp_ntz,
_status varchar
);
insert into _temp
values ('2021-12-11 11:12:03','created'),
('2021-12-11 11:12:03','checked_in'),
('2021-12-11 11:22:49','progress'),
('2021-12-11 11:55:03','completed');
数据透视查询:
select *
from _temp
pivot(max(ts) for _status in ('checked_in', 'completed')) as p;
结果:
'checked_in' 'completed'
2021-12-11 11:12:03.000 2021-12-11 11:55:03.000
请注意,我使用了 MAX
聚合函数,它可以被其他聚合函数替换。如果只有 2 列,这总是 return 单行,为了更好地了解数据透视有另一列并查看数据透视文档中提供的示例。