Presto/SQL 查找下一个满足条件的值
Presto/SQL find next value that meets a condition
我有一个 table 如下所示,前 3 列是原始 table,最后 2 列是我想要的。所以我想要的是:
- 如果 "issued" 事件发生
,则在记录的接下来 30 天内查找
- 如果发生"issued","issued"
的邮票是多少
我能够通过将原始 table 中的 "issued" 替换为 1 并使用自连接并将 "replaced issued" 与 sum(case when t2.event_ts between t1.event_ts and date_add('day',30,t1.event_ts) then t2.issued end) as issued_in_month
求和来确定是否已发布。
但我不确定如何获取 "issued" 时间戳。
如何在 Presto 中执行此操作?
谢谢
| event_id | event_ts | event_name | issued_within_30_days | issued_ts |
|----------------------------------|---------------|------------|-----------------------|---------------|
| fc25df28c8bb188e8d15c1fea4acd06a | 6/1/18 21:31 | CLICK | 1 | 6/27/18 10:18 |
| 2b48d5ea80829cdd1014a68631160eb0 | 6/6/18 17:25 | OPEN | 1 | 6/27/18 10:18 |
| f4f6d9af7217031e5579c6d741fe3d21 | 6/8/18 0:00 | MAIL_DROP | 1 | 6/27/18 10:18 |
| 4a5af238e28688bf899def0905210550 | 6/10/18 4:25 | OPEN | 1 | 6/27/18 10:18 |
| a7945e7087c27d7f6bd7c66a397bf6bf | 6/25/18 17:53 | CLICK | 1 | 6/27/18 10:18 |
| 82eba7f62a9398347d67073291547765 | 6/27/18 4:58 | OPEN | 1 | 6/27/18 10:18 |
| 6d6984822e98009a0a30deda1204a5e0 | 6/27/18 4:58 | CLICK | 1 | 6/27/18 10:18 |
| 6229316a4ebcb56124e1180c9e2f60be | 6/27/18 10:18 | ISSUED | 1 | 6/27/18 10:18 |
| 3994ed57abc330e16965f13a2a37edd8 | 6/27/18 11:19 | SEND | 0 | |
| 70f0a4d73975f26523415625309f1f64 | 7/4/18 17:30 | SEND | 0 | |
| a7dd44648a0ce0029ad1ec7106fcf95c | 7/4/18 18:37 | OPEN | 0 | |
| 8a823394754276a46fa3021ddc7a7290 | 7/6/18 0:00 | MAIL_DROP | 0 | |
| 7a4c451554ba08da59a85a3fa61de7b6 | 7/6/18 20:18 | SEND | 0 | |
| f992b7a5a25f765e5fd6f7dde129f9dc | 7/7/18 1:16 | OPEN | 0 | |
| 9370e4901a4796371827e2f45862ba66 | 7/11/18 4:35 | CLICK | 0 | |
| 96badd7253344f79c7ac2a47032d5942 | 2/22/19 17:20 | ISSUED | 0 | |
最简单的方法可能是累积最小值和一些 case
表达式:
select t.*,
(case when next_issued_ts < event_ts + interval '30' day
then 1 else 0
end) as issued_within_30_days,
(case when next_issued_ts < event_ts + interval '30' day
then next_issued_ts
end) as issued_ts
from (select t.*,
min(case when event = 'issued' then event_ts end) over
(order by event_ts desc) as next_issued_ts
from t
) t
我有一个 table 如下所示,前 3 列是原始 table,最后 2 列是我想要的。所以我想要的是:
- 如果 "issued" 事件发生 ,则在记录的接下来 30 天内查找
- 如果发生"issued","issued" 的邮票是多少
我能够通过将原始 table 中的 "issued" 替换为 1 并使用自连接并将 "replaced issued" 与 sum(case when t2.event_ts between t1.event_ts and date_add('day',30,t1.event_ts) then t2.issued end) as issued_in_month
求和来确定是否已发布。
但我不确定如何获取 "issued" 时间戳。
如何在 Presto 中执行此操作? 谢谢
| event_id | event_ts | event_name | issued_within_30_days | issued_ts |
|----------------------------------|---------------|------------|-----------------------|---------------|
| fc25df28c8bb188e8d15c1fea4acd06a | 6/1/18 21:31 | CLICK | 1 | 6/27/18 10:18 |
| 2b48d5ea80829cdd1014a68631160eb0 | 6/6/18 17:25 | OPEN | 1 | 6/27/18 10:18 |
| f4f6d9af7217031e5579c6d741fe3d21 | 6/8/18 0:00 | MAIL_DROP | 1 | 6/27/18 10:18 |
| 4a5af238e28688bf899def0905210550 | 6/10/18 4:25 | OPEN | 1 | 6/27/18 10:18 |
| a7945e7087c27d7f6bd7c66a397bf6bf | 6/25/18 17:53 | CLICK | 1 | 6/27/18 10:18 |
| 82eba7f62a9398347d67073291547765 | 6/27/18 4:58 | OPEN | 1 | 6/27/18 10:18 |
| 6d6984822e98009a0a30deda1204a5e0 | 6/27/18 4:58 | CLICK | 1 | 6/27/18 10:18 |
| 6229316a4ebcb56124e1180c9e2f60be | 6/27/18 10:18 | ISSUED | 1 | 6/27/18 10:18 |
| 3994ed57abc330e16965f13a2a37edd8 | 6/27/18 11:19 | SEND | 0 | |
| 70f0a4d73975f26523415625309f1f64 | 7/4/18 17:30 | SEND | 0 | |
| a7dd44648a0ce0029ad1ec7106fcf95c | 7/4/18 18:37 | OPEN | 0 | |
| 8a823394754276a46fa3021ddc7a7290 | 7/6/18 0:00 | MAIL_DROP | 0 | |
| 7a4c451554ba08da59a85a3fa61de7b6 | 7/6/18 20:18 | SEND | 0 | |
| f992b7a5a25f765e5fd6f7dde129f9dc | 7/7/18 1:16 | OPEN | 0 | |
| 9370e4901a4796371827e2f45862ba66 | 7/11/18 4:35 | CLICK | 0 | |
| 96badd7253344f79c7ac2a47032d5942 | 2/22/19 17:20 | ISSUED | 0 | |
最简单的方法可能是累积最小值和一些 case
表达式:
select t.*,
(case when next_issued_ts < event_ts + interval '30' day
then 1 else 0
end) as issued_within_30_days,
(case when next_issued_ts < event_ts + interval '30' day
then next_issued_ts
end) as issued_ts
from (select t.*,
min(case when event = 'issued' then event_ts end) over
(order by event_ts desc) as next_issued_ts
from t
) t