如何在使用Window 函数LAG() 时保留第一条记录?
How to retain the first record while using Window function LAG()?
我有一个数据看起来像 -
实际Table-
VIN
Mode
Status
Start
End
ABC123456789
Mode 1
Waiting for Auth
01/01/2010 00:00:00
05/05/2014 14:54:54
ABC123456789
Mode 1
Waiting for URL
05/05/2014 14:54:54
05/13/2014 19:09:51
ABC123456789
Mode 1
Waiting for User
05/13/2014 19:09:51
11/13/2014 22:26:32
ABC123456789
Mode 1
Authorized
11/13/2014 22:26:32
11/13/2014 22:31:00
ABC123456789
Mode 1
Authorized
11/13/2014 22:31:00
11/14/2014 01:23:56
ABC123456789
Mode 2
Waiting for User
11/14/2014 01:23:56
11/18/2014 19:38:51
ABC123456789
Mode 2
Waiting for User
11/18/2014 19:38:51
11/18/2014 19:38:54
ABC123456789
Mode 2
Waiting for User
11/18/2014 19:38:54
11/18/2014 20:07:52
ABC123456789
Mode 2
Authorized All
11/18/2014 20:07:52
12/17/2014 19:22:50
ABC123456789
Mode 2
Authorized All
12/17/2014 19:22:50
02/25/2015 20:03:44
ABC123456789
Mode 2
Authorized All
02/25/2015 20:03:44
02/25/2015 20:03:48
ABC123456789
Mode 3
Authorized All
02/25/2015 20:03:48
02/25/2015 20:14:05
ABC123456789
Mode 3
Revoke Auth
02/25/2015 20:14:05
02/25/2015 20:14:29
ABC123456789
Mode 3
Waiting for Auth
02/25/2015 20:14:29
02/25/2015 20:40:21
我正在使用下面的 window 函数查询来获得结果输出 table 中所示的预期结果。 但是我无法保留第一行。如何实现?
Hive 查询 -
WITH mma AS
(
select VIN,
Mode,
Status,
case when lower(Status) like '%authorized%' then 'Authorized' else 'Deauthorized' end Event,
Start,
End
from ModemAuth
where VIN = 'ABC123456789'
order by Start
)
select mma2.*
from (select mma.*,
lag(event) over (partition by VIN order by Start) as Prev_Event
from mma
) mma2
where Prev_Event <> Event
预期结果 -
VIN
Mode
Status
Event
Start
End
ABC123456789
Mode 1
Waiting for Auth
Deauthorized
01/01/2010 00:00:00
05/05/2014 14:54:54
ABC123456789
Mode 1
Authorized
Authorized
11/13/2014 22:26:32
11/13/2014 22:31:00
ABC123456789
Mode 2
Waiting for User
Deauthorized
11/14/2014 01:23:56
11/18/2014 19:38:51
ABC123456789
Mode 2
Authorized All
Authorized
11/18/2014 20:07:52
12/17/2014 19:22:50
ABC123456789
Mode 3
Revoke Auth
Deauthorized
02/25/2015 20:14:05
02/25/2015 20:14:29
您的where
条件不正确。应该是:
where Prev_Event is null or prev_event <> Event
这是我放在你之前的逻辑上,类似。你可以接受那个答案。
我有一个数据看起来像 -
实际Table-
VIN | Mode | Status | Start | End |
---|---|---|---|---|
ABC123456789 | Mode 1 | Waiting for Auth | 01/01/2010 00:00:00 | 05/05/2014 14:54:54 |
ABC123456789 | Mode 1 | Waiting for URL | 05/05/2014 14:54:54 | 05/13/2014 19:09:51 |
ABC123456789 | Mode 1 | Waiting for User | 05/13/2014 19:09:51 | 11/13/2014 22:26:32 |
ABC123456789 | Mode 1 | Authorized | 11/13/2014 22:26:32 | 11/13/2014 22:31:00 |
ABC123456789 | Mode 1 | Authorized | 11/13/2014 22:31:00 | 11/14/2014 01:23:56 |
ABC123456789 | Mode 2 | Waiting for User | 11/14/2014 01:23:56 | 11/18/2014 19:38:51 |
ABC123456789 | Mode 2 | Waiting for User | 11/18/2014 19:38:51 | 11/18/2014 19:38:54 |
ABC123456789 | Mode 2 | Waiting for User | 11/18/2014 19:38:54 | 11/18/2014 20:07:52 |
ABC123456789 | Mode 2 | Authorized All | 11/18/2014 20:07:52 | 12/17/2014 19:22:50 |
ABC123456789 | Mode 2 | Authorized All | 12/17/2014 19:22:50 | 02/25/2015 20:03:44 |
ABC123456789 | Mode 2 | Authorized All | 02/25/2015 20:03:44 | 02/25/2015 20:03:48 |
ABC123456789 | Mode 3 | Authorized All | 02/25/2015 20:03:48 | 02/25/2015 20:14:05 |
ABC123456789 | Mode 3 | Revoke Auth | 02/25/2015 20:14:05 | 02/25/2015 20:14:29 |
ABC123456789 | Mode 3 | Waiting for Auth | 02/25/2015 20:14:29 | 02/25/2015 20:40:21 |
我正在使用下面的 window 函数查询来获得结果输出 table 中所示的预期结果。 但是我无法保留第一行。如何实现?
Hive 查询 -
WITH mma AS
(
select VIN,
Mode,
Status,
case when lower(Status) like '%authorized%' then 'Authorized' else 'Deauthorized' end Event,
Start,
End
from ModemAuth
where VIN = 'ABC123456789'
order by Start
)
select mma2.*
from (select mma.*,
lag(event) over (partition by VIN order by Start) as Prev_Event
from mma
) mma2
where Prev_Event <> Event
预期结果 -
VIN | Mode | Status | Event | Start | End |
---|---|---|---|---|---|
ABC123456789 | Mode 1 | Waiting for Auth | Deauthorized | 01/01/2010 00:00:00 | 05/05/2014 14:54:54 |
ABC123456789 | Mode 1 | Authorized | Authorized | 11/13/2014 22:26:32 | 11/13/2014 22:31:00 |
ABC123456789 | Mode 2 | Waiting for User | Deauthorized | 11/14/2014 01:23:56 | 11/18/2014 19:38:51 |
ABC123456789 | Mode 2 | Authorized All | Authorized | 11/18/2014 20:07:52 | 12/17/2014 19:22:50 |
ABC123456789 | Mode 3 | Revoke Auth | Deauthorized | 02/25/2015 20:14:05 | 02/25/2015 20:14:29 |
您的where
条件不正确。应该是:
where Prev_Event is null or prev_event <> Event
这是我放在你之前的逻辑上,类似