如何在使用Window 函数LAG() 时保留第一条记录?

How to retain the first record while using Window function LAG()?

我有一个数据看起来像 -

实际Table-

VIN Mode Status Start End
ABC123456789 Mode 1 Waiting for Auth 01/01/2010 00:00:00 05/05/2014 14:54:54
ABC123456789 Mode 1 Waiting for URL 05/05/2014 14:54:54 05/13/2014 19:09:51
ABC123456789 Mode 1 Waiting for User 05/13/2014 19:09:51 11/13/2014 22:26:32
ABC123456789 Mode 1 Authorized 11/13/2014 22:26:32 11/13/2014 22:31:00
ABC123456789 Mode 1 Authorized 11/13/2014 22:31:00 11/14/2014 01:23:56
ABC123456789 Mode 2 Waiting for User 11/14/2014 01:23:56 11/18/2014 19:38:51
ABC123456789 Mode 2 Waiting for User 11/18/2014 19:38:51 11/18/2014 19:38:54
ABC123456789 Mode 2 Waiting for User 11/18/2014 19:38:54 11/18/2014 20:07:52
ABC123456789 Mode 2 Authorized All 11/18/2014 20:07:52 12/17/2014 19:22:50
ABC123456789 Mode 2 Authorized All 12/17/2014 19:22:50 02/25/2015 20:03:44
ABC123456789 Mode 2 Authorized All 02/25/2015 20:03:44 02/25/2015 20:03:48
ABC123456789 Mode 3 Authorized All 02/25/2015 20:03:48 02/25/2015 20:14:05
ABC123456789 Mode 3 Revoke Auth 02/25/2015 20:14:05 02/25/2015 20:14:29
ABC123456789 Mode 3 Waiting for Auth 02/25/2015 20:14:29 02/25/2015 20:40:21

我正在使用下面的 window 函数查询来获得结果输出 table 中所示的预期结果。 但是我无法保留第一行。如何实现?

Hive 查询 -

WITH mma AS
(
select VIN, 
       Mode, 
       Status,
       case when lower(Status) like '%authorized%' then 'Authorized' else 'Deauthorized' end Event,
       Start, 
       End 
  from ModemAuth 
 where VIN = 'ABC123456789'
 order by Start
 )

select mma2.*
from (select mma.*,
             lag(event) over (partition by VIN order by Start) as Prev_Event
        from mma
     ) mma2
where Prev_Event <> Event

预期结果 -

VIN Mode Status Event Start End
ABC123456789 Mode 1 Waiting for Auth Deauthorized 01/01/2010 00:00:00 05/05/2014 14:54:54
ABC123456789 Mode 1 Authorized Authorized 11/13/2014 22:26:32 11/13/2014 22:31:00
ABC123456789 Mode 2 Waiting for User Deauthorized 11/14/2014 01:23:56 11/18/2014 19:38:51
ABC123456789 Mode 2 Authorized All Authorized 11/18/2014 20:07:52 12/17/2014 19:22:50
ABC123456789 Mode 3 Revoke Auth Deauthorized 02/25/2015 20:14:05 02/25/2015 20:14:29

您的where条件不正确。应该是:

where Prev_Event is null or prev_event <> Event

这是我放在你之前的逻辑上,类似。你可以接受那个答案。