在 SQL 查询中仅包含过渡状态

Include only transition states in SQL query

我有一个 table 客户及其购买行为,如下所示:

customer     shop       time
----------------------------
   1          5        13.30   
   1          5        14.33
   1          10       22.17
   2          3        12.15
   2          1        13.30
   2          1        15.55
   2          3        17.29

因为我想在商店换班,所以我需要以下输出

customer     shop       time
----------------------------
   1          5        13.30   
   1          10       22.17
   2          3        12.15
   2          1        13.30
   2          3        17.29

我试过使用

ROW_NUMBER() OVER (PARTITION BY customer, shop  ORDER BY time ASC) AS a counter

然后只保留全部counter=1。但是,当客户稍后再次访问同一家商店时,这让我很困扰,就像我的示例中的 customer=2shop=3 一样。

我想到了这个:

WITH a AS 
(
    SELECT 
        customer, shop, time, 
        ROW_NUMBER() OVER (PARTITION BY customer ORDER BY time ASC) AS counter
    FROM 
        db
)
SELECT a1.*
FROM a a1
JOIN a AS a2 ON (a1.device = a2.device AND a2.counter1 + 1 = a1.counter1 AND a2.id <> a1.id)

UNION 

SELECT a.*
FROM a 
WHERE counter1 = 1

但是,这非常低效,运行 在我的数据所在的 AWS 中,它会导致错误告诉我

Query exhausted resources at this scale factor

有没有办法让这个查询更有效率?

这是一个缺口和孤岛问题。但最简单的解决方案是使用 lag():

select customer, shop, time
from (select t.*, lag(shop) over (partition by customer order by time) as prev_shop
      from t
     ) t
where prev_shop is null or prev_shop <> shop;