SQL select 满足条件后每组的所有行

Question

我想 select 在最后一次满足该组的条件后，每个组的所有行。 related question 有一个使用相关子查询的答案。

在我的例子中，我将有数百万个类别和数百个 millions/billions 行。 有没有办法使用性能更高的查询来获得相同的结果？

这是一个例子。条件是条件列中最后一个 0 之后的所有行（每组）。

category | timestamp |  condition 
--------------------------------------
   A     |     1     |     0 
   A     |     2     |     1 
   A     |     3     |     0 
   A     |     4     |     1
   A     |     5     |     1
   B     |     1     |     0 
   B     |     2     |     1
   B     |     3     |     1

我想达到的结果是

category | timestamp |  condition 
--------------------------------------
   A     |     4     |     1
   A     |     5     |     1
   B     |     2     |     1
   B     |     3     |     1

Answer 1

如果你想要最后一个 0 之后的所有内容，你可以使用 window 函数：

select t.*
from (select t.*,
             max(case when condition = 0 then timestamp end) over (partition by category) as max_timestamp_0
      from t
     ) t
where timestamp > max_timestamp_0 or
      max_timestamp_0 is null;

使用 (category, condition, timestamp) 上的索引，相关子查询版本也可能执行得很好：

select t.*
from t
where t.timestamp > all (select t2.timestamp
                         from t t2
                         where t2.category = t.category and
                               t2.condition = 0
                        );

Answer 2

您可能想尝试 window 功能：

select category, timestamp, condition
from (
    select 
        t.*,
        min(condition) over(partition by category order by timestamp desc) min_cond
    from mytable t
) t
where min_cond = 1

带有 order by 子句的 window min() 计算相同 category 的当前行和后续行的 condition 的最小值：我们可以将其用作过滤器，以消除具有 0.

的更新行的行

与相关子查询方法相比，使用 window 函数的好处是它减少了 table 上所需的扫描次数。当然，这种计算也有成本，因此您需要根据样本数据评估这两种解决方案。

SQL select 满足条件后每组的所有行

SQL select all rows per group after a condition is met

sql

correlated-subquery

window-functions