在 beamSQL 中实现 ROW_NUMBER()

Implement ROW_NUMBER() in beamSQL

我有以下查询:

SELECT DISTINCT Summed, ROW_NUMBER () OVER (order by Summed desc) as Rank  from table1

我必须在 Apache Beam(beamSql) 中编写它。下面是我的代码:

PCollection<BeamRecord> rec_2_part2 = rec_2.apply(BeamSql.query("SELECT DISTINCT Summed, ROW_NUMBER(Summed) OVER (ORDER BY Summed) Rank1 from PCOLLECTION "));

但我收到以下错误:

Caused by: java.lang.UnsupportedOperationException: Operator: ROW_NUMBER is not supported yet!

知道如何在 beamSql 中实现 ROW_NUMBER() 吗?

这是一种无需使用 ROW_NUMBER 即可估算当前查询的方法:

SELECT
    t1.Summed,
    (SELECT COUNT(*) FROM (SELECT DISTINCT Summed FROM table1) t2
     WHERE t2.Summed >= t1.Summed) AS Rank
FROM
(
    SELECT DISTINCT Summed
    FROM table1
) t1

基本思想是首先进行子查询以获得仅具有不同 Summed 值的 table。然后,使用相关子查询来模拟行号。这不是一个非常有效的方法,但如果 ROW_NUMBER 不可用,那么您只能使用其他方法。

适用于上述查询的解决方案:

PCollection<BeamRecord> rec_2 = rec_1.apply(BeamSql.query("SELECT max(Summed) as maxed, max(Summed)-10 as least, 'a' as Dummy from PCOLLECTION"));