在 beamSQL 中实现 ROW_NUMBER()
Implement ROW_NUMBER() in beamSQL
我有以下查询:
SELECT DISTINCT Summed, ROW_NUMBER () OVER (order by Summed desc) as Rank from table1
我必须在 Apache Beam(beamSql) 中编写它。下面是我的代码:
PCollection<BeamRecord> rec_2_part2 = rec_2.apply(BeamSql.query("SELECT DISTINCT Summed, ROW_NUMBER(Summed) OVER (ORDER BY Summed) Rank1 from PCOLLECTION "));
但我收到以下错误:
Caused by: java.lang.UnsupportedOperationException: Operator: ROW_NUMBER is not supported yet!
知道如何在 beamSql 中实现 ROW_NUMBER() 吗?
这是一种无需使用 ROW_NUMBER
即可估算当前查询的方法:
SELECT
t1.Summed,
(SELECT COUNT(*) FROM (SELECT DISTINCT Summed FROM table1) t2
WHERE t2.Summed >= t1.Summed) AS Rank
FROM
(
SELECT DISTINCT Summed
FROM table1
) t1
基本思想是首先进行子查询以获得仅具有不同 Summed
值的 table。然后,使用相关子查询来模拟行号。这不是一个非常有效的方法,但如果 ROW_NUMBER
不可用,那么您只能使用其他方法。
适用于上述查询的解决方案:
PCollection<BeamRecord> rec_2 = rec_1.apply(BeamSql.query("SELECT max(Summed) as maxed, max(Summed)-10 as least, 'a' as Dummy from PCOLLECTION"));
我有以下查询:
SELECT DISTINCT Summed, ROW_NUMBER () OVER (order by Summed desc) as Rank from table1
我必须在 Apache Beam(beamSql) 中编写它。下面是我的代码:
PCollection<BeamRecord> rec_2_part2 = rec_2.apply(BeamSql.query("SELECT DISTINCT Summed, ROW_NUMBER(Summed) OVER (ORDER BY Summed) Rank1 from PCOLLECTION "));
但我收到以下错误:
Caused by: java.lang.UnsupportedOperationException: Operator: ROW_NUMBER is not supported yet!
知道如何在 beamSql 中实现 ROW_NUMBER() 吗?
这是一种无需使用 ROW_NUMBER
即可估算当前查询的方法:
SELECT
t1.Summed,
(SELECT COUNT(*) FROM (SELECT DISTINCT Summed FROM table1) t2
WHERE t2.Summed >= t1.Summed) AS Rank
FROM
(
SELECT DISTINCT Summed
FROM table1
) t1
基本思想是首先进行子查询以获得仅具有不同 Summed
值的 table。然后,使用相关子查询来模拟行号。这不是一个非常有效的方法,但如果 ROW_NUMBER
不可用,那么您只能使用其他方法。
适用于上述查询的解决方案:
PCollection<BeamRecord> rec_2 = rec_1.apply(BeamSql.query("SELECT max(Summed) as maxed, max(Summed)-10 as least, 'a' as Dummy from PCOLLECTION"));