Spark SQL: 在 WHERE 子句中指定 UDF 产生的列名

Question

我编写了一个 UDF 函数，它将在处理 2 列后 return 一列（0 或 1）。我需要 select 查询，以便它 returns 那些值为 1 的记录。我写了如下查询：

SELECT number, myUDF(col1, col2) as result
    FROM mytable 
    WHERE result is not null

但是它无法识别列名称 'result'。是否需要任何特殊语法才能识别这个新的输出列？谢谢

Answer 1

CASE 语句应该可以解决这里的问题：

SELECT number, CASE when myUDF(col1, col2) = 1 then myUDF(col1, col2) END as result FROM mytable

Spark SQL: Specify column name resulting from UDF in WHERE clause