WHERE 中的多个 IN 子查询

Multiple IN subqueries in WHERE

我在尝试将以下查询从 impala 转换为 Cloudera 5.8 上的 Hive 1.1 时遇到问题。

SELECT   *
FROM   
table1 t1,table2 t2
WHERE   concat(t1.field1, t1.field2) IN
               (SELECT   concat(T3.field1, T3.field2)
                  FROM   table3 T3
                 WHERE   T3.field3 = 'value')
         AND concat(t1.field3, t1.field4) IN
               (SELECT   concat(T3.field1, T3.field2)
                  FROM   table3 T3
                 WHERE   T3.field3 = 'value')
AND t1.some_field = t2.some_field

我在这里得到的错误指出我不能在 where 子句中执行多个子查询。

Only 1 SubQuery expression is supported.

我曾尝试使用 union 解决此问题,但此版本仅支持 union all。我也不太确定如何在此处使用连接来解决此问题。

我将不胜感激关于如何重写此查询以产生预期结果而不会引发错误的建议。

他们的文档说您可以使用 CTE。 https://cwiki.apache.org/confluence/display/Hive/Common+Table+Expression

你能试试这个吗?

WITH firstConcatResult AS (
    SELECT * FROM 
    table1 t1,table2 t2
    WHERE 
        //first concat
)
SELECT * FROM firstConcatResult f
WHERE  
    //other concat

我会使用 exists 和正确的 join 语法:

SELECT *
FROM table1 t1 JOIN
     table2 t2
     ON t1.some_field = t2.some_field
WHERE EXISTS (SELECT 1
              FROM table3 T3
              WHERE T3.field3 = 'value' AND
                    T3.field1 = t1.field1 AND t3.field2 = t1.field2
             ) AND
      EXISTS (SELECT 1
              FROM table3 T3
              WHERE T3.field3 = 'value' AND
                    T3.field1 = t1.field3 AND t3.field2 = t1.field4
             );

使用联接和 CTE:

with s3 as (SELECT T3.field1, T3.field2
                         FROM   table3 T3
                        WHERE   T3.field3 = 'value')

SELECT   *
FROM   
table1 t1 
       inner join table2 t2 on t1.some_field = t2.some_field
       left semi join s3 on t1.field1=s3.field1 
                        and t1.field2=s3.field2
       left semi join s3 on t1.field3=s3.field1 
                        and t1.field4=s3.field2