查询未找到列,建议 Hive 中的相同列 SQL
Query does not found column, suggests same column in Hive SQL
我在 SQL 中有以下查询:
select midquery.account, midquery.name, midquery.label, midquery.labelfrequency
from(
-- Count the appearance of each label.
select count(*) as labelfrequency, account, name, label
from(
select account, name, label from myTable
) innerquery
group by account, name, label
) midquery
-- Select most frequent values only.
where rank() over
(partition by midquery.account, midquery.name
order by midquery.labelfrequency desc) = 1
我们的想法是为每个名称帐户集找到最常见的标签。当我 运行 这个查询时,我得到以下错误:
Error while compiling statement: FAILED: SemanticException [Error 10002]: Line 12:74 Invalid column reference 'labelfrequency': (possible column names are: labelfrequency, account, name, label)
我不太明白为什么解释器找不到列 labelfrequency 但可以建议它。您对如何解决这个问题有什么建议吗?
编辑: 如果我将 rank() 移动到 select 部分,我会得到结果。
select midquery.account, midquery.name, midquery.label, midquery.labelfrequency,
rank() over (partition by midquery.account, midquery.name
order by midquery.labelfrequency desc)
from(
-- Count the appearance of each label.
select count(*) as labelfrequency, account, name, label
from(
select account, name, label from myTable
) innerquery
group by account, name, label
) midquery
Window 函数根本不允许出现在 WHERE
子句中。这有充分的理由,但您可以将其视为 SQL 的另一条规则——类似于无法识别列别名。
(真正的原因是指定 window 函数在有多个过滤条件时将如何运行。(几乎?)不可能想出一套连贯的规则。)
话虽如此,您可以简化查询:
select t.account, t.name, t.label, t.labelfrequency
from (select count(*) as labelfrequency, account, name, label,
rank() over (partition by account, name
order by count(*) desc
) as seqnum
from myTable t
group by account, name, label
) t
where seqnum = 1;
即window函数和聚合函数可以组合。而且您不需要子查询来仅指定少数 a 列。
我在 SQL 中有以下查询:
select midquery.account, midquery.name, midquery.label, midquery.labelfrequency
from(
-- Count the appearance of each label.
select count(*) as labelfrequency, account, name, label
from(
select account, name, label from myTable
) innerquery
group by account, name, label
) midquery
-- Select most frequent values only.
where rank() over
(partition by midquery.account, midquery.name
order by midquery.labelfrequency desc) = 1
我们的想法是为每个名称帐户集找到最常见的标签。当我 运行 这个查询时,我得到以下错误:
Error while compiling statement: FAILED: SemanticException [Error 10002]: Line 12:74 Invalid column reference 'labelfrequency': (possible column names are: labelfrequency, account, name, label)
我不太明白为什么解释器找不到列 labelfrequency 但可以建议它。您对如何解决这个问题有什么建议吗?
编辑: 如果我将 rank() 移动到 select 部分,我会得到结果。
select midquery.account, midquery.name, midquery.label, midquery.labelfrequency,
rank() over (partition by midquery.account, midquery.name
order by midquery.labelfrequency desc)
from(
-- Count the appearance of each label.
select count(*) as labelfrequency, account, name, label
from(
select account, name, label from myTable
) innerquery
group by account, name, label
) midquery
Window 函数根本不允许出现在 WHERE
子句中。这有充分的理由,但您可以将其视为 SQL 的另一条规则——类似于无法识别列别名。
(真正的原因是指定 window 函数在有多个过滤条件时将如何运行。(几乎?)不可能想出一套连贯的规则。)
话虽如此,您可以简化查询:
select t.account, t.name, t.label, t.labelfrequency
from (select count(*) as labelfrequency, account, name, label,
rank() over (partition by account, name
order by count(*) desc
) as seqnum
from myTable t
group by account, name, label
) t
where seqnum = 1;
即window函数和聚合函数可以组合。而且您不需要子查询来仅指定少数 a 列。