获取具有相同值列的行
get rows that have the same value column
我正在尝试在 Githubs public 数据上使用 BigQuery select 具有相同列值的行。我会像这样使用 SQL 服务器来处理它,但我得到“不支持引用其他表的相关子查询,除非它们可以去相关,例如通过将它们转换为有效的 JOIN”。错误信息。
select t1.id as id, t1.path as path
from `bigquery-public-data.github_repos.sample_files` t1
where exists (select path
from `bigquery-public-data.github_repos.sample_files` t2
where t1.path = t2.path
group by path, id
having count(id) > 1)
我也试过像这样执行自连接:
SELECT t1.repo_name as repo_name, t1.path as path, t2.repo_name as reponame2, t2.path as path2
FROM `bigquery-public-data.github_repos.sample_files` as t1
JOIN `bigquery-public-data.github_repos.sample_files` as t2 ON t1.path = t2.path GROUP BY repo_name, path, reponame2, path2
但是我收到了超时错误。实现此目标的正确方法是什么?
我认为您正在寻找以下内容(至少这是将您的原始查询直接转换为在保留逻辑的同时实际有效的查询)
select id, path
from `bigquery-public-data.github_repos.sample_files`
qualify count(id) over(partition by path) > 1
我正在尝试在 Githubs public 数据上使用 BigQuery select 具有相同列值的行。我会像这样使用 SQL 服务器来处理它,但我得到“不支持引用其他表的相关子查询,除非它们可以去相关,例如通过将它们转换为有效的 JOIN”。错误信息。
select t1.id as id, t1.path as path
from `bigquery-public-data.github_repos.sample_files` t1
where exists (select path
from `bigquery-public-data.github_repos.sample_files` t2
where t1.path = t2.path
group by path, id
having count(id) > 1)
我也试过像这样执行自连接:
SELECT t1.repo_name as repo_name, t1.path as path, t2.repo_name as reponame2, t2.path as path2
FROM `bigquery-public-data.github_repos.sample_files` as t1
JOIN `bigquery-public-data.github_repos.sample_files` as t2 ON t1.path = t2.path GROUP BY repo_name, path, reponame2, path2
但是我收到了超时错误。实现此目标的正确方法是什么?
我认为您正在寻找以下内容(至少这是将您的原始查询直接转换为在保留逻辑的同时实际有效的查询)
select id, path
from `bigquery-public-data.github_repos.sample_files`
qualify count(id) over(partition by path) > 1