查找产品之间常见帐户的不同计数

Find distinct count of common accounts between products

假设一个 table 有两列,如下所示:

Account_ID (integer)
Product_ID (integer)

其他列没有material。这列出了帐户购买的产品。我想创建一个包含三列的输出,如下所示:

Account_ID_1 | Account_ID_2 | Count(distinct product_ID)

结果应该包含 Account_IDs 的所有值和每个 Account_Id 组合中常见 Product_Ids 的相关非重复计数的组合。

我正在使用 Google BigQuery。是否有 SQL 方法来执行此操作,或者我是否应该计划使用完整的编程语言对其进行编码?

这里我计算两个账号共有多少个产品。

SELECT 
     T1.Account_ID as Account_ID_1,
     T2.Account_ID as Account_ID_2,
     COUNT(distinct T1.product_id) 

From YourTable as T1
JOIN YourTable as T2
  ON T1.Account_ID <  T2.Account_ID
 AND T1.product_ID =  T2.product_ID 
GROUP BY
     T1.Account_ID,
     T2.Account_ID

这对我有用:

select
   t1.Account_ID, T2.Account_ID, count(t1.Product_ID) count_product_id 
from
   MYTABLE t1 join MYTABLE t2 on t1.Product_ID = t2.Product_ID
where t1.Account_ID <> t2.Account_ID
group by t1.Account_ID, t2.Account_ID
order by 1,2

BigQuery 版本:

(JOIN 仅基于相等,同时在 WHERE 子句中保留 <)

SELECT a.corpus, b.corpus, EXACT_COUNT_DISTINCT(a.word) c
FROM
(SELECT corpus, word FROM [publicdata:samples.shakespeare]) a
JOIN
(SELECT corpus, word FROM [publicdata:samples.shakespeare]) b
ON a.word=b.word
WHERE a.corpus>b.corpus
GROUP BY 1, 2
ORDER BY 4 DESC