查找产品之间常见帐户的不同计数

Question

假设一个 table 有两列，如下所示：

Account_ID (integer)
Product_ID (integer)

其他列没有material。这列出了帐户购买的产品。我想创建一个包含三列的输出，如下所示：

Account_ID_1 | Account_ID_2 | Count(distinct product_ID)

结果应该包含 Account_IDs 的所有值和每个 Account_Id 组合中常见 Product_Ids 的相关非重复计数的组合。

我正在使用 Google BigQuery。是否有 SQL 方法来执行此操作，或者我是否应该计划使用完整的编程语言对其进行编码？

Answer 1

这里我计算两个账号共有多少个产品。

SELECT 
     T1.Account_ID as Account_ID_1,
     T2.Account_ID as Account_ID_2,
     COUNT(distinct T1.product_id) 

From YourTable as T1
JOIN YourTable as T2
  ON T1.Account_ID <  T2.Account_ID
 AND T1.product_ID =  T2.product_ID 
GROUP BY
     T1.Account_ID,
     T2.Account_ID

Answer 2

这对我有用：

select
   t1.Account_ID, T2.Account_ID, count(t1.Product_ID) count_product_id 
from
   MYTABLE t1 join MYTABLE t2 on t1.Product_ID = t2.Product_ID
where t1.Account_ID <> t2.Account_ID
group by t1.Account_ID, t2.Account_ID
order by 1,2

Answer 3

BigQuery 版本：

（JOIN 仅基于相等，同时在 WHERE 子句中保留 <）

SELECT a.corpus, b.corpus, EXACT_COUNT_DISTINCT(a.word) c
FROM
(SELECT corpus, word FROM [publicdata:samples.shakespeare]) a
JOIN
(SELECT corpus, word FROM [publicdata:samples.shakespeare]) b
ON a.word=b.word
WHERE a.corpus>b.corpus
GROUP BY 1, 2
ORDER BY 4 DESC

查找产品之间常见帐户的不同计数

Find distinct count of common accounts between products

sql

combinations

distinct

google-bigquery