MySQL 如何编写 SQL 以查找任何 15 分钟内的过多交易 windows？

Question

MySQL
假设有一家信用卡处理公司。

create table tran(
  id int,
  tran_dt datetime, 
  card_id int,
  merchant_id int,
  amount int
);

#Customer #1 
insert into tran values(1, '2015-01-01 01:00:00', 1, 1, 10);
insert into tran values(2, '2015-01-01 01:01:00', 1, 1, 10);
insert into tran values(3, '2015-01-01 01:02:00', 1, 1, 10);

#Customer #2 
insert into tran values(21, '2015-01-01 01:00:00', 2, 1, 10);
insert into tran values(22, '2015-01-01 01:01:00', 2, 1, 10);
insert into tran values(23, '2015-01-01 01:02:00', 2, 1, 10);
insert into tran values(24, '2015-01-01 01:03:00', 2, 1, 10);

#Customer #3 
insert into tran values(31, '2015-01-01 01:00:00', 3, 1, 10);
insert into tran values(32, '2015-01-01 01:00:00', 3, 1, 10);
insert into tran values(33, '2015-01-01 01:00:00', 3, 1, 10);
insert into tran values(34, '2015-01-01 01:00:00', 3, 1, 10);
insert into tran values(35, '2015-01-01 01:00:00', 3, 1, 10);

需要报告哪些卡在同一商户任意 15 分钟window 内被使用 3 次以上。

SELECT t1.card_id, t1.merchant_id, count(*) 
FROM tran t1 
JOIN tran t2
  on t2.card_id=t1.card_id 
  and t2.merchant_id=t1.merchant_id 
  and t2.tran_dt <= DATE_ADD(t1.tran_dt, INTERVAL 15 MINUTE)
  and t2.id>t1.id
GROUP BY t1.card_id, t1.merchant_id
HAVING count(*)>2

结果

card_id     merchant_id     count(*)
1           1               3
2           1               6
3           1               10

第一个客户的计数是正确的，但其他客户的计数太高了。我的 sql 错误在哪里？

http://sqlfiddle.com/#!9/e0de2/1

PS 不允许触发器。

Answer 1

那应该有group by中的第一个交易id:

SELECT t1.id, t1.card_id, t1.merchant_id, t1.tran_dt, count(*) 
FROM tran t1 JOIN
     tran t2
     on t2.card_id = t1.card_id and
        t2.merchant_id = t1.merchant_id and
        t2.tran_dt >= t1.tran_dt and
        t2.tran_dt <= DATE_ADD(t1.tran_dt, INTERVAL 15 MINUTE) 
GROUP BY t1.id, t1.card_id, t1.merchant_id, t1.tran_dt
HAVING count(*) > 2;

SQL Fiddle 是 here。

Answer 2

根据我昨晚 () 对你的相关问题的回答（很高兴你接受了我的回答:>）我只是想找到 >2 的，就是这样，并描述了合成计数.

这应该可以修复它，根据需要调整间隔：

SELECT t1.card_id,t1.merchant_id,count(distinct t1.id)+1 as ChargeCount
FROM tran t1 
INNER JOIN tran t2
on t2.card_id=t1.card_id 
and t2.merchant_id=t1.merchant_id 
and t2.tran_dt <= DATE_ADD(t1.tran_dt, INTERVAL 15 MINUTE)
and t2.id>t1.id
GROUP BY t1.card_id,t1.merchant_id
HAVING ChargeCount>2;

24 小时：

SELECT t1.card_id,t1.merchant_id,count(distinct t1.id)+1 as ChargeCount
FROM tran t1 
INNER JOIN tran t2
on t2.card_id=t1.card_id 
and t2.merchant_id=t1.merchant_id 
and t2.tran_dt <= DATE_ADD(t1.tran_dt, INTERVAL 24 HOUR)
and t2.id>t1.id
GROUP BY t1.card_id,t1.merchant_id
HAVING ChargeCount>2;

编辑:

imo（=价值 1 美分），最好使用标准日期而不是本地日期数学。原因如下：

1) 有人可能会说连接不是必需的，但它可能在复合索引上并且它们正在快速尖叫。如果您没有它们，请添加它们。它们对您的所有查询都很有用（而且它们是瘦整数）。如果要合并 覆盖索引 (id,id,datetime)，您现在有一个在索引页中解析的访问路径，不需要在数据页之后。请记住，索引页将您指向仍然需要阅读的数据页（但对于精心选择的覆盖索引来说不是必需的）。我已经看到通过使用 覆盖索引.

将超过一百万行的连接减少到几秒钟

2) 本土日期数学容易出错

3) 本土日期数学很难修改和调整

4) 看起来更短 sql 并不 = 更快 sql。更短的 sql 可以将处理负担转移到索引优化之外的函数，并且经常（我敢说多次）导致表扫描。可怕的表格扫描。

MySQL 如何编写 SQL 以查找任何 15 分钟内的过多交易 windows？

MySQL how to write SQL to find excessive transactions in any 15 minute windows?

mysql

sql

group-by

self-join