如何使用来自 T2 的随机重复行将更大的 table T1 连接到更小的 T2
how to join bigger table T1 to a smaller T2 with random repeating rows from T2
抱歉,如果已经以某种方式询问过这个问题。我已经搜索过但找不到这个特定的解决方案。所以我将不胜感激答案或指向正确位置的指针...
我有两个不同(变化)长度的表。玩具示例:
T1:
SELECT * FROM T1;
gid | call_s1
-----+---------
1 | 1
3 | 1
4 | 1
7 | 1
8 | 1
(5 rows)
和
SELECT * FROM T2;
gid | dt_ping
-----+---------------------
1 | 2009-06-06 19:00:00
2 | 2009-06-06 19:00:15
3 | 2009-06-06 19:00:30
4 | 2009-06-06 19:00:45
(4 rows)
我想得到一个结果 T3,它已将随机行从 T2.dt_ping 分配到 T1 中的每一行,如有必要则重复。例如可能的结果是:
gid | call_s1 | dt_ping
-----+----------+---------
1 | 1 | 2009-06-06 19:00:45
3 | 1 | 2009-06-06 19:00:30
4 | 1 | 2009-06-06 19:00:15
7 | 1 | 2009-06-06 19:00:00
8 | 1 | 2009-06-06 19:00:45
我试过偏移、随机排序等。要么得到笛卡尔积,要么得到空值。例如,这是我最后一次尝试:
SELECT
T1.gid
, T1.call_s1
, T2.dt_ping
FROM
(
SELECT
gid
, call_s1
, ceiling( random() * (SELECT count(*)::int as n FROM fake_called_small ) ) tgid
FROM fake_called_small
) T1
LEFT OUTER JOIN
fake_times_small T2
ON
T2.gid = T1.tgid;
我得到的结果之一:
gid | call_s1 | dt_ping
-----+---------+---------------------
1 | 1 |
3 | 1 | 2009-06-06 19:00:30
4 | 1 | 2009-06-06 19:00:45
7 | 1 | 2009-06-06 19:00:15
8 | 1 |
(5 rows)
我知道我遗漏了一些简单的东西,但是什么?
顺便说一句,我试过这个:
SELECT
T1.gid
, T1.call_s1
, (SELECT dt_ping FROM fake_times_small T2 ORDER BY random() LIMIT 1)
FROM fake_called_small T1;
并得到:
gid | call_s1 | dt_ping
-----+---------+---------------------
1 | 1 | 2009-06-06 19:00:30
3 | 1 | 2009-06-06 19:00:30
4 | 1 | 2009-06-06 19:00:30
7 | 1 | 2009-06-06 19:00:30
8 | 1 | 2009-06-06 19:00:30
(5 rows)
重复 dt_ping
的同一行,因为子选择仅完成一次
如果您不关心性能,下面的查询应该可以满足您的要求:
SELECT gid, call_s1, dt_ping
FROM (
SELECT t1.gid, call_s1, dt_ping,
ROW_NUMBER() OVER (PARTITION BY t1.gid, call_s1 ORDER BY RANDOM()) AS rn
FROM t1
CROSS JOIN t2
) x
WHERE rn = 1;
抱歉,如果已经以某种方式询问过这个问题。我已经搜索过但找不到这个特定的解决方案。所以我将不胜感激答案或指向正确位置的指针... 我有两个不同(变化)长度的表。玩具示例:
T1:
SELECT * FROM T1;
gid | call_s1
-----+---------
1 | 1
3 | 1
4 | 1
7 | 1
8 | 1
(5 rows)
和
SELECT * FROM T2;
gid | dt_ping
-----+---------------------
1 | 2009-06-06 19:00:00
2 | 2009-06-06 19:00:15
3 | 2009-06-06 19:00:30
4 | 2009-06-06 19:00:45
(4 rows)
我想得到一个结果 T3,它已将随机行从 T2.dt_ping 分配到 T1 中的每一行,如有必要则重复。例如可能的结果是:
gid | call_s1 | dt_ping
-----+----------+---------
1 | 1 | 2009-06-06 19:00:45
3 | 1 | 2009-06-06 19:00:30
4 | 1 | 2009-06-06 19:00:15
7 | 1 | 2009-06-06 19:00:00
8 | 1 | 2009-06-06 19:00:45
我试过偏移、随机排序等。要么得到笛卡尔积,要么得到空值。例如,这是我最后一次尝试:
SELECT
T1.gid
, T1.call_s1
, T2.dt_ping
FROM
(
SELECT
gid
, call_s1
, ceiling( random() * (SELECT count(*)::int as n FROM fake_called_small ) ) tgid
FROM fake_called_small
) T1
LEFT OUTER JOIN
fake_times_small T2
ON
T2.gid = T1.tgid;
我得到的结果之一:
gid | call_s1 | dt_ping
-----+---------+---------------------
1 | 1 |
3 | 1 | 2009-06-06 19:00:30
4 | 1 | 2009-06-06 19:00:45
7 | 1 | 2009-06-06 19:00:15
8 | 1 |
(5 rows)
我知道我遗漏了一些简单的东西,但是什么?
顺便说一句,我试过这个:
SELECT
T1.gid
, T1.call_s1
, (SELECT dt_ping FROM fake_times_small T2 ORDER BY random() LIMIT 1)
FROM fake_called_small T1;
并得到:
gid | call_s1 | dt_ping
-----+---------+---------------------
1 | 1 | 2009-06-06 19:00:30
3 | 1 | 2009-06-06 19:00:30
4 | 1 | 2009-06-06 19:00:30
7 | 1 | 2009-06-06 19:00:30
8 | 1 | 2009-06-06 19:00:30
(5 rows)
重复 dt_ping
的同一行,因为子选择仅完成一次
如果您不关心性能,下面的查询应该可以满足您的要求:
SELECT gid, call_s1, dt_ping
FROM (
SELECT t1.gid, call_s1, dt_ping,
ROW_NUMBER() OVER (PARTITION BY t1.gid, call_s1 ORDER BY RANDOM()) AS rn
FROM t1
CROSS JOIN t2
) x
WHERE rn = 1;