JOIN 具有最相似名称的记录与来自多个表的每一行
JOIN the record with the most similar name with each row from multiple tables
平台: PostgreSQL
表:
shortlist: name (text), city (text)...
data1: name (text), ranking (integer), score1 (double)...
data2: name (text), ranking (integer), score1 (double)...
data3: name (text), ranking (integer), score1 (double)...
data4: name (text), ranking (integer), score1 (double)...
类似格式的数据数量有限 table。
我想将 shortlist
中的每一行与 data
table 中具有最相似名称的行连接起来 similarity(shortlist.name, data#.name)
.
同样思路的伪代码:
for each s_row in shortlist:
select shortlist.*
join (SELECT data1.*, similarity(s_row.name, data1.name) AS sim FROM data1 ORDER BY sim DESC LIMIT 1)
join (SELECT data2.*, similarity(s_row.name, data2.name) AS sim FROM data2 ORDER BY sim DESC LIMIT 1)
join (SELECT data3.*, similarity(s_row.name, data3.name) AS sim FROM data3 ORDER BY sim DESC LIMIT 1)
join (SELECT data4.*, similarity(s_row.name, data4.name) AS sim FROM data4 ORDER BY sim DESC LIMIT 1)
SQL有没有办法做到这一点?
我不完全确定你在找什么,但像这样:
select s.name,
d1.name as d1_name,
d2.name as d2_name
from shortlist s
left join lateral (
SELECT data1.*, similarity(s.name, data1.name) AS sim
FROM data1
ORDER BY sim
DESC LIMIT 1
) d1 on true
left join lateral (
SELECT data2.*, similarity(s.name, data2.name) AS sim
FROM data2
ORDER BY sim DESC
LIMIT 1
) d2 on true
您希望每个 table 都有一个外连接 (left join
),否则如果至少有一个 table 没有匹配项,您将看不到任何内容。
平台: PostgreSQL
表:
shortlist: name (text), city (text)...
data1: name (text), ranking (integer), score1 (double)...
data2: name (text), ranking (integer), score1 (double)...
data3: name (text), ranking (integer), score1 (double)...
data4: name (text), ranking (integer), score1 (double)...
类似格式的数据数量有限 table。
我想将 shortlist
中的每一行与 data
table 中具有最相似名称的行连接起来 similarity(shortlist.name, data#.name)
.
同样思路的伪代码:
for each s_row in shortlist:
select shortlist.*
join (SELECT data1.*, similarity(s_row.name, data1.name) AS sim FROM data1 ORDER BY sim DESC LIMIT 1)
join (SELECT data2.*, similarity(s_row.name, data2.name) AS sim FROM data2 ORDER BY sim DESC LIMIT 1)
join (SELECT data3.*, similarity(s_row.name, data3.name) AS sim FROM data3 ORDER BY sim DESC LIMIT 1)
join (SELECT data4.*, similarity(s_row.name, data4.name) AS sim FROM data4 ORDER BY sim DESC LIMIT 1)
SQL有没有办法做到这一点?
我不完全确定你在找什么,但像这样:
select s.name,
d1.name as d1_name,
d2.name as d2_name
from shortlist s
left join lateral (
SELECT data1.*, similarity(s.name, data1.name) AS sim
FROM data1
ORDER BY sim
DESC LIMIT 1
) d1 on true
left join lateral (
SELECT data2.*, similarity(s.name, data2.name) AS sim
FROM data2
ORDER BY sim DESC
LIMIT 1
) d2 on true
您希望每个 table 都有一个外连接 (left join
),否则如果至少有一个 table 没有匹配项,您将看不到任何内容。