将内部连接的结果连接回原来的 table
joining the result of inner join back onto the original table
我有以下两个 table:
claims_tbl
client_ID claim_ID date
personA claim1 date1
personB claim2 date2
personB claim3 date3
personB claim4 date4
personC claim5 date5
personD claim6 date5
procedures_tbl
claim_id procedure_code
claim1 hash1
claim2 hash27
claim3 hash1
claim4 hash45
claim5 hash22
claim6 hash1
我想从 claims_tbl 中提取与 hash1
关联的 claim_IDs,然后提取与 claimID
关联的 client_ID。然后使用它来提取与该 client_ID
.
关联的所有过程
我有:
select *
from ((select distinct c.claims_id
from procedures_tbl p
inner join claims_tbl c
on p.claim_id=c.claim_id
and p.procedure_code = 'hash1') t
inner join claims_tbl c
on c.claims_id=t.claims_id) t
inner join from procedures_tbl p
on t.claims_id=p.claims_id
有没有更有效的方法来做到这一点?
我的预期 table 将是:
client_ID claim_ID date procedure_code
personA claim1 date1 hash1
personB claim2 date2 hash27
personB claim3 date3 hash1
personB claim4 date4 hash45
personD claim6 date5 hash1
按照您在问题中描述关系的相同顺序进行连接。
SELECT c1.client_id, c2.claim_id, c2.date, p2.procedure_code
FROM procedures_tbl AS p1 -- Get claims associated with procedure hash1
JOIN claims_tbl AS c1 ON p1.claim_id = c1.claim_id -- get client_ID from those claims
JOIN claims_tbl AS c2 ON c1.client_id = c2.client_id -- get other claims for that client_id
JOIN procedures_tbl AS p2 ON p2.claim_id = c2.claim_id -- get those procedure codes
WHERE p1.procedure_code = 'hash1'
所以我只是稍微重写了你的布局 SQL:
select *
from (
select * from
(
select distinct
c.claims_id
from procedures_tbl p
join claims_tbl c
on p.claim_id = c.claim_id
and p.procedure_code = 'hash1'
) t1
join claims_tbl c
on c.claims_id = t1.claims_id
) t2
join from procedures_tbl p
on t2.claims_id=p.claims_id
t1 是 select 不同的集合,如果 claims_id 在程序和索赔表中,也有 procedure_code_hash1。
这用于select所有具有相同 id
的声明
然后用于 select 具有相同声明 ID 的所有程序。
t1 块中的不同意味着这两个表之间存在不止一对一的关系。
所以制作一些代表这个的假数据:
WITH procedures_tbl(claim_id, claims_id, procedure_code,p_details) AS (
SELECT * FROM VALUES
(1, 10, 'hash1', 'pd 1'),
(2, 10, 'hash1', 'pd 2'),
(3, 11, 'hash1', 'pd 3'),
(4, 11, 'hash1', 'pd 4')
), claims_tbl(claim_id, claims_id, c_details) AS (
SELECT * FROM VALUES
(1, 10, 'cd 1'),
(2, 10, 'cd 2'),
(3, 11, 'cd 3'),
(4, 11, 'cd 4')
)
然后修改 SQL 使其 运行s(在 Snowflake 中)
select p.*
,t2.*
from (
select c.* from
(
select distinct
c.claims_id
from procedures_tbl p
join claims_tbl c
on p.claim_id = c.claim_id
and p.procedure_code = 'hash1'
) t1
join claims_tbl c
on c.claims_id = t1.claims_id
) t2
join procedures_tbl p
on t2.claims_id = p.claims_id
order by 1,2,3,4;
CLAIM_ID
CLAIMS_ID
PROCEDURE_CODE
P_DETAILS
CLAIM_ID
CLAIMS_ID
C_DETAILS
1
10
hash1
pd 1
1
10
cd 1
1
10
hash1
pd 1
2
10
cd 2
2
10
hash1
pd 2
1
10
cd 1
2
10
hash1
pd 2
2
10
cd 2
3
11
hash1
pd 3
3
11
cd 3
3
11
hash1
pd 3
4
11
cd 4
4
11
hash1
pd 4
3
11
cd 3
4
11
hash1
pd 4
4
11
cd 4
所以这得到完整的 procedures_tbl 加入 claims_tbl 对于任何 claims_id 其中任何一个条目有一个代码 = hash1
这等同于:
SELECT p.*
,c.*
FROM procedures_tbl p
JOIN claims_tbl c
ON c.claims_id = p.claims_id
WHERE c.claims_id IN (
select
sc.claims_id
from procedures_tbl sp
join claims_tbl sc
on sp.claim_id = sc.claim_id
and sp.procedure_code = 'hash1'
);
并且他们应该有相同的执行计划。但我猜你的计划在 Snowflake 中会更快,因为如果 procedures_tbl 和 claims_tbl 很大,WHERE IN 过滤器可能不会总是 运行 足够快。
我倾向于这样写,因为它是一样的:
select
p.*
,c.*
from (
select distinct
c.claims_id
from procedures_tbl p
join claims_tbl c
on p.claim_id = c.claim_id
and p.procedure_code = 'hash1'
) d
join claims_tbl c
on c.claims_id = d.claims_id
join procedures_tbl p
on p.claims_id = d.claims_id
order by 1,2,3,4;
因此有了这个稍微扩展的数据:
WITH procedures_tbl(claim_id, claims_id, procedure_code,p_details) AS (
SELECT * FROM VALUES
(1, 10, 'hash1', 'pd 1'),
(2, 10, 'hash2', 'pd 2'),
(3, 11, 'hash1', 'pd 3'),
(4, 11, 'hash3', 'pd 4'),
(5, 12, 'hash2', 'pd 5'),
(6, 12, 'hash3', 'pd 6')
), claims_tbl(claim_id, claims_id, c_details) AS (
SELECT * FROM VALUES
(1, 10, 'cd 1'),
(2, 10, 'cd 2'),
(3, 11, 'cd 3'),
(4, 11, 'cd 4'),
(5, 12, 'cd 5'),
(6, 12, 'cd 6')
)
我们得到:
CLAIM_ID
CLAIMS_ID
PROCEDURE_CODE
P_DETAILS
CLAIM_ID
CLAIMS_ID
C_DETAILS
1
10
hash1
pd 1
1
10
cd 1
1
10
hash1
pd 1
2
10
cd 2
2
10
hash2
pd 2
1
10
cd 1
2
10
hash2
pd 2
2
10
cd 2
3
11
hash1
pd 3
3
11
cd 3
3
11
hash1
pd 3
4
11
cd 4
4
11
hash3
pd 4
3
11
cd 3
4
11
hash3
pd 4
4
11
cd 4
我有以下两个 table:
claims_tbl
client_ID claim_ID date
personA claim1 date1
personB claim2 date2
personB claim3 date3
personB claim4 date4
personC claim5 date5
personD claim6 date5
procedures_tbl
claim_id procedure_code
claim1 hash1
claim2 hash27
claim3 hash1
claim4 hash45
claim5 hash22
claim6 hash1
我想从 claims_tbl 中提取与 hash1
关联的 claim_IDs,然后提取与 claimID
关联的 client_ID。然后使用它来提取与该 client_ID
.
我有:
select *
from ((select distinct c.claims_id
from procedures_tbl p
inner join claims_tbl c
on p.claim_id=c.claim_id
and p.procedure_code = 'hash1') t
inner join claims_tbl c
on c.claims_id=t.claims_id) t
inner join from procedures_tbl p
on t.claims_id=p.claims_id
有没有更有效的方法来做到这一点?
我的预期 table 将是:
client_ID claim_ID date procedure_code
personA claim1 date1 hash1
personB claim2 date2 hash27
personB claim3 date3 hash1
personB claim4 date4 hash45
personD claim6 date5 hash1
按照您在问题中描述关系的相同顺序进行连接。
SELECT c1.client_id, c2.claim_id, c2.date, p2.procedure_code
FROM procedures_tbl AS p1 -- Get claims associated with procedure hash1
JOIN claims_tbl AS c1 ON p1.claim_id = c1.claim_id -- get client_ID from those claims
JOIN claims_tbl AS c2 ON c1.client_id = c2.client_id -- get other claims for that client_id
JOIN procedures_tbl AS p2 ON p2.claim_id = c2.claim_id -- get those procedure codes
WHERE p1.procedure_code = 'hash1'
所以我只是稍微重写了你的布局 SQL:
select *
from (
select * from
(
select distinct
c.claims_id
from procedures_tbl p
join claims_tbl c
on p.claim_id = c.claim_id
and p.procedure_code = 'hash1'
) t1
join claims_tbl c
on c.claims_id = t1.claims_id
) t2
join from procedures_tbl p
on t2.claims_id=p.claims_id
t1 是 select 不同的集合,如果 claims_id 在程序和索赔表中,也有 procedure_code_hash1。
这用于select所有具有相同 id
的声明然后用于 select 具有相同声明 ID 的所有程序。
t1 块中的不同意味着这两个表之间存在不止一对一的关系。
所以制作一些代表这个的假数据:
WITH procedures_tbl(claim_id, claims_id, procedure_code,p_details) AS (
SELECT * FROM VALUES
(1, 10, 'hash1', 'pd 1'),
(2, 10, 'hash1', 'pd 2'),
(3, 11, 'hash1', 'pd 3'),
(4, 11, 'hash1', 'pd 4')
), claims_tbl(claim_id, claims_id, c_details) AS (
SELECT * FROM VALUES
(1, 10, 'cd 1'),
(2, 10, 'cd 2'),
(3, 11, 'cd 3'),
(4, 11, 'cd 4')
)
然后修改 SQL 使其 运行s(在 Snowflake 中)
select p.*
,t2.*
from (
select c.* from
(
select distinct
c.claims_id
from procedures_tbl p
join claims_tbl c
on p.claim_id = c.claim_id
and p.procedure_code = 'hash1'
) t1
join claims_tbl c
on c.claims_id = t1.claims_id
) t2
join procedures_tbl p
on t2.claims_id = p.claims_id
order by 1,2,3,4;
CLAIM_ID | CLAIMS_ID | PROCEDURE_CODE | P_DETAILS | CLAIM_ID | CLAIMS_ID | C_DETAILS |
---|---|---|---|---|---|---|
1 | 10 | hash1 | pd 1 | 1 | 10 | cd 1 |
1 | 10 | hash1 | pd 1 | 2 | 10 | cd 2 |
2 | 10 | hash1 | pd 2 | 1 | 10 | cd 1 |
2 | 10 | hash1 | pd 2 | 2 | 10 | cd 2 |
3 | 11 | hash1 | pd 3 | 3 | 11 | cd 3 |
3 | 11 | hash1 | pd 3 | 4 | 11 | cd 4 |
4 | 11 | hash1 | pd 4 | 3 | 11 | cd 3 |
4 | 11 | hash1 | pd 4 | 4 | 11 | cd 4 |
所以这得到完整的 procedures_tbl 加入 claims_tbl 对于任何 claims_id 其中任何一个条目有一个代码 = hash1
这等同于:
SELECT p.*
,c.*
FROM procedures_tbl p
JOIN claims_tbl c
ON c.claims_id = p.claims_id
WHERE c.claims_id IN (
select
sc.claims_id
from procedures_tbl sp
join claims_tbl sc
on sp.claim_id = sc.claim_id
and sp.procedure_code = 'hash1'
);
并且他们应该有相同的执行计划。但我猜你的计划在 Snowflake 中会更快,因为如果 procedures_tbl 和 claims_tbl 很大,WHERE IN 过滤器可能不会总是 运行 足够快。
我倾向于这样写,因为它是一样的:
select
p.*
,c.*
from (
select distinct
c.claims_id
from procedures_tbl p
join claims_tbl c
on p.claim_id = c.claim_id
and p.procedure_code = 'hash1'
) d
join claims_tbl c
on c.claims_id = d.claims_id
join procedures_tbl p
on p.claims_id = d.claims_id
order by 1,2,3,4;
因此有了这个稍微扩展的数据:
WITH procedures_tbl(claim_id, claims_id, procedure_code,p_details) AS (
SELECT * FROM VALUES
(1, 10, 'hash1', 'pd 1'),
(2, 10, 'hash2', 'pd 2'),
(3, 11, 'hash1', 'pd 3'),
(4, 11, 'hash3', 'pd 4'),
(5, 12, 'hash2', 'pd 5'),
(6, 12, 'hash3', 'pd 6')
), claims_tbl(claim_id, claims_id, c_details) AS (
SELECT * FROM VALUES
(1, 10, 'cd 1'),
(2, 10, 'cd 2'),
(3, 11, 'cd 3'),
(4, 11, 'cd 4'),
(5, 12, 'cd 5'),
(6, 12, 'cd 6')
)
我们得到:
CLAIM_ID | CLAIMS_ID | PROCEDURE_CODE | P_DETAILS | CLAIM_ID | CLAIMS_ID | C_DETAILS |
---|---|---|---|---|---|---|
1 | 10 | hash1 | pd 1 | 1 | 10 | cd 1 |
1 | 10 | hash1 | pd 1 | 2 | 10 | cd 2 |
2 | 10 | hash2 | pd 2 | 1 | 10 | cd 1 |
2 | 10 | hash2 | pd 2 | 2 | 10 | cd 2 |
3 | 11 | hash1 | pd 3 | 3 | 11 | cd 3 |
3 | 11 | hash1 | pd 3 | 4 | 11 | cd 4 |
4 | 11 | hash3 | pd 4 | 3 | 11 | cd 3 |
4 | 11 | hash3 | pd 4 | 4 | 11 | cd 4 |