SQL - 通过根据开始值和结束值连接两个 table 来创建一个新的 table,然后根据开始值和结束值连接行?
SQL - Create a new table by joining two tables based on Start and End values and then Concatenate rows based on Start and End values?
这是 的后续问题。我是 SQL 的新手。我有 2 tables Table 1 和 Table 2。Table 1 有列 ID、SSTART、SEND、UPSTART、UPEND、DNSTART、DNSTAND。 Table 2 有列 Position 和 Seq.
Table 1
ID
UPSTART
UPEND
SStart
SEnd
DNSTART
DNEND
1
98
99
100
104
105
106
2
98
99
100
104
105
106
3
100
101
102
106
107
108
4
100
101
102
106
107
108
Table 2
Position
Seq
98
M
99
N
100
A
101
T
102
C
103
T
104
G
105
T
106
T
107
G
108
T
109
G
我的最终 table 需要列 ID、SStart、SSEnd、FullSeq、UStart、UPEnd、UPSeq、DNStart、DNStand、DnSeq,如下所示:
ID
UpStart
UpEnd
UpSeq
SStart
SSEnd
FullSeq
DNStart
DNStand
DnSeq
1
98
99
MN
100
104
ATCTG
105
106
TT
2
98
99
MN
100
104
ATCTG
105
106
TT
3
100
101
AT
102
106
CTGTT
107
108
GT
4
100
101
AT
102
106
CTGTT
107
108
GT
我正在使用 SQL 服务器 2016。我试过了
; WITH SequenceCTE AS(
SELECT
[ID] ,
[SStart],
[SSEnd],
[UpStart],
[UpEnd],
Seq,
[DnStart],
[DnEnd],
[Position]
FROM Table_1 a
JOIN Table_2 b
ON b.Position >= a.[UpStart] AND
b.Position <= a.[DnEnd]
)
SELECT DISTINCT
a.ID,
a.[UpStart],
a.[UpEnd],
UpSeq = (
SELECT STUFF(',' + Seq,1,1,'')
FROM SequenceCTE b
WHERE
a.ID = b.ID AND
a.[Position] > = b.[UpStart] AND
a.[Position] < = b.[UpEnd] AND
order by b.Position
FOR XML PATH ('')
) ,
a.[SStart],
a.[SSEnd],
FullSeq = (
SELECT STUFF(',' + Seq,1,1,'')
FROM SequenceCTE b
WHERE
a.ID = b.ID AND
a.[SStart] = b.[SStart] AND
a.[SSEnd] = b.[SSEnd]
order by b.Position
FOR XML PATH ('')
) ,
a.[DnStart],
a.[DnEnd],
DownSeq = (
SELECT STUFF(',' + Seq,1,1,'')
FROM SequenceCTE b
WHERE
a.ID = b.ID AND
a.[Position] > = b.[DnStart] AND
a.[Position] < = b.[DnEnd]
order by b.Position
FOR XML PATH ('')
)
FROM SequenceCTE a
但是没有用。然后我试了
With FullSeqCTE as(
Select * ,
b1.Seq as FullSeq
from Table_1 a
join Table_2 b1 ON b1.Position >= a.SStart and b1.Position <= a.SSEnd
), UpperSeqCTE as(
Select * ,
b2.Seq as UpSeq
from Table_1 a
join Table_2 b2 ON b2.Position >= a.UpStart and b2.Position <= a.UpEnd
), LowerSeqCTE as (
Select * ,
b3.Seq as DownSeq
from Table_1 a
join Table_2 b3 ON b3.Position > = a.DnStart and b3.Position < = a.DnEnd
)
但我不确定如何进行。非常感谢大家的帮助。
创建语句Table1
CREATE TABLE [Table_1](
[SStart] [int] NULL,
[SSend] [int] NULL,
[ID] [int] NULL,
[UpStart] [int] NULL,
[UpEnd] [int] NULL,
[DnStart] [int] NULL,
[DnEnd] [int] NULL
) ON [PRIMARY]
GO
插入语句Table1
INSERT INTO [Table_1]
([ID]
,[UpStart]
,[UpEnd]
,[SStart]
,[SSend]
,[DnStart]
,[DnEnd])
VALUES
(1,98,99,100,104,105,106),
(2,98,99,100,104,105,106),
(3,100,101,102,106,107,108),
(4,100,101,102,106,107,108)
GO
创建语句Table 2
CREATE TABLE [Table_2](
[Position] [int] NULL,
[Seq] [nvarchar](1) NULL
) ON [PRIMARY]
Go
插入语句 Table 2
INSERT INTO [dbo].[Table_2]
([Position]
,[Seq])
VALUES
(98,'M'),
(99,'N'),
(100,'A'),
(101,'T'),
(102,'C'),
(103,'T'),
(104,'G'),
(105,'T'),
(106,'T'),
(107,'G'),
(108,'T'),
(109,'G')
GO
我正在使用常见的 table 表达式(cte_Up
、cte_S
、cte_Dn
)来限制所需的分组。
解决方案 1
使用 SQL Server 2017 或更高版本时,您可以使用 STRING_AGG()
函数连接列。
with cte_Up as
(
select t1.Id, string_agg(t2.Seq, '') within group (order by t2.Position) as UpSeq
from table_1 t1
join table_2 t2
on t2.Position >= t1.UpStart
and t2.Position <= t1.UpEnd
group by t1.Id
),
cte_S as
(
select t1.Id, string_agg(t2.Seq, '') within group (order by t2.Position) as SSeq
from table_1 t1
join table_2 t2
on t2.Position >= t1.SStart
and t2.Position <= t1.SEnd
group by t1.Id
),
cte_Dn as
(
select t1.Id, string_agg(t2.Seq, '') within group (order by t2.Position) as DnSeq
from table_1 t1
join table_2 t2
on t2.Position >= t1.DnStart
and t2.Position <= t1.DnEnd
group by t1.Id
)
select t1.Id,
t1.UpStart,
t1.UpEnd,
u.UpSeq,
t1.SStart,
t1.SEnd,
s.SSeq,
t1.DnStart,
t1.DnEnd,
d.DnSeq
from table_1 t1
join cte_Up u
on u.Id = t1.Id
join cte_S s
on s.Id = t1.Id
join cte_Dn d
on d.Id = t1.Id;
Fiddle 查看实际情况。
解决方案 2
当 string_agg()
不可用时,使用 for xml
子句实现字符串连接。
with cte_Up as
(
select t1.Id,
( select '' + t2.Seq
from table_2 t2
where t2.Position >= t1.UpStart
and t2.Position <= t1.UpEnd
order by t2.Position
for xml path('') ) as UpSeq
from table_1 t1
),
cte_S as
(
select t1.Id,
( select '' + t2.Seq
from table_2 t2
where t2.Position >= t1.SStart
and t2.Position <= t1.SEnd
order by t2.Position
for xml path('') ) as SSeq
from table_1 t1
),
cte_Dn as
(
select t1.Id,
( select '' + t2.Seq
from table_2 t2
where t2.Position >= t1.DnStart
and t2.Position <= t1.DnEnd
order by t2.Position
for xml path('') ) as DnSeq
from table_1 t1
)
select t1.Id,
t1.UpStart,
t1.UpEnd,
u.UpSeq,
t1.SStart,
t1.SEnd,
s.SSeq,
t1.DnStart,
t1.DnEnd,
d.DnSeq
from table_1 t1
join cte_Up u
on u.Id = t1.Id
join cte_S s
on s.Id = t1.Id
join cte_Dn d
on d.Id = t1.Id;
Fiddle 查看实际情况。
结果
Id | UpStart UpEnd UpSeq | SStart SEnd SSeq | DnStart DnEnd DnSeq
-- | ------- ----- ----- | ------ ---- ----- | ------- ----- -----
1 | 98 99 MN | 100 104 ATCTG | 105 106 TT
2 | 98 99 MN | 100 104 ATCTG | 105 106 TT
3 | 100 101 AT | 102 106 CTGTT | 107 108 GT
4 | 100 101 AT | 102 106 CTGTT | 107 108 GT
这是
Table 1
ID | UPSTART | UPEND | SStart | SEnd | DNSTART | DNEND |
---|---|---|---|---|---|---|
1 | 98 | 99 | 100 | 104 | 105 | 106 |
2 | 98 | 99 | 100 | 104 | 105 | 106 |
3 | 100 | 101 | 102 | 106 | 107 | 108 |
4 | 100 | 101 | 102 | 106 | 107 | 108 |
Table 2
Position | Seq |
---|---|
98 | M |
99 | N |
100 | A |
101 | T |
102 | C |
103 | T |
104 | G |
105 | T |
106 | T |
107 | G |
108 | T |
109 | G |
我的最终 table 需要列 ID、SStart、SSEnd、FullSeq、UStart、UPEnd、UPSeq、DNStart、DNStand、DnSeq,如下所示:
ID | UpStart | UpEnd | UpSeq | SStart | SSEnd | FullSeq | DNStart | DNStand | DnSeq |
---|---|---|---|---|---|---|---|---|---|
1 | 98 | 99 | MN | 100 | 104 | ATCTG | 105 | 106 | TT |
2 | 98 | 99 | MN | 100 | 104 | ATCTG | 105 | 106 | TT |
3 | 100 | 101 | AT | 102 | 106 | CTGTT | 107 | 108 | GT |
4 | 100 | 101 | AT | 102 | 106 | CTGTT | 107 | 108 | GT |
我正在使用 SQL 服务器 2016。我试过了
; WITH SequenceCTE AS(
SELECT
[ID] ,
[SStart],
[SSEnd],
[UpStart],
[UpEnd],
Seq,
[DnStart],
[DnEnd],
[Position]
FROM Table_1 a
JOIN Table_2 b
ON b.Position >= a.[UpStart] AND
b.Position <= a.[DnEnd]
)
SELECT DISTINCT
a.ID,
a.[UpStart],
a.[UpEnd],
UpSeq = (
SELECT STUFF(',' + Seq,1,1,'')
FROM SequenceCTE b
WHERE
a.ID = b.ID AND
a.[Position] > = b.[UpStart] AND
a.[Position] < = b.[UpEnd] AND
order by b.Position
FOR XML PATH ('')
) ,
a.[SStart],
a.[SSEnd],
FullSeq = (
SELECT STUFF(',' + Seq,1,1,'')
FROM SequenceCTE b
WHERE
a.ID = b.ID AND
a.[SStart] = b.[SStart] AND
a.[SSEnd] = b.[SSEnd]
order by b.Position
FOR XML PATH ('')
) ,
a.[DnStart],
a.[DnEnd],
DownSeq = (
SELECT STUFF(',' + Seq,1,1,'')
FROM SequenceCTE b
WHERE
a.ID = b.ID AND
a.[Position] > = b.[DnStart] AND
a.[Position] < = b.[DnEnd]
order by b.Position
FOR XML PATH ('')
)
FROM SequenceCTE a
但是没有用。然后我试了
With FullSeqCTE as(
Select * ,
b1.Seq as FullSeq
from Table_1 a
join Table_2 b1 ON b1.Position >= a.SStart and b1.Position <= a.SSEnd
), UpperSeqCTE as(
Select * ,
b2.Seq as UpSeq
from Table_1 a
join Table_2 b2 ON b2.Position >= a.UpStart and b2.Position <= a.UpEnd
), LowerSeqCTE as (
Select * ,
b3.Seq as DownSeq
from Table_1 a
join Table_2 b3 ON b3.Position > = a.DnStart and b3.Position < = a.DnEnd
)
但我不确定如何进行。非常感谢大家的帮助。
创建语句Table1
CREATE TABLE [Table_1](
[SStart] [int] NULL,
[SSend] [int] NULL,
[ID] [int] NULL,
[UpStart] [int] NULL,
[UpEnd] [int] NULL,
[DnStart] [int] NULL,
[DnEnd] [int] NULL
) ON [PRIMARY]
GO
插入语句Table1
INSERT INTO [Table_1]
([ID]
,[UpStart]
,[UpEnd]
,[SStart]
,[SSend]
,[DnStart]
,[DnEnd])
VALUES
(1,98,99,100,104,105,106),
(2,98,99,100,104,105,106),
(3,100,101,102,106,107,108),
(4,100,101,102,106,107,108)
GO
创建语句Table 2
CREATE TABLE [Table_2](
[Position] [int] NULL,
[Seq] [nvarchar](1) NULL
) ON [PRIMARY]
Go
插入语句 Table 2
INSERT INTO [dbo].[Table_2]
([Position]
,[Seq])
VALUES
(98,'M'),
(99,'N'),
(100,'A'),
(101,'T'),
(102,'C'),
(103,'T'),
(104,'G'),
(105,'T'),
(106,'T'),
(107,'G'),
(108,'T'),
(109,'G')
GO
我正在使用常见的 table 表达式(cte_Up
、cte_S
、cte_Dn
)来限制所需的分组。
解决方案 1
使用 SQL Server 2017 或更高版本时,您可以使用 STRING_AGG()
函数连接列。
with cte_Up as
(
select t1.Id, string_agg(t2.Seq, '') within group (order by t2.Position) as UpSeq
from table_1 t1
join table_2 t2
on t2.Position >= t1.UpStart
and t2.Position <= t1.UpEnd
group by t1.Id
),
cte_S as
(
select t1.Id, string_agg(t2.Seq, '') within group (order by t2.Position) as SSeq
from table_1 t1
join table_2 t2
on t2.Position >= t1.SStart
and t2.Position <= t1.SEnd
group by t1.Id
),
cte_Dn as
(
select t1.Id, string_agg(t2.Seq, '') within group (order by t2.Position) as DnSeq
from table_1 t1
join table_2 t2
on t2.Position >= t1.DnStart
and t2.Position <= t1.DnEnd
group by t1.Id
)
select t1.Id,
t1.UpStart,
t1.UpEnd,
u.UpSeq,
t1.SStart,
t1.SEnd,
s.SSeq,
t1.DnStart,
t1.DnEnd,
d.DnSeq
from table_1 t1
join cte_Up u
on u.Id = t1.Id
join cte_S s
on s.Id = t1.Id
join cte_Dn d
on d.Id = t1.Id;
Fiddle 查看实际情况。
解决方案 2
当 string_agg()
不可用时,使用 for xml
子句实现字符串连接。
with cte_Up as
(
select t1.Id,
( select '' + t2.Seq
from table_2 t2
where t2.Position >= t1.UpStart
and t2.Position <= t1.UpEnd
order by t2.Position
for xml path('') ) as UpSeq
from table_1 t1
),
cte_S as
(
select t1.Id,
( select '' + t2.Seq
from table_2 t2
where t2.Position >= t1.SStart
and t2.Position <= t1.SEnd
order by t2.Position
for xml path('') ) as SSeq
from table_1 t1
),
cte_Dn as
(
select t1.Id,
( select '' + t2.Seq
from table_2 t2
where t2.Position >= t1.DnStart
and t2.Position <= t1.DnEnd
order by t2.Position
for xml path('') ) as DnSeq
from table_1 t1
)
select t1.Id,
t1.UpStart,
t1.UpEnd,
u.UpSeq,
t1.SStart,
t1.SEnd,
s.SSeq,
t1.DnStart,
t1.DnEnd,
d.DnSeq
from table_1 t1
join cte_Up u
on u.Id = t1.Id
join cte_S s
on s.Id = t1.Id
join cte_Dn d
on d.Id = t1.Id;
Fiddle 查看实际情况。
结果
Id | UpStart UpEnd UpSeq | SStart SEnd SSeq | DnStart DnEnd DnSeq
-- | ------- ----- ----- | ------ ---- ----- | ------- ----- -----
1 | 98 99 MN | 100 104 ATCTG | 105 106 TT
2 | 98 99 MN | 100 104 ATCTG | 105 106 TT
3 | 100 101 AT | 102 106 CTGTT | 107 108 GT
4 | 100 101 AT | 102 106 CTGTT | 107 108 GT