在 sql 中连接行的最佳方式
best way to concatenate rows in sql
目前,我正在使用以下代码来汇总已审计固定资产的报告。起初,这没问题,但 3 年后我发现系统现在非常慢。我重新访问代码,发现该代码导致系统 运行 变慢的原因。当我删除子字符串部分时,系统 运行 更快。这附近有工作吗?
SELECT
ENTITY
, [ASSET NUMBER]
, [YEAR AUDITED]
, SUM(COUNT) AS AUDITED
, SUBSTRING
(
(
SELECT
', ' + [SCANNED BY].USERNAME
FROM dbo.vwAUDITED as [SCANNED BY]
WHERE vwAUDITED.[ASSET NUMBER] = [SCANNED BY].[ASSET NUMBER]
AND vwAUDITED.ENTITY = [SCANNED BY].ENTITY
AND vwAUDITED.[YEAR AUDITED] = [SCANNED BY].[YEAR AUDITED]
ORDER BY [SCANNED BY].[ASSET NUMBER] FOR XML PATH('')
), 2, 1000
) AS [SCANNED BY]
, MAX(DATE) AS [COMPLETION DATE]
FROM dbo.vwAUDITED
GROUP BY ENTITY
, [ASSET NUMBER]
, [YEAR AUDITED]
一种方法是在您的 MS SQL 服务器版本中添加 user-defined aggregate function, as an alternative for the missing STRING_AGG 功能。然后用函数替换相关子查询。
另一种不涉及数据库管理员的方法是使用临时 table,在用于加入子查询的字段上使用复合索引 FOR XML
.
在 db<>fiddle here
上测试
-- Just assuming the datatypes here, so change them to the correct types.
IF OBJECT_ID('tempdb..#tmpAUDITED') IS NOT NULL DROP TABLE #tmpAUDITED;
CREATE TABLE #tmpAUDITED
(
[ENTITY] INT NOT NULL,
[ASSET NUMBER] INT NOT NULL,
[YEAR AUDITED] INT NOT NULL,
[COUNT] INT,
[DATE] DATE,
[USERNAME] VARCHAR(100),
INDEX idx_1 NONCLUSTERED ([ENTITY], [ASSET NUMBER], [YEAR AUDITED])
);
INSERT INTO #tmpAUDITED (
[ENTITY], [ASSET NUMBER],[YEAR AUDITED], [COUNT], [DATE], [USERNAME]
)
SELECT
[ENTITY], [ASSET NUMBER],[YEAR AUDITED], [COUNT], [DATE], [USERNAME]
FROM dbo.vwAUDITED;
-- Now using the temp table instead of the view.
SELECT
[ENTITY],
[ASSET NUMBER],
[YEAR AUDITED],
SUM([COUNT]) AS AUDITED,
SUBSTRING
((
SELECT ', ' + s.[USERNAME]
FROM #tmpAUDITED AS s
WHERE s.[ENTITY] = [AUDITED].[ENTITY]
AND s.[ASSET NUMBER] = [AUDITED].[ASSET NUMBER]
AND s.[YEAR AUDITED] = [AUDITED].[YEAR AUDITED]
-- ORDER BY s.[ENTITY], s.[ASSET NUMBER], s.[YEAR AUDITED], s.[USERNAME]
FOR XML PATH('')
), 2, 1000) AS [SCANNED BY],
MAX([DATE]) AS [COMPLETION DATE]
FROM #tmpAUDITED AS [AUDITED]
GROUP BY [ENTITY], [ASSET NUMBER], [YEAR AUDITED];
此外,检查视图中的查询是否可以优化可能是值得的。或者在 table 上添加某些索引(在视图中使用)可以提高视图本身的性能。
最好的连接方式是:
declare @str nvarchar(max) = space(0);
select @str += [column] from [yourTable];
select @str;
但是,您的情况有点不同。
在您的情况下,连接做得很好,并且由于视图结构可能会导致性能下降。如果您看不到视图结构,
您可以通过使用主视图记录集作为 CTE 来最大限度地减少性能问题,然后在 CTE 而不是视图上执行资源成本高昂的连接:
;with [data] as (
select [entity], [asset number], [year audited], [username], [date], [count] from [dbo].[vwAUDITED]
)
select
[entity] = [d].[entity]
,[asset number] = [d].[asset number]
,[year audited] = [d].[year audited]
,[audited] = sum([d].[count])
,[scanned by] = substring((
select
', ' + [username]
from
[dbo].[vwAUDITED]
where
[asset number] = [d].[asset number]
and [entity] = [d].[entity]
and [year audited] = [d].[year audited]
order by
[asset number]
for xml path('')
), 2, 1000)
,[completion date] = max([d].[date])
from
[data] as [d]
group by
[d].[entity]
,[d].[asset number]
,[d].[year audited];
目前,我正在使用以下代码来汇总已审计固定资产的报告。起初,这没问题,但 3 年后我发现系统现在非常慢。我重新访问代码,发现该代码导致系统 运行 变慢的原因。当我删除子字符串部分时,系统 运行 更快。这附近有工作吗?
SELECT
ENTITY
, [ASSET NUMBER]
, [YEAR AUDITED]
, SUM(COUNT) AS AUDITED
, SUBSTRING
(
(
SELECT
', ' + [SCANNED BY].USERNAME
FROM dbo.vwAUDITED as [SCANNED BY]
WHERE vwAUDITED.[ASSET NUMBER] = [SCANNED BY].[ASSET NUMBER]
AND vwAUDITED.ENTITY = [SCANNED BY].ENTITY
AND vwAUDITED.[YEAR AUDITED] = [SCANNED BY].[YEAR AUDITED]
ORDER BY [SCANNED BY].[ASSET NUMBER] FOR XML PATH('')
), 2, 1000
) AS [SCANNED BY]
, MAX(DATE) AS [COMPLETION DATE]
FROM dbo.vwAUDITED
GROUP BY ENTITY
, [ASSET NUMBER]
, [YEAR AUDITED]
一种方法是在您的 MS SQL 服务器版本中添加 user-defined aggregate function, as an alternative for the missing STRING_AGG 功能。然后用函数替换相关子查询。
另一种不涉及数据库管理员的方法是使用临时 table,在用于加入子查询的字段上使用复合索引 FOR XML
.
在 db<>fiddle here
上测试-- Just assuming the datatypes here, so change them to the correct types.
IF OBJECT_ID('tempdb..#tmpAUDITED') IS NOT NULL DROP TABLE #tmpAUDITED;
CREATE TABLE #tmpAUDITED
(
[ENTITY] INT NOT NULL,
[ASSET NUMBER] INT NOT NULL,
[YEAR AUDITED] INT NOT NULL,
[COUNT] INT,
[DATE] DATE,
[USERNAME] VARCHAR(100),
INDEX idx_1 NONCLUSTERED ([ENTITY], [ASSET NUMBER], [YEAR AUDITED])
);
INSERT INTO #tmpAUDITED (
[ENTITY], [ASSET NUMBER],[YEAR AUDITED], [COUNT], [DATE], [USERNAME]
)
SELECT
[ENTITY], [ASSET NUMBER],[YEAR AUDITED], [COUNT], [DATE], [USERNAME]
FROM dbo.vwAUDITED;
-- Now using the temp table instead of the view.
SELECT
[ENTITY],
[ASSET NUMBER],
[YEAR AUDITED],
SUM([COUNT]) AS AUDITED,
SUBSTRING
((
SELECT ', ' + s.[USERNAME]
FROM #tmpAUDITED AS s
WHERE s.[ENTITY] = [AUDITED].[ENTITY]
AND s.[ASSET NUMBER] = [AUDITED].[ASSET NUMBER]
AND s.[YEAR AUDITED] = [AUDITED].[YEAR AUDITED]
-- ORDER BY s.[ENTITY], s.[ASSET NUMBER], s.[YEAR AUDITED], s.[USERNAME]
FOR XML PATH('')
), 2, 1000) AS [SCANNED BY],
MAX([DATE]) AS [COMPLETION DATE]
FROM #tmpAUDITED AS [AUDITED]
GROUP BY [ENTITY], [ASSET NUMBER], [YEAR AUDITED];
此外,检查视图中的查询是否可以优化可能是值得的。或者在 table 上添加某些索引(在视图中使用)可以提高视图本身的性能。
最好的连接方式是:
declare @str nvarchar(max) = space(0);
select @str += [column] from [yourTable];
select @str;
但是,您的情况有点不同。 在您的情况下,连接做得很好,并且由于视图结构可能会导致性能下降。如果您看不到视图结构, 您可以通过使用主视图记录集作为 CTE 来最大限度地减少性能问题,然后在 CTE 而不是视图上执行资源成本高昂的连接:
;with [data] as (
select [entity], [asset number], [year audited], [username], [date], [count] from [dbo].[vwAUDITED]
)
select
[entity] = [d].[entity]
,[asset number] = [d].[asset number]
,[year audited] = [d].[year audited]
,[audited] = sum([d].[count])
,[scanned by] = substring((
select
', ' + [username]
from
[dbo].[vwAUDITED]
where
[asset number] = [d].[asset number]
and [entity] = [d].[entity]
and [year audited] = [d].[year audited]
order by
[asset number]
for xml path('')
), 2, 1000)
,[completion date] = max([d].[date])
from
[data] as [d]
group by
[d].[entity]
,[d].[asset number]
,[d].[year audited];