SQL 服务器存储过程在进行微小更改后 运行 花费了很长时间
SQL Server stored procedure taking a LONG time to run after minor changes
我有一个存储过程,它从我创建的两个 table 中获取信息以生成一个摘要 table,然后将其用于多个视图。
以前 运行 需要 60-90 秒。我有两次调用不同成本的函数,第三次调用成本 * 数量。我删除了所有 3 个并替换为一个几乎与其他成本函数之一完全相同的新函数
我在研究它时写了这篇文章,所以它有所改进。我提高了速度,但仍然没有以前那么快,我不确定为什么。
ALTER FUNCTION [dbo].[fn_getFactoryStdCost]
(@PartID int)
RETURNS decimal(20, 4)
AS
BEGIN
DECLARE @pureID int = 0
SET @pureID = (SELECT TOP(1) PURE_COST_ID
FROM visuser.PART_COST
WHERE EN_PART_ID = @partID
ORDER BY EN_REV_MASTER_ID DESC, IC_WAREHOUSE_ID DESC)
RETURN (SELECT TOP(1) (TOT_MATERIAL_N + TOT_MATERIAL_OVERHEAD_N)
FROM visuser.PURE_COST
WHERE PURE_COST_ID = @pureID
ORDER BY (TOT_MATERIAL_N + TOT_MATERIAL_OVERHEAD_N) DESC)
END
替换为。我在它第一次卡住后添加了 WITH INLINE = OFF
来排除这种情况。该功能本身就可以正常工作。
ALTER FUNCTION [dbo].[fn_getFactoryStdCost]
(@PartID int)
RETURNS decimal(20,4)
WITH INLINE = OFF
AS
BEGIN
DECLARE @pureID int = 0
SET @pureID = (SELECT TOP(1) PURE_COST_ID
FROM visuser.PART_COST
WHERE EN_PART_ID = @partID
ORDER BY EN_REV_MASTER_ID DESC, IC_WAREHOUSE_ID DESC)
RETURN (SELECT TOP(1) (TOT_MATERIAL_N + TOT_MATERIAL_OVERHEAD_N + TOT_RUN_VALUE_N + TOT_FIXED_OVERHEAD_N) FROM visuser.PURE_COST WHERE PURE_COST_ID = @pureID ORDER BY (TOT_MATERIAL_N + TOT_MATERIAL_OVERHEAD_N) DESC)
END
我所做的其他更改是将 [Qty] > 0 AND
添加到 [Part Count] 行
并将 Commondity ID 的基于字符串的条目替换为整数(更合适),因为 COMMODITY_ID 是对 COMMODITY_CODE 的引用,这就是字符串。
我希望它 运行 更快,而不是 运行 无限期。该过程现在需要永远 运行。我现在在 38 分钟并且还在继续。我还尝试只复制过程本身中的代码并 运行 对其进行编译,这也需要很长时间,所以它是代码本身的一部分。
AllPartsList table 有 1.04m 行,bomBreakdown table 也是如此。 bomBreakdown table 要复杂得多,需要 40-60 秒才能生成。 bomSummary table 将有 4,100 行。 AllPartsList table 有适当的索引,bomBreakdown 没有。
ALTER PROCEDURE [dbo].[createBOMSummary]
AS
DECLARE @processTime int=0, @begin datetime, @end datetime
SET @begin = SYSDATETIME()
IF OBJECT_ID(N'dbo.bomSummary', N'U') IS NOT NULL
DROP TABLE bomSummary
SELECT
DISTINCT ap.[SourcePartID] AS [Assembly Part ID],
p.[PART_X] AS [Assembly Part #],
p.[DESCR_X] AS [Assembly Part Description],
(SELECT COUNT(DISTINCT [Component Part #]) FROM [bomBreakdown] WHERE [Qty] > 0 AND [Component Part ID] IS NOT NULL AND SourcePartID = ap.SourcePartID GROUP BY [SourcePartID]) AS [Part Count],
(SELECT SUM([Qty]) FROM [bomBreakdown] WHERE [Component Part ID] IS NOT NULL AND SourcePartID = ap.[SourcePartID] GROUP BY [SourcePartID]) AS [Total # of Parts],
([dbo].[fn_getFactoryStdCost](ap.[SourcePartID])) AS [Factory Std Cost],
COALESCE(
(SELECT COUNT(DISTINCT ComponentPartID)
FROM AllPartsList apl
LEFT JOIN visuser.EN_PART p1
ON p1.[EN_Part_ID] = apl.[ComponentPartID]
WHERE
apl.ComponentPartID IS NOT NULL AND
apl.SourcePartID = ap.SourcePartID AND
p1.Commodity_ID IN (15, 84, 85, 87, 81, 92) -- Commodity Codes: 009, 072, 073, 075, 079, 082
GROUP BY SourcePartID
), 0) AS [# of Docs], --0sec
COALESCE(
(SELECT COUNT(DISTINCT ComponentPartID)
FROM AllPartsList apl
LEFT JOIN visuser.EN_PART p1
ON p1.[EN_Part_ID] = apl.[ComponentPartID]
WHERE
apl.ComponentPartID IS NOT NULL AND
apl.SourcePartID = ap.SourcePartID AND
p1.Commodity_ID IN (28) -- Commodity Code 034
GROUP BY SourcePartID
), 0) AS [# of Software], --0sec
COALESCE(
(SELECT COUNT(*)
FROM visuser.[PART_COST]
WHERE [STD_PO_Cost_N] > 0 AND
EN_PART_ID IN
(SELECT DISTINCT ComponentPartID FROM AllPartsList WHERE ComponentPartID IS NOT NULL AND SourcePartID = ap.SourcePartID)
), 0) AS [# of Std Cost Items], --0sec
COALESCE(
(SELECT COUNT(DISTINCT ComponentPartID)
FROM AllPartsList apl
LEFT JOIN visuser.EN_PART p1
ON p1.[EN_Part_ID] = apl.[ComponentPartID]
WHERE
apl.ComponentPartID IS NOT NULL AND
apl.SourcePartID = ap.SourcePartID AND
p1.Commodity_ID IN (11) -- Commodity Code: 002
GROUP BY SourcePartID), 0
) AS [# of HR Devices] ,--0sec
COALESCE(
(SELECT COUNT(DISTINCT ComponentPartID)
FROM AllPartsList apl
LEFT JOIN visuser.EN_PART p1
ON p1.[EN_Part_ID] = apl.[ComponentPartID]
WHERE
apl.ComponentPartID IS NOT NULL AND
apl.SourcePartID = ap.SourcePartID AND
p1.Commodity_ID IN (5) -- Commodity Code: 007
GROUP BY SourcePartID), 0
) AS [# of 3rd Party Devices], --0sec
COALESCE(
(SELECT COUNT(DISTINCT ComponentPartID)
FROM AllPartsList apl
LEFT JOIN visuser.EN_PART p1
ON p1.[EN_Part_ID] = apl.[ComponentPartID]
WHERE
apl.ComponentPartID IS NOT NULL AND
apl.SourcePartID = ap.SourcePartID AND
p1.Commodity_ID IN (13) AND -- Commodity Code: 005
p1.MAKE_BUY_C = 'B'
GROUP BY SourcePartID
), 0) AS [# of Robots], --0sec
COALESCE(
(SELECT COUNT(*)
FROM visuser.[PART_COST] c
LEFT JOIN visuser.[EN_PART] p
ON p.[EN_PART_ID] = c.[EN_PART_ID]
WHERE
c.[STD_PO_Cost_N] > 0 AND
p.[MAKE_BUY_C] = 'B' AND
c.[EN_PART_ID] IN
(SELECT DISTINCT ComponentPartID FROM AllPartsList WHERE ComponentPartID IS NOT NULL AND SourcePartID = ap.SourcePartID)
), 0) AS [# of Buy Parts], --0sec
COALESCE(
(SELECT COUNT(*)
FROM visuser.[PART_COST] c
LEFT JOIN visuser.[EN_PART] p
ON p.[EN_PART_ID] = c.[EN_PART_ID]
WHERE
c.[STD_PO_Cost_N] > 0 AND
p.[MAKE_BUY_C] = 'M' AND
c.[EN_PART_ID] IN
(SELECT DISTINCT ComponentPartID FROM AllPartsList WHERE ComponentPartID IS NOT NULL AND SourcePartID = ap.SourcePartID)
), 0) AS [# of Make Parts]
INTO bomSummary
FROM AllPartsList ap
LEFT JOIN visuser.EN_PART p
ON p.[EN_Part_ID] = ap.[SourcePartID]
ORDER BY [PART_X]
SET @end = SYSDATETIME()
SET @processTime = DATEDIFF(s, @begin, @end)
PRINT @end
PRINT CHAR(10)+CHAR(13)
PRINT 'bomSummary Processing Time: ' + CONVERT(varchar, @processTime)
GO
这是 bomBreakdown table 的样子:
和 AllPartsList table:
如果我注释掉两条记录需要1m 20s处理的函数行,这里是执行计划的一部分。看起来我的每个 COALESCE 都会增加 4-6 秒的处理时间。
如果我删除所有 COALESCE,则处理所有 4981 条记录需要 2 分 50 秒。这是它的执行列表:
执行计划建议了几个额外的索引,所以我添加了这些,现在 1 条记录需要 0 秒,2 条需要 5 秒,10 条需要 1 秒,100 条需要 2 秒,1000 条需要 28 条,所有 4981 条需要 4 分 17 秒.
额外的索引肯定有帮助,我不再看到 %s 超过 1000%,有几个仍然超过 100%,这让我觉得可以做更多的优化,我只是不确定在哪里。执行计划很大,所以这里只是几个镜头:
不确定这 2 条记录是怎么回事。虽然不是以前的 90 秒,但至少现在已经结束了。
我看到奇怪的是它有(1000 行受影响),然后(1 行受影响)。我不知道那 1 行是什么或它来自哪里。而且我仍然想知道为什么进行这些少量更改会产生如此大的变化。
我正在使用:
- SQL 服务器 2019 (v15.0.2070.41)
- SSMS v18.5
以下是我根据allmhuran的建议修改后的结果:
SELECT
DISTINCT ap.[SourcePartID] AS [Assembly Part ID],
p.[PART_X] AS [Assembly Part #],
p.[DESCR_X] AS [Assembly Part Description],
oa2.[Part Count],
oa2.[Total # of Parts],
([dbo].[fn_getFactoryStdCost](ap.[SourcePartID])) AS [Factory Std Cost],
oa2.[# of Docs],
oa2.[# of Software],
'Logic Pending' AS [# of Std Cost Items],
oa2.[# of HR Devices],
oa2.[# of 3rd Party Devices],
oa2.[# of Robots],
oa2.[# of Buy Parts],
oa2.[# of Make Parts]
FROM AllPartsList ap
LEFT JOIN visuser.EN_PART p
ON p.[EN_Part_ID] = ap.[SourcePartID]
OUTER APPLY (
SELECT
[Part Count] = COUNT( DISTINCT IIF( [Qty] = 0, null, [Component Part #]) ),
[Total # of Parts] = SUM([Qty]),
[# of Docs] = COUNT( DISTINCT IIF( [Commodity Code] IN ('009', '072', '073', '075', '079', '082'), [Component Part #], null) ), -- Commodity Codes: 009, 072, 073, 075, 079, 082 : Commodity ID: 15, 84, 85, 87, 81, 92
[# of Software] = COUNT( DISTINCT IIF( [Commodity Code] IN ('034'), [Component Part #], null) ), -- Commodity Code 034 : Commodity ID: 28
[# of HR Devices] = COUNT( DISTINCT IIF( [Commodity Code] IN ('002'), [Component Part #], null) ), -- Commodity Code 002 : Commodity ID: 11
[# of 3rd Party Devices] = COUNT( DISTINCT IIF( [Commodity Code] IN ('007'), [Component Part #], null) ), -- Commodity Code 007 : Commodity ID: 5
[# of Robots] = COUNT( DISTINCT IIF( ( [Commodity Code] IN ('005') AND [Make/Buy] = 'B' ), [Component Part #], null) ), -- Commodity Code 005 : Commodity ID: 13
[# of Buy Parts] = COUNT( DISTINCT IIF( [Make/Buy] = 'B', [Component Part #], null) ),
[# of Make Parts] = COUNT( DISTINCT IIF( [Make/Buy] = 'M', [Component Part #], null) )
FROM bomBreakdown
WHERE
[Component Part ID] IS NOT NULL AND
[SourcePartID] = ap.[SourcePartID] AND
--[SourcePartID] = ap.[AssemblyPartID] AND
ap.SourcePartID = 964
GROUP BY [SourcePartID]
) oa2
好的,抽点时间完成这个。
标量函数重构
正如我在评论中提到的,标量函数对基于集合的操作做坏事。一般来说,如果你有这样的模式
create function scalar_UDF(@i int) returns int as begin
return @i * 2;
end
select c = scalar_UDF(t.c)
from t;
然后这会将您的 select 变成暗中进行的逐行 (RBAR) 操作。
您可以通过坚持使用基于集合的操作来提高性能。一种方法是将标量 UDF 标记为 inline
,这基本上告诉 SQL 它可以在生成查询计划之前将您的查询重写为:
select c = t.c * 2
from t;
但是标量函数内联是微软很难解决的事情,而且还是有点bug。另一种方法是自己处理,使用内联 table 值函数和 cross apply
或 outer apply
create function inline_TVF(@i int) returns table as return
(
select result = @i * 2
)
select c = u.result
from t
outer apply inline_TVF(t.c) u;
实际分解重构
您现有的部分程序如下所示:
select [Part Count] =
(
select count(distinct [Component Part #])
from bomBreakdown
where Qty > 0
and [Component Part ID] is not null
and SourcePartID = ap.SourcePartID
group by SourcePartID
),
[Total # of Parts] =
(
select sum(Qty)
from bomBreakdown
where [Component Part ID] is not null
and SourcePartID = ap.SourcePartID
group by SourcePartID
)
-- , more ...
这两个子查询看起来非常相似。就是这种模式:
select a = (
select x1 from y where z
),
b = (
select x2 from y where almost_z
)
我们真正想做的是像下面这样的事情。如果可以,那么查询只需要命中 y
table 一次,而不是命中两次。但是语法当然是无效的:
select a = t.x1,
b = t.x2
from (
select x1 where z,
x2 where almost_z
from y
) t
啊哈,但也许我们可以聪明一点。如果我们回顾您的具体案例,我们可能会将其更改为如下内容:
select oa1.[Part Count],
oa1.[Total # of Parts]
into bomSummary
from AllPartsList ap
left join visuser.EN_PART p on p.EN_Part_ID = ap.SourcePartID
outer apply (
select [Part Count] = count
(
distinct iif
(
Qty = 0, null, [Component Part #]
)
),
[Total # of Parts] = sum(qty)
from bomBreakdown
where [Component Part ID] is not null
and SourcePartID = ap.SourcePartID
group by SourcePartID
)
oa1
此处,如果数量为零,iif(Qty = 0, null, [Component Part #])
将使该列为空。计数将忽略那些空值。我们得到了独特的,就像以前一样。所以我们在这里偷偷设法得到了一个 where
子句:“计算数量不等于零的不同 component part #
值”。现在我们也可以对 Qty
列求和,我们完成了重构。
可以在此存储过程的许多 处进行相同类型的重构。这实际上是一个很好的重构学习练习 SQL。我不打算做所有这些,只是尝试识别模式,并遵循因式分解过程——就像你在代数中所做的一样。因为,在很多方面,这个 是 代数!
如有 typos/syntax 错误,敬请谅解。我无法通过实际查询来检查这一点 window,我在这里的目的是展示一些想法,而不是实际重写原始查询。
我有一个存储过程,它从我创建的两个 table 中获取信息以生成一个摘要 table,然后将其用于多个视图。
以前 运行 需要 60-90 秒。我有两次调用不同成本的函数,第三次调用成本 * 数量。我删除了所有 3 个并替换为一个几乎与其他成本函数之一完全相同的新函数
我在研究它时写了这篇文章,所以它有所改进。我提高了速度,但仍然没有以前那么快,我不确定为什么。
ALTER FUNCTION [dbo].[fn_getFactoryStdCost]
(@PartID int)
RETURNS decimal(20, 4)
AS
BEGIN
DECLARE @pureID int = 0
SET @pureID = (SELECT TOP(1) PURE_COST_ID
FROM visuser.PART_COST
WHERE EN_PART_ID = @partID
ORDER BY EN_REV_MASTER_ID DESC, IC_WAREHOUSE_ID DESC)
RETURN (SELECT TOP(1) (TOT_MATERIAL_N + TOT_MATERIAL_OVERHEAD_N)
FROM visuser.PURE_COST
WHERE PURE_COST_ID = @pureID
ORDER BY (TOT_MATERIAL_N + TOT_MATERIAL_OVERHEAD_N) DESC)
END
替换为。我在它第一次卡住后添加了 WITH INLINE = OFF
来排除这种情况。该功能本身就可以正常工作。
ALTER FUNCTION [dbo].[fn_getFactoryStdCost]
(@PartID int)
RETURNS decimal(20,4)
WITH INLINE = OFF
AS
BEGIN
DECLARE @pureID int = 0
SET @pureID = (SELECT TOP(1) PURE_COST_ID
FROM visuser.PART_COST
WHERE EN_PART_ID = @partID
ORDER BY EN_REV_MASTER_ID DESC, IC_WAREHOUSE_ID DESC)
RETURN (SELECT TOP(1) (TOT_MATERIAL_N + TOT_MATERIAL_OVERHEAD_N + TOT_RUN_VALUE_N + TOT_FIXED_OVERHEAD_N) FROM visuser.PURE_COST WHERE PURE_COST_ID = @pureID ORDER BY (TOT_MATERIAL_N + TOT_MATERIAL_OVERHEAD_N) DESC)
END
我所做的其他更改是将 [Qty] > 0 AND
添加到 [Part Count] 行
并将 Commondity ID 的基于字符串的条目替换为整数(更合适),因为 COMMODITY_ID 是对 COMMODITY_CODE 的引用,这就是字符串。
我希望它 运行 更快,而不是 运行 无限期。该过程现在需要永远 运行。我现在在 38 分钟并且还在继续。我还尝试只复制过程本身中的代码并 运行 对其进行编译,这也需要很长时间,所以它是代码本身的一部分。
AllPartsList table 有 1.04m 行,bomBreakdown table 也是如此。 bomBreakdown table 要复杂得多,需要 40-60 秒才能生成。 bomSummary table 将有 4,100 行。 AllPartsList table 有适当的索引,bomBreakdown 没有。
ALTER PROCEDURE [dbo].[createBOMSummary]
AS
DECLARE @processTime int=0, @begin datetime, @end datetime
SET @begin = SYSDATETIME()
IF OBJECT_ID(N'dbo.bomSummary', N'U') IS NOT NULL
DROP TABLE bomSummary
SELECT
DISTINCT ap.[SourcePartID] AS [Assembly Part ID],
p.[PART_X] AS [Assembly Part #],
p.[DESCR_X] AS [Assembly Part Description],
(SELECT COUNT(DISTINCT [Component Part #]) FROM [bomBreakdown] WHERE [Qty] > 0 AND [Component Part ID] IS NOT NULL AND SourcePartID = ap.SourcePartID GROUP BY [SourcePartID]) AS [Part Count],
(SELECT SUM([Qty]) FROM [bomBreakdown] WHERE [Component Part ID] IS NOT NULL AND SourcePartID = ap.[SourcePartID] GROUP BY [SourcePartID]) AS [Total # of Parts],
([dbo].[fn_getFactoryStdCost](ap.[SourcePartID])) AS [Factory Std Cost],
COALESCE(
(SELECT COUNT(DISTINCT ComponentPartID)
FROM AllPartsList apl
LEFT JOIN visuser.EN_PART p1
ON p1.[EN_Part_ID] = apl.[ComponentPartID]
WHERE
apl.ComponentPartID IS NOT NULL AND
apl.SourcePartID = ap.SourcePartID AND
p1.Commodity_ID IN (15, 84, 85, 87, 81, 92) -- Commodity Codes: 009, 072, 073, 075, 079, 082
GROUP BY SourcePartID
), 0) AS [# of Docs], --0sec
COALESCE(
(SELECT COUNT(DISTINCT ComponentPartID)
FROM AllPartsList apl
LEFT JOIN visuser.EN_PART p1
ON p1.[EN_Part_ID] = apl.[ComponentPartID]
WHERE
apl.ComponentPartID IS NOT NULL AND
apl.SourcePartID = ap.SourcePartID AND
p1.Commodity_ID IN (28) -- Commodity Code 034
GROUP BY SourcePartID
), 0) AS [# of Software], --0sec
COALESCE(
(SELECT COUNT(*)
FROM visuser.[PART_COST]
WHERE [STD_PO_Cost_N] > 0 AND
EN_PART_ID IN
(SELECT DISTINCT ComponentPartID FROM AllPartsList WHERE ComponentPartID IS NOT NULL AND SourcePartID = ap.SourcePartID)
), 0) AS [# of Std Cost Items], --0sec
COALESCE(
(SELECT COUNT(DISTINCT ComponentPartID)
FROM AllPartsList apl
LEFT JOIN visuser.EN_PART p1
ON p1.[EN_Part_ID] = apl.[ComponentPartID]
WHERE
apl.ComponentPartID IS NOT NULL AND
apl.SourcePartID = ap.SourcePartID AND
p1.Commodity_ID IN (11) -- Commodity Code: 002
GROUP BY SourcePartID), 0
) AS [# of HR Devices] ,--0sec
COALESCE(
(SELECT COUNT(DISTINCT ComponentPartID)
FROM AllPartsList apl
LEFT JOIN visuser.EN_PART p1
ON p1.[EN_Part_ID] = apl.[ComponentPartID]
WHERE
apl.ComponentPartID IS NOT NULL AND
apl.SourcePartID = ap.SourcePartID AND
p1.Commodity_ID IN (5) -- Commodity Code: 007
GROUP BY SourcePartID), 0
) AS [# of 3rd Party Devices], --0sec
COALESCE(
(SELECT COUNT(DISTINCT ComponentPartID)
FROM AllPartsList apl
LEFT JOIN visuser.EN_PART p1
ON p1.[EN_Part_ID] = apl.[ComponentPartID]
WHERE
apl.ComponentPartID IS NOT NULL AND
apl.SourcePartID = ap.SourcePartID AND
p1.Commodity_ID IN (13) AND -- Commodity Code: 005
p1.MAKE_BUY_C = 'B'
GROUP BY SourcePartID
), 0) AS [# of Robots], --0sec
COALESCE(
(SELECT COUNT(*)
FROM visuser.[PART_COST] c
LEFT JOIN visuser.[EN_PART] p
ON p.[EN_PART_ID] = c.[EN_PART_ID]
WHERE
c.[STD_PO_Cost_N] > 0 AND
p.[MAKE_BUY_C] = 'B' AND
c.[EN_PART_ID] IN
(SELECT DISTINCT ComponentPartID FROM AllPartsList WHERE ComponentPartID IS NOT NULL AND SourcePartID = ap.SourcePartID)
), 0) AS [# of Buy Parts], --0sec
COALESCE(
(SELECT COUNT(*)
FROM visuser.[PART_COST] c
LEFT JOIN visuser.[EN_PART] p
ON p.[EN_PART_ID] = c.[EN_PART_ID]
WHERE
c.[STD_PO_Cost_N] > 0 AND
p.[MAKE_BUY_C] = 'M' AND
c.[EN_PART_ID] IN
(SELECT DISTINCT ComponentPartID FROM AllPartsList WHERE ComponentPartID IS NOT NULL AND SourcePartID = ap.SourcePartID)
), 0) AS [# of Make Parts]
INTO bomSummary
FROM AllPartsList ap
LEFT JOIN visuser.EN_PART p
ON p.[EN_Part_ID] = ap.[SourcePartID]
ORDER BY [PART_X]
SET @end = SYSDATETIME()
SET @processTime = DATEDIFF(s, @begin, @end)
PRINT @end
PRINT CHAR(10)+CHAR(13)
PRINT 'bomSummary Processing Time: ' + CONVERT(varchar, @processTime)
GO
这是 bomBreakdown table 的样子:
和 AllPartsList table:
如果我注释掉两条记录需要1m 20s处理的函数行,这里是执行计划的一部分。看起来我的每个 COALESCE 都会增加 4-6 秒的处理时间。
如果我删除所有 COALESCE,则处理所有 4981 条记录需要 2 分 50 秒。这是它的执行列表:
执行计划建议了几个额外的索引,所以我添加了这些,现在 1 条记录需要 0 秒,2 条需要 5 秒,10 条需要 1 秒,100 条需要 2 秒,1000 条需要 28 条,所有 4981 条需要 4 分 17 秒.
额外的索引肯定有帮助,我不再看到 %s 超过 1000%,有几个仍然超过 100%,这让我觉得可以做更多的优化,我只是不确定在哪里。执行计划很大,所以这里只是几个镜头:
不确定这 2 条记录是怎么回事。虽然不是以前的 90 秒,但至少现在已经结束了。
我看到奇怪的是它有(1000 行受影响),然后(1 行受影响)。我不知道那 1 行是什么或它来自哪里。而且我仍然想知道为什么进行这些少量更改会产生如此大的变化。
我正在使用:
- SQL 服务器 2019 (v15.0.2070.41)
- SSMS v18.5
以下是我根据allmhuran的建议修改后的结果:
SELECT
DISTINCT ap.[SourcePartID] AS [Assembly Part ID],
p.[PART_X] AS [Assembly Part #],
p.[DESCR_X] AS [Assembly Part Description],
oa2.[Part Count],
oa2.[Total # of Parts],
([dbo].[fn_getFactoryStdCost](ap.[SourcePartID])) AS [Factory Std Cost],
oa2.[# of Docs],
oa2.[# of Software],
'Logic Pending' AS [# of Std Cost Items],
oa2.[# of HR Devices],
oa2.[# of 3rd Party Devices],
oa2.[# of Robots],
oa2.[# of Buy Parts],
oa2.[# of Make Parts]
FROM AllPartsList ap
LEFT JOIN visuser.EN_PART p
ON p.[EN_Part_ID] = ap.[SourcePartID]
OUTER APPLY (
SELECT
[Part Count] = COUNT( DISTINCT IIF( [Qty] = 0, null, [Component Part #]) ),
[Total # of Parts] = SUM([Qty]),
[# of Docs] = COUNT( DISTINCT IIF( [Commodity Code] IN ('009', '072', '073', '075', '079', '082'), [Component Part #], null) ), -- Commodity Codes: 009, 072, 073, 075, 079, 082 : Commodity ID: 15, 84, 85, 87, 81, 92
[# of Software] = COUNT( DISTINCT IIF( [Commodity Code] IN ('034'), [Component Part #], null) ), -- Commodity Code 034 : Commodity ID: 28
[# of HR Devices] = COUNT( DISTINCT IIF( [Commodity Code] IN ('002'), [Component Part #], null) ), -- Commodity Code 002 : Commodity ID: 11
[# of 3rd Party Devices] = COUNT( DISTINCT IIF( [Commodity Code] IN ('007'), [Component Part #], null) ), -- Commodity Code 007 : Commodity ID: 5
[# of Robots] = COUNT( DISTINCT IIF( ( [Commodity Code] IN ('005') AND [Make/Buy] = 'B' ), [Component Part #], null) ), -- Commodity Code 005 : Commodity ID: 13
[# of Buy Parts] = COUNT( DISTINCT IIF( [Make/Buy] = 'B', [Component Part #], null) ),
[# of Make Parts] = COUNT( DISTINCT IIF( [Make/Buy] = 'M', [Component Part #], null) )
FROM bomBreakdown
WHERE
[Component Part ID] IS NOT NULL AND
[SourcePartID] = ap.[SourcePartID] AND
--[SourcePartID] = ap.[AssemblyPartID] AND
ap.SourcePartID = 964
GROUP BY [SourcePartID]
) oa2
好的,抽点时间完成这个。
标量函数重构
正如我在评论中提到的,标量函数对基于集合的操作做坏事。一般来说,如果你有这样的模式
create function scalar_UDF(@i int) returns int as begin
return @i * 2;
end
select c = scalar_UDF(t.c)
from t;
然后这会将您的 select 变成暗中进行的逐行 (RBAR) 操作。
您可以通过坚持使用基于集合的操作来提高性能。一种方法是将标量 UDF 标记为 inline
,这基本上告诉 SQL 它可以在生成查询计划之前将您的查询重写为:
select c = t.c * 2
from t;
但是标量函数内联是微软很难解决的事情,而且还是有点bug。另一种方法是自己处理,使用内联 table 值函数和 cross apply
或 outer apply
create function inline_TVF(@i int) returns table as return
(
select result = @i * 2
)
select c = u.result
from t
outer apply inline_TVF(t.c) u;
实际分解重构
您现有的部分程序如下所示:
select [Part Count] =
(
select count(distinct [Component Part #])
from bomBreakdown
where Qty > 0
and [Component Part ID] is not null
and SourcePartID = ap.SourcePartID
group by SourcePartID
),
[Total # of Parts] =
(
select sum(Qty)
from bomBreakdown
where [Component Part ID] is not null
and SourcePartID = ap.SourcePartID
group by SourcePartID
)
-- , more ...
这两个子查询看起来非常相似。就是这种模式:
select a = (
select x1 from y where z
),
b = (
select x2 from y where almost_z
)
我们真正想做的是像下面这样的事情。如果可以,那么查询只需要命中 y
table 一次,而不是命中两次。但是语法当然是无效的:
select a = t.x1,
b = t.x2
from (
select x1 where z,
x2 where almost_z
from y
) t
啊哈,但也许我们可以聪明一点。如果我们回顾您的具体案例,我们可能会将其更改为如下内容:
select oa1.[Part Count],
oa1.[Total # of Parts]
into bomSummary
from AllPartsList ap
left join visuser.EN_PART p on p.EN_Part_ID = ap.SourcePartID
outer apply (
select [Part Count] = count
(
distinct iif
(
Qty = 0, null, [Component Part #]
)
),
[Total # of Parts] = sum(qty)
from bomBreakdown
where [Component Part ID] is not null
and SourcePartID = ap.SourcePartID
group by SourcePartID
)
oa1
此处,如果数量为零,iif(Qty = 0, null, [Component Part #])
将使该列为空。计数将忽略那些空值。我们得到了独特的,就像以前一样。所以我们在这里偷偷设法得到了一个 where
子句:“计算数量不等于零的不同 component part #
值”。现在我们也可以对 Qty
列求和,我们完成了重构。
可以在此存储过程的许多 处进行相同类型的重构。这实际上是一个很好的重构学习练习 SQL。我不打算做所有这些,只是尝试识别模式,并遵循因式分解过程——就像你在代数中所做的一样。因为,在很多方面,这个 是 代数!
如有 typos/syntax 错误,敬请谅解。我无法通过实际查询来检查这一点 window,我在这里的目的是展示一些想法,而不是实际重写原始查询。