为什么将 OrderBy 添加到 LINQ to EF 查询会提高其性能?
Why does adding OrderBy to LINQ to EF query improve its performance?
请参阅下面的查询。对象和 属性 名称在某种程度上被混淆了,以免泄露 confidential/sensitive 信息,但查询结构是相同的。
添加 .OrderBy(p => "")
后,这对我来说完全是无意义的,查询运行得更快。执行查询所需的时间约为。 2000 毫秒下降到大约。 400 毫秒。我已经测试了几次,只添加和删除了 OrderBy 语句。
我百思不得其解,怎么会这样?查询在 Azure 环境中的 SQL 数据库上执行。
我可以理解,在 属性 A 上排序数据,然后选择 属性 A 等于某个值的记录可能会加快查询速度。但是在一个空字符串上排序!?这是怎么回事?
我还想指出,在没有 OrderBy 的情况下,使用表达式(如 )的查询将执行时间也降低到大约。 400 毫秒。添加 .OrderBy(p => "")
然后不会产生任何明显的差异。
var query = (from p in Context.Punders.Where(p => p.A == A)
.Where(p => null != p.SomeNumber)
.Where(p => p.StatusCode == Default ||
p.StatusCode == Cancelled)
.Where(p => p.DatePosted >= startDate && p.DatePosted <= endDate)
join f in Context.Founders.Where(f => f.A == A) on p.Code equals f.Code
join r in Context.Rounders.Where(r => r.A == A) on p.Code equals r.Code
into rg
from r in rg.DefaultIfEmpty()
join pt in Context.FishTypes.Where(ft => ft.A ==A) on p.Code equals pt.Code
where r == null
select new
{
p.Code,
f.B,
f.C,
p.D,
p.E,
pt.F,
pt.G,
p.H
})
.OrderBy(p => "");
没有 .OrderBy(...
的查询
SELECT [Filter1].[q] AS [q],
[Filter1].[c1] AS [edoc],
[Filter1].[oc1] AS [wnrdc],
[Filter1].[otc1] AS [weener],
[Filter1].[ptc1] AS [pmtpdc],
[Extent4].[isr] AS [isr],
[Extent4].[rac] AS [rac],
[Filter1].[arn] AS [arn]
FROM (SELECT [Extent1].[pcid] AS [pcid1],
[Extent1].[edoc] AS [c1],
[Extent1].[pmtpdc] AS [ptc1],
[Extent1].[q] AS [q],
[Extent1].[arn] AS [arn],
[Extent1].[dateposted] AS [DatePosted],
[Extent2].[pcid] AS [pcid2],
[Extent2].[wnrdc] AS [oc1],
[Extent2].[weener] AS [otc1]
FROM [fnish].[post] AS [Extent1]
INNER JOIN [fnish].[olik] AS [Extent2]
ON [Extent1].[olikedoc] = [Extent2].[edoc]
LEFT OUTER JOIN [fnish].[receivable] AS [Extent3]
ON ( [Extent3].[pcid] = @p__linq__4 )
AND ( [Extent1].[edoc] =
[Extent3].[pepstedoc] )
WHERE ( [Extent1].[arn] IS NOT NULL )
AND ( [Extent1].[posttedoc] IN ( N'D', N'X' ) )
AND ( [Extent3].[id] IS NULL )) AS [Filter1]
INNER JOIN [fnish].[paymenttype] AS [Extent4]
ON [Filter1].[ptc1] = [Extent4].[edoc]
WHERE ( [Filter1].[pcid1] = @p__linq__0 )
AND ( [Filter1].[dateposted] >= @p__linq__1 )
AND ( [Filter1].[dateposted] <= @p__linq__2 )
AND ( [Filter1].[pcid2] = @p__linq__3 )
AND ( [Extent4].[pcid] = @p__linq__5 )
查询 .OrderBy(...
SELECT [Project1].[q] AS [q],
[Project1].[edoc] AS [edoc],
[Project1].[wnrdc] AS [wnrdc],
[Project1].[weener] AS [weener],
[Project1].[pmtpdc] AS [pmtpdc],
[Project1].[isr] AS [isr],
[Project1].[rac] AS [rac],
[Project1].[arn] AS [arn]
FROM (SELECT N'' AS [C1],
[Filter1].[c1] AS [edoc],
[Filter1].[ptc1] AS [pmtpdc],
[Filter1].[q] AS [q],
[Filter1].[arn] AS [arn],
[Filter1].[oc1] AS [wnrdc],
[Filter1].[otc1] AS [weener],
[Extent4].[isr] AS [isr],
[Extent4].[rac] AS [rac]
FROM (SELECT [Extent1].[pcid] AS [pcid1],
[Extent1].[edoc] AS [c1],
[Extent1].[pmtpdc] AS [ptc1],
[Extent1].[q] AS [q],
[Extent1].[arn] AS [arn],
[Extent1].[dateposted] AS [DatePosted],
[Extent2].[pcid] AS [pcid2],
[Extent2].[wnrdc] AS [oc1],
[Extent2].[weener] AS [otc1]
FROM [fnish].[post] AS [Extent1]
INNER JOIN [fnish].[olik] AS [Extent2]
ON [Extent1].[olikedoc] = [Extent2].[edoc]
LEFT OUTER JOIN [fnish].[receivable] AS [Extent3]
ON ( [Extent3].[pcid] =
@p__linq__4 )
AND ( [Extent1].[edoc] =
[Extent3].[pepstedoc] )
WHERE ( [Extent1].[arn] IS NOT NULL )
AND ( [Extent1].[posttedoc] IN ( N'D', N'X' ) )
AND ( [Extent3].[id] IS NULL )) AS [Filter1]
INNER JOIN [fnish].[paymenttype] AS [Extent4]
ON [Filter1].[ptc1] = [Extent4].[edoc]
WHERE ( [Filter1].[pcid1] = @p__linq__0 )
AND ( [Filter1].[dateposted] >= @p__linq__1 )
AND ( [Filter1].[dateposted] <= @p__linq__2 )
AND ( [Filter1].[pcid2] = @p__linq__3 )
AND ( [Extent4].[pcid] = @p__linq__5 )) AS [Project1]
ORDER BY [Project1].[c1] ASC
结论
据我所知,有一点猜测:这是特定于案例的行为。在我的例子中,性能提升可能是由于 SQL 服务器构建的不同执行计划产生了性能更好的查询。我已经看到一个不同的执行计划,在没有 OrderBy
的情况下,使用 SQL 语句 OPTION(RECOMIPILE)
的查询显示出类似的性能提升。因此,将 OrderBy
添加到 LINQ 查询中很可能(我认为)会产生不同的执行计划,从而产生性能更好的查询。
根据您的说明
Also I want to note, that the query, without the OrderBy, using
Expressions ( as suggested in this post to circumvent SQL parameter
sniffing) lowers the execution time also to approx. 400ms. Adding the
.OrderBy(p => "") then doesn't make any noticeable difference.
最合理的解释是:OrderBy
与使用显式值代替参数具有相同的效果。因此,如果您有针对给定查询的 pre-cached 计划,并且具有特定参数值,则此计划不是最佳的(2 秒)- 通过向其添加无用的 OrderBy
来更改此查询将强制 SQL 服务器为此查询创建新的执行计划,因此将抵消旧 non-optimal 执行计划的影响。当然,应该清楚这不是否定计划缓存的好方法。
请参阅下面的查询。对象和 属性 名称在某种程度上被混淆了,以免泄露 confidential/sensitive 信息,但查询结构是相同的。
添加 .OrderBy(p => "")
后,这对我来说完全是无意义的,查询运行得更快。执行查询所需的时间约为。 2000 毫秒下降到大约。 400 毫秒。我已经测试了几次,只添加和删除了 OrderBy 语句。
我百思不得其解,怎么会这样?查询在 Azure 环境中的 SQL 数据库上执行。
我可以理解,在 属性 A 上排序数据,然后选择 属性 A 等于某个值的记录可能会加快查询速度。但是在一个空字符串上排序!?这是怎么回事?
我还想指出,在没有 OrderBy 的情况下,使用表达式(如 .OrderBy(p => "")
然后不会产生任何明显的差异。
var query = (from p in Context.Punders.Where(p => p.A == A)
.Where(p => null != p.SomeNumber)
.Where(p => p.StatusCode == Default ||
p.StatusCode == Cancelled)
.Where(p => p.DatePosted >= startDate && p.DatePosted <= endDate)
join f in Context.Founders.Where(f => f.A == A) on p.Code equals f.Code
join r in Context.Rounders.Where(r => r.A == A) on p.Code equals r.Code
into rg
from r in rg.DefaultIfEmpty()
join pt in Context.FishTypes.Where(ft => ft.A ==A) on p.Code equals pt.Code
where r == null
select new
{
p.Code,
f.B,
f.C,
p.D,
p.E,
pt.F,
pt.G,
p.H
})
.OrderBy(p => "");
没有 .OrderBy(...
SELECT [Filter1].[q] AS [q],
[Filter1].[c1] AS [edoc],
[Filter1].[oc1] AS [wnrdc],
[Filter1].[otc1] AS [weener],
[Filter1].[ptc1] AS [pmtpdc],
[Extent4].[isr] AS [isr],
[Extent4].[rac] AS [rac],
[Filter1].[arn] AS [arn]
FROM (SELECT [Extent1].[pcid] AS [pcid1],
[Extent1].[edoc] AS [c1],
[Extent1].[pmtpdc] AS [ptc1],
[Extent1].[q] AS [q],
[Extent1].[arn] AS [arn],
[Extent1].[dateposted] AS [DatePosted],
[Extent2].[pcid] AS [pcid2],
[Extent2].[wnrdc] AS [oc1],
[Extent2].[weener] AS [otc1]
FROM [fnish].[post] AS [Extent1]
INNER JOIN [fnish].[olik] AS [Extent2]
ON [Extent1].[olikedoc] = [Extent2].[edoc]
LEFT OUTER JOIN [fnish].[receivable] AS [Extent3]
ON ( [Extent3].[pcid] = @p__linq__4 )
AND ( [Extent1].[edoc] =
[Extent3].[pepstedoc] )
WHERE ( [Extent1].[arn] IS NOT NULL )
AND ( [Extent1].[posttedoc] IN ( N'D', N'X' ) )
AND ( [Extent3].[id] IS NULL )) AS [Filter1]
INNER JOIN [fnish].[paymenttype] AS [Extent4]
ON [Filter1].[ptc1] = [Extent4].[edoc]
WHERE ( [Filter1].[pcid1] = @p__linq__0 )
AND ( [Filter1].[dateposted] >= @p__linq__1 )
AND ( [Filter1].[dateposted] <= @p__linq__2 )
AND ( [Filter1].[pcid2] = @p__linq__3 )
AND ( [Extent4].[pcid] = @p__linq__5 )
查询 .OrderBy(...
SELECT [Project1].[q] AS [q],
[Project1].[edoc] AS [edoc],
[Project1].[wnrdc] AS [wnrdc],
[Project1].[weener] AS [weener],
[Project1].[pmtpdc] AS [pmtpdc],
[Project1].[isr] AS [isr],
[Project1].[rac] AS [rac],
[Project1].[arn] AS [arn]
FROM (SELECT N'' AS [C1],
[Filter1].[c1] AS [edoc],
[Filter1].[ptc1] AS [pmtpdc],
[Filter1].[q] AS [q],
[Filter1].[arn] AS [arn],
[Filter1].[oc1] AS [wnrdc],
[Filter1].[otc1] AS [weener],
[Extent4].[isr] AS [isr],
[Extent4].[rac] AS [rac]
FROM (SELECT [Extent1].[pcid] AS [pcid1],
[Extent1].[edoc] AS [c1],
[Extent1].[pmtpdc] AS [ptc1],
[Extent1].[q] AS [q],
[Extent1].[arn] AS [arn],
[Extent1].[dateposted] AS [DatePosted],
[Extent2].[pcid] AS [pcid2],
[Extent2].[wnrdc] AS [oc1],
[Extent2].[weener] AS [otc1]
FROM [fnish].[post] AS [Extent1]
INNER JOIN [fnish].[olik] AS [Extent2]
ON [Extent1].[olikedoc] = [Extent2].[edoc]
LEFT OUTER JOIN [fnish].[receivable] AS [Extent3]
ON ( [Extent3].[pcid] =
@p__linq__4 )
AND ( [Extent1].[edoc] =
[Extent3].[pepstedoc] )
WHERE ( [Extent1].[arn] IS NOT NULL )
AND ( [Extent1].[posttedoc] IN ( N'D', N'X' ) )
AND ( [Extent3].[id] IS NULL )) AS [Filter1]
INNER JOIN [fnish].[paymenttype] AS [Extent4]
ON [Filter1].[ptc1] = [Extent4].[edoc]
WHERE ( [Filter1].[pcid1] = @p__linq__0 )
AND ( [Filter1].[dateposted] >= @p__linq__1 )
AND ( [Filter1].[dateposted] <= @p__linq__2 )
AND ( [Filter1].[pcid2] = @p__linq__3 )
AND ( [Extent4].[pcid] = @p__linq__5 )) AS [Project1]
ORDER BY [Project1].[c1] ASC
结论
据我所知,有一点猜测:这是特定于案例的行为。在我的例子中,性能提升可能是由于 SQL 服务器构建的不同执行计划产生了性能更好的查询。我已经看到一个不同的执行计划,在没有 OrderBy
的情况下,使用 SQL 语句 OPTION(RECOMIPILE)
的查询显示出类似的性能提升。因此,将 OrderBy
添加到 LINQ 查询中很可能(我认为)会产生不同的执行计划,从而产生性能更好的查询。
根据您的说明
Also I want to note, that the query, without the OrderBy, using Expressions ( as suggested in this post to circumvent SQL parameter sniffing) lowers the execution time also to approx. 400ms. Adding the .OrderBy(p => "") then doesn't make any noticeable difference.
最合理的解释是:OrderBy
与使用显式值代替参数具有相同的效果。因此,如果您有针对给定查询的 pre-cached 计划,并且具有特定参数值,则此计划不是最佳的(2 秒)- 通过向其添加无用的 OrderBy
来更改此查询将强制 SQL 服务器为此查询创建新的执行计划,因此将抵消旧 non-optimal 执行计划的影响。当然,应该清楚这不是否定计划缓存的好方法。