EF: Include with where 子句, + SubIncludes
EF: Include with where clause, + SubIncludes
这是此处问题的后续问题:Include with where clause。这个问题想找到所有 Awake Passengers on Driving Buss
没有Passengers的WHERE子句,就很简单了,像这样:
var result = Context.Busses.Where(x => x.IsDriving)
.Include(x => x.Passengers);
如果没有 Passengers 的 WHERE 子句,包含来自 Passengers 的子关系也非常简单,如下所示:
var result = Context.Busses.Where(x => x.IsDriving)
.Include(x => x.Passengers.CarryOns)
.Include(x => x.Passengers.Luggage);
但是这个问题需要在导航中使用 WHERE 子句 属性。
对先前问题的回答 完美无缺,没有子关系:
var result = Context.Busses.Where(x => x.IsDriving)
.Select(bus => new {bus, Passengers = bus.Passengers.Where(x => x.Awake)})
.AsEnumerable().Select(x => x.bus).ToList();
但是如何使用这种方法将乘客的随身行李和行李也包括在内呢? Passengers 不可查询,因此此时您不能 Include
。我尝试过这样的事情,但第一部分刚刚被第二部分覆盖:
var bussesQuery = Context.Busses.Where(x => x.IsDriving)
.Include(x => x.Passengers.CarryOns)
.Include(x => x.Passengers.Luggage);
// var check = bussesQuery.ToList();
// The Sub-Relationship data is included here, but the Passengers are not filtered.
var result = bussesQuery
.Select(bus => new {bus, Passengers = bus.Passengers.Where(x => x.Awake)})
.AsEnumerable().Select(x => x.bus).ToList();
// The Sub-Relationship data is missing, but the Passengers are filtered
通过查看您之前的查询,我发现您从 N+1 的极端到单一查询的极端。你曾经有很多疑问,现在你想要一个,但请考虑一下幕后发生的事情。要获取数据 entity framework 需要交叉连接所有实体,因此对于每个包含的实体,您会在结果中获得额外的列并且结果与该包含交叉连接。
假设您有 5 辆正在行驶的公共汽车,每辆公共汽车有 30 名醒着的乘客和 15 件行李,因此您得到公共汽车 x 行李 x 乘客 = 2250 条记录,每条记录都包含乘客和行李数据`。如果您使用单独的查询来查询乘客和行李,您将拥有更少的记录(5 * 30 + 5 * 15 = 225)并且每个实体将被提取一次。
做一个会 return 一切的大查询不是什么好主意 - 它更慢,更难维护,不值得你花时间。只查询醒着的乘客,再查询行李。
如需答案,请向下滚动至答案部分。
免责声明:我喜欢 EF。对于在我的系统中进行的 99.999% 的调用,我可以最快地编写代码 (LINQ),而 OR-Mapping 是最快的系统。此外,生成的查询(虽然看起来很混乱)的执行计划比手写的 SQL 快得多。但这里不是这样。
研究科
先说一句旁白:查看我的最终请求的原始 SQL 是这样的:
SELECT * FROM [Busses] [bus]
LEFT JOIN [Passengers] [passenger] ON [passenger].[BusID] = [bus].[BusID] AND [passenger].[Awake] <> 1
LEFT JOIN [CarryOns] [carryOn] ON [carryOn].[PassengerID] = [passenger].[PassengerID]
LEFT JOIN [Luggages] [luggage] ON [luggage].[PassengerID] = [passenger].[PassengerID]
WHERE [bus].[IsDriving] = 1
当然,如果 EF 要为这些结果生成一些东西,则需要嵌套和关键字段才能知道如何映射它们。没什么大不了的。
不幸的是,为了通过一次访问数据库来实现这一点,我必须执行以下操作:
var busses = context.Set<BusEntity>().Where(x => x.IsDriving);
var passengers = context.Set<PassengerEntity>().Where(x => x.Awake);
var carryOns = context.Set<CarryOnEntity>();
var luggages = context.Set<LuggageEntity>();
var passengerJoins = passengers.GroupJoin(
carryOns,
x => x.PassengerID,
y => y.PassengerID,
(x, y) => new { Passenger = x, CarryOns = y }
)
.SelectMany(
x => x.CarryOns.DefaultIfEmpty(),
(x, y) => new { Passenger = x.Passenger, CarryOns = x.CarryOns }
).GroupJoin(
luggages,
x => x.Passenger.PassengerID,
y => y.PassengerID,
(x, y) => new { Passenger = x.Passenger, CarryOns = x.CarryOns, Luggages = y }
)
.SelectMany(
x => x.Luggages.DefaultIfEmpty(),
(x, y) => new { Passenger = x.Passenger, CarryOns = x.CarryOns, Luggages = x.Luggages }
);
var bussesToPassengers = busses.GroupJoin(
passengerJoins,
x => x.BusID,
y => y.Passenger.BusID,
(x, y) => new { Bus = x, Passengers = y }
)
.SelectMany(
x => x.Passengers.DefaultIfEmpty(),
(x, y) => new { Bus = x.Bus, Passengers = x.Passengers }
)
.GroupBy(x => x.Bus);
var rez = bussesToPassengers.ToList()
.Select(x => x.First().Bus)
.ToList();
我不抱怨 EF Generated SQL,但单个 SQL 语句有几百行。我破解了它,删除了 SELECT 列,并更改了一些 ID 以匹配这个问题,它是这样的:
SELECT *
FROM ( SELECT *
FROM (SELECT *
FROM ( SELECT DISTINCT *
FROM [dbo].[Bus] AS [Extent1]
LEFT OUTER JOIN (SELECT *
FROM [dbo].[Passenger] AS [Extent2]
LEFT OUTER JOIN [dbo].[CarryOns] AS [Extent3] ON [Extent2].[PassengerId] = [Extent3].[PassengerId]
LEFT OUTER JOIN [dbo].[Luggages] AS [Extent4] ON [Extent2].[PassengerId] = [Extent4].[PassengerId]
WHERE [Extent1].[IsDriving] = 1
) AS [Distinct1] ) AS [Project2]
OUTER APPLY (SELECT *
FROM (SELECT *
FROM [dbo].[Bus] AS [Extent6]
LEFT OUTER JOIN (SELECT *
FROM [dbo].[Passenger] AS [Extent7]
LEFT OUTER JOIN [dbo].[CarryOns] AS [Extent8] ON [Extent7].[PassengerId] = [Extent8].[PassengerId]
LEFT OUTER JOIN [dbo].[Luggages] AS [Extent9] ON [Extent7].[PassengerId] = [Extent9].[PassengerId]
WHERE ([Extent6].[IsDriving] = 1) AND ([Project2].[BusId] = [Extent6].[BusId]) ) AS [Project3]
OUTER APPLY (SELECT *
FROM [dbo].[Passenger] AS [Extent11]
LEFT OUTER JOIN [dbo].[CarryOns] AS [Extent12] ON [Extent11].[PassengerId] = [Extent12].[PassengerId]
LEFT OUTER JOIN [dbo].[Luggages] AS [Extent13] ON [Extent11].[PassengerId] = [Extent13].[PassengerId]
LEFT OUTER JOIN [dbo].[CarryOns] AS [Extent15] ON [Extent11].[PassengerId] = [Extent15].[PassengerId]
WHERE ([Extent11].[IsAwake] = 1) AND ([Project3].[BusId] = [Extent11].[BusId])
UNION ALL
SELECT *
FROM [dbo].[Passenger] AS [Extent16]
LEFT OUTER JOIN [dbo].[CarryOns] AS [Extent17] ON [Extent16].[PassengerId] = [Extent17].[PassengerId]
LEFT OUTER JOIN [dbo].[Luggages] AS [Extent18] ON [Extent16].[PassengerId] = [Extent18].[PassengerId]
WHERE ([Extent16].[IsAwake] = 1) AND ([Project3].[BusId] = [Extent16].[BusId])
) AS [Project7]
ORDER BY ........................
我个人的测试数据,My Hand-Written SQL Query returns 54 行,EF Generated Query returns 大约 30,000 行。因此,如果您仅考虑数据的 Over-The-Wire 传输时间的增加,那是不可接受的。
答案部分
答案是:您可以使用 Linq to Entities(在 DB 上)和 Linq to Objects(在代码中)在一次调用中实现您的结果,但性能不佳。您可以改为选择具有更好性能的多个调用,包括通过网络传输的数据更少、生成的查询更具可读性以及代码更易于理解。
最好的办法是执行多个查询。我就是这样做的:
var bus = context.Set<BusEntity>().Where(x => x.IsDriving).ToList();
var busIDs = bus.Select(x => x.BusID).ToList();
var passengers = context.Set<PassengerEntity>().Where(x => x.IsAwake && busIDs.Contains(x.BusID)).ToList();
var passengerIDs = passengers.Select(x => x.PassengerID).ToList();
var carryOns = context.Set<CarryOnEntity>().Where(x => passengerIDs.Contains(x.PassengerID)).ToList();
var luggages = context.Set<LuggageEntity>().Where(x => passengerIDs.Contains(x.PassengerID)).ToList();
passengers.ForEach(x => {
x.CarryOns = carryOns.Where(y => y.PassengerID == x.PassengerID).ToList();
x.Luggages = luggages.Where(y => y.PassengerID == x.PassengerID).ToList();
});
bus.ForEach(x => x.Passengers = passengers.Where(y => y.BusID == x.BusID).ToList());
这产生了 4 个调用。 SQL 总共有大约 40 行。我破解了它,删除了 SELECT 列,并更改了一些 ID 以匹配这个问题,它是这样的:
SELECT * FROM [dbo].[Busses] AS [Extent1]
WHERE [Extent1].[IsDriving] = 1
SELECT * FROM [dbo].[Passengers] AS [Extent1]
WHERE ([Extent1].[Awake] = 1) AND ([Extent1].[BusID] IN (......................))
SELECT * FROM [dbo].[CarryOns] AS [Extent1]
WHERE [Extent1].[PassengerID] IN (......................)
SELECT * FROM [dbo].[Luggages] AS [Extent1]
WHERE [Extent1].[PassengerID] IN (......................)
EF 生成的查询 returns 4 次往返调用总共约 100 行。所以这意味着对数据库的 4 次调用,但都非常小、可读且非常快。
我没有计时,但每当我在此答案代码上方的断点处暂停,然后 F5 到结果的另一边时,它是即时的。当我在我的研究中对 Single-Call 做同样的事情时,它花了整整一秒或更长时间,明显滞后 运行。
这是此处问题的后续问题:Include with where clause。这个问题想找到所有 Awake Passengers on Driving Buss
没有Passengers的WHERE子句,就很简单了,像这样:
var result = Context.Busses.Where(x => x.IsDriving)
.Include(x => x.Passengers);
如果没有 Passengers 的 WHERE 子句,包含来自 Passengers 的子关系也非常简单,如下所示:
var result = Context.Busses.Where(x => x.IsDriving)
.Include(x => x.Passengers.CarryOns)
.Include(x => x.Passengers.Luggage);
但是这个问题需要在导航中使用 WHERE 子句 属性。 对先前问题的回答 完美无缺,没有子关系:
var result = Context.Busses.Where(x => x.IsDriving)
.Select(bus => new {bus, Passengers = bus.Passengers.Where(x => x.Awake)})
.AsEnumerable().Select(x => x.bus).ToList();
但是如何使用这种方法将乘客的随身行李和行李也包括在内呢? Passengers 不可查询,因此此时您不能 Include
。我尝试过这样的事情,但第一部分刚刚被第二部分覆盖:
var bussesQuery = Context.Busses.Where(x => x.IsDriving)
.Include(x => x.Passengers.CarryOns)
.Include(x => x.Passengers.Luggage);
// var check = bussesQuery.ToList();
// The Sub-Relationship data is included here, but the Passengers are not filtered.
var result = bussesQuery
.Select(bus => new {bus, Passengers = bus.Passengers.Where(x => x.Awake)})
.AsEnumerable().Select(x => x.bus).ToList();
// The Sub-Relationship data is missing, but the Passengers are filtered
通过查看您之前的查询,我发现您从 N+1 的极端到单一查询的极端。你曾经有很多疑问,现在你想要一个,但请考虑一下幕后发生的事情。要获取数据 entity framework 需要交叉连接所有实体,因此对于每个包含的实体,您会在结果中获得额外的列并且结果与该包含交叉连接。
假设您有 5 辆正在行驶的公共汽车,每辆公共汽车有 30 名醒着的乘客和 15 件行李,因此您得到公共汽车 x 行李 x 乘客 = 2250 条记录,每条记录都包含乘客和行李数据`。如果您使用单独的查询来查询乘客和行李,您将拥有更少的记录(5 * 30 + 5 * 15 = 225)并且每个实体将被提取一次。
做一个会 return 一切的大查询不是什么好主意 - 它更慢,更难维护,不值得你花时间。只查询醒着的乘客,再查询行李。
如需答案,请向下滚动至答案部分。
免责声明:我喜欢 EF。对于在我的系统中进行的 99.999% 的调用,我可以最快地编写代码 (LINQ),而 OR-Mapping 是最快的系统。此外,生成的查询(虽然看起来很混乱)的执行计划比手写的 SQL 快得多。但这里不是这样。
研究科
先说一句旁白:查看我的最终请求的原始 SQL 是这样的:
SELECT * FROM [Busses] [bus]
LEFT JOIN [Passengers] [passenger] ON [passenger].[BusID] = [bus].[BusID] AND [passenger].[Awake] <> 1
LEFT JOIN [CarryOns] [carryOn] ON [carryOn].[PassengerID] = [passenger].[PassengerID]
LEFT JOIN [Luggages] [luggage] ON [luggage].[PassengerID] = [passenger].[PassengerID]
WHERE [bus].[IsDriving] = 1
当然,如果 EF 要为这些结果生成一些东西,则需要嵌套和关键字段才能知道如何映射它们。没什么大不了的。
不幸的是,为了通过一次访问数据库来实现这一点,我必须执行以下操作:
var busses = context.Set<BusEntity>().Where(x => x.IsDriving);
var passengers = context.Set<PassengerEntity>().Where(x => x.Awake);
var carryOns = context.Set<CarryOnEntity>();
var luggages = context.Set<LuggageEntity>();
var passengerJoins = passengers.GroupJoin(
carryOns,
x => x.PassengerID,
y => y.PassengerID,
(x, y) => new { Passenger = x, CarryOns = y }
)
.SelectMany(
x => x.CarryOns.DefaultIfEmpty(),
(x, y) => new { Passenger = x.Passenger, CarryOns = x.CarryOns }
).GroupJoin(
luggages,
x => x.Passenger.PassengerID,
y => y.PassengerID,
(x, y) => new { Passenger = x.Passenger, CarryOns = x.CarryOns, Luggages = y }
)
.SelectMany(
x => x.Luggages.DefaultIfEmpty(),
(x, y) => new { Passenger = x.Passenger, CarryOns = x.CarryOns, Luggages = x.Luggages }
);
var bussesToPassengers = busses.GroupJoin(
passengerJoins,
x => x.BusID,
y => y.Passenger.BusID,
(x, y) => new { Bus = x, Passengers = y }
)
.SelectMany(
x => x.Passengers.DefaultIfEmpty(),
(x, y) => new { Bus = x.Bus, Passengers = x.Passengers }
)
.GroupBy(x => x.Bus);
var rez = bussesToPassengers.ToList()
.Select(x => x.First().Bus)
.ToList();
我不抱怨 EF Generated SQL,但单个 SQL 语句有几百行。我破解了它,删除了 SELECT 列,并更改了一些 ID 以匹配这个问题,它是这样的:
SELECT *
FROM ( SELECT *
FROM (SELECT *
FROM ( SELECT DISTINCT *
FROM [dbo].[Bus] AS [Extent1]
LEFT OUTER JOIN (SELECT *
FROM [dbo].[Passenger] AS [Extent2]
LEFT OUTER JOIN [dbo].[CarryOns] AS [Extent3] ON [Extent2].[PassengerId] = [Extent3].[PassengerId]
LEFT OUTER JOIN [dbo].[Luggages] AS [Extent4] ON [Extent2].[PassengerId] = [Extent4].[PassengerId]
WHERE [Extent1].[IsDriving] = 1
) AS [Distinct1] ) AS [Project2]
OUTER APPLY (SELECT *
FROM (SELECT *
FROM [dbo].[Bus] AS [Extent6]
LEFT OUTER JOIN (SELECT *
FROM [dbo].[Passenger] AS [Extent7]
LEFT OUTER JOIN [dbo].[CarryOns] AS [Extent8] ON [Extent7].[PassengerId] = [Extent8].[PassengerId]
LEFT OUTER JOIN [dbo].[Luggages] AS [Extent9] ON [Extent7].[PassengerId] = [Extent9].[PassengerId]
WHERE ([Extent6].[IsDriving] = 1) AND ([Project2].[BusId] = [Extent6].[BusId]) ) AS [Project3]
OUTER APPLY (SELECT *
FROM [dbo].[Passenger] AS [Extent11]
LEFT OUTER JOIN [dbo].[CarryOns] AS [Extent12] ON [Extent11].[PassengerId] = [Extent12].[PassengerId]
LEFT OUTER JOIN [dbo].[Luggages] AS [Extent13] ON [Extent11].[PassengerId] = [Extent13].[PassengerId]
LEFT OUTER JOIN [dbo].[CarryOns] AS [Extent15] ON [Extent11].[PassengerId] = [Extent15].[PassengerId]
WHERE ([Extent11].[IsAwake] = 1) AND ([Project3].[BusId] = [Extent11].[BusId])
UNION ALL
SELECT *
FROM [dbo].[Passenger] AS [Extent16]
LEFT OUTER JOIN [dbo].[CarryOns] AS [Extent17] ON [Extent16].[PassengerId] = [Extent17].[PassengerId]
LEFT OUTER JOIN [dbo].[Luggages] AS [Extent18] ON [Extent16].[PassengerId] = [Extent18].[PassengerId]
WHERE ([Extent16].[IsAwake] = 1) AND ([Project3].[BusId] = [Extent16].[BusId])
) AS [Project7]
ORDER BY ........................
我个人的测试数据,My Hand-Written SQL Query returns 54 行,EF Generated Query returns 大约 30,000 行。因此,如果您仅考虑数据的 Over-The-Wire 传输时间的增加,那是不可接受的。
答案部分
答案是:您可以使用 Linq to Entities(在 DB 上)和 Linq to Objects(在代码中)在一次调用中实现您的结果,但性能不佳。您可以改为选择具有更好性能的多个调用,包括通过网络传输的数据更少、生成的查询更具可读性以及代码更易于理解。
最好的办法是执行多个查询。我就是这样做的:
var bus = context.Set<BusEntity>().Where(x => x.IsDriving).ToList();
var busIDs = bus.Select(x => x.BusID).ToList();
var passengers = context.Set<PassengerEntity>().Where(x => x.IsAwake && busIDs.Contains(x.BusID)).ToList();
var passengerIDs = passengers.Select(x => x.PassengerID).ToList();
var carryOns = context.Set<CarryOnEntity>().Where(x => passengerIDs.Contains(x.PassengerID)).ToList();
var luggages = context.Set<LuggageEntity>().Where(x => passengerIDs.Contains(x.PassengerID)).ToList();
passengers.ForEach(x => {
x.CarryOns = carryOns.Where(y => y.PassengerID == x.PassengerID).ToList();
x.Luggages = luggages.Where(y => y.PassengerID == x.PassengerID).ToList();
});
bus.ForEach(x => x.Passengers = passengers.Where(y => y.BusID == x.BusID).ToList());
这产生了 4 个调用。 SQL 总共有大约 40 行。我破解了它,删除了 SELECT 列,并更改了一些 ID 以匹配这个问题,它是这样的:
SELECT * FROM [dbo].[Busses] AS [Extent1]
WHERE [Extent1].[IsDriving] = 1
SELECT * FROM [dbo].[Passengers] AS [Extent1]
WHERE ([Extent1].[Awake] = 1) AND ([Extent1].[BusID] IN (......................))
SELECT * FROM [dbo].[CarryOns] AS [Extent1]
WHERE [Extent1].[PassengerID] IN (......................)
SELECT * FROM [dbo].[Luggages] AS [Extent1]
WHERE [Extent1].[PassengerID] IN (......................)
EF 生成的查询 returns 4 次往返调用总共约 100 行。所以这意味着对数据库的 4 次调用,但都非常小、可读且非常快。
我没有计时,但每当我在此答案代码上方的断点处暂停,然后 F5 到结果的另一边时,它是即时的。当我在我的研究中对 Single-Call 做同样的事情时,它花了整整一秒或更长时间,明显滞后 运行。