使用 LINQ 从类型集合中过滤重复项
Filtering duplicates from a type collection using LINQ
我通过对两个参数进行分组并根据创建日期(使用 first())选择子组列表中的最新类型来过滤列表。
这消除了 x.application 和 x.externalid 属性上的重复项。
var list = ((List<SomeType>)xDic)
.GroupBy(x => new {x.Application, x.ExternalID})
.OrderByDescending(z => z.First().CreateDate)
.Select(y => y.First()).ToList();
我遇到的问题是定义另一个属性组合(x.application 和 x.externaldisplayid)来过滤和分组以获取第一个属性。
总而言之,我需要根据 ((x.application/x.externalid) OR (x.application/x.externaldisplayid)) 过滤掉任何重复项,从而获得一个唯一的 SomeTypes 列表组合。
Example set:
{ "extID": 1234, "extDspID" : 111, "App" : "Test", "CreateDate": 2/01/2015}
{ "extID": 1234, "extDspID" : 5, "App" : "Test", "CreateDate": 1/01/2015}
{ "extID": 012, "extDspID" : 90, "App" : "Mono", "CreateDate": 6/06/2015}
{ "extID": 999, "extDspID" : 78, "App" : "Epic", "CreateDate": 8/08/2015}
{ "extID": 333, "extDspID" : 78, "App" : "Epic", "CreateDate": 8/12/2015}
{ "extID": 345, "extDspID" : 33, "App" : "Test", "CreateDate": 2/01/2015}
{ "extID": 666, "extDspID" : 33, "App" : "Test", "CreateDate": 1/01/2015}
desired result:
{ "extID": 1234, "extDspID" : 111, "App" : "Test", "CreateDate": 2/01/2015}
{ "extID": 012, "extDspID" : 90, "App" : "Mono", "CreateDate": 6/06/2015}
{ "extID": 333, "extDspID" : 78, "App" : "Epic", "CreateDate": 8/12/2015}
{ "extID": 345, "extDspID" : 33, "App" : "Test", "CreateDate": 2/01/2015}
首先,声明两个相等比较器来指定你的两个条件,如下所示:
public class MyEqualityComparer1 : IEqualityComparer<SomeType>
{
public bool Equals(SomeType x, SomeType y)
{
return x.Application == y.Application && x.ExternalID == y.ExternalID;
}
public int GetHashCode(SomeType obj)
{
return (obj.Application + obj.ExternalID).GetHashCode();
}
}
public class MyEqualityComparer2 : IEqualityComparer<SomeType>
{
public bool Equals(SomeType x, SomeType y)
{
return x.Application == y.Application && x.ExternalDisplayId == y.ExternalDisplayId;
}
public int GetHashCode(SomeType obj)
{
return (obj.Application + obj.ExternalDisplayId).GetHashCode();
}
}
然后,按 CreatedDate
排序您的列表,然后使用 Distinct
过滤您的列表,如下所示:
var result = xDic
.OrderByDescending(x => x.CreateDate)
.Distinct(new MyEqualityComparer1())
.Distinct(new MyEqualityComparer2());
Distinct
方法 should remove the later items,所以我们应该能够依赖于我们使用 OrderByDescending
的事实来确保 Distinct
将删除带有最近 CreatedTime
.
但是,由于 Distinct
的文档不保证这一点,您可以使用这样的自定义 distinct 方法:
public static class Extensions
{
public static IEnumerable<T> OrderedDistinct<T>(this IEnumerable<T> enumerable, IEqualityComparer<T> comparer)
{
HashSet<T> hash_set = new HashSet<T>(comparer);
foreach(var item in enumerable)
if (hash_set.Add(item))
yield return item;
}
}
并像这样使用它:
var result = xDic
.OrderByDescending(x => x.CreateDate)
.OrderedDistinct(new MyEqualityComparer1())
.OrderedDistinct(new MyEqualityComparer2());
当前接受的答案不会对您的 "SomeType" 对象进行正确排序,因此不会产生您想要的结果集。
我在这里实现了一个解决方案:
https://dotnetfiddle.net/qBkIXo
我的解决方案也基于 Distinct(请参阅 MSDN 文档 here). The way I generate the hash is based on this 使用匿名类型的简洁方法,例如
public int GetHashCode(SomeType sometype)
{
//Calculate the hash code for the SomeType.
return new { sometype.Application, sometype.ExternalID }.GetHashCode();
}
为了获得正确的预期结果,需要应用分组、排序和使用不同的组合,例如
var noduplicates = products.GroupBy(x => new {x.Application, x.ExternalDisplayId})
.Select(y => y.OrderByDescending(x => x.CreateDate).First())
.ToList()
.Distinct(new ApplicationExternalDisplayIdComparer())
.GroupBy(x => new {x.Application, x.ExternalID})
.Select(y => y.OrderByDescending(x => x.CreateDate).First())
.ToList()
.Distinct(new ApplicationExternalIDComparer());
正如您将在 fiddle 输出中看到的那样,这给出了您期望的结果。
我通过对两个参数进行分组并根据创建日期(使用 first())选择子组列表中的最新类型来过滤列表。 这消除了 x.application 和 x.externalid 属性上的重复项。
var list = ((List<SomeType>)xDic)
.GroupBy(x => new {x.Application, x.ExternalID})
.OrderByDescending(z => z.First().CreateDate)
.Select(y => y.First()).ToList();
我遇到的问题是定义另一个属性组合(x.application 和 x.externaldisplayid)来过滤和分组以获取第一个属性。
总而言之,我需要根据 ((x.application/x.externalid) OR (x.application/x.externaldisplayid)) 过滤掉任何重复项,从而获得一个唯一的 SomeTypes 列表组合。
Example set:
{ "extID": 1234, "extDspID" : 111, "App" : "Test", "CreateDate": 2/01/2015}
{ "extID": 1234, "extDspID" : 5, "App" : "Test", "CreateDate": 1/01/2015}
{ "extID": 012, "extDspID" : 90, "App" : "Mono", "CreateDate": 6/06/2015}
{ "extID": 999, "extDspID" : 78, "App" : "Epic", "CreateDate": 8/08/2015}
{ "extID": 333, "extDspID" : 78, "App" : "Epic", "CreateDate": 8/12/2015}
{ "extID": 345, "extDspID" : 33, "App" : "Test", "CreateDate": 2/01/2015}
{ "extID": 666, "extDspID" : 33, "App" : "Test", "CreateDate": 1/01/2015}
desired result:
{ "extID": 1234, "extDspID" : 111, "App" : "Test", "CreateDate": 2/01/2015}
{ "extID": 012, "extDspID" : 90, "App" : "Mono", "CreateDate": 6/06/2015}
{ "extID": 333, "extDspID" : 78, "App" : "Epic", "CreateDate": 8/12/2015}
{ "extID": 345, "extDspID" : 33, "App" : "Test", "CreateDate": 2/01/2015}
首先,声明两个相等比较器来指定你的两个条件,如下所示:
public class MyEqualityComparer1 : IEqualityComparer<SomeType>
{
public bool Equals(SomeType x, SomeType y)
{
return x.Application == y.Application && x.ExternalID == y.ExternalID;
}
public int GetHashCode(SomeType obj)
{
return (obj.Application + obj.ExternalID).GetHashCode();
}
}
public class MyEqualityComparer2 : IEqualityComparer<SomeType>
{
public bool Equals(SomeType x, SomeType y)
{
return x.Application == y.Application && x.ExternalDisplayId == y.ExternalDisplayId;
}
public int GetHashCode(SomeType obj)
{
return (obj.Application + obj.ExternalDisplayId).GetHashCode();
}
}
然后,按 CreatedDate
排序您的列表,然后使用 Distinct
过滤您的列表,如下所示:
var result = xDic
.OrderByDescending(x => x.CreateDate)
.Distinct(new MyEqualityComparer1())
.Distinct(new MyEqualityComparer2());
Distinct
方法 should remove the later items,所以我们应该能够依赖于我们使用 OrderByDescending
的事实来确保 Distinct
将删除带有最近 CreatedTime
.
但是,由于 Distinct
的文档不保证这一点,您可以使用这样的自定义 distinct 方法:
public static class Extensions
{
public static IEnumerable<T> OrderedDistinct<T>(this IEnumerable<T> enumerable, IEqualityComparer<T> comparer)
{
HashSet<T> hash_set = new HashSet<T>(comparer);
foreach(var item in enumerable)
if (hash_set.Add(item))
yield return item;
}
}
并像这样使用它:
var result = xDic
.OrderByDescending(x => x.CreateDate)
.OrderedDistinct(new MyEqualityComparer1())
.OrderedDistinct(new MyEqualityComparer2());
当前接受的答案不会对您的 "SomeType" 对象进行正确排序,因此不会产生您想要的结果集。
我在这里实现了一个解决方案:
https://dotnetfiddle.net/qBkIXo
我的解决方案也基于 Distinct(请参阅 MSDN 文档 here). The way I generate the hash is based on this 使用匿名类型的简洁方法,例如
public int GetHashCode(SomeType sometype)
{
//Calculate the hash code for the SomeType.
return new { sometype.Application, sometype.ExternalID }.GetHashCode();
}
为了获得正确的预期结果,需要应用分组、排序和使用不同的组合,例如
var noduplicates = products.GroupBy(x => new {x.Application, x.ExternalDisplayId})
.Select(y => y.OrderByDescending(x => x.CreateDate).First())
.ToList()
.Distinct(new ApplicationExternalDisplayIdComparer())
.GroupBy(x => new {x.Application, x.ExternalID})
.Select(y => y.OrderByDescending(x => x.CreateDate).First())
.ToList()
.Distinct(new ApplicationExternalIDComparer());
正如您将在 fiddle 输出中看到的那样,这给出了您期望的结果。