使用 linq 根据 属性 获取重复项

Get duplicates based on property using linq

我有一个 class 说

public class Sample 
{
    public string property;
    public List<string> someListProperty;
    public string someOtherPropery;
}

现在我有一个对象 List<Sample>,我需要遍历这个集合并找出 propertysomeListProperty 字段具有相同值的项目。

我试过这个:

var listSample = List<Sample>()
var result = listSample.GroupBy(x => new { x.property, x.someListProperty})
                       .Where(x => x.Count() > 1).ToList();

但这似乎不起作用,任何指点将不胜感激。

在下面评论后更新:正如你所描述的,你想按 someOtherProperty 而不是 someListProperty 分组,然后按它分组:

listSample.GroupBy(x => new { x.property, x.someOtherProperty});

  1. 选项 1 - 您应该使用 SequenceEqual to check that the nested lists of two given samples are the same. To do that pass a custom IEqualityComparer

    public class SampleComparer : IEqualityComparer<Sample>
    {
        public bool Equals(Sample x, Sample y)
        {
            return x.property == y.property &&
                Enumerable.SequenceEqual(x.someListProperty, y.someListProperty);
        }
        public int GetHashCode(Sample obj)
        {
            // Implement
        }
    }
    

    (要实现 GetHashCode,请参阅:What is the best algorithm for an overridden System.Object.GetHashCode?

    然后:

    var result = list.GroupBy(k => k, new SampleComparer());
    

    测试了以下数据,它 returns 3 组:

    List<Sample> a = new List<Sample>()
    {
        new Sample { property = "a", someListProperty = new List<string> {"a"}, someOtherPropery = "1"},
        new Sample { property = "a", someListProperty = new List<string> {"a"}, someOtherPropery = "2"},
        new Sample { property = "a", someListProperty = new List<string> {"b"}, someOtherPropery = "3"},
        new Sample { property = "b", someListProperty = new List<string> {"a"}, someOtherPropery = "4"},
    };
    
  2. 选项 2 - 不是创建 class 实现接口而是使用 ProjectionEqualityComparer,如下所述:Can you create a simple 'EqualityComparer<T>' using a lambda expression


作为旁注,而不是在使用的地方使用 Count

var result = list.GroupBy(k => k, new SampleComparer())
                 .Where(g => g.Skip(1).Any());

因为您想要的只是检查组中是否有多个项目,而不是实际数量,这将只检查两个项目,而不是在 O(n) 操作中将它们全部计算在内。

像这样调整你的 linq:

    static void Main(string[] args)
    {
        var samples = new List<Sample>
        {
            new Sample("p1", "aaa,bbb,ccc,ddd"),
            new Sample("p1", "bbb,ccc,xxx"),
            new Sample("p2", "aaa,bbb,ccc"),
            new Sample("p1", "xxx")
        };

        var grp = samples.GroupBy(b => b.property)
            .Where(a => a.Key == "p1")
            .SelectMany(s => s.ToList())
            .Where(b => b.someListProperty.Contains("ccc"));

        foreach (var g in grp)
            System.Console.WriteLine(g.ToString());

        System.Console.ReadLine();
    }

    private class Sample
    {
        public string property;

        public List<string> someListProperty;

        public string someOtherPropery;

        public Sample(string p, string props)
        {
            property = p;
            someListProperty = props.Split(',').ToList();
            someOtherPropery = string.Concat(from s in someListProperty select s[0]);
        }

        public override string ToString()
        {
            return $"{property} - {string.Join(", ", someListProperty)} -"
                       + $" ({someOtherPropery})";
        }
    }

误读 - 我读到您希望 someListProperty 中的 "some" 应该用于您要从分组中过滤的那些。

你总是可以像这样分组来实现它:

var grp = samples
    // choose a joiner thats not in your somePropertyList-data
    .GroupBy(b => $"{b.property}:{string.Join("|", b.someListProperty)}")
    .Where(g => g.Skip(1).Any())
    .SelectMany(s => s.ToList())

但请注意,这有点老套并且取决于您的数据。您基本上将所有您感兴趣的内容分组,然后选择所有具有多个结果的内容。