如何去掉中间的重复项

Question

给出如下序列：-

var list = new[] {"1a", "1b", "1c", "1d", "2a", "3a", "4a", "4b", "5a", "6a", "7a", "7b", "8a"}.Select(x => new { P1 = x.Substring(0,1), P2 = x.Substring(1,1)});

我想删除 "middle" 中的重复项以结束：-

var expected = new[] {"1a", "1d", "2a", "3a", "4a", "4b", "5a", "6a", "7a", "7b", "8a"}.Select(x => new { P1 = x.Substring(0, 1), P2 = x.Substring(1, 1) });

所以超过两个的任何重复都被去除。不过，重要的是我要得到第一个和最后一个副本。

Answer 1

对于那些不聚合并希望在此处使用闭包的超简短答案的人：

var data = new[] { "1a", "1b", "1c", "1d", "2a", "3a", "4a", "4b", "1e", "5a", "6a", "7a", "7b", "8a" };
char priorKey = ' ';
int currentIndex = 0;

var result2 = data.GroupBy((x) => x[0] == priorKey ? new { k = x[0], g = currentIndex } : new { k = priorKey = x[0], g = ++currentIndex })
    .Select(i => new[] { i.First(), i.Last() }.Distinct())
    .SelectMany(i => i).ToArray();

向@Slai 致敬基于代码的提示（我为非连续组问题添加了一个修复程序。）

以下是如何使用聚合进行操作。我没有测试所有边缘情况...只是您的测试用例。

var list = new[] { "1a", "1b", "1c", "1d", "2a", "3a", "4a", "4b", "5a", "6a", "7a", "7b", "8a" }
           .Aggregate(new { result = new List<string>(), first = "", last = "" },
              (store, given) =>
              {
                var result = store.result;
                var first = store.first;
                var last = store.last;

                 if (first == "")
                  // this is the first one.
                  first = given;
                else
                {
                  if (first[0] == given[0])
                    last = given;
                  else
                  {
                    result.Add(first);
                    if (last != "")
                      result.Add(last);
                    first = given;
                    last = "";
                  }

                }
                 return new { result = result, first = first, last = last }; },
                 (store) => { store.result.Add(store.first); if (store.last != "") store.result.Add(store.last); return store.result; })
           .Select(x => new { P1 = x.Substring(0,1), P2 = x.Substring(1,1)});

我创建了一个对象来保存到目前为止的列表以及目前已知的第一个和最后一个。

然后我只是应用逻辑来删除中间的东西。

Answer 2

按第一个字符分组，取每组的第一项和最后一项：

var list = "1a 1b 1c 1d 2a 3a 4a 4b 5a 6a 7a 7b 8a".Split();

var result = list.GroupBy(i => i[0])
    .Select(i => new[] { i.First(), i.Last() }.Distinct())
    .SelectMany(i => i).ToArray();

Debug.Print(string.Join("\", \"", result)); 
// { "1a", "1b", "1c", "1d", "2a", "3a", "4a", "4b", "5a", "6a", "7a", "7b", "8a" }

如何去掉中间的重复项

How to remove duplicates in the middle

linq

enumerable