忽略 LINQ order by 子句中的字符串 c#

Ignoring strings in LINQ order by clause c#

我有一个字符串集合如下:

"[Unfinished] Project task 1"
"Some other Piece of work to do"
"[Continued] [Unfinished] Project task 1"
"Project Task 2"
"Random other work to do"
"Project 4"
"[Continued] [Continued] Project task 1"
"[SPIKE] Investigate the foo"

我想做的是根据字符串按字母顺序排列这些字符串,但忽略方括号中的值。所以我希望最终结果是:

"[SPIKE] Investigate the foo"
"Project 4"
"[Continued] [Continued] Project task 1"
"[Continued] [Unfinished] Project task 1"
"[Unfinished] Project task 1"
"Project Task 2"
"Random other work to do"
"Some other Piece of work to do"

问题:

如何在 LINQ 中实现这一点,这是我必须要做的:

collection.OrderBy(str => str)

听起来您应该编写一个检索 "the part of the string which isn't in brackets" 的方法(例如,使用正则表达式)。然后你可以使用:

var ordered = collection.OrderBy(RemoveTextInBrackets);

您的 RemoveTextInBrackets 方法 可能 只想删除字符串开头的内容,以及后面的 space。

完整示例:

using System;
using System.Linq;
using System.Text.RegularExpressions;

public class Program
{
    private static readonly Regex TextInBrackets = new Regex(@"^(\[[^\]]*\] )*");

    public static void Main()
    {
        var input = new[]
        {
            "[Unfinished] Project task 1 bit",
            "Some other Piece of work to do",
            "[Continued] [Unfinished] Project task 1",
            "Project Task 2",
            "Random other work to do",
            "Project 4",
            "[Continued] [Continued] Project task 1",
            "[SPIKE] Investigate the foo",
        };

        var ordered = input.OrderBy(RemoveTextInBrackets);

        foreach (var item in ordered)
        {
            Console.WriteLine(item);
        }
    }

    static string RemoveTextInBrackets(string input) =>
        TextInBrackets.Replace(input, "");
}

给定一个简单的正则表达式:

var rx = new Regex(@"\[[^]]*\] *");

搜索括号内的文本(后跟可选空格),您可以:

var ordered = collection.OrderBy(str => rx.Replace(str, string.Empty));

这将按删除括号内文本的文本排序。

注意这里没有"secondary ordering",所以:

"[Continued] [Unfinished] Project task 1"
"[Continued] [Continued] Project task 1"

将保持与书面相同的顺序(未完成,续)并且不会颠倒。

如需二次订货,则:

var ordered = collection
    .OrderBy(str => rx.Replace(str, string.Empty))
    .ThenBy(str => str);

使用整个字符串作为二次排序可能没问题。但是然后:

"[Continued] [Unfinished] project task 1"
"[Continued] project task 1"

将保持原样(Unicode 中 [] 之后的 是小写字母 ,而

"[Continued] [Unfinished] Project task 1"
"[Continued] Project task 1"

会变成

"[Continued] Project task 1"
"[Continued] [Unfinished] Project task 1"

因为大写字母在 Unicode 中 [] 之前。

尝试一些扩展方法的组合,如下所示:

inputList.OrderBy(x=> x.Contains("]")? x.Substring(x.LastIndexOf("]")):x)

Working Example

与上述建议类似,这是我的实现。

    var newCollection =  collection.OrderBy((s) =>
     {
        if (s.Contains("]"))
        {
           string pattern = "\[(.*?)\] ";
           Regex rgx = new Regex(pattern);
           return rgx.Replace(s, "");
        }
        else
        {
           return s;
        }
     }).ToList();
  }