C# 需要正则表达式来捕获下划线后的第二次出现
C# need regular expression to capture the second occurence after a underscore
我正在使用下面的正则表达式来捕获下划线后的所有 numbers/letters,但我只需要捕获第二次出现,即“00500”,如下所示:
regular expresion: (?<=_)[a-zA-Z0-9]+
string:
"-rw-rw-rw- 1 rats rats 31K Sep 17 13:33 /opt/data/automation_sent/20180918/labc/0/20180918_00500.itx"
我正在用 C# 做,我认为该值会在第二组中[1],但事实并非如此;它只捕获字符串 "_sent":
string temp2 = "";
Regex getValueAfterUnderscore = new Regex(@"(?<=_)[a-zA-Z0-9]+");
Match match2 = getValueAfterUnderscore.Match(line);
if (match2.Success)
{
temp2 = match2.Groups[1].Value;
Console.WriteLine(temp2);
}
有什么想法吗?谢谢!
如果你所有的字符串看起来都像这样 {SOME_STRING}_{YOUR_NUMBER}.itx,那么你可以使用这个解决方案(没有正则表达式)
var arr = str.Split(new[] {"_", ".itx"}, StringSplitOptions.RemoveEmptyEntries);
var result = arr[arr.Length - 1];
也许您混淆了 "groups" 和 "matches"。您应该搜索正则表达式的匹配项。以下是在给定字符串中列出正则表达式的所有匹配项的方法:
string str = "-rw-rw-rw- 1 rats rats 31K Sep 17 13:33 /opt/data/automation_sent/20180918/labc/0/20180918_00500.itx";
MatchCollection matches = Regex.Matches(str, @"(?<=_)[a-zA-Z0-9]+");
foreach (Match curMatch in matches)
Console.WriteLine(curMatch.Value);
对于您的具体情况,验证是否至少有 2 个匹配项并检索 matches[1]
的值(这是第二个匹配项)。
if (matches.Count >= 2)
Console.WriteLine($"Your result: {matches[1].Value}");
var input = "-rw-rw-rw- 1 rats rats 31K Sep 17 13:33 /opt/data/automation_sent/20180918/labc/0/20180918_00500.itx";
Regex regex = new Regex(@"(?<Identifier1>\d+)_(?<Identifier2>\d+)");
var results = regex.Matches(input);
foreach (Match match in results)
{
Console.WriteLine(match.Groups["Identifier1"].Value);
Console.WriteLine(match.Groups["Identifier2"].Value);//second occurence
}
您可以使用以下代码捕获第二个下划线后的文本
var line = "-rw-rw-rw- 1 rats rats 31K Sep 17 13:33 /opt/data/automation_sent/20180918/labc/0/20180918_00500.itx";
string temp2 = "";
Regex getValueAfterUnderscore = new Regex(@"_.+_([a-zA-Z0-9]+)");
Match match2 = getValueAfterUnderscore.Match(line);
if (match2.Success)
{
temp2 = match2.Groups[1].Value;
Console.WriteLine(temp2);
}
输出:
00500
我正在使用下面的正则表达式来捕获下划线后的所有 numbers/letters,但我只需要捕获第二次出现,即“00500”,如下所示:
regular expresion: (?<=_)[a-zA-Z0-9]+
string:
"-rw-rw-rw- 1 rats rats 31K Sep 17 13:33 /opt/data/automation_sent/20180918/labc/0/20180918_00500.itx"
我正在用 C# 做,我认为该值会在第二组中[1],但事实并非如此;它只捕获字符串 "_sent":
string temp2 = "";
Regex getValueAfterUnderscore = new Regex(@"(?<=_)[a-zA-Z0-9]+");
Match match2 = getValueAfterUnderscore.Match(line);
if (match2.Success)
{
temp2 = match2.Groups[1].Value;
Console.WriteLine(temp2);
}
有什么想法吗?谢谢!
如果你所有的字符串看起来都像这样 {SOME_STRING}_{YOUR_NUMBER}.itx,那么你可以使用这个解决方案(没有正则表达式)
var arr = str.Split(new[] {"_", ".itx"}, StringSplitOptions.RemoveEmptyEntries);
var result = arr[arr.Length - 1];
也许您混淆了 "groups" 和 "matches"。您应该搜索正则表达式的匹配项。以下是在给定字符串中列出正则表达式的所有匹配项的方法:
string str = "-rw-rw-rw- 1 rats rats 31K Sep 17 13:33 /opt/data/automation_sent/20180918/labc/0/20180918_00500.itx";
MatchCollection matches = Regex.Matches(str, @"(?<=_)[a-zA-Z0-9]+");
foreach (Match curMatch in matches)
Console.WriteLine(curMatch.Value);
对于您的具体情况,验证是否至少有 2 个匹配项并检索 matches[1]
的值(这是第二个匹配项)。
if (matches.Count >= 2)
Console.WriteLine($"Your result: {matches[1].Value}");
var input = "-rw-rw-rw- 1 rats rats 31K Sep 17 13:33 /opt/data/automation_sent/20180918/labc/0/20180918_00500.itx";
Regex regex = new Regex(@"(?<Identifier1>\d+)_(?<Identifier2>\d+)");
var results = regex.Matches(input);
foreach (Match match in results)
{
Console.WriteLine(match.Groups["Identifier1"].Value);
Console.WriteLine(match.Groups["Identifier2"].Value);//second occurence
}
您可以使用以下代码捕获第二个下划线后的文本
var line = "-rw-rw-rw- 1 rats rats 31K Sep 17 13:33 /opt/data/automation_sent/20180918/labc/0/20180918_00500.itx";
string temp2 = "";
Regex getValueAfterUnderscore = new Regex(@"_.+_([a-zA-Z0-9]+)");
Match match2 = getValueAfterUnderscore.Match(line);
if (match2.Success)
{
temp2 = match2.Groups[1].Value;
Console.WriteLine(temp2);
}
输出:
00500