找到 RegEx 匹配然后提取
Locate RegEx match then extract
我正在尝试从 RichTextBox 中读取文本,以便找到匹配表达式的第一次出现。然后我想提取满足他们查询的字符串,这样我就可以将它用作变量。下面是我必须开始和构建的基本代码。
private string returnPostcode()
{
string[] allLines = rtxtDocViewer.Text.Split('\n');
string expression = string expression = "^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2})$"
foreach (string line in allLines)
{
if (Regex.Matches(line, expression, RegexOptions.Count > 0)
{
//extract and return the string that is found
}
}
}
RichTextBox 中包含的内容示例如下。我想提取上面的正则表达式应该能够找到的 "E12 8SD" 。谢谢
Damon Brown
Flat B University Place
26 Park Square
London
E12 8SD
Mobile: 1111 22222
Email: dabrown192882@gmail.com Date of birth: 21/03/1986
Gender: Male
Marital Status: Single
Nationality: English
Summary
I have acquired a multifaceted skill set with experience using several computing platforms.
您需要使用 Regex.IsMatch
并删除 RegexOptions.Count > 0
string[] allLines = s.Split('\n');
string expression = "^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2})$";
foreach (string line in allLines)
{
if (Regex.IsMatch(line, expression)) // Regex.IsMatch will check if a string matches the regex
{
Console.WriteLine(line); // Print the matched line
}
}
很可能您的文本包含 CR+LF 换行符。然后,按如下方式调整您的代码:
string[] allLines = s.Split(new[] {"\r\n"}, StringSplitOptions.RemoveEmptyEntries);
更新
要仅使用正则表达式提取代码,无需将内容拆分成行,只需在整个文本上使用 Regex.Match
:
string s = "Damon Brown\nFlat B University Place\n26 Park Square \nLondon\nTW1 1AJ Twickenham Mobile: +44 (0) 7711223344\nMobile: 1111 22222\nEmail: dabrown192882@gmail.com Date of birth: 21/03/1986\nGender: Male\nMarital Status: Single\nNationality: English\nSummary\nI have acquired a multifaceted skill set with experience using several computing platforms.";
string expression = @"(?i)\b(gir 0a{2})|((([a-z][0-9]{1,2})|(([a-z][a-hj-y][0-9]{1,2})|(([a-z][0-9][a-z])|([a-z][a-hj-y][0-9]?[a-z])))) [0-9][a-z]{2})\b";
Match res = Regex.Match(s, expression);
if (res.Success)
Console.WriteLine(res.Value); // = > TW1 1AJ
我还删除了大写范围,用不区分大小写的修饰符替换它们 (?i)
。
我正在尝试从 RichTextBox 中读取文本,以便找到匹配表达式的第一次出现。然后我想提取满足他们查询的字符串,这样我就可以将它用作变量。下面是我必须开始和构建的基本代码。
private string returnPostcode()
{
string[] allLines = rtxtDocViewer.Text.Split('\n');
string expression = string expression = "^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2})$"
foreach (string line in allLines)
{
if (Regex.Matches(line, expression, RegexOptions.Count > 0)
{
//extract and return the string that is found
}
}
}
RichTextBox 中包含的内容示例如下。我想提取上面的正则表达式应该能够找到的 "E12 8SD" 。谢谢
Damon Brown
Flat B University Place
26 Park Square
London
E12 8SD
Mobile: 1111 22222
Email: dabrown192882@gmail.com Date of birth: 21/03/1986
Gender: Male
Marital Status: Single
Nationality: English
Summary
I have acquired a multifaceted skill set with experience using several computing platforms.
您需要使用 Regex.IsMatch
并删除 RegexOptions.Count > 0
string[] allLines = s.Split('\n');
string expression = "^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([AZa-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2})$";
foreach (string line in allLines)
{
if (Regex.IsMatch(line, expression)) // Regex.IsMatch will check if a string matches the regex
{
Console.WriteLine(line); // Print the matched line
}
}
很可能您的文本包含 CR+LF 换行符。然后,按如下方式调整您的代码:
string[] allLines = s.Split(new[] {"\r\n"}, StringSplitOptions.RemoveEmptyEntries);
更新
要仅使用正则表达式提取代码,无需将内容拆分成行,只需在整个文本上使用 Regex.Match
:
string s = "Damon Brown\nFlat B University Place\n26 Park Square \nLondon\nTW1 1AJ Twickenham Mobile: +44 (0) 7711223344\nMobile: 1111 22222\nEmail: dabrown192882@gmail.com Date of birth: 21/03/1986\nGender: Male\nMarital Status: Single\nNationality: English\nSummary\nI have acquired a multifaceted skill set with experience using several computing platforms.";
string expression = @"(?i)\b(gir 0a{2})|((([a-z][0-9]{1,2})|(([a-z][a-hj-y][0-9]{1,2})|(([a-z][0-9][a-z])|([a-z][a-hj-y][0-9]?[a-z])))) [0-9][a-z]{2})\b";
Match res = Regex.Match(s, expression);
if (res.Success)
Console.WriteLine(res.Value); // = > TW1 1AJ
我还删除了大写范围,用不区分大小写的修饰符替换它们 (?i)
。