如果以句号结尾但不以包含小数的数字结尾,则拆分句子
Split Sentence if it ends with period but not number containing decimal
我想将我的文本拆分成句子,每个句子以 .
结尾,但不包括带小数点的数字。
我使用了拆分功能。但是它把数字分成两部分,我不想把数字分开。
示例:
My package a mount is 85.5 daily, how can I make use of it. any body
has an idea for that. please let me know.
应拆分为:
My package a mount is 85.5 daily, how can I make use of it
any body has an idea for that
please let me know
您可以使用 "period space" 作为分隔符:
string source = "My package amount is 85.5 daily, how can I make use of it. Anybody has an idea for that? Please let me know.";
string[] splits = Regex.Split(source, @"[\.?!]\s+");
这也是以 !
或 ?
结尾的句子,并采用任意数量的 space 字符(和其他白色 space 字符),最少为一个字符。
如果您想保留句点,请搜索以行尾符号开头的 space:
string[] splits2 = Regex.Split(source, @"(?<=[\.?!])\s+");
Dmitry Bychenko 的回答简单而漂亮,但是它会在 .(dot) 之后丢失一个字符
var regex = new System.Text.RegularExpressions.Regex(@"(?<!\d)\.(?!\d)");
var myText = @"My package a mount is 85.5 daily, how can I make use of it. any body has an idea for that. please let me know.";
Console.WriteLine(regex.Replace(myText, Environment.NewLine));
它只会消耗既没有数字也没有数字的.(点)字符。
我想将我的文本拆分成句子,每个句子以 .
结尾,但不包括带小数点的数字。
我使用了拆分功能。但是它把数字分成两部分,我不想把数字分开。
示例:
My package a mount is 85.5 daily, how can I make use of it. any body has an idea for that. please let me know.
应拆分为:
My package a mount is 85.5 daily, how can I make use of it
any body has an idea for that
please let me know
您可以使用 "period space" 作为分隔符:
string source = "My package amount is 85.5 daily, how can I make use of it. Anybody has an idea for that? Please let me know.";
string[] splits = Regex.Split(source, @"[\.?!]\s+");
这也是以 !
或 ?
结尾的句子,并采用任意数量的 space 字符(和其他白色 space 字符),最少为一个字符。
如果您想保留句点,请搜索以行尾符号开头的 space:
string[] splits2 = Regex.Split(source, @"(?<=[\.?!])\s+");
Dmitry Bychenko 的回答简单而漂亮,但是它会在 .(dot) 之后丢失一个字符
var regex = new System.Text.RegularExpressions.Regex(@"(?<!\d)\.(?!\d)");
var myText = @"My package a mount is 85.5 daily, how can I make use of it. any body has an idea for that. please let me know.";
Console.WriteLine(regex.Replace(myText, Environment.NewLine));
它只会消耗既没有数字也没有数字的.(点)字符。