正则表达式在一行的开头捕获 N 个单词而不影响下一行

Question

我有一个很大的 csv 文件，我需要捕获一行的前几个单词（6 个或更少）。我在记事本++中使用替换，我有这个正则表达式。

^((\w+\S+)\s+){1,6}.*$     (My 'replace with' text is ...)

我的问题是，如果一行少于 6 个单词，它将影响下一行。

例如：如果我运行替换此文本：

one two three four five six
one two three four
one two three four five six
one two three four five six

我得到的是这个结果：

one two three four five six...
one two three four
one two...
one two three four five six...

这是我想要的结果：

one two three four five six...
one two three four...
one two three four five six...
one two three four five six...

如有任何帮助，我们将不胜感激。

Answer 1

\s 包括换行符。请尝试 (?:(?!\n)\s) 或 [^\S\n]：

^((\w+\S+)[^\S\n]+){1,6}.*$

此外，这包括以下space，因此不会匹配正好有六个单词的行。试试这个：

^((?:\w+\S*)(?:[^\S\n]+(\w+\S*)){0,5}).*$

Answer 2

我愿意：

查找内容：^((?:\S+\h){6}).*$
替换为：</code></p> <p>这将删除第六个单词之后的所有内容。少于6个字的行将保留原样。</p> <p><code>\h代表横白space.
\w+\S+可以减少到\S+，如果不符合你的需要，保留\w+\S+

确保你没有勾选 dot matches newline

regular expression to capture N words at start of a line without effecting next line