sed 重复匹配行为不当

Question

我正在尝试从以下字符串中获取文件路径：

"# configuration file /etc/nginx/conf.d/default.conf"

通过将其传递给 sed:

sed -n 's,\(# configuration file \)\(\/[a-zA-Z_.]\+\)\+,,'

我预计 /etc/nginx/conf.d/default.conf 会在 </code> 中被抓住，但令人惊讶的是只返回 <code>default.conf 部分。在这里，我了解到所引用的每次 /[a-zA-Z_.]\+ 的下一场比赛都会重新填充部分。这不是合乎逻辑的吗每个下一个匹配项都会转到下一个引用，因此 default.conf 将在 </code>?</p> 中返回 <pre><code>/[a-zA-Z_.]\+ >>> \(/etc\)\(/nginx\)\(/conf.d\)\(/default.conf\)

Answer 1

这可能适合您 (GNU sed)：

sed -nE 's,(# configuration file )((/[a-zA-Z_.]+)+),,p' file

这将捕获文件路径。

sed -nE 's,(# configuration file )((/[a-zA-Z_.]+)+),,p' file

这将捕获评论的开头。

sed -nE 's/(# configuration file )((\/[a-zA-Z_.]+)+)//p' file

这将捕获文件路径的末尾。

N.B。当捕获组由可能重复的内容限定时，即 *、?、+ 或 {...} 之间的任何内容，它将保留最后一次这样的重复（参见解决方案 3） .

sed 重复匹配行为不当

sed repetition match misbehaving

regex

sed

pattern-matching