R - 为什么 str_detect return 在 'words' 上使用以破折号结尾的单词边界时，结果与 grepl 不同

Question

str_detect 的帮助页面指出 "Equivalent to grepl(pattern, x)"，但是：

str_detect("ALL-", str_c("\b", "ALL-", "\b"))
[1] FALSE

同时

grepl(str_c("\b", "ALL-", "\b"), "ALL-")
[1] TRUE

我想其中一个没有按预期工作？还是我遗漏了什么？

Answer 1

将参数 perl = TRUE 添加到 grepl() 时，结果相同：

> grepl(str_c("\b", "ALL-", "\b"), "ALL-")
[1] TRUE
> grepl(str_c("\b", "ALL-", "\b"), "ALL-", perl = T)
[1] FALSE

这个参数意味着 grepl() 将使用 Perl 兼容正则表达式。

?grep中有这个警告，可能是相关的？

The POSIX 1003.2 mode of gsub and gregexpr does not work correctly with repeated word-boundaries (e.g., pattern = "\b"). Use perl = TRUE for such matches (but that may not work as expected with non-ASCII inputs, as the meaning of ‘word’ is system-dependent).

R - 为什么 str_detect return 在 'words' 上使用以破折号结尾的单词边界时，结果与 grepl 不同

R - why does str_detect return a different result than grepl when using word boundary on 'words' ending with dash

regex

r

str-replace

stringr