正则表达式排除特定序列中的单词
Regex excluding words in certain sequence
如果某些字段不符合条件,我想过滤掉它们。问题是它们的顺序。我尝试了以下结构:
(EXCLUDING)(?!\(MONDAY)(.*MONDAY).*
和
(EXCLUDING)(?!\()(.*MONDAY).*
我想要实现的是找到一个过滤器而不是 catches EXCLUDING * MONDAY
但如果这些词之间有括号则不行。也就是我要抓:
EXCLUDING MONDAY
EXCLUDING WEDNESDAY AND MONDAY
EXCLUDING MONDAY AND WEDNESDAY
EXCLUDING MONDAY (WEDNESDAY IS OK)
但不是
EXCLUDING WEDNESDAY (MONDAY IS OK)
上面的表达式当然可以涵盖所有这些。在R.
中是运行
这个怎么样?
mystrings <- c("EXCLUDING MONDAY",
"EXCLUDING WEDNESDAY AND MONDAY",
"EXCLUDING MONDAY AND WEDNESDAY",
"EXCLUDING MONDAY (WEDNESDAY IS OK)",
"EXCLUDING WEDNESDAY (MONDAY IS OK)")
grepl("EXCLUDING[^\(]+MONDAY", mystrings)
> TRUE TRUE TRUE TRUE FALSE
如果您只想匹配 (
不应紧接在 MONDAY
之前的模式,您可以使用否定后向断言。您的正则表达式用于否定前瞻,这就是为什么它不能正常工作的原因 (MONDAY
.
strs <- c("EXCLUDING MONDAY",
"EXCLUDING WEDNESDAY AND MONDAY",
"EXCLUDING MONDAY AND WEDNESDAY",
"EXCLUDING MONDAY (WEDNESDAY IS OK)",
"EXCLUDING WEDNESDAY (MONDAY IS OK)")
grepl("EXCLUDING.*(?<!\()MONDAY", strs, perl=TRUE)
# [1] TRUE TRUE TRUE TRUE FALSE
如果某些字段不符合条件,我想过滤掉它们。问题是它们的顺序。我尝试了以下结构:
(EXCLUDING)(?!\(MONDAY)(.*MONDAY).*
和
(EXCLUDING)(?!\()(.*MONDAY).*
我想要实现的是找到一个过滤器而不是 catches EXCLUDING * MONDAY
但如果这些词之间有括号则不行。也就是我要抓:
EXCLUDING MONDAY
EXCLUDING WEDNESDAY AND MONDAY
EXCLUDING MONDAY AND WEDNESDAY
EXCLUDING MONDAY (WEDNESDAY IS OK)
但不是
EXCLUDING WEDNESDAY (MONDAY IS OK)
上面的表达式当然可以涵盖所有这些。在R.
中是运行这个怎么样?
mystrings <- c("EXCLUDING MONDAY",
"EXCLUDING WEDNESDAY AND MONDAY",
"EXCLUDING MONDAY AND WEDNESDAY",
"EXCLUDING MONDAY (WEDNESDAY IS OK)",
"EXCLUDING WEDNESDAY (MONDAY IS OK)")
grepl("EXCLUDING[^\(]+MONDAY", mystrings)
> TRUE TRUE TRUE TRUE FALSE
如果您只想匹配 (
不应紧接在 MONDAY
之前的模式,您可以使用否定后向断言。您的正则表达式用于否定前瞻,这就是为什么它不能正常工作的原因 (MONDAY
.
strs <- c("EXCLUDING MONDAY",
"EXCLUDING WEDNESDAY AND MONDAY",
"EXCLUDING MONDAY AND WEDNESDAY",
"EXCLUDING MONDAY (WEDNESDAY IS OK)",
"EXCLUDING WEDNESDAY (MONDAY IS OK)")
grepl("EXCLUDING.*(?<!\()MONDAY", strs, perl=TRUE)
# [1] TRUE TRUE TRUE TRUE FALSE