在 R 中用 类 搜索和替换字符

Searching and replacing characters with classes in R

我正在尝试替换 R 中的文本。我只想找到字母和数字之间的空格并删除它们,但是当我使用 [:alpha:] 和 [:alnum:] 进行搜索时,它会替换为 class运算符。

> string <- "WORD = 500 * WORD + ((WORD & 400) - (WORD & 300))"

> str_replace_all(string,
+                 "[:alpha:] & [:alnum:]",
+                 "[:alpha:]&[:alnum:]")

[1] "WORD = 500 * WORD + ((WOR[:alpha:]&[:alnum:]00) - (WOR[:alpha:]&[:alnum:]00))"

我怎样才能使用这个函数,使它 returns-

[1] "WORD = 500 * WORD + ((WORD&400) - (WORD&300))"

您的要求很容易使用 sub 和 lookarounds 来处理:

string <- "WORD = 500 * WORD + ((WORD & 400) - (WORD & 300))"
output <- gsub("(?<=\w) & (?=\w)", "&", string, perl=TRUE)
output

[1] "WORD = 500 * WORD + ((WORD&400) - (WORD&300))"

下面是对正则表达式的简要解释:

(?<=\w)   assert that what precedes is a word character
[ ]&[ ]    then match a space, followed by `&`, followed by another space
(?=\w)    assert that what follows is also a word character

然后,我们只用一个 & 替换,两边都没有空格。

这是一个选项,我们匹配正则表达式环视以匹配 & 之前或之后的一个或多个空格 (\s+) 并替换为空白 ("")

gsub("(?<=&)\s+|\s+(?=&)", "", string, perl = TRUE)
#[1] "WORD = 500 * WORD + ((WORD&400) - (WORD&300))"
str_replace_all(string, "([:alpha:]) & ([:alnum:])", "\1&\2")