R: space 之间的 gsub 单词

Question

我有一个看起来像这样的文本：

a <- "233,236,241 solitude ΔE=1.9"

我想做的是提取两个空格 ( ) 之间的第二个单词，给出这个输出

> solitude

我尝试了两种方法：

a1 <- strsplit(a,' ',fixed=TRUE)[[1]][2]
a2 <- sapply(strsplit(a, " ", fixed=TRUE), "[", 2)

但总是显示：

ΔE=1.9

正确的做法是什么？

Answer 1

试试这个：

gsub("\s.+$","",gsub("^.+[[:digit:]]\s","",a))

Answer 2

这是一种使用捕获 classes（括号内的模式）和字符 classes（方括号内的模式）的方法。

sub("(^[^ ]*[ ])([^ ]*)([ ].*$)" , "\2", a)
[1] "solitude"

注释第一个捕获 class 模式：

"(^[^ ]*[ ])([^ ]*)([ ].*$)" , "\2", a)
         \finds first space
       \ an arbitrary number of times
    \ inside a character class an '^' as the first character ...
       signals negation of character class. This one with only the space character in it.
  \----- '^' marks the beginning of a character value

第二次捕获 class 模式：

"(^[^ ]*[ ])([^ ]*)([ ].*$)" , "\2", a)
                 \ an arbitrary number of times
              \negation of character class with only the space character in it.

第三次捕获class:

"(^[^ ]*[ ])([^ ]*)([ ].*$)" , "\2", a)
                     \ the second space
                        \anything after second space to end.

replacement 中的 "\<n>" 条目指的是捕获 class 匹配 n 它们在 pattern 参数中出现的顺序。

R: space 之间的 gsub 单词

R: gsub words between space

regex

r

gsub