使用正则表达式过滤 R 中只有一个“/”的路径

filter paths that have only one "/" in R using regular expression

我有一个不同路径的向量,例如

 levs<-c( "20200507-30g_25d" , "20200507-30g_25d/ggg" , "20200507-30g_25d/grn", "20200507-30g_25d/ylw", "ggg" , "grn", "tre_livelli", "tre_livelli/20200507-30g_25d", "tre_livelli/20200507-30g_25d/ggg", "tre_livelli/20200507-30g_25d/grn", "tre_livelli/20200507-30g_25d/ylw" , "ylw" )

这实际上是递归设置为 TRUE 的 list.dirs 的输出。

我只想识别只有一个子文件夹的路径(即“20200507-30g_25d/ggg”、“20200507-30g_25d/grn”、“20200507-30g_25d/ylw”) .

我想过滤向量以仅查找那些只有一个“/”的路径,然后将其与具有多个“/”的路径进行比较以去除部分路径。

我尝试使用正则表达式,例如:

rep(levs,pattern='/{1}', value=T)

但我明白了:

 "20200507-30g_25d/ggg"             "20200507-30g_25d/grn"             "20200507-30g_25d/ylw"             "tre_livelli/20200507-30g_25d"     "tre_livelli/20200507-30g_25d/ggg" "tre_livelli/20200507-30g_25d/grn" "tre_livelli/20200507-30g_25d/ylw"

知道如何进行吗?

带有str_count的选项计算/

的实例数
library(stringr)
levs[str_count(levs, "/") == 1 ]

-输出

[1] "20200507-30g_25d/ggg"         "20200507-30g_25d/grn" 
[3] "20200507-30g_25d/ylw"         "tre_livelli/20200507-30g_25d"

/{1} 是一个等同于 / 的正则表达式,它只匹配字符串中任意位置的 /,并且其中可以有多个 / .请查看 tag page:

Using {1} as a single-repetition quantifier is harmless but never useful. It is basically an indication of inexperience and/or confusion.

h{1}t{1}t{1}p{1} matches the same string as the simpler expression http (or ht{2}p for that matter) but as you can see, the redundant {1} repetitions only make it harder to read.

你可以使用

grep(levs, pattern="^[^/]+/[^/]+$", value=TRUE)
# => [1] "20200507-30g_25d/ggg"         "20200507-30g_25d/grn"         "20200507-30g_25d/ylw"         "tre_livelli/20200507-30g_25d"

参见 regex demo:

  • ^ - 匹配字符串的开头
  • [^/]+- /
  • 以外的一个或多个字符
  • / - 一个 / 字符
  • [^/]+- /
  • 以外的一个或多个字符
  • $ - 字符串结尾。

注意:如果字符串中只有/前后的部分可以为空,将+替换为*^[^/]*/[^/]*$.