使用正则表达式过滤 R 中只有一个“/”的路径

Question

我有一个不同路径的向量，例如

 levs<-c( "20200507-30g_25d" , "20200507-30g_25d/ggg" , "20200507-30g_25d/grn", "20200507-30g_25d/ylw", "ggg" , "grn", "tre_livelli", "tre_livelli/20200507-30g_25d", "tre_livelli/20200507-30g_25d/ggg", "tre_livelli/20200507-30g_25d/grn", "tre_livelli/20200507-30g_25d/ylw" , "ylw" )

这实际上是递归设置为 TRUE 的 list.dirs 的输出。

我只想识别只有一个子文件夹的路径（即“20200507-30g_25d/ggg”、“20200507-30g_25d/grn”、“20200507-30g_25d/ylw”） .

我想过滤向量以仅查找那些只有一个“/”的路径，然后将其与具有多个“/”的路径进行比较以去除部分路径。

我尝试使用正则表达式，例如：

rep(levs,pattern='/{1}', value=T)

但我明白了：

 "20200507-30g_25d/ggg"             "20200507-30g_25d/grn"             "20200507-30g_25d/ylw"             "tre_livelli/20200507-30g_25d"     "tre_livelli/20200507-30g_25d/ggg" "tre_livelli/20200507-30g_25d/grn" "tre_livelli/20200507-30g_25d/ylw"

知道如何进行吗？

Answer 1

带有str_count的选项计算/

的实例数

library(stringr)
levs[str_count(levs, "/") == 1 ]

-输出

[1] "20200507-30g_25d/ggg"         "20200507-30g_25d/grn" 
[3] "20200507-30g_25d/ylw"         "tre_livelli/20200507-30g_25d"

Answer 2

/{1} 是一个等同于 / 的正则表达式，它只匹配字符串中任意位置的 /，并且其中可以有多个 / .请查看 regex tag page:

Using {1} as a single-repetition quantifier is harmless but never useful. It is basically an indication of inexperience and/or confusion.

h{1}t{1}t{1}p{1} matches the same string as the simpler expression http (or ht{2}p for that matter) but as you can see, the redundant {1} repetitions only make it harder to read.

你可以使用

grep(levs, pattern="^[^/]+/[^/]+$", value=TRUE)
# => [1] "20200507-30g_25d/ggg"         "20200507-30g_25d/grn"         "20200507-30g_25d/ylw"         "tre_livelli/20200507-30g_25d"

参见 regex demo:

^ - 匹配字符串的开头
[^/]+- /
/ - 一个 / 字符
[^/]+- /
$ - 字符串结尾。

注意：如果字符串中只有/前后的部分可以为空，将+替换为*： ^[^/]*/[^/]*$.

使用正则表达式过滤 R 中只有一个“/”的路径

filter paths that have only one "/" in R using regular expression

regex

string

r

path

filter