使用正则表达式过滤 R 中只有一个“/”的路径
filter paths that have only one "/" in R using regular expression
我有一个不同路径的向量,例如
levs<-c( "20200507-30g_25d" , "20200507-30g_25d/ggg" , "20200507-30g_25d/grn", "20200507-30g_25d/ylw", "ggg" , "grn", "tre_livelli", "tre_livelli/20200507-30g_25d", "tre_livelli/20200507-30g_25d/ggg", "tre_livelli/20200507-30g_25d/grn", "tre_livelli/20200507-30g_25d/ylw" , "ylw" )
这实际上是递归设置为 TRUE 的 list.dirs 的输出。
我只想识别只有一个子文件夹的路径(即“20200507-30g_25d/ggg”、“20200507-30g_25d/grn”、“20200507-30g_25d/ylw”) .
我想过滤向量以仅查找那些只有一个“/”的路径,然后将其与具有多个“/”的路径进行比较以去除部分路径。
我尝试使用正则表达式,例如:
rep(levs,pattern='/{1}', value=T)
但我明白了:
"20200507-30g_25d/ggg" "20200507-30g_25d/grn" "20200507-30g_25d/ylw" "tre_livelli/20200507-30g_25d" "tre_livelli/20200507-30g_25d/ggg" "tre_livelli/20200507-30g_25d/grn" "tre_livelli/20200507-30g_25d/ylw"
知道如何进行吗?
带有str_count
的选项计算/
的实例数
library(stringr)
levs[str_count(levs, "/") == 1 ]
-输出
[1] "20200507-30g_25d/ggg" "20200507-30g_25d/grn"
[3] "20200507-30g_25d/ylw" "tre_livelli/20200507-30g_25d"
/{1}
是一个等同于 /
的正则表达式,它只匹配字符串中任意位置的 /
,并且其中可以有多个 /
.请查看 regex tag page:
Using {1}
as a single-repetition quantifier is harmless but never useful. It is basically an indication of inexperience and/or confusion.
h{1}t{1}t{1}p{1}
matches the same string as the simpler expression http
(or ht{2}p
for that matter) but as you can see, the redundant {1}
repetitions only make it harder to read.
你可以使用
grep(levs, pattern="^[^/]+/[^/]+$", value=TRUE)
# => [1] "20200507-30g_25d/ggg" "20200507-30g_25d/grn" "20200507-30g_25d/ylw" "tre_livelli/20200507-30g_25d"
参见 regex demo:
^
- 匹配字符串的开头
[^/]+
- /
以外的一个或多个字符
/
- 一个 /
字符
[^/]+
- /
以外的一个或多个字符
$
- 字符串结尾。
注意:如果字符串中只有/
前后的部分可以为空,将+
替换为*
: ^[^/]*/[^/]*$
.
我有一个不同路径的向量,例如
levs<-c( "20200507-30g_25d" , "20200507-30g_25d/ggg" , "20200507-30g_25d/grn", "20200507-30g_25d/ylw", "ggg" , "grn", "tre_livelli", "tre_livelli/20200507-30g_25d", "tre_livelli/20200507-30g_25d/ggg", "tre_livelli/20200507-30g_25d/grn", "tre_livelli/20200507-30g_25d/ylw" , "ylw" )
这实际上是递归设置为 TRUE 的 list.dirs 的输出。
我只想识别只有一个子文件夹的路径(即“20200507-30g_25d/ggg”、“20200507-30g_25d/grn”、“20200507-30g_25d/ylw”) .
我想过滤向量以仅查找那些只有一个“/”的路径,然后将其与具有多个“/”的路径进行比较以去除部分路径。
我尝试使用正则表达式,例如:
rep(levs,pattern='/{1}', value=T)
但我明白了:
"20200507-30g_25d/ggg" "20200507-30g_25d/grn" "20200507-30g_25d/ylw" "tre_livelli/20200507-30g_25d" "tre_livelli/20200507-30g_25d/ggg" "tre_livelli/20200507-30g_25d/grn" "tre_livelli/20200507-30g_25d/ylw"
知道如何进行吗?
带有str_count
的选项计算/
library(stringr)
levs[str_count(levs, "/") == 1 ]
-输出
[1] "20200507-30g_25d/ggg" "20200507-30g_25d/grn"
[3] "20200507-30g_25d/ylw" "tre_livelli/20200507-30g_25d"
/{1}
是一个等同于 /
的正则表达式,它只匹配字符串中任意位置的 /
,并且其中可以有多个 /
.请查看 regex tag page:
Using
{1}
as a single-repetition quantifier is harmless but never useful. It is basically an indication of inexperience and/or confusion.
h{1}t{1}t{1}p{1}
matches the same string as the simpler expressionhttp
(orht{2}p
for that matter) but as you can see, the redundant{1}
repetitions only make it harder to read.
你可以使用
grep(levs, pattern="^[^/]+/[^/]+$", value=TRUE)
# => [1] "20200507-30g_25d/ggg" "20200507-30g_25d/grn" "20200507-30g_25d/ylw" "tre_livelli/20200507-30g_25d"
参见 regex demo:
^
- 匹配字符串的开头[^/]+
-/
以外的一个或多个字符
/
- 一个/
字符[^/]+
-/
以外的一个或多个字符
$
- 字符串结尾。
注意:如果字符串中只有/
前后的部分可以为空,将+
替换为*
: ^[^/]*/[^/]*$
.