按子字符串对向量进行排序
Sort a vector by a substring
你好,我有一个文件列表:
"1_EX-P1-H2.3000" "10_EX-P1-H2.3002" "100_EX-P1-H2.3074"
"1004_EX-P1-H2.4059" "1006_EX-P1-H2.4070" "2_EX-P1-H2.3000" "3_EX-P1-H2.3000" "4_EX-P1-H2.3000"
"5_EX-P1-H2.3001"
我不想按字典顺序排序,而是购买“_”之前的第一个数字的顺序,这些数字从 1 到 1000。结果我应该得到:
"1_EX-P1-H2.3000" "2_EX-P1-H2.3000" "3_EX-P1-H2.3000" "4_EX-P1-H2.3000"
"5_EX-P1-H2.3001" "10_EX-P1-H2.3002" "100_EX-P1-H2.3074"
"1004_EX-P1-H2.4059" "1006_EX-P1-H2.4070"
正如 OP 在 order
中提到的,基于 _
之前的第一个数字,我们可以使用 readr
中的 parse_number
来提取第一个数字子字符串,order
并用它来重新排列向量
v1[order(readr::parse_number(v1))]
#[1] "1_EX-P1-H2.3000" "2_EX-P1-H2.3000" "3_EX-P1-H2.3000" "4_EX-P1-H2.3000" "5_EX-P1-H2.3001" "10_EX-P1-H2.3002"
#[7] "100_EX-P1-H2.3074" "1004_EX-P1-H2.4059" "1006_EX-P1-H2.4070"
或使用sub
删除子串,order
v1[order(as.numeric(sub("_.*", "", v1)))]
#[1] "1_EX-P1-H2.3000" "2_EX-P1-H2.3000" "3_EX-P1-H2.3000" "4_EX-P1-H2.3000" "5_EX-P1-H2.3001" "10_EX-P1-H2.3002"
#[7] "100_EX-P1-H2.3074" "1004_EX-P1-H2.4059" "1006_EX-P1-H2.4070"
或者另一个选项是 mixedsort
来自 gtools
gtools::mixedsort(v1)
-输出
#[1] "1_EX-P1-H2.3000" "2_EX-P1-H2.3000" "3_EX-P1-H2.3000" "4_EX-P1-H2.3000" "5_EX-P1-H2.3001" "10_EX-P1-H2.3002"
#[7] "100_EX-P1-H2.3074" "1004_EX-P1-H2.4059" "1006_EX-P1-H2.4070"
数据
v1 <- c("1_EX-P1-H2.3000", "10_EX-P1-H2.3002", "100_EX-P1-H2.3074",
"1004_EX-P1-H2.4059", "1006_EX-P1-H2.4070", "2_EX-P1-H2.3000",
"3_EX-P1-H2.3000", "4_EX-P1-H2.3000", "5_EX-P1-H2.3001")
你好,我有一个文件列表:
"1_EX-P1-H2.3000" "10_EX-P1-H2.3002" "100_EX-P1-H2.3074"
"1004_EX-P1-H2.4059" "1006_EX-P1-H2.4070" "2_EX-P1-H2.3000" "3_EX-P1-H2.3000" "4_EX-P1-H2.3000"
"5_EX-P1-H2.3001"
我不想按字典顺序排序,而是购买“_”之前的第一个数字的顺序,这些数字从 1 到 1000。结果我应该得到:
"1_EX-P1-H2.3000" "2_EX-P1-H2.3000" "3_EX-P1-H2.3000" "4_EX-P1-H2.3000"
"5_EX-P1-H2.3001" "10_EX-P1-H2.3002" "100_EX-P1-H2.3074"
"1004_EX-P1-H2.4059" "1006_EX-P1-H2.4070"
正如 OP 在 order
中提到的,基于 _
之前的第一个数字,我们可以使用 readr
中的 parse_number
来提取第一个数字子字符串,order
并用它来重新排列向量
v1[order(readr::parse_number(v1))]
#[1] "1_EX-P1-H2.3000" "2_EX-P1-H2.3000" "3_EX-P1-H2.3000" "4_EX-P1-H2.3000" "5_EX-P1-H2.3001" "10_EX-P1-H2.3002"
#[7] "100_EX-P1-H2.3074" "1004_EX-P1-H2.4059" "1006_EX-P1-H2.4070"
或使用sub
删除子串,order
v1[order(as.numeric(sub("_.*", "", v1)))]
#[1] "1_EX-P1-H2.3000" "2_EX-P1-H2.3000" "3_EX-P1-H2.3000" "4_EX-P1-H2.3000" "5_EX-P1-H2.3001" "10_EX-P1-H2.3002"
#[7] "100_EX-P1-H2.3074" "1004_EX-P1-H2.4059" "1006_EX-P1-H2.4070"
或者另一个选项是 mixedsort
来自 gtools
gtools::mixedsort(v1)
-输出
#[1] "1_EX-P1-H2.3000" "2_EX-P1-H2.3000" "3_EX-P1-H2.3000" "4_EX-P1-H2.3000" "5_EX-P1-H2.3001" "10_EX-P1-H2.3002"
#[7] "100_EX-P1-H2.3074" "1004_EX-P1-H2.4059" "1006_EX-P1-H2.4070"
数据
v1 <- c("1_EX-P1-H2.3000", "10_EX-P1-H2.3002", "100_EX-P1-H2.3074",
"1004_EX-P1-H2.4059", "1006_EX-P1-H2.4070", "2_EX-P1-H2.3000",
"3_EX-P1-H2.3000", "4_EX-P1-H2.3000", "5_EX-P1-H2.3001")