删除包含匹配数字字符串的行
remove rows containing matching numeric strings
我有一个包含 3 列的数据框:
df
A B C
round1 test1 testing1
round1 test1 testing2
round1 test1 testing3
round1 test1 testing4
round1 test1 testing5
round2 test2 testing1
round2 test2 testing2
round2 test2 testing3
round2 test2 testing4
round2 test2 testing5
.
.
.
.
.
round100 test30 testing30
round100 test30 testing31
如何删除 B
和 C
列的字符串中的数值匹配的行?
只需提取数字部分并进行比较。
NumB = sub("\D+(\d+).*", "\1", DAT$B)
NumC = sub("\D+(\d+).*", "\1", DAT$C)
DAT = DAT[NumB != NumC,]
数据
DAT = read.table(text="A B C
round1 test1 testing1
round1 test1 testing2
round1 test1 testing3
round1 test1 testing4
round1 test1 testing5
round2 test2 testing1
round2 test2 testing2
round2 test2 testing3
round2 test2 testing4
round2 test2 testing5",
header=TRUE, stringsAsFactors = FALSE)
将非数字 "\D"
替换为空字符串并比较剩下的内容:
subset(DF, gsub("\D", "", B) != gsub("\D", "", C))
给出输入 DF
在下面的注释中重复显示的位置:
A B C
2 round1 test1 testing2
3 round1 test1 testing3
4 round1 test1 testing4
5 round1 test1 testing5
6 round2 test2 testing1
8 round2 test2 testing3
9 round2 test2 testing4
10 round2 test2 testing5
12 round100 test30 testing31
备注
可重现形式的输入是:
Lines <- "
A B C
round1 test1 testing1
round1 test1 testing2
round1 test1 testing3
round1 test1 testing4
round1 test1 testing5
round2 test2 testing1
round2 test2 testing2
round2 test2 testing3
round2 test2 testing4
round2 test2 testing5
round100 test30 testing30
round100 test30 testing31"
DF <- read.table(text = Lines, header = TRUE)
我有一个包含 3 列的数据框:
df
A B C
round1 test1 testing1
round1 test1 testing2
round1 test1 testing3
round1 test1 testing4
round1 test1 testing5
round2 test2 testing1
round2 test2 testing2
round2 test2 testing3
round2 test2 testing4
round2 test2 testing5
.
.
.
.
.
round100 test30 testing30
round100 test30 testing31
如何删除 B
和 C
列的字符串中的数值匹配的行?
只需提取数字部分并进行比较。
NumB = sub("\D+(\d+).*", "\1", DAT$B)
NumC = sub("\D+(\d+).*", "\1", DAT$C)
DAT = DAT[NumB != NumC,]
数据
DAT = read.table(text="A B C
round1 test1 testing1
round1 test1 testing2
round1 test1 testing3
round1 test1 testing4
round1 test1 testing5
round2 test2 testing1
round2 test2 testing2
round2 test2 testing3
round2 test2 testing4
round2 test2 testing5",
header=TRUE, stringsAsFactors = FALSE)
将非数字 "\D"
替换为空字符串并比较剩下的内容:
subset(DF, gsub("\D", "", B) != gsub("\D", "", C))
给出输入 DF
在下面的注释中重复显示的位置:
A B C
2 round1 test1 testing2
3 round1 test1 testing3
4 round1 test1 testing4
5 round1 test1 testing5
6 round2 test2 testing1
8 round2 test2 testing3
9 round2 test2 testing4
10 round2 test2 testing5
12 round100 test30 testing31
备注
可重现形式的输入是:
Lines <- "
A B C
round1 test1 testing1
round1 test1 testing2
round1 test1 testing3
round1 test1 testing4
round1 test1 testing5
round2 test2 testing1
round2 test2 testing2
round2 test2 testing3
round2 test2 testing4
round2 test2 testing5
round100 test30 testing30
round100 test30 testing31"
DF <- read.table(text = Lines, header = TRUE)