有条件地替换特定行值
Conditional replacement of specific row values
我有条件替换的问题。假设我有以下数据框代码
a=c("0","1","0","B","NA","NA","NA","NA","NA")
b=c(0,1,0,0,1,0,1,0,1)
c=c(0,0,0,0,1,0,0,1,1)
d=c("0","1","0","0","1","0","B","NA","NA")
dat=data.frame(rbind(a,b,c,d))
names(dat)=c("P1","P2","P3","P4","C1","C2","C3","C4","C5")
现在,如果这些值之一是 B,我想用 NA 替换 P1:P4 的行值,如果这些值之一,我还想用 NA 替换 C1:C5 的行值是 B。所以我希望 Dataframe 看起来像这样:
a=c(**"NA","NA","NA","NA"**,"NA","NA","NA","NA","NA")
b=c(0,1,0,0,1,0,1,0,1)
c=c(0,0,0,0,1,0,0,1,1)
d=c("0","1","0","0",**"NA","NA","NA"**,"NA","NA")
dat=data.frame(rbind(a,b,c,d))
names(dat)=c("P1","P2","P3","P4","C1","C2","C3","C4","C5")
我希望这个问题是可以理解的,如果有任何帮助,我将不胜感激。
据我了解...如果在 P1 到 P4 列中找到值 B
,则将 P1 到 P4 中的所有值设置为 NA
。
你可以试试:
nm <- c("P1", "P2", "P3", "P4")
cols <- which(names(dat) %in% nm)
dat[,cols][any(dat[,cols] == "B")] <- NA
dat
# P1 P2 P3 P4 C1 C2 C3 C4 C5
# a NA NA NA NA NA NA NA NA NA
# b NA NA NA NA 1 0 1 0 1
# c NA NA NA NA 1 0 0 1 1
# d NA NA NA NA 1 0 B NA NA
如果您只想将此应用到第一行,请使用 dat[1,cols][any(dat[,cols] == "B")] <- NA
。
考虑到 dat 是原始提供的数据框,我提供了一个相对较长的代码以便更好地理解。希望对你有帮助。
dat2 <- data.frame()
for(i in 1:nrow(dat)){
datSubset <- with(dat, dat[i,])
col.num.of.B <- which(datSubset == "B", arr.ind = T)[2]
if(is.na(col.num.of.B)){
datSubset <- datSubset
} else if(col.num.of.B < 5) {
datSubset[,c(1:4)] <- NA
} else {
datSubset[,c(5:9)] <- NA
}
dat2 <- rbind(dat2, datSubset)
}
dat2
# P1 P2 P3 P4 C1 C2 C3 C4 C5
# a <NA> <NA> <NA> <NA> NA NA NA NA NA
# b 0 1 0 0 1 0 1 0 1
# c 0 0 0 0 1 0 0 1 1
# d 0 1 0 0 <NA> <NA> <NA> <NA> <NA>
我有条件替换的问题。假设我有以下数据框代码
a=c("0","1","0","B","NA","NA","NA","NA","NA")
b=c(0,1,0,0,1,0,1,0,1)
c=c(0,0,0,0,1,0,0,1,1)
d=c("0","1","0","0","1","0","B","NA","NA")
dat=data.frame(rbind(a,b,c,d))
names(dat)=c("P1","P2","P3","P4","C1","C2","C3","C4","C5")
现在,如果这些值之一是 B,我想用 NA 替换 P1:P4 的行值,如果这些值之一,我还想用 NA 替换 C1:C5 的行值是 B。所以我希望 Dataframe 看起来像这样:
a=c(**"NA","NA","NA","NA"**,"NA","NA","NA","NA","NA")
b=c(0,1,0,0,1,0,1,0,1)
c=c(0,0,0,0,1,0,0,1,1)
d=c("0","1","0","0",**"NA","NA","NA"**,"NA","NA")
dat=data.frame(rbind(a,b,c,d))
names(dat)=c("P1","P2","P3","P4","C1","C2","C3","C4","C5")
我希望这个问题是可以理解的,如果有任何帮助,我将不胜感激。
据我了解...如果在 P1 到 P4 列中找到值 B
,则将 P1 到 P4 中的所有值设置为 NA
。
你可以试试:
nm <- c("P1", "P2", "P3", "P4")
cols <- which(names(dat) %in% nm)
dat[,cols][any(dat[,cols] == "B")] <- NA
dat
# P1 P2 P3 P4 C1 C2 C3 C4 C5
# a NA NA NA NA NA NA NA NA NA
# b NA NA NA NA 1 0 1 0 1
# c NA NA NA NA 1 0 0 1 1
# d NA NA NA NA 1 0 B NA NA
如果您只想将此应用到第一行,请使用 dat[1,cols][any(dat[,cols] == "B")] <- NA
。
考虑到 dat 是原始提供的数据框,我提供了一个相对较长的代码以便更好地理解。希望对你有帮助。
dat2 <- data.frame()
for(i in 1:nrow(dat)){
datSubset <- with(dat, dat[i,])
col.num.of.B <- which(datSubset == "B", arr.ind = T)[2]
if(is.na(col.num.of.B)){
datSubset <- datSubset
} else if(col.num.of.B < 5) {
datSubset[,c(1:4)] <- NA
} else {
datSubset[,c(5:9)] <- NA
}
dat2 <- rbind(dat2, datSubset)
}
dat2
# P1 P2 P3 P4 C1 C2 C3 C4 C5
# a <NA> <NA> <NA> <NA> NA NA NA NA NA
# b 0 1 0 0 1 0 1 0 1
# c 0 0 0 0 1 0 0 1 1
# d 0 1 0 0 <NA> <NA> <NA> <NA> <NA>