用相邻列中的值替换列中的 NA
Replacing NA in column with values in adjacent column
我有这样一个数据框:
A B C D E F G H
a LOW 1.5 0.2 NA 1000 2000 NA
b LOW 2.9 0.4 HIGH 6000 1000 NA
c LOW 1 1.3 LOW 400 1111 LOW
d LOW 2 10 LOW 1000 400 HIGH
如何使用条件语句替换 NA
值。
对于E列,我想取C列和D列的差值,小于0则显示"small decrease",大于0则显示"small increase" .
然后对于 H 列,除了使用 F 列和 G 列的差值外,执行相同的操作。如果低于 0,则显示 "small decrease",如果高于 0,则显示 "small increase"。
最终输出应如下所示:
A B C D E F G H
a LOW 1.5 0.2 Small Increase 1000 2000 Small Decrease
b LOW 2.9 0.4 HIGH 6000 1000 Small Increase
c LOW 1 1.3 LOW 400 1111 LOW
d LOW 2 10 LOW 1000 400 HIGH
对其他列也执行类似的步骤!
df$E <- ifelse(is.na(df$E), ifelse(df$C-df$D <0,"small decrease","small increase"), df$E)
这是一个使用 data.table
中的 set
的选项,它会非常有效,因为它会就地赋值
library(data.table)
setDT(df1)#converts 'data.frame' to 'data.table'
#loop through the index of the concerned columns
for(j in c(5L, 8L)) {
#get the row index of NA for each column
i1 <- which(is.na(df1[[j]]))
#get the value to be replaced based on the difference
val <- c("Small Increase", "Small Decrease")[((df1[[j-2]][i1] - df1[[j-1]][i1]) < 0) + 1]
#set the NA elements to the above val
set(df1, i = i1, j = j, value = val)
}
df1
# A B C D E F G H
#1: a LOW 1.5 0.2 Small Increase 1000 2000 Small Decrease
#2: b LOW 2.9 0.4 HIGH 6000 1000 Small Increase
#3: c LOW 1.0 1.3 LOW 400 1111 LOW
#4: d LOW 2.0 10.0 LOW 1000 400 HIGH
数据
df1 <- structure(list(A = c("a", "b", "c", "d"), B = c("LOW", "LOW",
"LOW", "LOW"), C = c(1.5, 2.9, 1, 2), D = c(0.2, 0.4, 1.3, 10
), E = c(NA, "HIGH", "LOW", "LOW"), F = c(1000L, 6000L, 400L,
1000L), G = c(2000L, 1000L, 1111L, 400L), H = c(NA, NA, "LOW",
"HIGH")), .Names = c("A", "B", "C", "D", "E", "F", "G", "H"),
class = "data.frame", row.names = c(NA, -4L))
我有这样一个数据框:
A B C D E F G H
a LOW 1.5 0.2 NA 1000 2000 NA
b LOW 2.9 0.4 HIGH 6000 1000 NA
c LOW 1 1.3 LOW 400 1111 LOW
d LOW 2 10 LOW 1000 400 HIGH
如何使用条件语句替换 NA
值。
对于E列,我想取C列和D列的差值,小于0则显示"small decrease",大于0则显示"small increase" .
然后对于 H 列,除了使用 F 列和 G 列的差值外,执行相同的操作。如果低于 0,则显示 "small decrease",如果高于 0,则显示 "small increase"。
最终输出应如下所示:
A B C D E F G H
a LOW 1.5 0.2 Small Increase 1000 2000 Small Decrease
b LOW 2.9 0.4 HIGH 6000 1000 Small Increase
c LOW 1 1.3 LOW 400 1111 LOW
d LOW 2 10 LOW 1000 400 HIGH
对其他列也执行类似的步骤!
df$E <- ifelse(is.na(df$E), ifelse(df$C-df$D <0,"small decrease","small increase"), df$E)
这是一个使用 data.table
中的 set
的选项,它会非常有效,因为它会就地赋值
library(data.table)
setDT(df1)#converts 'data.frame' to 'data.table'
#loop through the index of the concerned columns
for(j in c(5L, 8L)) {
#get the row index of NA for each column
i1 <- which(is.na(df1[[j]]))
#get the value to be replaced based on the difference
val <- c("Small Increase", "Small Decrease")[((df1[[j-2]][i1] - df1[[j-1]][i1]) < 0) + 1]
#set the NA elements to the above val
set(df1, i = i1, j = j, value = val)
}
df1
# A B C D E F G H
#1: a LOW 1.5 0.2 Small Increase 1000 2000 Small Decrease
#2: b LOW 2.9 0.4 HIGH 6000 1000 Small Increase
#3: c LOW 1.0 1.3 LOW 400 1111 LOW
#4: d LOW 2.0 10.0 LOW 1000 400 HIGH
数据
df1 <- structure(list(A = c("a", "b", "c", "d"), B = c("LOW", "LOW",
"LOW", "LOW"), C = c(1.5, 2.9, 1, 2), D = c(0.2, 0.4, 1.3, 10
), E = c(NA, "HIGH", "LOW", "LOW"), F = c(1000L, 6000L, 400L,
1000L), G = c(2000L, 1000L, 1111L, 400L), H = c(NA, NA, "LOW",
"HIGH")), .Names = c("A", "B", "C", "D", "E", "F", "G", "H"),
class = "data.frame", row.names = c(NA, -4L))