用相邻列中的值替换列中的 NA

Replacing NA in column with values in adjacent column

我有这样一个数据框:

  A       B        C       D       E       F        G        H
  a     LOW      1.5     0.2      NA    1000     2000       NA
  b     LOW      2.9     0.4    HIGH    6000     1000       NA
  c     LOW        1     1.3     LOW     400     1111      LOW 
  d     LOW        2      10     LOW    1000      400     HIGH

如何使用条件语句替换 NA 值。

对于E列,我想取C列和D列的差值,小于0则显示"small decrease",大于0则显示"small increase" .

然后对于 H 列,除了使用 F 列和 G 列的差值外,执行相同的操作。如果低于 0,则显示 "small decrease",如果高于 0,则显示 "small increase"。

最终输出应如下所示:

  A       B        C       D                 E       F        G                    H
  a     LOW      1.5     0.2    Small Increase    1000     2000       Small Decrease
  b     LOW      2.9     0.4              HIGH    6000     1000       Small Increase
  c     LOW        1     1.3               LOW     400     1111                  LOW 
  d     LOW        2      10               LOW    1000      400                 HIGH

对其他列也执行类似的步骤!

df$E <- ifelse(is.na(df$E), ifelse(df$C-df$D <0,"small decrease","small increase"), df$E)

这是一个使用 data.table 中的 set 的选项,它会非常有效,因为它会就地赋值

library(data.table)
setDT(df1)#converts 'data.frame' to 'data.table'
#loop through the index of the concerned columns
for(j in c(5L, 8L)) {
  #get the row index of NA for each column
  i1 <- which(is.na(df1[[j]])) 
  #get the value to be replaced based on the difference
  val <- c("Small Increase", "Small Decrease")[((df1[[j-2]][i1] - df1[[j-1]][i1]) < 0) + 1]
  #set the NA elements to the above val
  set(df1, i = i1, j = j, value = val)
 }

df1
#   A   B   C    D              E    F    G              H
#1: a LOW 1.5  0.2 Small Increase 1000 2000 Small Decrease
#2: b LOW 2.9  0.4           HIGH 6000 1000 Small Increase
#3: c LOW 1.0  1.3            LOW  400 1111            LOW
#4: d LOW 2.0 10.0            LOW 1000  400           HIGH

数据

df1 <- structure(list(A = c("a", "b", "c", "d"), B = c("LOW", "LOW", 
"LOW", "LOW"), C = c(1.5, 2.9, 1, 2), D = c(0.2, 0.4, 1.3, 10
), E = c(NA, "HIGH", "LOW", "LOW"), F = c(1000L, 6000L, 400L, 
1000L), G = c(2000L, 1000L, 1111L, 400L), H = c(NA, NA, "LOW", 
"HIGH")), .Names = c("A", "B", "C", "D", "E", "F", "G", "H"), 
class = "data.frame", row.names = c(NA, -4L))