将多列从因子转换为数字但在 R 中获得 NA
Convert multiple columns from factor to numeric but obtaining NAs in R
假设以下数据帧df
:
df <- structure(list(`week1` = structure(c(number = 4L,
area1 = 1L, area2 = 2L, price1 = 3L,
price2 = 5L), .Label = c("154.93", "304.69", "3554.50",
"49", "7587.22"), class = "factor"), `week2` = structure(c(number = 3L,
area1 = 1L, area2 = 4L, price1 = 2L,
price2 = 5L), .Label = c("28.12", "2882.91", "30",
"44.24", "4534.47"), class = "factor")), class = "data.frame", row.names = c("number",
"area1", "area2", "price1",
"price2"))
我尝试将其 week1
和 week2
列从 factor
转换为 numeric
,方法是:
cols = c(1, 2)
df[, cols] <- as.numeric(as.character(df[, cols]))
# df[cols] <- lapply(df[cols], as.numeric) # gives incorrect results
输出:
但它 returns NA
或那些列的不正确结果。但是,以下代码给出了正确答案:
cols = c(1, 2)
df[, cols] = apply(df[, cols], 2, function(x) as.numeric(as.character(x)))
df
输出:
为什么第一个适用于 this case 的解决方案得到 NA
s?谢谢。
as.character
/as.numeric
需要一个向量作为输入。使用 df[, cols]
您正在向它传递一个数据帧(检查 class(df[, cols])
)。
如果您在 link 中谈论接受的答案,它会说要更改 for
循环中的代码,并且不建议传递整个数据帧。要更改多列的 class,您可以使用 for
循环、apply
或 lapply
.
df[cols] <- lapply(df[cols], function(x) as.numeric(as.character(x)))
假设以下数据帧df
:
df <- structure(list(`week1` = structure(c(number = 4L,
area1 = 1L, area2 = 2L, price1 = 3L,
price2 = 5L), .Label = c("154.93", "304.69", "3554.50",
"49", "7587.22"), class = "factor"), `week2` = structure(c(number = 3L,
area1 = 1L, area2 = 4L, price1 = 2L,
price2 = 5L), .Label = c("28.12", "2882.91", "30",
"44.24", "4534.47"), class = "factor")), class = "data.frame", row.names = c("number",
"area1", "area2", "price1",
"price2"))
我尝试将其 week1
和 week2
列从 factor
转换为 numeric
,方法是:
cols = c(1, 2)
df[, cols] <- as.numeric(as.character(df[, cols]))
# df[cols] <- lapply(df[cols], as.numeric) # gives incorrect results
输出:
但它 returns NA
或那些列的不正确结果。但是,以下代码给出了正确答案:
cols = c(1, 2)
df[, cols] = apply(df[, cols], 2, function(x) as.numeric(as.character(x)))
df
输出:
为什么第一个适用于 this case 的解决方案得到 NA
s?谢谢。
as.character
/as.numeric
需要一个向量作为输入。使用 df[, cols]
您正在向它传递一个数据帧(检查 class(df[, cols])
)。
如果您在 link 中谈论接受的答案,它会说要更改 for
循环中的代码,并且不建议传递整个数据帧。要更改多列的 class,您可以使用 for
循环、apply
或 lapply
.
df[cols] <- lapply(df[cols], function(x) as.numeric(as.character(x)))