当数据框列不包含该级别时如何强制执行特定级别? (使用 R)
How to force specific levels when dataframe column does not contain that level? (Using R)
我在数据集中有可能包含 0 或 1 的列,但有些列只包含 0。
我想使用这些数字作为因数,但我仍然希望每一列都具有级别 0 和 1。我尝试了下面的代码,但我一直收到错误,但我不明白为什么...
#dataframe df has 100 rows
column_list = c("col1", "col2", "col3")
for (col in column_list) {
#convert number 0 and number 1 to factors
# (but sometimes the data only has zeros)
df[,col] <- as.factor(df[,col])
# I want to force levels to be 0 and 1
# this is for when the data is completely missing number 1
levels(df[, col] <- c(0,1)) #give error
# Error in `[<-.data.frame`(`*tmp*`, , col, value = c(0, 1)) :
# replacement has 2 rows, data has 100
print(levels(df[, col]))
#this produces "0" "1" or just "0" depending on the column
}
你指出你的错误在哪里,那行写错了。应该是:
df[, col] <- factor(df[, col], levels = c(0,1)
您甚至不需要上一行。
您甚至可以避免 for 循环并使用 apply:
df <- apply(df, 2, levels, c(0,1))
我认为你刚刚把 )
放错了地方
这个有效:
column_list = c("col1", "col2", "col3")
df <- data.frame(matrix(0, nrow = 100, ncol = 3))
names(df) <- column_list
for (col in column_list) {
#convert number 0 and number 1 to factors
# (but sometimes the data only has zeros)
df[,col] <- as.factor(df[,col])
# I want to force levels to be 0 and 1
# this is for when the data is completely missing number 1
levels(df[, col]) <- c(0,1) #no error anymore
# Error in `[<-.data.frame`(`*tmp*`, , col, value = c(0, 1)) :
# replacement has 2 rows, data has 100
print(levels(df[, col]))
#this produces "0" "1" or just "0" depending on the column
}
我在数据集中有可能包含 0 或 1 的列,但有些列只包含 0。
我想使用这些数字作为因数,但我仍然希望每一列都具有级别 0 和 1。我尝试了下面的代码,但我一直收到错误,但我不明白为什么...
#dataframe df has 100 rows
column_list = c("col1", "col2", "col3")
for (col in column_list) {
#convert number 0 and number 1 to factors
# (but sometimes the data only has zeros)
df[,col] <- as.factor(df[,col])
# I want to force levels to be 0 and 1
# this is for when the data is completely missing number 1
levels(df[, col] <- c(0,1)) #give error
# Error in `[<-.data.frame`(`*tmp*`, , col, value = c(0, 1)) :
# replacement has 2 rows, data has 100
print(levels(df[, col]))
#this produces "0" "1" or just "0" depending on the column
}
你指出你的错误在哪里,那行写错了。应该是:
df[, col] <- factor(df[, col], levels = c(0,1)
您甚至不需要上一行。 您甚至可以避免 for 循环并使用 apply:
df <- apply(df, 2, levels, c(0,1))
我认为你刚刚把 )
放错了地方
这个有效:
column_list = c("col1", "col2", "col3")
df <- data.frame(matrix(0, nrow = 100, ncol = 3))
names(df) <- column_list
for (col in column_list) {
#convert number 0 and number 1 to factors
# (but sometimes the data only has zeros)
df[,col] <- as.factor(df[,col])
# I want to force levels to be 0 and 1
# this is for when the data is completely missing number 1
levels(df[, col]) <- c(0,1) #no error anymore
# Error in `[<-.data.frame`(`*tmp*`, , col, value = c(0, 1)) :
# replacement has 2 rows, data has 100
print(levels(df[, col]))
#this produces "0" "1" or just "0" depending on the column
}