Why isnt my logistic regression model output a factor of 2 levels? (Error: `data` and `reference` should be factors with the same levels.)
Why isnt my logistic regression model output a factor of 2 levels? (Error: `data` and `reference` should be factors with the same levels.)
通过阅读类似的问题,我知道问题在于 yhat.logisticReg
不是 2 级的因子,而 training.prepped$TARGET_FLAG
是。我假设这个问题可以通过改变我的模型或在预测中得到解决,所以 yhat.logisticReg
是 2 级的一个因素。我该怎么做?
logisticReg = glm(TARGET_FLAG ~ .,
data = training.prepped,
family = binomial())
yhat.logisticReg = predict(logisticReg, training.prepped, type = "response")
confusionMatrix(yhat.logisticReg, training.prepped$TARGET_FLAG)
Error: `data` and `reference` should be factors with the same levels.
str(training.prepped$TARGET_FLAG)
Factor w/ 2 levels "0","1": 1 1 1 1 1 2 1 2 2 1 ...
str(yhat.logisticReg)
Named num [1:8161] 0.1656 0.2792 0.3717 0.0894 0.272 ...
- attr(*, "names")= chr [1:8161] "1" "2" "3" "4" ...
您可能需要先选择一个阈值,然后将您的实数值数据转换为二进制值,例如
a <- c(0.2, 0.7, 0.4)
threshold <- 0.5
binary_a <- factor(as.numeric(a>threshold))
str(binary_a)
Factor w/ 2 levels "0","1": 1 2 1
库插入符号的方法 confusionMatrix
实现了多个指标。调用 overall
你可以得到准确度。如果你想要另一个指标,你可以检查他们是否已经实施并调用它。
library(caret)
acc = c()
for(value in yhat.logisticReg)
{
predictions <- ifelse(yhat.logisticReg <= value, 0, 1)
confusion_matrix = confusionMatrix(predictions, yhat.logisticReg)
acc = c(acc,confusion_matrix$overall["Accuracy"])
}
best_acc = max(acc)
best_threshold = yhat.logisticReg[which.max(acc)]
通过阅读类似的问题,我知道问题在于 yhat.logisticReg
不是 2 级的因子,而 training.prepped$TARGET_FLAG
是。我假设这个问题可以通过改变我的模型或在预测中得到解决,所以 yhat.logisticReg
是 2 级的一个因素。我该怎么做?
logisticReg = glm(TARGET_FLAG ~ .,
data = training.prepped,
family = binomial())
yhat.logisticReg = predict(logisticReg, training.prepped, type = "response")
confusionMatrix(yhat.logisticReg, training.prepped$TARGET_FLAG)
Error: `data` and `reference` should be factors with the same levels.
str(training.prepped$TARGET_FLAG)
Factor w/ 2 levels "0","1": 1 1 1 1 1 2 1 2 2 1 ...
str(yhat.logisticReg)
Named num [1:8161] 0.1656 0.2792 0.3717 0.0894 0.272 ...
- attr(*, "names")= chr [1:8161] "1" "2" "3" "4" ...
您可能需要先选择一个阈值,然后将您的实数值数据转换为二进制值,例如
a <- c(0.2, 0.7, 0.4)
threshold <- 0.5
binary_a <- factor(as.numeric(a>threshold))
str(binary_a)
Factor w/ 2 levels "0","1": 1 2 1
库插入符号的方法 confusionMatrix
实现了多个指标。调用 overall
你可以得到准确度。如果你想要另一个指标,你可以检查他们是否已经实施并调用它。
library(caret)
acc = c()
for(value in yhat.logisticReg)
{
predictions <- ifelse(yhat.logisticReg <= value, 0, 1)
confusion_matrix = confusionMatrix(predictions, yhat.logisticReg)
acc = c(acc,confusion_matrix$overall["Accuracy"])
}
best_acc = max(acc)
best_threshold = yhat.logisticReg[which.max(acc)]