R 语言:将循环结果存储到 Table
R Language: Storing Results of a Loop into a Table
我正在使用 R 编程语言。我正在学习如何循环一个过程并将结果存储到 table 中。对于这个例子,我首先生成了一些数据:
#load libraries
library(caret)
library(rpart)
#generate data
a = rnorm(1000, 10, 10)
b = rnorm(1000, 10, 5)
c = rnorm(1000, 5, 10)
group <- sample( LETTERS[1:2], 1000, replace=TRUE, prob=c(0.5,0.5) )
group_1 <- 1:1000
#put data into a frame
d = data.frame(a,b,c, group, group_1)
d$group = as.factor(d$group)
然后,我创建了最终的 table,我希望在其中存储循环结果:
#create the final results table in which the results of the loop will be stored
final_table = matrix(1, nrow = 6, ncol=2)
这是我要循环的程序。基本上,我想在此数据上拟合决策树模型。我想拟合 6 个不同的决策树:如果“group_1 > i”,变量“group_1”(响应变量)变为因子变量(“1”或“0”)。 “i”变量有 6 个值 (400,401,402,403,404,405)。因此,决策树拟合了 6 次。我想将这些决策树中的每一个的准确性存储到 "final_table":
for (i in 400:405)
{
d$group_1 = ifelse(d$group_1 > i, "1","0")
d$group_1 = as.factor(d$group_1)
trainIndex <- createDataPartition(d$group_1, p = .8,
list = FALSE,
times = 1)
training = d[ trainIndex,]
test <- d[-trainIndex,]
fitControl <- trainControl(## 10-fold CV
method = "repeatedcv",
number = 10,
## repeated ten times
repeats = 10)
TreeFit <- train(group_1 ~ ., data = training,
method = "rpart2",
trControl = fitControl)
pred = predict(TreeFit, test, type = "prob")
labels = as.factor(ifelse(pred[,2]>0.5, "1", "0"))
con = confusionMatrix(labels, test$group_1)
#update results into table
row = i - 399
final_table[row,1] = con$overall[1]
final_table[row,2] = i
}
但是,这给了我以下错误:
Error in na.fail.default(list(group = c(2L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, :
missing values in object
In addition: Warning message:
In Ops.factor(d$group_1, i) : ‘>’ not meaningful for factors
有人可以告诉我我做错了什么吗?
谢谢
您可以在任何其他变量中创建原始数据帧的副本,该变量可用于在每次迭代中覆盖已更改的数据帧。
library(caret)
library(rpart)
e <- d
for (i in 400:405) {
d <- e
d$group_1 = as.integer(d$group_1 > i)
d$group_1 = as.factor(d$group_1)
trainIndex <- createDataPartition(d$group_1, p = .8,list = FALSE,times = 1)
training = d[ trainIndex,]
test <- d[-trainIndex,]
fitControl <- trainControl(## 10-fold CV
method = "repeatedcv",
number = 10,
## repeated ten times
repeats = 10)
TreeFit <- train(group_1 ~ ., data = training,
method = "rpart2",
trControl = fitControl)
pred = predict(TreeFit, test, type = "prob")
labels = as.factor(ifelse(pred[,2]>0.5, "1", "0"))
con = confusionMatrix(labels, test$group_1)
#update results into table
row = i - 399
final_table[row,1] = con$overall[1]
final_table[row,2] = i
}
final_table
# [,1] [,2]
#[1,] 0.585 400
#[2,] 0.618 401
#[3,] 0.598 402
#[4,] 0.608 403
#[5,] 0.533 404
#[6,] 0.570 405
我正在使用 R 编程语言。我正在学习如何循环一个过程并将结果存储到 table 中。对于这个例子,我首先生成了一些数据:
#load libraries
library(caret)
library(rpart)
#generate data
a = rnorm(1000, 10, 10)
b = rnorm(1000, 10, 5)
c = rnorm(1000, 5, 10)
group <- sample( LETTERS[1:2], 1000, replace=TRUE, prob=c(0.5,0.5) )
group_1 <- 1:1000
#put data into a frame
d = data.frame(a,b,c, group, group_1)
d$group = as.factor(d$group)
然后,我创建了最终的 table,我希望在其中存储循环结果:
#create the final results table in which the results of the loop will be stored
final_table = matrix(1, nrow = 6, ncol=2)
这是我要循环的程序。基本上,我想在此数据上拟合决策树模型。我想拟合 6 个不同的决策树:如果“group_1 > i”,变量“group_1”(响应变量)变为因子变量(“1”或“0”)。 “i”变量有 6 个值 (400,401,402,403,404,405)。因此,决策树拟合了 6 次。我想将这些决策树中的每一个的准确性存储到 "final_table":
for (i in 400:405)
{
d$group_1 = ifelse(d$group_1 > i, "1","0")
d$group_1 = as.factor(d$group_1)
trainIndex <- createDataPartition(d$group_1, p = .8,
list = FALSE,
times = 1)
training = d[ trainIndex,]
test <- d[-trainIndex,]
fitControl <- trainControl(## 10-fold CV
method = "repeatedcv",
number = 10,
## repeated ten times
repeats = 10)
TreeFit <- train(group_1 ~ ., data = training,
method = "rpart2",
trControl = fitControl)
pred = predict(TreeFit, test, type = "prob")
labels = as.factor(ifelse(pred[,2]>0.5, "1", "0"))
con = confusionMatrix(labels, test$group_1)
#update results into table
row = i - 399
final_table[row,1] = con$overall[1]
final_table[row,2] = i
}
但是,这给了我以下错误:
Error in na.fail.default(list(group = c(2L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, :
missing values in object
In addition: Warning message:
In Ops.factor(d$group_1, i) : ‘>’ not meaningful for factors
有人可以告诉我我做错了什么吗?
谢谢
您可以在任何其他变量中创建原始数据帧的副本,该变量可用于在每次迭代中覆盖已更改的数据帧。
library(caret)
library(rpart)
e <- d
for (i in 400:405) {
d <- e
d$group_1 = as.integer(d$group_1 > i)
d$group_1 = as.factor(d$group_1)
trainIndex <- createDataPartition(d$group_1, p = .8,list = FALSE,times = 1)
training = d[ trainIndex,]
test <- d[-trainIndex,]
fitControl <- trainControl(## 10-fold CV
method = "repeatedcv",
number = 10,
## repeated ten times
repeats = 10)
TreeFit <- train(group_1 ~ ., data = training,
method = "rpart2",
trControl = fitControl)
pred = predict(TreeFit, test, type = "prob")
labels = as.factor(ifelse(pred[,2]>0.5, "1", "0"))
con = confusionMatrix(labels, test$group_1)
#update results into table
row = i - 399
final_table[row,1] = con$overall[1]
final_table[row,2] = i
}
final_table
# [,1] [,2]
#[1,] 0.585 400
#[2,] 0.618 401
#[3,] 0.598 402
#[4,] 0.608 403
#[5,] 0.533 404
#[6,] 0.570 405