Caret 交叉验证中每个折叠的测试集和训练集
Test set and train set for each fold in Caret cross validation
我试图了解 Caret 包中的 5 折交叉验证算法,但我无法找到如何为每一折获取训练集和测试集,而且我也无法从类似的建议问题中找到它。想象一下,如果我想通过随机森林方法进行交叉验证,我会执行以下操作:
set.seed(12)
train_control <- trainControl(method="cv", number=5,savePredictions = TRUE)
rfmodel <- train(Species~., data=iris, trControl=train_control, method="rf")
first_holdout <- subset(rfmodel$pred, Resample == "Fold1")
str(first_holdout)
'data.frame': 90 obs. of 5 variables:
$ pred : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1
$ obs : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1
$ rowIndex: int 2 3 9 11 25 29 35 36 41 50 ...
$ mtry : num 2 2 2 2 2 2 2 2 2 2 ...
$ Resample: chr "Fold1" "Fold1" "Fold1" "Fold1" ...
Fold1 中的这 90 个观察值是否用作训练集?如果是,那么此折叠的测试集在哪里?
str(rfmodel)
执行的模型以以下形式存储所有内容。下面的 control
存储了在 index
和 indexOut
.
中进行训练的样本和各自保留的索引
names(rfmodel)
# [1] "method" "modelInfo" "modelType" "results" "pred"
# [6] "bestTune" "call" "dots" "metric" "control"
# [11] "finalModel" "preProcess" "trainingData" "resample" "resampledCM"
# [16] "perfNames" "maximize" "yLimits" "times" "levels"
# [21] "terms" "coefnames" "xlevels"
Train 和 Hold Out 样本索引的路径
# Indexes of Hold Out Sets
rfmodel$control$indexOut
# Indexes of Train Sets for above hold outs
rfmodel$control$index
我试图了解 Caret 包中的 5 折交叉验证算法,但我无法找到如何为每一折获取训练集和测试集,而且我也无法从类似的建议问题中找到它。想象一下,如果我想通过随机森林方法进行交叉验证,我会执行以下操作:
set.seed(12)
train_control <- trainControl(method="cv", number=5,savePredictions = TRUE)
rfmodel <- train(Species~., data=iris, trControl=train_control, method="rf")
first_holdout <- subset(rfmodel$pred, Resample == "Fold1")
str(first_holdout)
'data.frame': 90 obs. of 5 variables:
$ pred : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1
$ obs : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1
$ rowIndex: int 2 3 9 11 25 29 35 36 41 50 ...
$ mtry : num 2 2 2 2 2 2 2 2 2 2 ...
$ Resample: chr "Fold1" "Fold1" "Fold1" "Fold1" ...
Fold1 中的这 90 个观察值是否用作训练集?如果是,那么此折叠的测试集在哪里?
str(rfmodel)
执行的模型以以下形式存储所有内容。下面的 control
存储了在 index
和 indexOut
.
names(rfmodel)
# [1] "method" "modelInfo" "modelType" "results" "pred"
# [6] "bestTune" "call" "dots" "metric" "control"
# [11] "finalModel" "preProcess" "trainingData" "resample" "resampledCM"
# [16] "perfNames" "maximize" "yLimits" "times" "levels"
# [21] "terms" "coefnames" "xlevels"
Train 和 Hold Out 样本索引的路径
# Indexes of Hold Out Sets
rfmodel$control$indexOut
# Indexes of Train Sets for above hold outs
rfmodel$control$index