如何从网格搜索结果中确定epoch超参数
How to determine epoch hyperparameter from grid search result
我有一个 运行 网格搜索,将历元作为超参数之一。现在选择最佳模型后,如何确定为该特定模型选择了哪个纪元?
以下是模型总结
型号详情:
==============
H2OBinomialModel: deeplearning
Model ID: dl_grid_model_19
Status of Neuron Layers: predicting Churn, 2-class classification, bernoulli distribution, CrossEntropy loss, 4,226 weights/biases, 44.1 KB, 47,520 training samples, mini-batch size 1
layer units type dropout l1 l2 mean_rate rate_rms momentum mean_weight weight_rms
1 1 30 Input 0.00 %
2 2 32 RectifierDropout 20.00 % 0.000010 0.000010 0.009995 0.000000 0.501901 -0.011006 0.210611
3 3 32 RectifierDropout 20.00 % 0.000010 0.000010 0.009995 0.000000 0.501901 -0.035854 0.191687
4 4 32 RectifierDropout 20.00 % 0.000010 0.000010 0.009995 0.000000 0.501901 -0.029072 0.185352
5 5 32 RectifierDropout 20.00 % 0.000010 0.000010 0.009995 0.000000 0.501901 -0.057359 0.186863
6 6 2 Softmax 0.000010 0.000010 0.009995 0.000000 0.501901 0.122655 0.406789
mean_bias bias_rms
1
2 0.401924 0.136989
3 0.938406 0.041128
4 0.950918 0.043826
5 0.915588 0.060796
6 0.019925 0.175195
H2OBinomialMetrics: deeplearning
** Reported on training data. **
** Metrics reported on full training frame **
MSE: 0.1946901
RMSE: 0.441237
LogLoss: 0.5731371
Mean Per-Class Error: 0.194215
AUC: 0.8767996
Gini: 0.7535992
Confusion Matrix for F1-optimal threshold:
No Yes Error Rate
No 1755 614 0.259181 =614/2369
Yes 308 2075 0.129249 =308/2383
Totals 2063 2689 0.194024 =922/4752
Maximum Metrics: Maximum metrics at their respective thresholds
metric threshold value idx
1 max f1 0.216316 0.818218 266
2 max f2 0.058723 0.889206 348
3 max f0point5 0.306487 0.801744 216
4 max accuracy 0.217122 0.805976 265
5 max precision 0.730797 1.000000 0
6 max recall 0.006754 1.000000 398
7 max specificity 0.730797 1.000000 0
8 max absolute_mcc 0.216316 0.616944 266
9 max min_per_class_accuracy 0.257957 0.795636 242
10 max mean_per_class_accuracy 0.217122 0.805792 265
Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)`
H2OBinomialMetrics: deeplearning
** Reported on validation data. **
** Metrics reported on full validation frame **
MSE: 0.1418929
RMSE: 0.3766867
LogLoss: 0.4374728
Mean Per-Class Error: 0.2603761
AUC: 0.8306744
Gini: 0.6613489
Confusion Matrix for F1-optimal threshold:
No Yes Error Rate
No 1075 201 0.157524 =201/1276
Yes 162 284 0.363229 =162/446
Totals 1237 485 0.210801 =363/1722
Maximum Metrics: Maximum metrics at their respective thresholds
metric threshold value idx
1 max f1 0.323830 0.610097 183
2 max f2 0.087110 0.740000 319
3 max f0point5 0.514027 0.608666 94
4 max accuracy 0.514027 0.800232 94
5 max precision 0.668538 0.875000 21
6 max recall 0.011443 1.000000 389
7 max specificity 0.717464 0.999216 0
8 max absolute_mcc 0.323830 0.466764 183
9 max min_per_class_accuracy 0.229876 0.746082 238
10 max mean_per_class_accuracy 0.173814 0.753367 273
Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)`
要了解模型使用了多少个 epoch,最好的方法是查看分数历史记录。例如。对于模型 m
:
h2o.scoreHistory(m)
(或者对于图形版本,绘制模型:plot(m)
)
这可能是太多的信息,所以减少它只显示时代:
h2o.scoreHistory(m)[,c("epochs")]
(我刚刚注意到 h2o.scoreHistory(m)$epochs
也可以。)
显示返回的最终模型的时代,其中:
last( h2o.scoreHistory(m)[,c("epochs")] )
顺便说一下,如果您刚刚打印了网格对象,那么您应该将时代视为列之一,如果它是您的超参数之一的话。
回答你没有问的问题:看看提前停止,这将使你不必尝试提前猜测你需要多少个 epoch,并且因此也会在您的网格搜索中为您保存一个超参数。
您也可以简单地制作具有您正在考虑的最高时期值的模型,并查看分数历史记录以获取您感兴趣的每个其他时期值的分数。
我有一个 运行 网格搜索,将历元作为超参数之一。现在选择最佳模型后,如何确定为该特定模型选择了哪个纪元?
以下是模型总结 型号详情: ==============
H2OBinomialModel: deeplearning
Model ID: dl_grid_model_19
Status of Neuron Layers: predicting Churn, 2-class classification, bernoulli distribution, CrossEntropy loss, 4,226 weights/biases, 44.1 KB, 47,520 training samples, mini-batch size 1
layer units type dropout l1 l2 mean_rate rate_rms momentum mean_weight weight_rms
1 1 30 Input 0.00 %
2 2 32 RectifierDropout 20.00 % 0.000010 0.000010 0.009995 0.000000 0.501901 -0.011006 0.210611
3 3 32 RectifierDropout 20.00 % 0.000010 0.000010 0.009995 0.000000 0.501901 -0.035854 0.191687
4 4 32 RectifierDropout 20.00 % 0.000010 0.000010 0.009995 0.000000 0.501901 -0.029072 0.185352
5 5 32 RectifierDropout 20.00 % 0.000010 0.000010 0.009995 0.000000 0.501901 -0.057359 0.186863
6 6 2 Softmax 0.000010 0.000010 0.009995 0.000000 0.501901 0.122655 0.406789
mean_bias bias_rms
1
2 0.401924 0.136989
3 0.938406 0.041128
4 0.950918 0.043826
5 0.915588 0.060796
6 0.019925 0.175195
H2OBinomialMetrics: deeplearning
** Reported on training data. **
** Metrics reported on full training frame **
MSE: 0.1946901
RMSE: 0.441237
LogLoss: 0.5731371
Mean Per-Class Error: 0.194215
AUC: 0.8767996
Gini: 0.7535992
Confusion Matrix for F1-optimal threshold:
No Yes Error Rate
No 1755 614 0.259181 =614/2369
Yes 308 2075 0.129249 =308/2383
Totals 2063 2689 0.194024 =922/4752
Maximum Metrics: Maximum metrics at their respective thresholds
metric threshold value idx
1 max f1 0.216316 0.818218 266
2 max f2 0.058723 0.889206 348
3 max f0point5 0.306487 0.801744 216
4 max accuracy 0.217122 0.805976 265
5 max precision 0.730797 1.000000 0
6 max recall 0.006754 1.000000 398
7 max specificity 0.730797 1.000000 0
8 max absolute_mcc 0.216316 0.616944 266
9 max min_per_class_accuracy 0.257957 0.795636 242
10 max mean_per_class_accuracy 0.217122 0.805792 265
Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)`
H2OBinomialMetrics: deeplearning
** Reported on validation data. **
** Metrics reported on full validation frame **
MSE: 0.1418929
RMSE: 0.3766867
LogLoss: 0.4374728
Mean Per-Class Error: 0.2603761
AUC: 0.8306744
Gini: 0.6613489
Confusion Matrix for F1-optimal threshold:
No Yes Error Rate
No 1075 201 0.157524 =201/1276
Yes 162 284 0.363229 =162/446
Totals 1237 485 0.210801 =363/1722
Maximum Metrics: Maximum metrics at their respective thresholds
metric threshold value idx
1 max f1 0.323830 0.610097 183
2 max f2 0.087110 0.740000 319
3 max f0point5 0.514027 0.608666 94
4 max accuracy 0.514027 0.800232 94
5 max precision 0.668538 0.875000 21
6 max recall 0.011443 1.000000 389
7 max specificity 0.717464 0.999216 0
8 max absolute_mcc 0.323830 0.466764 183
9 max min_per_class_accuracy 0.229876 0.746082 238
10 max mean_per_class_accuracy 0.173814 0.753367 273
Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)`
要了解模型使用了多少个 epoch,最好的方法是查看分数历史记录。例如。对于模型 m
:
h2o.scoreHistory(m)
(或者对于图形版本,绘制模型:plot(m)
)
这可能是太多的信息,所以减少它只显示时代:
h2o.scoreHistory(m)[,c("epochs")]
(我刚刚注意到 h2o.scoreHistory(m)$epochs
也可以。)
显示返回的最终模型的时代,其中:
last( h2o.scoreHistory(m)[,c("epochs")] )
顺便说一下,如果您刚刚打印了网格对象,那么您应该将时代视为列之一,如果它是您的超参数之一的话。
回答你没有问的问题:看看提前停止,这将使你不必尝试提前猜测你需要多少个 epoch,并且因此也会在您的网格搜索中为您保存一个超参数。
您也可以简单地制作具有您正在考虑的最高时期值的模型,并查看分数历史记录以获取您感兴趣的每个其他时期值的分数。