RandomSearchCV 超级慢 - 故障排除性能增强
RandomSearchCV super slow - troubleshooting performance enhancement
我一直在研究以下用于随机森林分类的脚本,并且 运行 遇到了一些与随机搜索性能相关的问题 - 它需要很长时间才能完成,我想知道是否有要么我做错了什么,要么我可以做得更好以使其更快。
任何人都可以建议 speed/performance 我可以做的改进吗?
提前致谢!
forest_start_time = time.time()
model = RandomForestClassifier()
param_grid = {
'bootstrap': [True, False],
'max_depth': [80, 90, 100, 110],
'max_features': [2, 3],
'min_samples_leaf': [3, 4, 5],
'min_samples_split': [8, 10, 12],
'n_estimators': [200, 300, 500, 1000]
}
bestforest = RandomizedSearchCV(estimator = model,
param_distributions = param_grid,
cv = 3, n_iter = 10,
n_jobs = available_processor_count)
bestforest.fit(train_features, train_labels.ravel())
forest_score = bestforest.score(test_features, test_labels.ravel())
print(forest_score)
forest_end_time = time.time()
forest_duration = forest_start_time-forest_end_time
加快速度的唯一方法是 1) 减少功能 or/and 使用更多 CPU 内核 n_jobs = -1
:
bestforest = RandomizedSearchCV(estimator = model,
param_distributions = param_grid,
cv = 3, n_iter = 10,
n_jobs = -1)
我一直在研究以下用于随机森林分类的脚本,并且 运行 遇到了一些与随机搜索性能相关的问题 - 它需要很长时间才能完成,我想知道是否有要么我做错了什么,要么我可以做得更好以使其更快。
任何人都可以建议 speed/performance 我可以做的改进吗?
提前致谢!
forest_start_time = time.time()
model = RandomForestClassifier()
param_grid = {
'bootstrap': [True, False],
'max_depth': [80, 90, 100, 110],
'max_features': [2, 3],
'min_samples_leaf': [3, 4, 5],
'min_samples_split': [8, 10, 12],
'n_estimators': [200, 300, 500, 1000]
}
bestforest = RandomizedSearchCV(estimator = model,
param_distributions = param_grid,
cv = 3, n_iter = 10,
n_jobs = available_processor_count)
bestforest.fit(train_features, train_labels.ravel())
forest_score = bestforest.score(test_features, test_labels.ravel())
print(forest_score)
forest_end_time = time.time()
forest_duration = forest_start_time-forest_end_time
加快速度的唯一方法是 1) 减少功能 or/and 使用更多 CPU 内核 n_jobs = -1
:
bestforest = RandomizedSearchCV(estimator = model,
param_distributions = param_grid,
cv = 3, n_iter = 10,
n_jobs = -1)