如何只执行 scikit-learn 管道的特定部分?
How to execute only particular part of the scikit-learn pipeline?
下面是与问题相关的部分代码。如果需要完整代码,这里有一个完整的可重现代码也可以下载数据:https://github.com/ageron/handson-ml2/blob/master/02_end_to_end_machine_learning_project.ipynb
我有一个管道:
prepare_select_and_predict_pipeline = Pipeline([
('preparation', full_pipeline),
('feature_selection', TopFeatureSelector(feature_importances, k)),
('svm_reg', SVR(**rnd_search.best_params_))
])
现在,我只想执行上面管道中的这一部分:
('preparation', full_pipeline),
('feature_selection', TopFeatureSelector(feature_importances, k)),
我试过prepare_select_and_predict_pipeline.fit(housing, housing_labels)
,但它也执行 SVM 部分。
最后,我需要从上面的管道中获得与执行以下代码相同的结果:
preparation_and_feature_selection_pipeline = Pipeline([
('preparation', full_pipeline),
('feature_selection', TopFeatureSelector(feature_importances, k))
])
housing_prepared_top_k_features = preparation_and_feature_selection_pipeline.fit_transform(housing)
我该怎么做?
FeatureUnion
可以做到:
from sklearn.pipeline import FeatureUnion, Pipeline
prepare_select_pipeline = Pipeline([
('preparation', full_pipeline),
('feature_selection', TopFeatureSelector(feature_importances, k))
])
feats = FeatureUnion([('prepare_and_select', prepare_select_pipeline)])
prepare_select_and_predict_pipeline = Pipeline([('feats', feats),
('svm_reg', SVR(**rnd_search.best_params_))])
中找到更多相关信息
您可以像列表一样对管道进行切片(版本 >=0.21),因此
prepare_select_and_predict_pipeline[:-1].fit_transform(housing)
应该可以。
(你在这里需要小心;你正在 改装 管道的变压器部分,因此在新数据集上进行,然后 prepare_select_and_predict_pipeline.predict(X_new)
将使用改装后的变形金刚!如果需要,您可以 clone
到一个新变量。)
下面是与问题相关的部分代码。如果需要完整代码,这里有一个完整的可重现代码也可以下载数据:https://github.com/ageron/handson-ml2/blob/master/02_end_to_end_machine_learning_project.ipynb
我有一个管道:
prepare_select_and_predict_pipeline = Pipeline([
('preparation', full_pipeline),
('feature_selection', TopFeatureSelector(feature_importances, k)),
('svm_reg', SVR(**rnd_search.best_params_))
])
现在,我只想执行上面管道中的这一部分:
('preparation', full_pipeline),
('feature_selection', TopFeatureSelector(feature_importances, k)),
我试过prepare_select_and_predict_pipeline.fit(housing, housing_labels)
,但它也执行 SVM 部分。
最后,我需要从上面的管道中获得与执行以下代码相同的结果:
preparation_and_feature_selection_pipeline = Pipeline([
('preparation', full_pipeline),
('feature_selection', TopFeatureSelector(feature_importances, k))
])
housing_prepared_top_k_features = preparation_and_feature_selection_pipeline.fit_transform(housing)
我该怎么做?
FeatureUnion
可以做到:
from sklearn.pipeline import FeatureUnion, Pipeline
prepare_select_pipeline = Pipeline([
('preparation', full_pipeline),
('feature_selection', TopFeatureSelector(feature_importances, k))
])
feats = FeatureUnion([('prepare_and_select', prepare_select_pipeline)])
prepare_select_and_predict_pipeline = Pipeline([('feats', feats),
('svm_reg', SVR(**rnd_search.best_params_))])
中找到更多相关信息
您可以像列表一样对管道进行切片(版本 >=0.21),因此
prepare_select_and_predict_pipeline[:-1].fit_transform(housing)
应该可以。
(你在这里需要小心;你正在 改装 管道的变压器部分,因此在新数据集上进行,然后 prepare_select_and_predict_pipeline.predict(X_new)
将使用改装后的变形金刚!如果需要,您可以 clone
到一个新变量。)