imblearn管道和管道之间的区别
Difference between imblearn pipeline and Pipeline
我想使用 sklearn.pipeline
而不是使用 imblearn.pipeline
来合并 `RandomUnderSampler()'。我的原始数据需要缺失值插补和缩放。在这里,我以乳腺癌数据为例。但是,它给了我以下错误消息。感谢您的建议。感谢您的宝贵时间!
from numpy.random import seed
seed(12)
from sklearn.datasets import load_breast_cancer
import time
from sklearn.metrics import make_scorer
from imblearn.metrics import geometric_mean_score
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_validate
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import MaxAbsScaler
from imblearn.under_sampling import RandomUnderSampler
gmean = make_scorer(geometric_mean_score, greater_is_better=True)
X, y = load_breast_cancer(return_X_y=True)
start_time1 = time.time()
scoring = {'G-mean': gmean}
LR_pipe = Pipeline([("impute", SimpleImputer(strategy='constant',fill_value= 0)),("scale", MaxAbsScaler()),("rus", RandomUnderSampler()),("LR", LogisticRegression(solver='lbfgs', random_state=0, class_weight='balanced', max_iter=100000))])
LRscores = cross_validate(LR_pipe,X, y, cv=5,scoring=scoring)
end_time1 = time.time()
print ("Computational time in seconds = " +str(end_time1 - start_time1) )
sorted(LRscores.keys())
LR_Gmean = LRscores['test_G-mean'].mean()
print("G-mean: %f" % (LR_Gmean))
错误信息:
TypeError: All intermediate steps should be transformers and implement fit and transform or be the string 'passthrough' 'RandomUnderSampler()' (type <class 'imblearn.under_sampling._prototype_selection._random_under_sampler.RandomUnderSampler'>) doesn't
我们应该从 imblearn.pipeline
而不是从 sklearn.pipeline
导入 make_pipeline
:来自 sklearn 的 make_pipeline
需要转换器来实现 fit
和 transform
方法。 sklearn.pipeline
导入管道与 imblearn.pipeline
导入管道冲突!
我想使用 sklearn.pipeline
而不是使用 imblearn.pipeline
来合并 `RandomUnderSampler()'。我的原始数据需要缺失值插补和缩放。在这里,我以乳腺癌数据为例。但是,它给了我以下错误消息。感谢您的建议。感谢您的宝贵时间!
from numpy.random import seed
seed(12)
from sklearn.datasets import load_breast_cancer
import time
from sklearn.metrics import make_scorer
from imblearn.metrics import geometric_mean_score
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_validate
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import MaxAbsScaler
from imblearn.under_sampling import RandomUnderSampler
gmean = make_scorer(geometric_mean_score, greater_is_better=True)
X, y = load_breast_cancer(return_X_y=True)
start_time1 = time.time()
scoring = {'G-mean': gmean}
LR_pipe = Pipeline([("impute", SimpleImputer(strategy='constant',fill_value= 0)),("scale", MaxAbsScaler()),("rus", RandomUnderSampler()),("LR", LogisticRegression(solver='lbfgs', random_state=0, class_weight='balanced', max_iter=100000))])
LRscores = cross_validate(LR_pipe,X, y, cv=5,scoring=scoring)
end_time1 = time.time()
print ("Computational time in seconds = " +str(end_time1 - start_time1) )
sorted(LRscores.keys())
LR_Gmean = LRscores['test_G-mean'].mean()
print("G-mean: %f" % (LR_Gmean))
错误信息:
TypeError: All intermediate steps should be transformers and implement fit and transform or be the string 'passthrough' 'RandomUnderSampler()' (type <class 'imblearn.under_sampling._prototype_selection._random_under_sampler.RandomUnderSampler'>) doesn't
我们应该从 imblearn.pipeline
而不是从 sklearn.pipeline
导入 make_pipeline
:来自 sklearn 的 make_pipeline
需要转换器来实现 fit
和 transform
方法。 sklearn.pipeline
导入管道与 imblearn.pipeline
导入管道冲突!