在 scikit-learn 中重复 FeatureUnion
repeated FeatureUnion in scikit-learn
我正在学习 scikit-learn 中的管道和 FeatureUnion,因此想知道是否可以在 class 上重复应用 'make_union'?
考虑以下代码:
import numpy as np
import pandas as pd
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.linear_model import LogisticRegression
import sklearn.datasets as d
class IrisDataManupulation(BaseEstimator, TransformerMixin):
"""
Raise the matrix of feature in power
"""
def __init__(self, power=2):
self.power = power
def fit(self, X, y=None):
return self
def transform(self, X):
return np.power(X, self.power)
iris_data = d.load_iris()
X, y = iris_data.data, iris_data.target
# feature union:
fu = FeatureUnion(transformer_list=[('squared', IrisDataManupulation(power=2)),
('third', IrisDataManupulation(power=3))])
问题
有什么巧妙的方法可以创建 FeatureUnion 而无需重复相同的转换器,而是传递参数列表?
例如:
fu_new = FeatureUnion(transformer_list=[('raise_power', IrisDataManupulation(),
param_grid = {'raise_power__power':[2,3]})
您可以将所有权力工作转移到单个自定义变形金刚中。我们可以更改您的 IrisDataManupulation
以处理其中的权力列表:
class IrisDataManupulation(BaseEstimator, TransformerMixin):
def __init__(self, powers=[2]):
self.powers = powers
def transform(self, X):
powered_arrays = []
for power in self.powers:
powered_arrays.append(np.power(X, power))
return np.hstack(powered_arrays)
那么你可以只使用这个新的转换器而不是 FeatureUnion:
fu = IrisDataManupulation(powers=[2,3])
注意:如果你想从你的原始特征生成多项式特征,我会推荐see PolynomialFeatures,除了特征之间的其他交互之外,它还可以生成你想要的幂。
我正在学习 scikit-learn 中的管道和 FeatureUnion,因此想知道是否可以在 class 上重复应用 'make_union'?
考虑以下代码:
import numpy as np
import pandas as pd
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.linear_model import LogisticRegression
import sklearn.datasets as d
class IrisDataManupulation(BaseEstimator, TransformerMixin):
"""
Raise the matrix of feature in power
"""
def __init__(self, power=2):
self.power = power
def fit(self, X, y=None):
return self
def transform(self, X):
return np.power(X, self.power)
iris_data = d.load_iris()
X, y = iris_data.data, iris_data.target
# feature union:
fu = FeatureUnion(transformer_list=[('squared', IrisDataManupulation(power=2)),
('third', IrisDataManupulation(power=3))])
问题 有什么巧妙的方法可以创建 FeatureUnion 而无需重复相同的转换器,而是传递参数列表?
例如:
fu_new = FeatureUnion(transformer_list=[('raise_power', IrisDataManupulation(),
param_grid = {'raise_power__power':[2,3]})
您可以将所有权力工作转移到单个自定义变形金刚中。我们可以更改您的 IrisDataManupulation
以处理其中的权力列表:
class IrisDataManupulation(BaseEstimator, TransformerMixin):
def __init__(self, powers=[2]):
self.powers = powers
def transform(self, X):
powered_arrays = []
for power in self.powers:
powered_arrays.append(np.power(X, power))
return np.hstack(powered_arrays)
那么你可以只使用这个新的转换器而不是 FeatureUnion:
fu = IrisDataManupulation(powers=[2,3])
注意:如果你想从你的原始特征生成多项式特征,我会推荐see PolynomialFeatures,除了特征之间的其他交互之外,它还可以生成你想要的幂。