Featuretools:跳过目标特征
Featuretools: skip the target feature
使用Featuretools时是否可以跳过目标特征?例如,考虑鸢尾花数据集
data = pd.read_csv('https://gist.githubusercontent.com/curran/a08a1080b88344b0c8a7/raw/639388c2cbc2120a14dcf466e85730eb8be498bb/iris.csv')
target = "species"
data[target] = data[target].map({'setosa':0, 'versicolor':1, 'virginica':2})
# Make an entityset and add the entity
es = ft.EntitySet(id='iris dataset')
es.entity_from_dataframe(entity_id='data', dataframe=data,
make_index=True, index='index')
# Run deep feature synthesis with transformation primitives
feature_matrix, feature_defs = ft.dfs(entityset=es, target_entity='data',
trans_primitives=['add_numeric', 'multiply_numeric'])
生成的feature_matrix
包含无用的特征,例如sepal_width + species
。我怎样才能删除它们?
你可以在DFS中使用ignore_variables
来忽略目标特征。
feature_matrix, feature_defs = ft.dfs(
entityset=es,
target_entity='data',
trans_primitives=['add_numeric', 'multiply_numeric'],
ignore_variables={'data': ['species']},
)
使用Featuretools时是否可以跳过目标特征?例如,考虑鸢尾花数据集
data = pd.read_csv('https://gist.githubusercontent.com/curran/a08a1080b88344b0c8a7/raw/639388c2cbc2120a14dcf466e85730eb8be498bb/iris.csv')
target = "species"
data[target] = data[target].map({'setosa':0, 'versicolor':1, 'virginica':2})
# Make an entityset and add the entity
es = ft.EntitySet(id='iris dataset')
es.entity_from_dataframe(entity_id='data', dataframe=data,
make_index=True, index='index')
# Run deep feature synthesis with transformation primitives
feature_matrix, feature_defs = ft.dfs(entityset=es, target_entity='data',
trans_primitives=['add_numeric', 'multiply_numeric'])
生成的feature_matrix
包含无用的特征,例如sepal_width + species
。我怎样才能删除它们?
你可以在DFS中使用ignore_variables
来忽略目标特征。
feature_matrix, feature_defs = ft.dfs(
entityset=es,
target_entity='data',
trans_primitives=['add_numeric', 'multiply_numeric'],
ignore_variables={'data': ['species']},
)