如何编写包含条件语句的 seed_features

How to write seed_features that include a conditional statement

我正在尝试编写一个生成 reward if place == 1 else 0.

的种子功能

placereward 都是 ft.variable_types.Numeric:

Entity: results
  Variables:
    id (dtype: index)
    place (dtype: numeric)
    reward (dtype: numeric)

我尝试了以下替代方法,但没有成功:

选项 1

roi = (ft.Feature(es['results']['reward'])
       if (ft.Feature(es['results']['place']) == 1)
       else 0).rename('roi')

产生AssertionError: Column "roi" missing frome dataframe 生成特征时。

备选方案 2

roi = ((ft.Feature(es['results']['place']) == 1) *
       ft.Feature(es['results']['reward'])).rename('roi')

分配种子特征时产生AssertionError: Provided inputs don't match input type requirements

备选方案 2 应该有效,因为在 Python:

>>> True * 3.14
3.14
>>> False * 3.14
0.0

完整堆栈跟踪:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-211-94dd07d98076> in <module>()
     23 
     24
---> 25 roi = ((ft.Feature(es['results']['place']) == 1) * ft.Feature(es['results']['reward'])).rename('roi')

~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in __mul__(self, other)
    287     def __mul__(self, other):
    288         """Multiply by other"""
--> 289         return self._handle_binary_comparision(other, primitives.MultiplyNumeric, primitives.MultiplyNumericScalar)
    290 
    291     def __rmul__(self, other):

~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in _handle_binary_comparision(self, other, Primitive, PrimitiveScalar)
    230     def _handle_binary_comparision(self, other, Primitive, PrimitiveScalar):
    231         if isinstance(other, FeatureBase):
--> 232             return Feature([self, other], primitive=Primitive)
    233 
    234         return Feature([self], primitive=PrimitiveScalar(other))

~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in __new__(self, base, entity, groupby, parent_entity, primitive, use_previous, where)
    755                                                primitive=primitive,
    756                                                groupby=groupby)
--> 757             return TransformFeature(base, primitive=primitive)
    758 
    759         raise Exception("Unrecognized feature initialization")

~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in __init__(self, base_features, primitive, name)
    660                                                relationship_path=RelationshipPath([]),
    661                                                primitive=primitive,
--> 662                                                name=name)
    663 
    664     @classmethod

~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in __init__(self, entity, base_features, relationship_path, primitive, name, names)
     56         self._names = names
     57 
---> 58         assert self._check_input_types(), ("Provided inputs don't match input "
     59                                            "type requirements")
     60 

AssertionError: Provided inputs don't match input type requirements

这应该适用于 featuretools v0.11.0。这是一个使用演示数据集的示例。 unit_pricetotal 都是数字。

import featuretools as ft

es = ft.demo.load_retail(nrows=100)
es['order_products']
Entity: order_products
  Variables:
    ...
    unit_price (dtype: numeric)
    total (dtype: numeric)
    ...

我创建了种子特征。

unit_price = ft.Feature(es['order_products']['unit_price'])
total = ft.Feature(es['order_products']['total'])
seed = ((total == 1) * unit_price).rename('seed')

然后,计算特征矩阵。

fm, fd = ft.dfs(target_entity='customers', entityset=es, seed_features=[seed])
fm.filter(regex='seed').columns.tolist()[:5]
['SUM(order_products.seed)',
 'STD(order_products.seed)',
 'MAX(order_products.seed)',
 'SKEW(order_products.seed)',
 'MIN(order_products.seed)']

在您的情况下,这将是种子特征。

place = ft.Feature(es['results']['place'])
reward = ft.Feature(es['results']['reward'])
roi = ((reward == 1) * place).rename('roi')

如果有帮助请告诉我。