如何编写包含条件语句的 seed_features
How to write seed_features that include a conditional statement
我正在尝试编写一个生成 reward
if place == 1
else 0
.
的种子功能
place
和 reward
都是 ft.variable_types.Numeric
:
Entity: results
Variables:
id (dtype: index)
place (dtype: numeric)
reward (dtype: numeric)
我尝试了以下替代方法,但没有成功:
选项 1
roi = (ft.Feature(es['results']['reward'])
if (ft.Feature(es['results']['place']) == 1)
else 0).rename('roi')
产生AssertionError: Column "roi" missing frome dataframe
生成特征时。
备选方案 2
roi = ((ft.Feature(es['results']['place']) == 1) *
ft.Feature(es['results']['reward'])).rename('roi')
分配种子特征时产生AssertionError: Provided inputs don't match input type requirements
。
备选方案 2 应该有效,因为在 Python:
>>> True * 3.14
3.14
>>> False * 3.14
0.0
完整堆栈跟踪:
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-211-94dd07d98076> in <module>()
23
24
---> 25 roi = ((ft.Feature(es['results']['place']) == 1) * ft.Feature(es['results']['reward'])).rename('roi')
~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in __mul__(self, other)
287 def __mul__(self, other):
288 """Multiply by other"""
--> 289 return self._handle_binary_comparision(other, primitives.MultiplyNumeric, primitives.MultiplyNumericScalar)
290
291 def __rmul__(self, other):
~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in _handle_binary_comparision(self, other, Primitive, PrimitiveScalar)
230 def _handle_binary_comparision(self, other, Primitive, PrimitiveScalar):
231 if isinstance(other, FeatureBase):
--> 232 return Feature([self, other], primitive=Primitive)
233
234 return Feature([self], primitive=PrimitiveScalar(other))
~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in __new__(self, base, entity, groupby, parent_entity, primitive, use_previous, where)
755 primitive=primitive,
756 groupby=groupby)
--> 757 return TransformFeature(base, primitive=primitive)
758
759 raise Exception("Unrecognized feature initialization")
~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in __init__(self, base_features, primitive, name)
660 relationship_path=RelationshipPath([]),
661 primitive=primitive,
--> 662 name=name)
663
664 @classmethod
~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in __init__(self, entity, base_features, relationship_path, primitive, name, names)
56 self._names = names
57
---> 58 assert self._check_input_types(), ("Provided inputs don't match input "
59 "type requirements")
60
AssertionError: Provided inputs don't match input type requirements
这应该适用于 featuretools v0.11.0
。这是一个使用演示数据集的示例。 unit_price
和 total
都是数字。
import featuretools as ft
es = ft.demo.load_retail(nrows=100)
es['order_products']
Entity: order_products
Variables:
...
unit_price (dtype: numeric)
total (dtype: numeric)
...
我创建了种子特征。
unit_price = ft.Feature(es['order_products']['unit_price'])
total = ft.Feature(es['order_products']['total'])
seed = ((total == 1) * unit_price).rename('seed')
然后,计算特征矩阵。
fm, fd = ft.dfs(target_entity='customers', entityset=es, seed_features=[seed])
fm.filter(regex='seed').columns.tolist()[:5]
['SUM(order_products.seed)',
'STD(order_products.seed)',
'MAX(order_products.seed)',
'SKEW(order_products.seed)',
'MIN(order_products.seed)']
在您的情况下,这将是种子特征。
place = ft.Feature(es['results']['place'])
reward = ft.Feature(es['results']['reward'])
roi = ((reward == 1) * place).rename('roi')
如果有帮助请告诉我。
我正在尝试编写一个生成 reward
if place == 1
else 0
.
place
和 reward
都是 ft.variable_types.Numeric
:
Entity: results
Variables:
id (dtype: index)
place (dtype: numeric)
reward (dtype: numeric)
我尝试了以下替代方法,但没有成功:
选项 1
roi = (ft.Feature(es['results']['reward'])
if (ft.Feature(es['results']['place']) == 1)
else 0).rename('roi')
产生AssertionError: Column "roi" missing frome dataframe
生成特征时。
备选方案 2
roi = ((ft.Feature(es['results']['place']) == 1) *
ft.Feature(es['results']['reward'])).rename('roi')
分配种子特征时产生AssertionError: Provided inputs don't match input type requirements
。
备选方案 2 应该有效,因为在 Python:
>>> True * 3.14
3.14
>>> False * 3.14
0.0
完整堆栈跟踪:
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-211-94dd07d98076> in <module>()
23
24
---> 25 roi = ((ft.Feature(es['results']['place']) == 1) * ft.Feature(es['results']['reward'])).rename('roi')
~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in __mul__(self, other)
287 def __mul__(self, other):
288 """Multiply by other"""
--> 289 return self._handle_binary_comparision(other, primitives.MultiplyNumeric, primitives.MultiplyNumericScalar)
290
291 def __rmul__(self, other):
~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in _handle_binary_comparision(self, other, Primitive, PrimitiveScalar)
230 def _handle_binary_comparision(self, other, Primitive, PrimitiveScalar):
231 if isinstance(other, FeatureBase):
--> 232 return Feature([self, other], primitive=Primitive)
233
234 return Feature([self], primitive=PrimitiveScalar(other))
~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in __new__(self, base, entity, groupby, parent_entity, primitive, use_previous, where)
755 primitive=primitive,
756 groupby=groupby)
--> 757 return TransformFeature(base, primitive=primitive)
758
759 raise Exception("Unrecognized feature initialization")
~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in __init__(self, base_features, primitive, name)
660 relationship_path=RelationshipPath([]),
661 primitive=primitive,
--> 662 name=name)
663
664 @classmethod
~/dev/venv/lib/python3.6/site-packages/featuretools/feature_base/feature_base.py in __init__(self, entity, base_features, relationship_path, primitive, name, names)
56 self._names = names
57
---> 58 assert self._check_input_types(), ("Provided inputs don't match input "
59 "type requirements")
60
AssertionError: Provided inputs don't match input type requirements
这应该适用于 featuretools v0.11.0
。这是一个使用演示数据集的示例。 unit_price
和 total
都是数字。
import featuretools as ft
es = ft.demo.load_retail(nrows=100)
es['order_products']
Entity: order_products
Variables:
...
unit_price (dtype: numeric)
total (dtype: numeric)
...
我创建了种子特征。
unit_price = ft.Feature(es['order_products']['unit_price'])
total = ft.Feature(es['order_products']['total'])
seed = ((total == 1) * unit_price).rename('seed')
然后,计算特征矩阵。
fm, fd = ft.dfs(target_entity='customers', entityset=es, seed_features=[seed])
fm.filter(regex='seed').columns.tolist()[:5]
['SUM(order_products.seed)',
'STD(order_products.seed)',
'MAX(order_products.seed)',
'SKEW(order_products.seed)',
'MIN(order_products.seed)']
在您的情况下,这将是种子特征。
place = ft.Feature(es['results']['place'])
reward = ft.Feature(es['results']['reward'])
roi = ((reward == 1) * place).rename('roi')
如果有帮助请告诉我。