Panadas Condition on Dataframe returns TypeError: '>' not supported between instances of 'str' and 'int'
Panadas Condition on Dataframe returns TypeError: '>' not supported between instances of 'str' and 'int'
我正在使用 pandas 处理 DataFrame,我需要根据某些条件添加一个新列。
我的数据框是:
discount tax total subtotal productid
3 0 20 13 002
10 3 106 94 003
46.49 6 21 20 004
我需要在将名为 Class 的新列添加到 DataFrame 时应用一些条件。
条件如下:
IF discount > 20 & total > 100 & tax == 0
那么 Class 应该是 1
否则应该是 0
这是我尝试过的方法:
def conditions(s):
if (s['discount'] > 20) and (s['tax'] == 0) and (s['total'] > 100):
return 1
else:
return 0
df_full['Class'] = df_full.apply(conditions, axis=1)
但它 returns 错误为:
TypeError: ("'>' not supported between instances of 'str' and 'int'", 'occurred at index 18')
我该如何解决这个问题?
请帮帮我!
提前致谢!
我建议创建布尔掩码并转换为 int
,True
s 是 1
s,False
s 是 0
s,也改变 and
到 &
按位 AND
:
print (df_full)
discount tax total subtotal productid
0 3.00 0 20 13 002
1 40.00 0 106 94 003
2 46.49 6 21 20 004
您还可以检查所有非数值:
print(df_full[pd.to_numeric(df_full['discount'], errors='coerce').isnull()]
#for convert to numeric - non numeric are convert to `NaN`s
df_full['discount'] = pd.to_numeric(df_full['discount'], errors='coerce')
df_full['Class'] = ((df_full['discount'] > 20) &
(df_full['tax'] == 0) &
(df_full['total'] > 100)).astype(int)
print (df_full)
discount tax total subtotal productid Class
0 3.00 0 20 13 002 0
1 40.00 0 106 94 003 1
2 46.49 6 21 20 004 0
我正在使用 pandas 处理 DataFrame,我需要根据某些条件添加一个新列。
我的数据框是:
discount tax total subtotal productid
3 0 20 13 002
10 3 106 94 003
46.49 6 21 20 004
我需要在将名为 Class 的新列添加到 DataFrame 时应用一些条件。
条件如下:
IF discount > 20 & total > 100 & tax == 0
那么 Class 应该是 1
否则应该是 0
这是我尝试过的方法:
def conditions(s):
if (s['discount'] > 20) and (s['tax'] == 0) and (s['total'] > 100):
return 1
else:
return 0
df_full['Class'] = df_full.apply(conditions, axis=1)
但它 returns 错误为:
TypeError: ("'>' not supported between instances of 'str' and 'int'", 'occurred at index 18')
我该如何解决这个问题?
请帮帮我!
提前致谢!
我建议创建布尔掩码并转换为 int
,True
s 是 1
s,False
s 是 0
s,也改变 and
到 &
按位 AND
:
print (df_full)
discount tax total subtotal productid
0 3.00 0 20 13 002
1 40.00 0 106 94 003
2 46.49 6 21 20 004
您还可以检查所有非数值:
print(df_full[pd.to_numeric(df_full['discount'], errors='coerce').isnull()]
#for convert to numeric - non numeric are convert to `NaN`s
df_full['discount'] = pd.to_numeric(df_full['discount'], errors='coerce')
df_full['Class'] = ((df_full['discount'] > 20) &
(df_full['tax'] == 0) &
(df_full['total'] > 100)).astype(int)
print (df_full)
discount tax total subtotal productid Class
0 3.00 0 20 13 002 0
1 40.00 0 106 94 003 1
2 46.49 6 21 20 004 0