如何使用 np.where 嵌套在带有 pandas 的数据框中?
How to use np.where nested in data frame with pandas?
我想根据以下条件实现添加自定义标签的逻辑:
if df[(df['value1'] ==0) & (df['value2']==1)] then label1
if df[(df['value1'] ==0) & (df['value2']==0)] then label2
if df[(df['value1'] ==1) & (df['value2']==1)] then label3
if df[(df['value1'] ==1) & (df['value2']==0)] then label4
输出:
label_class | other columns
label1 |...
label1 |...
label3 |...
label2 |...
我试过 np.where 但我不确定如何正确嵌套。
使用numpy.select
:
m1 = (df['value1'] ==0) & (df['value2']==1)
m2 = (df['value1'] ==0) & (df['value2']==0)
m3 = (df['value1'] ==1) & (df['value2']==1)
m4 = (df['value1'] ==1) & (df['value2']==0)
labels = ['label1', 'label2', 'label3', 'label4']
df['label_class'] = np.select([m1, m2, m3, m4], labels)
另一个想法是通过所有组合和标签创建辅助 DataFrame,然后通过左连接添加到 DataFrame:
df1 = pd.DataFrame({'value1':[0,0,1,1], 'value2':[1,0,1,0], 'label_class':labels})
df = df.merge(df1, on=['value1','value2'], how='left')
两列映射的想法:
d = {(0, 1): 'label1', (0, 0): 'label2', (1, 1): 'label3', (1, 0): 'label4'}
df['label_class'] = df.set_index(['value1','value2']).index.map(d)
np.where
的语法是这样np.where(condition, value_if_true, value_if_false)
在你的情况下,你可以这样做:
np.where(df[(df['value1'] ==0) & (df['value2']==1)], 'label1',
np.where(if df[(df['value1'] ==0) & (df['value2']==0)], 'label2',
np.where(if df[(df['value1'] ==1) & (df['value2']==1)], 'label3',
np.where(if df[(df['value1'] ==1) & (df['value2']==0)], 'label4', None))))
我想根据以下条件实现添加自定义标签的逻辑:
if df[(df['value1'] ==0) & (df['value2']==1)] then label1
if df[(df['value1'] ==0) & (df['value2']==0)] then label2
if df[(df['value1'] ==1) & (df['value2']==1)] then label3
if df[(df['value1'] ==1) & (df['value2']==0)] then label4
输出:
label_class | other columns
label1 |...
label1 |...
label3 |...
label2 |...
我试过 np.where 但我不确定如何正确嵌套。
使用numpy.select
:
m1 = (df['value1'] ==0) & (df['value2']==1)
m2 = (df['value1'] ==0) & (df['value2']==0)
m3 = (df['value1'] ==1) & (df['value2']==1)
m4 = (df['value1'] ==1) & (df['value2']==0)
labels = ['label1', 'label2', 'label3', 'label4']
df['label_class'] = np.select([m1, m2, m3, m4], labels)
另一个想法是通过所有组合和标签创建辅助 DataFrame,然后通过左连接添加到 DataFrame:
df1 = pd.DataFrame({'value1':[0,0,1,1], 'value2':[1,0,1,0], 'label_class':labels})
df = df.merge(df1, on=['value1','value2'], how='left')
两列映射的想法:
d = {(0, 1): 'label1', (0, 0): 'label2', (1, 1): 'label3', (1, 0): 'label4'}
df['label_class'] = df.set_index(['value1','value2']).index.map(d)
np.where
的语法是这样np.where(condition, value_if_true, value_if_false)
在你的情况下,你可以这样做:
np.where(df[(df['value1'] ==0) & (df['value2']==1)], 'label1',
np.where(if df[(df['value1'] ==0) & (df['value2']==0)], 'label2',
np.where(if df[(df['value1'] ==1) & (df['value2']==1)], 'label3',
np.where(if df[(df['value1'] ==1) & (df['value2']==0)], 'label4', None))))