根据其他行的值在 pandas 中创建一个新列

Question

我有一个示例数据：

column1 column2 column3 column4
  0.       1.      1.      0
  1.       1.      1.      1
  0.       0.      0.      0
  1.       1.      1.      0
  1.       1.      1.      1

我想创建一个新列（输出），如果数据框的所有行值都是 1，则显示 1，否则显示 0。

示例输出如下所示：

column1 column2 column3 column4. output
  0.       1.      1.      0.     0
  1.       1.      1.      1.     1
  0.       0.      0.      0.     0
  1.       1.      1.      0.     0
  1.       1.      1.      1.     1

Answer 1

你可以使用 numpy select()

import pandas as pd 
import numpy as np



condition = [(df.column1==1) & (df.column2==1) & (df.column3==1) & (df.column4==1)]
choices = [1]
df['output'] =np.select(condition, choices, default= 0)

如果你有多个列，你可以使用 np.apply_along_axis()

def ex(x):
    a = 0
    if x.all() == 1.0:
        a = 1
    return a

df['output'] = np.apply_along_axis(ex,1,df)

Answer 2

如果只有 0, 1 个值，请使用 DataFrame.all，因为 0 的处理方式与 False 类似，而 1 的处理方式与 True 类似：

df['new'] = df.all(axis=1).astype(int)
#alternative
#df['new'] = np.where(df.all(axis=1), 1, 0)
print (df)
   column1  column2  column3  column4  new
0      0.0      1.0      1.0        0    0
1      1.0      1.0      1.0        1    1
2      0.0      0.0      0.0        0    0
3      1.0      1.0      1.0        0    0
4      1.0      1.0      1.0        1    1

如果还有其他值比较 1:

df['new'] = df.eq(1).all(axis=1).astype(int)

如果只需要选择部分列：

cols = ['column1', 'column2', 'column3', 'column4']
df['new'] = df[cols].eq(1).all(axis=1).astype(int)

根据其他行的值在 pandas 中创建一个新列

Creating a new column in pandas with respect to the values of other rows

python

numpy

pandas

data-science