如何使用 comprehension-python 组合数据框中的不同列

How to combine different columns in a dataframe using comprehension-python

假设一个数据框包含

attacker_1 attacker_2  attacker_3  attacker_4
Lannister   nan         nan         nan
nan         Stark       greyjoy     nan

我想创建另一个名为 AttackerCombo 的列,它将 4 列聚合为 1 列。 我将如何在 python 中定义这样的代码? 我一直在练习 python,我认为这种列表理解是有意义的,但是 [list(x) for x in attackers] 其中 attackers 是 4 列的 numpy 数组显示所有 4 列聚合到 1 列中,但是我也想删除所有 nans。 所以每一行的结果而不是看起来像

starknannanlannister
看起来像
stark/lannister

借助 lambda 函数,您可以在要填充的数据框中设置一个新列:

df['attackers'] = df[['attacker_1','attacker_2','attacker_3','attacker_4']].apply(lambda x : '{}{}{}{}'.format(x[0],x[1],x[2],x[3]), axis=1)

您没有指定如何聚合它们,例如,如果您想用破折号分隔:

df['attackers'] = df[['attacker_1','attacker_2','attacker_3','attacker_4']].apply(lambda x : '{}-{}-{}-{}'.format(x[0],x[1],x[2],x[3]), axis=1)

我觉得你需要apply with join and remove NaN by dropna:

df['attackers'] = df[['attacker_1','attacker_2','attacker_3','attacker_4']] \
                    .apply(lambda x: '/'.join(x.dropna()), axis=1)
print (df)
  attacker_1 attacker_2 attacker_3  attacker_4      attackers
0  Lannister        NaN        NaN         NaN      Lannister
1        NaN      Stark    greyjoy         NaN  Stark/greyjoy

如果需要separator空字符串使用DataFrame.fillna:

df['attackers'] = df[['attacker_1','attacker_2','attacker_3','attacker_4']].fillna('') \
                    .apply(''.join, axis=1)
print (df)
  attacker_1 attacker_2 attacker_3  attacker_4     attackers
0  Lannister        NaN        NaN         NaN     Lannister
1        NaN      Stark    greyjoy         NaN  Starkgreyjoy

另外两个 list comprehension 的解决方案 - 首先比较 notnull 然后检查是否 string:

df['attackers'] = df[['attacker_1','attacker_2','attacker_3','attacker_4']] \
                    .apply(lambda x: '/'.join([e for e in x if pd.notnull(e)]), axis=1)
print (df)
  attacker_1 attacker_2 attacker_3  attacker_4      attackers
0  Lannister        NaN        NaN         NaN      Lannister
1        NaN      Stark    greyjoy         NaN  Stark/greyjoy


#python 3 - isinstance(e, str), python 2 - isinstance(e, basestring)
df['attackers'] = df[['attacker_1','attacker_2','attacker_3','attacker_4']] \
                    .apply(lambda x: '/'.join([e for e in x if isinstance(e, str)]), axis=1)
print (df)
  attacker_1 attacker_2 attacker_3  attacker_4      attackers
0  Lannister        NaN        NaN         NaN      Lannister
1        NaN      Stark    greyjoy         NaN  Stark/greyjoy