仅当单元格的 none 为空时，才在它们之间添加带有“,”的两列值

Question

我有以下数据框：

>>>name   breakfast  lunch   dinner
0 Zoey    apple      egg     noodels
1 Rena    pear               pasta
2 Shila             tomato  potatoes
3 Daphni coffee             soup 
4 Dufi

我想创建一个新列，其中包含每个名字在同一天吃的所有食物价值。我尝试使用 '+' 并用 ',' 分隔单词，如下所示：

df['food']=df['breakfast']+','+df['lunch']+','+df['dinner']

但是如果我有空值，我在中间有'，'：


>>>name   breakfast  lunch   dinner     food
0 Zoey    apple      egg     noodels    apple,egg,noodels
1 Rena    pear               pasta      pear,,pasta
2 Shila             tomato  potatoes    ,tmatoe,potatoes
3 Daphni coffee             soup       coffee,,soupp
4. Dufi                                ,,

我想在正确的位置用','来清理它，例如，如果有 null，就不要放：

>>>name   breakfast  lunch   dinner     food
0 Zoey    apple      egg     noodels    apple,egg,noodels
1 Rena    pear               pasta      pear,pasta
2 Shila             tomato  potatoes    tmatoe,potatoes
3 Daphni coffee             soup       coffee,soup
4 Dufi

有什么办法吗？定义如果有空单元格不添加它/不把 , 放在错误的地方

Answer 1

如果没有缺失值，只加入空字符串的解决方案只加入过滤空字符串的值：

cols = ['breakfast','lunch','dinner']
df['food'] = df[cols].apply(lambda x: ','.join(y for y in x if y != ''), axis=1)
print (df)
     name breakfast   lunch    dinner               food
0    Zoey     apple     egg   noodels  apple,egg,noodels
1    Rena      pear             pasta         pear,pasta
2   Shila            tomato  potatoes    tomato,potatoes
3  Daphni    coffee              soup        coffee,soup
4   Dufi

或列表理解：

cols = ['breakfast','lunch','dinner']
df['food'] = [','.join(y for y in x if y != '') for x in df[cols].to_numpy()]
print (df)
     name breakfast   lunch    dinner               food
0    Zoey     apple     egg   noodels  apple,egg,noodels
1    Rena      pear             pasta         pear,pasta
2   Shila            tomato  potatoes    tomato,potatoes
3  Daphni    coffee              soup        coffee,soup
4   Dufi

如果缺失值相似，解决方案只能使用NaN != NaN:

cols = ['breakfast','lunch','dinner']
df['food'] = [','.join(y for y in x if y == y) for x in df[cols].to_numpy()]
print (df)
     name breakfast   lunch    dinner               food
0    Zoey     apple     egg   noodels  apple,egg,noodels
1    Rena      pear     NaN     pasta         pear,pasta
2   Shila       NaN  tomato  potatoes    tomato,potatoes
3  Daphni    coffee     NaN      soup        coffee,soup
4   Dufi        NaN     NaN       NaN

Answer 2

在索引中使用 .stack 和 groupby。

假设您的空白实际上是真实的空值

因为我们不需要名称，我们可以将其添加到索引或删除它，我已将其添加到此处。

df['food'] = df.set_index('name',append=True).stack().groupby(level=0).agg(','.join)

如果您的空格不是空值，我们可以做到

df.replace(' ', np.nan).set_index('name',append=True).stack()\
                       .groupby(level=0).agg(','.join)

    name breakfast     lunch   dinner               food
0    Zoey     apple       egg  noodels  apple,egg,noodels
1    Rena      pear     pasta      NaN         pear,pasta
2   Shila    tomato  potatoes      NaN    tomato,potatoes
3  Daphni    coffee      soup      NaN        coffee,soup
4    Dufi       NaN       NaN      NaN                NaN

仅当单元格的 none 为空时，才在它们之间添加带有“,”的两列值

Add two columns values with ',' between them only if none of the cells is null

python

add

cell

pandas