仅当单元格的 none 为空时,才在它们之间添加带有“,”的两列值

Add two columns values with ',' between them only if none of the cells is null

我有以下数据框:

>>>name   breakfast  lunch   dinner
0 Zoey    apple      egg     noodels
1 Rena    pear               pasta
2 Shila             tomato  potatoes
3 Daphni coffee             soup 
4 Dufi                  

我想创建一个新列,其中包含每个名字在同一天吃的所有食物价值。我尝试使用 '+' 并用 ',' 分隔单词,如下所示:

df['food']=df['breakfast']+','+df['lunch']+','+df['dinner']

但是如果我有空值,我在中间有',':


>>>name   breakfast  lunch   dinner     food
0 Zoey    apple      egg     noodels    apple,egg,noodels
1 Rena    pear               pasta      pear,,pasta
2 Shila             tomato  potatoes    ,tmatoe,potatoes
3 Daphni coffee             soup       coffee,,soupp
4. Dufi                                ,,

我想在正确的位置用','来清理它,例如,如果有 null,就不要放:

>>>name   breakfast  lunch   dinner     food
0 Zoey    apple      egg     noodels    apple,egg,noodels
1 Rena    pear               pasta      pear,pasta
2 Shila             tomato  potatoes    tmatoe,potatoes
3 Daphni coffee             soup       coffee,soup
4 Dufi                  

有什么办法吗?定义如果有空单元格不添加它/不把 , 放在错误的地方

如果没有缺失值,只加入空字符串的解决方案只加入过滤空字符串的值:

cols = ['breakfast','lunch','dinner']
df['food'] = df[cols].apply(lambda x: ','.join(y for y in x if y != ''), axis=1)
print (df)
     name breakfast   lunch    dinner               food
0    Zoey     apple     egg   noodels  apple,egg,noodels
1    Rena      pear             pasta         pear,pasta
2   Shila            tomato  potatoes    tomato,potatoes
3  Daphni    coffee              soup        coffee,soup
4   Dufi                                                

或列表理解:

cols = ['breakfast','lunch','dinner']
df['food'] = [','.join(y for y in x if y != '') for x in df[cols].to_numpy()]
print (df)
     name breakfast   lunch    dinner               food
0    Zoey     apple     egg   noodels  apple,egg,noodels
1    Rena      pear             pasta         pear,pasta
2   Shila            tomato  potatoes    tomato,potatoes
3  Daphni    coffee              soup        coffee,soup
4   Dufi                                                

如果缺失值相似,解决方案只能使用NaN != NaN:

cols = ['breakfast','lunch','dinner']
df['food'] = [','.join(y for y in x if y == y) for x in df[cols].to_numpy()]
print (df)
     name breakfast   lunch    dinner               food
0    Zoey     apple     egg   noodels  apple,egg,noodels
1    Rena      pear     NaN     pasta         pear,pasta
2   Shila       NaN  tomato  potatoes    tomato,potatoes
3  Daphni    coffee     NaN      soup        coffee,soup
4   Dufi        NaN     NaN       NaN                   

在索引中使用 .stackgroupby

假设您的空白实际上是真实的空值

因为我们不需要名称,我们可以将其添加到索引或删除它,我已将其添加到此处。

df['food'] = df.set_index('name',append=True).stack().groupby(level=0).agg(','.join)

如果您的空格不是空值,我们可以做到

df.replace(' ', np.nan).set_index('name',append=True).stack()\
                       .groupby(level=0).agg(','.join)

    name breakfast     lunch   dinner               food
0    Zoey     apple       egg  noodels  apple,egg,noodels
1    Rena      pear     pasta      NaN         pear,pasta
2   Shila    tomato  potatoes      NaN    tomato,potatoes
3  Daphni    coffee      soup      NaN        coffee,soup
4    Dufi       NaN       NaN      NaN                NaN