仅当单元格的 none 为空时,才在它们之间添加带有“,”的两列值
Add two columns values with ',' between them only if none of the cells is null
我有以下数据框:
>>>name breakfast lunch dinner
0 Zoey apple egg noodels
1 Rena pear pasta
2 Shila tomato potatoes
3 Daphni coffee soup
4 Dufi
我想创建一个新列,其中包含每个名字在同一天吃的所有食物价值。我尝试使用 '+' 并用 ',' 分隔单词,如下所示:
df['food']=df['breakfast']+','+df['lunch']+','+df['dinner']
但是如果我有空值,我在中间有',':
>>>name breakfast lunch dinner food
0 Zoey apple egg noodels apple,egg,noodels
1 Rena pear pasta pear,,pasta
2 Shila tomato potatoes ,tmatoe,potatoes
3 Daphni coffee soup coffee,,soupp
4. Dufi ,,
我想在正确的位置用','来清理它,例如,如果有 null,就不要放:
>>>name breakfast lunch dinner food
0 Zoey apple egg noodels apple,egg,noodels
1 Rena pear pasta pear,pasta
2 Shila tomato potatoes tmatoe,potatoes
3 Daphni coffee soup coffee,soup
4 Dufi
有什么办法吗?定义如果有空单元格不添加它/不把 , 放在错误的地方
如果没有缺失值,只加入空字符串的解决方案只加入过滤空字符串的值:
cols = ['breakfast','lunch','dinner']
df['food'] = df[cols].apply(lambda x: ','.join(y for y in x if y != ''), axis=1)
print (df)
name breakfast lunch dinner food
0 Zoey apple egg noodels apple,egg,noodels
1 Rena pear pasta pear,pasta
2 Shila tomato potatoes tomato,potatoes
3 Daphni coffee soup coffee,soup
4 Dufi
或列表理解:
cols = ['breakfast','lunch','dinner']
df['food'] = [','.join(y for y in x if y != '') for x in df[cols].to_numpy()]
print (df)
name breakfast lunch dinner food
0 Zoey apple egg noodels apple,egg,noodels
1 Rena pear pasta pear,pasta
2 Shila tomato potatoes tomato,potatoes
3 Daphni coffee soup coffee,soup
4 Dufi
如果缺失值相似,解决方案只能使用NaN != NaN
:
cols = ['breakfast','lunch','dinner']
df['food'] = [','.join(y for y in x if y == y) for x in df[cols].to_numpy()]
print (df)
name breakfast lunch dinner food
0 Zoey apple egg noodels apple,egg,noodels
1 Rena pear NaN pasta pear,pasta
2 Shila NaN tomato potatoes tomato,potatoes
3 Daphni coffee NaN soup coffee,soup
4 Dufi NaN NaN NaN
在索引中使用 .stack
和 groupby
。
假设您的空白实际上是真实的空值
因为我们不需要名称,我们可以将其添加到索引或删除它,我已将其添加到此处。
df['food'] = df.set_index('name',append=True).stack().groupby(level=0).agg(','.join)
如果您的空格不是空值,我们可以做到
df.replace(' ', np.nan).set_index('name',append=True).stack()\
.groupby(level=0).agg(','.join)
name breakfast lunch dinner food
0 Zoey apple egg noodels apple,egg,noodels
1 Rena pear pasta NaN pear,pasta
2 Shila tomato potatoes NaN tomato,potatoes
3 Daphni coffee soup NaN coffee,soup
4 Dufi NaN NaN NaN NaN
我有以下数据框:
>>>name breakfast lunch dinner
0 Zoey apple egg noodels
1 Rena pear pasta
2 Shila tomato potatoes
3 Daphni coffee soup
4 Dufi
我想创建一个新列,其中包含每个名字在同一天吃的所有食物价值。我尝试使用 '+' 并用 ',' 分隔单词,如下所示:
df['food']=df['breakfast']+','+df['lunch']+','+df['dinner']
但是如果我有空值,我在中间有',':
>>>name breakfast lunch dinner food
0 Zoey apple egg noodels apple,egg,noodels
1 Rena pear pasta pear,,pasta
2 Shila tomato potatoes ,tmatoe,potatoes
3 Daphni coffee soup coffee,,soupp
4. Dufi ,,
我想在正确的位置用','来清理它,例如,如果有 null,就不要放:
>>>name breakfast lunch dinner food
0 Zoey apple egg noodels apple,egg,noodels
1 Rena pear pasta pear,pasta
2 Shila tomato potatoes tmatoe,potatoes
3 Daphni coffee soup coffee,soup
4 Dufi
有什么办法吗?定义如果有空单元格不添加它/不把 , 放在错误的地方
如果没有缺失值,只加入空字符串的解决方案只加入过滤空字符串的值:
cols = ['breakfast','lunch','dinner']
df['food'] = df[cols].apply(lambda x: ','.join(y for y in x if y != ''), axis=1)
print (df)
name breakfast lunch dinner food
0 Zoey apple egg noodels apple,egg,noodels
1 Rena pear pasta pear,pasta
2 Shila tomato potatoes tomato,potatoes
3 Daphni coffee soup coffee,soup
4 Dufi
或列表理解:
cols = ['breakfast','lunch','dinner']
df['food'] = [','.join(y for y in x if y != '') for x in df[cols].to_numpy()]
print (df)
name breakfast lunch dinner food
0 Zoey apple egg noodels apple,egg,noodels
1 Rena pear pasta pear,pasta
2 Shila tomato potatoes tomato,potatoes
3 Daphni coffee soup coffee,soup
4 Dufi
如果缺失值相似,解决方案只能使用NaN != NaN
:
cols = ['breakfast','lunch','dinner']
df['food'] = [','.join(y for y in x if y == y) for x in df[cols].to_numpy()]
print (df)
name breakfast lunch dinner food
0 Zoey apple egg noodels apple,egg,noodels
1 Rena pear NaN pasta pear,pasta
2 Shila NaN tomato potatoes tomato,potatoes
3 Daphni coffee NaN soup coffee,soup
4 Dufi NaN NaN NaN
在索引中使用 .stack
和 groupby
。
假设您的空白实际上是真实的空值
因为我们不需要名称,我们可以将其添加到索引或删除它,我已将其添加到此处。
df['food'] = df.set_index('name',append=True).stack().groupby(level=0).agg(','.join)
如果您的空格不是空值,我们可以做到
df.replace(' ', np.nan).set_index('name',append=True).stack()\
.groupby(level=0).agg(','.join)
name breakfast lunch dinner food
0 Zoey apple egg noodels apple,egg,noodels
1 Rena pear pasta NaN pear,pasta
2 Shila tomato potatoes NaN tomato,potatoes
3 Daphni coffee soup NaN coffee,soup
4 Dufi NaN NaN NaN NaN