Pandas groupby 多列根据条件取另一列的平均值
Pandas groupby multiple columns take average of another based on condition
我被困在这个问题上了,类似的帖子对我来说有点像个黑洞。我还在学习中..
我想取满足条件的组的平均值。我的数据如下所示:
user date Flag Value
0 ron 12/23/2016 'flag' 10
1 ron 12/21/2016 'n/a' 25
2 ron 12/23/2016 'flag' 10
3 ron 12/21/2016 'n/a' 3
4 andy 12/22/2016 'flag' 5
5 andy 12/22/2016 'flag' 1
我想按 user + Flag 分组并创建一个新列 'Avg',它只采用 'flag' 的平均值。所以数据看起来像这样:
user date Flag Value Avg
0 ron 12/23/2016 'flag' 10 10
1 ron 12/21/2016 'n/a' 25 10
2 ron 12/23/2016 'flag' 10 10
3 ron 12/21/2016 'n/a' 3 10
4 andy 12/22/2016 'flag' 5 3
5 andy 12/22/2016 'flag' 1 3
我有这样的东西,但尝试了很多不同的变体:
groups = sample.groupby(['user','Flag'])
flag = sample.groupby(['user','Flag'])['Value'].transform('mean')
sample.loc[:,'Avg'] = np.select([flag.eq('flag'), groups.transform('mean')])
感谢指导..
这是 groupby
和 map
的解决方案:
df['Avg'] = df['user'].map(df[df['Flag']=="'flag'"] # use "flag" only if you don't have `'` in the data'
.groupby('user')['Value'].mean())
输出:
user date Flag Value Avg
0 ron 12/23/2016 'flag' 10 10
1 ron 12/21/2016 'n/a' 25 10
2 ron 12/23/2016 'flag' 10 10
3 ron 12/21/2016 'n/a' 3 10
4 andy 12/22/2016 'flag' 5 3
5 andy 12/22/2016 'flag' 1 3
我被困在这个问题上了,类似的帖子对我来说有点像个黑洞。我还在学习中..
我想取满足条件的组的平均值。我的数据如下所示:
user date Flag Value
0 ron 12/23/2016 'flag' 10
1 ron 12/21/2016 'n/a' 25
2 ron 12/23/2016 'flag' 10
3 ron 12/21/2016 'n/a' 3
4 andy 12/22/2016 'flag' 5
5 andy 12/22/2016 'flag' 1
我想按 user + Flag 分组并创建一个新列 'Avg',它只采用 'flag' 的平均值。所以数据看起来像这样:
user date Flag Value Avg
0 ron 12/23/2016 'flag' 10 10
1 ron 12/21/2016 'n/a' 25 10
2 ron 12/23/2016 'flag' 10 10
3 ron 12/21/2016 'n/a' 3 10
4 andy 12/22/2016 'flag' 5 3
5 andy 12/22/2016 'flag' 1 3
我有这样的东西,但尝试了很多不同的变体:
groups = sample.groupby(['user','Flag'])
flag = sample.groupby(['user','Flag'])['Value'].transform('mean')
sample.loc[:,'Avg'] = np.select([flag.eq('flag'), groups.transform('mean')])
感谢指导..
这是 groupby
和 map
的解决方案:
df['Avg'] = df['user'].map(df[df['Flag']=="'flag'"] # use "flag" only if you don't have `'` in the data'
.groupby('user')['Value'].mean())
输出:
user date Flag Value Avg
0 ron 12/23/2016 'flag' 10 10
1 ron 12/21/2016 'n/a' 25 10
2 ron 12/23/2016 'flag' 10 10
3 ron 12/21/2016 'n/a' 3 10
4 andy 12/22/2016 'flag' 5 3
5 andy 12/22/2016 'flag' 1 3