按数据平均列

Mean Column by Data

我想要实现的是创建一个名为 mean 的新列,通过它给出 value1 的日平均值,并将 value2 除以该日平均值并将其存储为 value3。

这是我的

df = pd.DataFrame( [
[pd.Timestamp('2019-09-22 00:00:00'), 'device1', 10, 3000],
[pd.Timestamp('2019-09-22 04:00:00'), 'device1', 40, 2000],
[pd.Timestamp('2019-09-22 05:00:00'), 'device1', 45, 1000],
[pd.Timestamp('2019-09-22 06:00:00'), 'device1', 450, 1500],
[pd.Timestamp('2019-09-22 07:00:00'), 'device1', 500, 2000],
[pd.Timestamp('2019-09-22 08:00:00'), 'device1', 550, 3000],
[pd.Timestamp('2019-09-22 15:00:00'), 'device1', 600, 4000],
[pd.Timestamp('2019-09-22 16:00:00'), 'device1', 650, 3000],
[pd.Timestamp('2019-09-22 17:00:00'), 'device1', 700, 2000],
[pd.Timestamp('2019-09-22 21:00:00'), 'device1', 900, 1000],
[pd.Timestamp('2019-09-22 22:00:00'), 'device1', 1000, 1500],
[pd.Timestamp('2019-09-23 05:00:00'), 'device1', 1100, 2000],
[pd.Timestamp('2019-09-23 04:00:00'), 'device1', 1200, 3000],
[pd.Timestamp('2019-09-24 05:00:00'), 'device1', 1100, 2000],
[pd.Timestamp('2019-09-24 04:00:00'), 'device1', 1200, 3000]
],
columns=["devicetimestamp","id","value", "value2"]
)

我想要实现的是这样的

devicetimestamp , id , value1 , value2, mean, value3 
[pd.Timestamp('2019-09-22 00:00:00'), 'device1', 10, 3000, 404.09, 0.134],
[pd.Timestamp('2019-09-22 04:00:00'), 'device1', 40, 2000, 404.09, 0.202],
[pd.Timestamp('2019-09-22 05:00:00'), 'device1', 45, 1000, 404.09, 0.404],
[pd.Timestamp('2019-09-22 06:00:00'), 'device1', 450, 1500, 404.09, 0.269],
[pd.Timestamp('2019-09-22 07:00:00'), 'device1', 500, 2000, 404.09],
[pd.Timestamp('2019-09-22 08:00:00'), 'device1', 550, 3000, 404.09],
[pd.Timestamp('2019-09-22 15:00:00'), 'device1', 600, 4000, 404.09],
[pd.Timestamp('2019-09-22 16:00:00'), 'device1', 650, 3000, 404.09],
[pd.Timestamp('2019-09-22 17:00:00'), 'device1', 700, 2000, 404.09],
[pd.Timestamp('2019-09-22 21:00:00'), 'device1', 900, 1000, 404.09],
[pd.Timestamp('2019-09-22 22:00:00'), 'device1', 1000, 1500, 404.09],
[pd.Timestamp('2019-09-23 05:00:00'), 'device1', 1100, 2000, 1150, 1.04],
[pd.Timestamp('2019-09-23 04:00:00'), 'device1', 1200, 3000, 1150, 0.95],
[pd.Timestamp('2019-09-24 05:00:00'), 'device1', 1100, 2000, 1200, 1.09],
[pd.Timestamp('2019-09-24 04:00:00'), 'device1', 1300, 3000, 1200, 0.92]

'''

我尝试做一个 groupby.mean 和海市蜃楼新专栏但是没有用

你想要transform('mean'):

df['mean'] = (df.groupby(['id',       # remove if you don't want groupby ID
                  df.devicetimestamp.dt.normalize()]) # normalize gives you the date
                  ['value'].transform('mean')
             )

df['value3'] = df['value2']/df['mean']

输出:

       devicetimestamp       id  value  value2  mean    value3
0  2019-09-22 00:00:00  device1     10    3000   495  6.060606
1  2019-09-22 04:00:00  device1     40    2000   495  4.040404
2  2019-09-22 05:00:00  device1     45    1000   495  2.020202
3  2019-09-22 06:00:00  device1    450    1500   495  3.030303
4  2019-09-22 07:00:00  device1    500    2000   495  4.040404
5  2019-09-22 08:00:00  device1    550    3000   495  6.060606
6  2019-09-22 15:00:00  device1    600    4000   495  8.080808
7  2019-09-22 16:00:00  device1    650    3000   495  6.060606
8  2019-09-22 17:00:00  device1    700    2000   495  4.040404
9  2019-09-22 21:00:00  device1    900    1000   495  2.020202
10 2019-09-22 22:00:00  device1   1000    1500   495  3.030303
11 2019-09-23 05:00:00  device1   1100    2000  1150  1.739130
12 2019-09-23 04:00:00  device1   1200    3000  1150  2.608696
13 2019-09-24 05:00:00  device1   1100    2000  1150  1.739130
14 2019-09-24 04:00:00  device1   1200    3000  1150  2.608696