按数据平均列
Mean Column by Data
我想要实现的是创建一个名为 mean 的新列,通过它给出 value1 的日平均值,并将 value2 除以该日平均值并将其存储为 value3。
这是我的
df = pd.DataFrame( [
[pd.Timestamp('2019-09-22 00:00:00'), 'device1', 10, 3000],
[pd.Timestamp('2019-09-22 04:00:00'), 'device1', 40, 2000],
[pd.Timestamp('2019-09-22 05:00:00'), 'device1', 45, 1000],
[pd.Timestamp('2019-09-22 06:00:00'), 'device1', 450, 1500],
[pd.Timestamp('2019-09-22 07:00:00'), 'device1', 500, 2000],
[pd.Timestamp('2019-09-22 08:00:00'), 'device1', 550, 3000],
[pd.Timestamp('2019-09-22 15:00:00'), 'device1', 600, 4000],
[pd.Timestamp('2019-09-22 16:00:00'), 'device1', 650, 3000],
[pd.Timestamp('2019-09-22 17:00:00'), 'device1', 700, 2000],
[pd.Timestamp('2019-09-22 21:00:00'), 'device1', 900, 1000],
[pd.Timestamp('2019-09-22 22:00:00'), 'device1', 1000, 1500],
[pd.Timestamp('2019-09-23 05:00:00'), 'device1', 1100, 2000],
[pd.Timestamp('2019-09-23 04:00:00'), 'device1', 1200, 3000],
[pd.Timestamp('2019-09-24 05:00:00'), 'device1', 1100, 2000],
[pd.Timestamp('2019-09-24 04:00:00'), 'device1', 1200, 3000]
],
columns=["devicetimestamp","id","value", "value2"]
)
我想要实现的是这样的
devicetimestamp , id , value1 , value2, mean, value3
[pd.Timestamp('2019-09-22 00:00:00'), 'device1', 10, 3000, 404.09, 0.134],
[pd.Timestamp('2019-09-22 04:00:00'), 'device1', 40, 2000, 404.09, 0.202],
[pd.Timestamp('2019-09-22 05:00:00'), 'device1', 45, 1000, 404.09, 0.404],
[pd.Timestamp('2019-09-22 06:00:00'), 'device1', 450, 1500, 404.09, 0.269],
[pd.Timestamp('2019-09-22 07:00:00'), 'device1', 500, 2000, 404.09],
[pd.Timestamp('2019-09-22 08:00:00'), 'device1', 550, 3000, 404.09],
[pd.Timestamp('2019-09-22 15:00:00'), 'device1', 600, 4000, 404.09],
[pd.Timestamp('2019-09-22 16:00:00'), 'device1', 650, 3000, 404.09],
[pd.Timestamp('2019-09-22 17:00:00'), 'device1', 700, 2000, 404.09],
[pd.Timestamp('2019-09-22 21:00:00'), 'device1', 900, 1000, 404.09],
[pd.Timestamp('2019-09-22 22:00:00'), 'device1', 1000, 1500, 404.09],
[pd.Timestamp('2019-09-23 05:00:00'), 'device1', 1100, 2000, 1150, 1.04],
[pd.Timestamp('2019-09-23 04:00:00'), 'device1', 1200, 3000, 1150, 0.95],
[pd.Timestamp('2019-09-24 05:00:00'), 'device1', 1100, 2000, 1200, 1.09],
[pd.Timestamp('2019-09-24 04:00:00'), 'device1', 1300, 3000, 1200, 0.92]
'''
我尝试做一个 groupby.mean 和海市蜃楼新专栏但是没有用
你想要transform('mean')
:
df['mean'] = (df.groupby(['id', # remove if you don't want groupby ID
df.devicetimestamp.dt.normalize()]) # normalize gives you the date
['value'].transform('mean')
)
df['value3'] = df['value2']/df['mean']
输出:
devicetimestamp id value value2 mean value3
0 2019-09-22 00:00:00 device1 10 3000 495 6.060606
1 2019-09-22 04:00:00 device1 40 2000 495 4.040404
2 2019-09-22 05:00:00 device1 45 1000 495 2.020202
3 2019-09-22 06:00:00 device1 450 1500 495 3.030303
4 2019-09-22 07:00:00 device1 500 2000 495 4.040404
5 2019-09-22 08:00:00 device1 550 3000 495 6.060606
6 2019-09-22 15:00:00 device1 600 4000 495 8.080808
7 2019-09-22 16:00:00 device1 650 3000 495 6.060606
8 2019-09-22 17:00:00 device1 700 2000 495 4.040404
9 2019-09-22 21:00:00 device1 900 1000 495 2.020202
10 2019-09-22 22:00:00 device1 1000 1500 495 3.030303
11 2019-09-23 05:00:00 device1 1100 2000 1150 1.739130
12 2019-09-23 04:00:00 device1 1200 3000 1150 2.608696
13 2019-09-24 05:00:00 device1 1100 2000 1150 1.739130
14 2019-09-24 04:00:00 device1 1200 3000 1150 2.608696
我想要实现的是创建一个名为 mean 的新列,通过它给出 value1 的日平均值,并将 value2 除以该日平均值并将其存储为 value3。
这是我的
df = pd.DataFrame( [
[pd.Timestamp('2019-09-22 00:00:00'), 'device1', 10, 3000],
[pd.Timestamp('2019-09-22 04:00:00'), 'device1', 40, 2000],
[pd.Timestamp('2019-09-22 05:00:00'), 'device1', 45, 1000],
[pd.Timestamp('2019-09-22 06:00:00'), 'device1', 450, 1500],
[pd.Timestamp('2019-09-22 07:00:00'), 'device1', 500, 2000],
[pd.Timestamp('2019-09-22 08:00:00'), 'device1', 550, 3000],
[pd.Timestamp('2019-09-22 15:00:00'), 'device1', 600, 4000],
[pd.Timestamp('2019-09-22 16:00:00'), 'device1', 650, 3000],
[pd.Timestamp('2019-09-22 17:00:00'), 'device1', 700, 2000],
[pd.Timestamp('2019-09-22 21:00:00'), 'device1', 900, 1000],
[pd.Timestamp('2019-09-22 22:00:00'), 'device1', 1000, 1500],
[pd.Timestamp('2019-09-23 05:00:00'), 'device1', 1100, 2000],
[pd.Timestamp('2019-09-23 04:00:00'), 'device1', 1200, 3000],
[pd.Timestamp('2019-09-24 05:00:00'), 'device1', 1100, 2000],
[pd.Timestamp('2019-09-24 04:00:00'), 'device1', 1200, 3000]
],
columns=["devicetimestamp","id","value", "value2"]
)
我想要实现的是这样的
devicetimestamp , id , value1 , value2, mean, value3
[pd.Timestamp('2019-09-22 00:00:00'), 'device1', 10, 3000, 404.09, 0.134],
[pd.Timestamp('2019-09-22 04:00:00'), 'device1', 40, 2000, 404.09, 0.202],
[pd.Timestamp('2019-09-22 05:00:00'), 'device1', 45, 1000, 404.09, 0.404],
[pd.Timestamp('2019-09-22 06:00:00'), 'device1', 450, 1500, 404.09, 0.269],
[pd.Timestamp('2019-09-22 07:00:00'), 'device1', 500, 2000, 404.09],
[pd.Timestamp('2019-09-22 08:00:00'), 'device1', 550, 3000, 404.09],
[pd.Timestamp('2019-09-22 15:00:00'), 'device1', 600, 4000, 404.09],
[pd.Timestamp('2019-09-22 16:00:00'), 'device1', 650, 3000, 404.09],
[pd.Timestamp('2019-09-22 17:00:00'), 'device1', 700, 2000, 404.09],
[pd.Timestamp('2019-09-22 21:00:00'), 'device1', 900, 1000, 404.09],
[pd.Timestamp('2019-09-22 22:00:00'), 'device1', 1000, 1500, 404.09],
[pd.Timestamp('2019-09-23 05:00:00'), 'device1', 1100, 2000, 1150, 1.04],
[pd.Timestamp('2019-09-23 04:00:00'), 'device1', 1200, 3000, 1150, 0.95],
[pd.Timestamp('2019-09-24 05:00:00'), 'device1', 1100, 2000, 1200, 1.09],
[pd.Timestamp('2019-09-24 04:00:00'), 'device1', 1300, 3000, 1200, 0.92]
'''
我尝试做一个 groupby.mean 和海市蜃楼新专栏但是没有用
你想要transform('mean')
:
df['mean'] = (df.groupby(['id', # remove if you don't want groupby ID
df.devicetimestamp.dt.normalize()]) # normalize gives you the date
['value'].transform('mean')
)
df['value3'] = df['value2']/df['mean']
输出:
devicetimestamp id value value2 mean value3
0 2019-09-22 00:00:00 device1 10 3000 495 6.060606
1 2019-09-22 04:00:00 device1 40 2000 495 4.040404
2 2019-09-22 05:00:00 device1 45 1000 495 2.020202
3 2019-09-22 06:00:00 device1 450 1500 495 3.030303
4 2019-09-22 07:00:00 device1 500 2000 495 4.040404
5 2019-09-22 08:00:00 device1 550 3000 495 6.060606
6 2019-09-22 15:00:00 device1 600 4000 495 8.080808
7 2019-09-22 16:00:00 device1 650 3000 495 6.060606
8 2019-09-22 17:00:00 device1 700 2000 495 4.040404
9 2019-09-22 21:00:00 device1 900 1000 495 2.020202
10 2019-09-22 22:00:00 device1 1000 1500 495 3.030303
11 2019-09-23 05:00:00 device1 1100 2000 1150 1.739130
12 2019-09-23 04:00:00 device1 1200 3000 1150 2.608696
13 2019-09-24 05:00:00 device1 1100 2000 1150 1.739130
14 2019-09-24 04:00:00 device1 1200 3000 1150 2.608696