有没有办法计算两个参数的标准差和平均值?
Is there a way to calculate std and mean over two parameters?
我使用 groupby 生成了以下内容 pd.DataFrame:
Timestemp Altitude [m] Sequence ID Horizontal Wind Speed [m/s] ... Radial Wind Speed [m/s] CNR [dB] U-Component of Wind Speed V-Component of Wind Speed
0 2019-07-29 00:00:40.901 100 617375 7.2750 ... -0.006 -15.706 7.241811 -0.694118
1 2019-07-29 00:00:40.901 150 617375 8.0700 ... 0.252 -14.960 8.068156 -0.172526
2 2019-07-29 00:00:40.901 200 617375 9.6750 ... 0.572 -13.872 9.672698 -0.211059
3 2019-07-29 00:00:40.901 250 617375 9.7975 ... 0.424 -12.584 9.786624 0.461525
4 2019-07-29 00:00:40.901 300 617375 9.0325 ... 0.054 -10.998 9.029804 -0.220684
... ... ... ... ... ... ... ... ... ...
1612 2019-07-29 00:16:59.713 1500 617425 NaN ... NaN NaN NaN NaN
1613 2019-07-29 00:16:59.713 1550 617425 NaN ... NaN NaN NaN NaN
1614 2019-07-29 00:16:59.713 1600 617425 NaN ... NaN NaN NaN NaN
1615 2019-07-29 00:16:59.713 1650 617425 NaN ... NaN NaN NaN NaN
1616 2019-07-29 00:16:59.713 1700 617425 NaN ... NaN NaN NaN NaN
但现在有点棘手。我想计算每个高度上每 5 分钟的平均值和标准差。
所以海拔高度超过 5 分钟的 Timestemp。
我该如何解决?有人有想法吗?
谢谢
首先您需要将您的时间列设置为索引。然后就可以用采样频率计算均值和标准差了
df = df.set_index(pd.DatetimeIndex(df['Timestemp']))
dfmean = df.groupby(pd.Grouper(freq='5T')).mean() # 5min
dfstd = df.groupby(pd.Grouper(freq='5T')).std()
您可以使用 resample
按 5 分钟 bin 对数据帧进行分组。首先,您需要将时间戳变量作为索引,然后应用 resample
函数。 “T”代表分钟。您可以在此处找到所有代码列表:pandas resample documentation
df.set_index('Timestamp', inplace=True)
df.resample("5T").mean()
df.resample("5T").std()
编辑:如果您还想按“高度”分组。请记住,您仍然需要索引上的时间戳。
df.groupby([pd.Grouper(freq="5Min"), "Altitude"]).mean()
df.groupby([pd.Grouper(freq="5Min"), "Altitude"]).std()
我使用 groupby 生成了以下内容 pd.DataFrame:
Timestemp Altitude [m] Sequence ID Horizontal Wind Speed [m/s] ... Radial Wind Speed [m/s] CNR [dB] U-Component of Wind Speed V-Component of Wind Speed
0 2019-07-29 00:00:40.901 100 617375 7.2750 ... -0.006 -15.706 7.241811 -0.694118
1 2019-07-29 00:00:40.901 150 617375 8.0700 ... 0.252 -14.960 8.068156 -0.172526
2 2019-07-29 00:00:40.901 200 617375 9.6750 ... 0.572 -13.872 9.672698 -0.211059
3 2019-07-29 00:00:40.901 250 617375 9.7975 ... 0.424 -12.584 9.786624 0.461525
4 2019-07-29 00:00:40.901 300 617375 9.0325 ... 0.054 -10.998 9.029804 -0.220684
... ... ... ... ... ... ... ... ... ...
1612 2019-07-29 00:16:59.713 1500 617425 NaN ... NaN NaN NaN NaN
1613 2019-07-29 00:16:59.713 1550 617425 NaN ... NaN NaN NaN NaN
1614 2019-07-29 00:16:59.713 1600 617425 NaN ... NaN NaN NaN NaN
1615 2019-07-29 00:16:59.713 1650 617425 NaN ... NaN NaN NaN NaN
1616 2019-07-29 00:16:59.713 1700 617425 NaN ... NaN NaN NaN NaN
但现在有点棘手。我想计算每个高度上每 5 分钟的平均值和标准差。 所以海拔高度超过 5 分钟的 Timestemp。
我该如何解决?有人有想法吗? 谢谢
首先您需要将您的时间列设置为索引。然后就可以用采样频率计算均值和标准差了
df = df.set_index(pd.DatetimeIndex(df['Timestemp']))
dfmean = df.groupby(pd.Grouper(freq='5T')).mean() # 5min
dfstd = df.groupby(pd.Grouper(freq='5T')).std()
您可以使用 resample
按 5 分钟 bin 对数据帧进行分组。首先,您需要将时间戳变量作为索引,然后应用 resample
函数。 “T”代表分钟。您可以在此处找到所有代码列表:pandas resample documentation
df.set_index('Timestamp', inplace=True)
df.resample("5T").mean()
df.resample("5T").std()
编辑:如果您还想按“高度”分组。请记住,您仍然需要索引上的时间戳。
df.groupby([pd.Grouper(freq="5Min"), "Altitude"]).mean()
df.groupby([pd.Grouper(freq="5Min"), "Altitude"]).std()