如何获取pandas的日差
How to obtain the daily difference in pandas
例子=
day | hours
monday 5
monday 6
tuesday 5
tuesday 6
tuesday 7
wednesday 5
wednesday 6
wednesday 7
预期结果:
day | hours
monday 1
tuesday 2
wednesday 2
您只想从每个组中减去 first
和 last
行。
使用Groupby.agg
, pick first
and last
rows per group. Then, use df.diff
with MultiIndex.droplevel
:
In [3831]: x = (
...: df.groupby("day")
...: .agg({"hours": ["first", "last"]})
...: .diff(axis=1)[("hours", "last")]
...: .reset_index()
...: .droplevel(level=1, axis=1)
...: )
In [3829]: x
Out[3829]:
day hours
0 monday 1.0
1 tuesday 2.0
2 wednesday 2.0
我读得太快了,我们试试:
df.groupby('day', as_index=False)['hours'].apply(lambda x: x.max()-x.min())
或者
df.groupby('day', as_index=False)['hours'].apply(np.ptp)
输出:
day hours
0 monday 1
1 tuesday 2
2 wednesday 2
例子=
day | hours
monday 5
monday 6
tuesday 5
tuesday 6
tuesday 7
wednesday 5
wednesday 6
wednesday 7
预期结果:
day | hours
monday 1
tuesday 2
wednesday 2
您只想从每个组中减去 first
和 last
行。
使用Groupby.agg
, pick first
and last
rows per group. Then, use df.diff
with MultiIndex.droplevel
:
In [3831]: x = (
...: df.groupby("day")
...: .agg({"hours": ["first", "last"]})
...: .diff(axis=1)[("hours", "last")]
...: .reset_index()
...: .droplevel(level=1, axis=1)
...: )
In [3829]: x
Out[3829]:
day hours
0 monday 1.0
1 tuesday 2.0
2 wednesday 2.0
我读得太快了,我们试试:
df.groupby('day', as_index=False)['hours'].apply(lambda x: x.max()-x.min())
或者
df.groupby('day', as_index=False)['hours'].apply(np.ptp)
输出:
day hours
0 monday 1
1 tuesday 2
2 wednesday 2