上一年和明年的天数 - Pandas

no of days to previous and next year - Pandas

我有一个如下所示的数据框

df1 = pd.DataFrame({'person_id': [11, 21, 31, 41, 51],
                        'date_1': ['12/30/1961', '05/29/1967', '02/03/1957', '7/27/1959', '01/13/1971'],
                        'date_2': ['07/23/2017','05/29/2017','02/03/2015',np.nan,np.nan]})
df1 = df1.melt('person_id', value_name='dates')

我想得到上一年和下一年的天数.

我可以使用下面的代码获取上一年和下一年

df1['cur_year'] = pd.DatetimeIndex(df1['dates']).year
df1['prev_year'] = (df1['cur_year'] - 1)
df1['next_year'] = (df1['cur_year'] + 1)

如您所见,每一行的 year 值都在不断变化,而且我没有固定的基准日期,我如何计算天数与 31/12 之类的日期的差异上一年和下一年 01/01

请注意end date is not included while getting the number of days

我在下面展示了 2 个主题的示例输出。

更新截图

据我了解,你可以试试;

df1['dates'] = pd.to_datetime(df1['dates'])
out = df1.assign(prev_yr_days=df1['dates'].dt.dayofyear,
     next_yr_days=((df1['dates'] + pd.offsets.YearEnd(0)) - df1['dates']).dt.days.add(1))

   person_id variable      dates  prev_yr_days  next_yr_days
0         11   date_1 1961-12-30         364.0           2.0
5         11   date_2 2017-07-23         204.0         162.0
1         21   date_1 1967-05-29         149.0         217.0
6         21   date_2 2017-05-29         149.0         217.0
2         31   date_1 1957-02-03          34.0         332.0
7         31   date_2 2015-02-03          34.0         332.0
3         41   date_1 1959-07-27         208.0         158.0
8         41   date_2        NaT           NaN           NaN
4         51   date_1 1971-01-13          13.0         353.0
9         51   date_2        NaT           NaN           NaN

我们可以根据您的行有条件地创建上一年和下一年的总和。

df1["next_year"] = (
    pd.to_datetime(
        "01-01-" + (df1["dates"].dt.year + 1).fillna(0).astype(int).astype(str)
    )
    - df1["dates"]
)

df1["prev_year"] = (df1['dates'] - 
    pd.to_datetime(
        "31-12-" + (df1["dates"].dt.year - 1).fillna(0).astype(int).astype(str)
    )
    
)

print(df1)

   person_id variable      dates next_year prev_year
0         11   date_1 1961-12-30    2 days  364 days
1         21   date_1 1967-05-29  217 days  149 days
2         31   date_1 1957-02-03  332 days   34 days
3         41   date_1 1959-07-27  158 days  208 days
4         51   date_1 1971-01-13  353 days   13 days
5         11   date_2 2017-07-23  162 days  204 days
6         21   date_2 2017-05-29  217 days  149 days
7         31   date_2 2015-02-03  332 days   34 days
8         41   date_2        NaT       NaT       NaT
9         51   date_2        NaT       NaT       NaT

这是一种方法:

dates = df['dates'].astype('datetime64')
df1['prev_yr_days'] = dates.dt.dayofyear
df1['next_yr_days'] = dates.dt.is_leap_year.sub(df1['prev_yr_days']).add(366)

结果:

   person_id variable       dates  prev_yr_day  next_yr_days
0         11   date_1  12/30/1961        364.0           2.0
5         11   date_2  07/23/2017        204.0         162.0
1         21   date_1  05/29/1967        149.0         217.0
6         21   date_2  05/29/2017        149.0         217.0
2         31   date_1  02/03/1957         34.0         332.0
7         31   date_2  02/03/2015         34.0         332.0
3         41   date_1   7/27/1959        208.0         158.0
8         41   date_2         NaN          NaN           NaN
4         51   date_1  01/13/1971         13.0         353.0
9         51   date_2         NaN          NaN           NaN