上一年和明年的天数 - Pandas
no of days to previous and next year - Pandas
我有一个如下所示的数据框
df1 = pd.DataFrame({'person_id': [11, 21, 31, 41, 51],
'date_1': ['12/30/1961', '05/29/1967', '02/03/1957', '7/27/1959', '01/13/1971'],
'date_2': ['07/23/2017','05/29/2017','02/03/2015',np.nan,np.nan]})
df1 = df1.melt('person_id', value_name='dates')
我想得到上一年和下一年的天数.
我可以使用下面的代码获取上一年和下一年
df1['cur_year'] = pd.DatetimeIndex(df1['dates']).year
df1['prev_year'] = (df1['cur_year'] - 1)
df1['next_year'] = (df1['cur_year'] + 1)
如您所见,每一行的 year
值都在不断变化,而且我没有固定的基准日期,我如何计算天数与 31/12
之类的日期的差异上一年和下一年 01/01
。
请注意end date is not included while getting the number of days
我在下面展示了 2 个主题的示例输出。
更新截图
据我了解,你可以试试;
df1['dates'] = pd.to_datetime(df1['dates'])
out = df1.assign(prev_yr_days=df1['dates'].dt.dayofyear,
next_yr_days=((df1['dates'] + pd.offsets.YearEnd(0)) - df1['dates']).dt.days.add(1))
person_id variable dates prev_yr_days next_yr_days
0 11 date_1 1961-12-30 364.0 2.0
5 11 date_2 2017-07-23 204.0 162.0
1 21 date_1 1967-05-29 149.0 217.0
6 21 date_2 2017-05-29 149.0 217.0
2 31 date_1 1957-02-03 34.0 332.0
7 31 date_2 2015-02-03 34.0 332.0
3 41 date_1 1959-07-27 208.0 158.0
8 41 date_2 NaT NaN NaN
4 51 date_1 1971-01-13 13.0 353.0
9 51 date_2 NaT NaN NaN
我们可以根据您的行有条件地创建上一年和下一年的总和。
df1["next_year"] = (
pd.to_datetime(
"01-01-" + (df1["dates"].dt.year + 1).fillna(0).astype(int).astype(str)
)
- df1["dates"]
)
df1["prev_year"] = (df1['dates'] -
pd.to_datetime(
"31-12-" + (df1["dates"].dt.year - 1).fillna(0).astype(int).astype(str)
)
)
print(df1)
person_id variable dates next_year prev_year
0 11 date_1 1961-12-30 2 days 364 days
1 21 date_1 1967-05-29 217 days 149 days
2 31 date_1 1957-02-03 332 days 34 days
3 41 date_1 1959-07-27 158 days 208 days
4 51 date_1 1971-01-13 353 days 13 days
5 11 date_2 2017-07-23 162 days 204 days
6 21 date_2 2017-05-29 217 days 149 days
7 31 date_2 2015-02-03 332 days 34 days
8 41 date_2 NaT NaT NaT
9 51 date_2 NaT NaT NaT
这是一种方法:
dates = df['dates'].astype('datetime64')
df1['prev_yr_days'] = dates.dt.dayofyear
df1['next_yr_days'] = dates.dt.is_leap_year.sub(df1['prev_yr_days']).add(366)
结果:
person_id variable dates prev_yr_day next_yr_days
0 11 date_1 12/30/1961 364.0 2.0
5 11 date_2 07/23/2017 204.0 162.0
1 21 date_1 05/29/1967 149.0 217.0
6 21 date_2 05/29/2017 149.0 217.0
2 31 date_1 02/03/1957 34.0 332.0
7 31 date_2 02/03/2015 34.0 332.0
3 41 date_1 7/27/1959 208.0 158.0
8 41 date_2 NaN NaN NaN
4 51 date_1 01/13/1971 13.0 353.0
9 51 date_2 NaN NaN NaN
我有一个如下所示的数据框
df1 = pd.DataFrame({'person_id': [11, 21, 31, 41, 51],
'date_1': ['12/30/1961', '05/29/1967', '02/03/1957', '7/27/1959', '01/13/1971'],
'date_2': ['07/23/2017','05/29/2017','02/03/2015',np.nan,np.nan]})
df1 = df1.melt('person_id', value_name='dates')
我想得到上一年和下一年的天数.
我可以使用下面的代码获取上一年和下一年
df1['cur_year'] = pd.DatetimeIndex(df1['dates']).year
df1['prev_year'] = (df1['cur_year'] - 1)
df1['next_year'] = (df1['cur_year'] + 1)
如您所见,每一行的 year
值都在不断变化,而且我没有固定的基准日期,我如何计算天数与 31/12
之类的日期的差异上一年和下一年 01/01
。
请注意end date is not included while getting the number of days
我在下面展示了 2 个主题的示例输出。
更新截图
据我了解,你可以试试;
df1['dates'] = pd.to_datetime(df1['dates'])
out = df1.assign(prev_yr_days=df1['dates'].dt.dayofyear,
next_yr_days=((df1['dates'] + pd.offsets.YearEnd(0)) - df1['dates']).dt.days.add(1))
person_id variable dates prev_yr_days next_yr_days
0 11 date_1 1961-12-30 364.0 2.0
5 11 date_2 2017-07-23 204.0 162.0
1 21 date_1 1967-05-29 149.0 217.0
6 21 date_2 2017-05-29 149.0 217.0
2 31 date_1 1957-02-03 34.0 332.0
7 31 date_2 2015-02-03 34.0 332.0
3 41 date_1 1959-07-27 208.0 158.0
8 41 date_2 NaT NaN NaN
4 51 date_1 1971-01-13 13.0 353.0
9 51 date_2 NaT NaN NaN
我们可以根据您的行有条件地创建上一年和下一年的总和。
df1["next_year"] = (
pd.to_datetime(
"01-01-" + (df1["dates"].dt.year + 1).fillna(0).astype(int).astype(str)
)
- df1["dates"]
)
df1["prev_year"] = (df1['dates'] -
pd.to_datetime(
"31-12-" + (df1["dates"].dt.year - 1).fillna(0).astype(int).astype(str)
)
)
print(df1)
person_id variable dates next_year prev_year
0 11 date_1 1961-12-30 2 days 364 days
1 21 date_1 1967-05-29 217 days 149 days
2 31 date_1 1957-02-03 332 days 34 days
3 41 date_1 1959-07-27 158 days 208 days
4 51 date_1 1971-01-13 353 days 13 days
5 11 date_2 2017-07-23 162 days 204 days
6 21 date_2 2017-05-29 217 days 149 days
7 31 date_2 2015-02-03 332 days 34 days
8 41 date_2 NaT NaT NaT
9 51 date_2 NaT NaT NaT
这是一种方法:
dates = df['dates'].astype('datetime64')
df1['prev_yr_days'] = dates.dt.dayofyear
df1['next_yr_days'] = dates.dt.is_leap_year.sub(df1['prev_yr_days']).add(366)
结果:
person_id variable dates prev_yr_day next_yr_days
0 11 date_1 12/30/1961 364.0 2.0
5 11 date_2 07/23/2017 204.0 162.0
1 21 date_1 05/29/1967 149.0 217.0
6 21 date_2 05/29/2017 149.0 217.0
2 31 date_1 02/03/1957 34.0 332.0
7 31 date_2 02/03/2015 34.0 332.0
3 41 date_1 7/27/1959 208.0 158.0
8 41 date_2 NaN NaN NaN
4 51 date_1 01/13/1971 13.0 353.0
9 51 date_2 NaN NaN NaN