在 python 中将日期时间拆分为年份和月份列
Split the Datetime into Year and Month column in python
我们如何将日期时间值拆分为年份和月份,并且需要拆分年份列
(2017_year、2018_year 等等...)并且年份列下的值应该得到相应年份的月份?
示例数据:
call time area age
2017-12-12 19:38:00 Rural 28
2018-01-12 22:05:00 Rural 50
2018-02-12 22:33:00 Rural 76
2019-01-12 22:37:00 Urban 45
2020-02-13 00:26:00 Urban 52
所需输出:
call time area age Year_2017 Year_2018
2017-12-12 19:38:00 Rural 28 jan jan
2018-01-12 22:05:00 Rural 50 Feb Feb
2018-02-12 22:33:00 Rural 76 mar mar
2019-01-12 22:37:00 Urban 45 Apr Apr
2020-02-13 00:26:00 Urban 52 may may
我认为您需要从 call time
日期时间生成年份和月份,因此输出不同:
解释 - 首先由 DataFrame.assign
and Series.dt.strftime
, then convert years to index with append=True
for MultiIndex
, so possible reshape by Series.unstack
生成月份列,最后添加到原始:
df1 = (df.assign(m = df['call time'].dt.strftime('%b'))
.set_index(df['call time'].dt.year, append=True)['m']
.unstack()
.add_prefix('Year_'))
print (df1)
call time Year_2017 Year_2018 Year_2019 Year_2020
0 Dec NaN NaN NaN
1 NaN Jan NaN NaN
2 NaN Feb NaN NaN
3 NaN NaN Jan NaN
4 NaN NaN NaN Feb
df = df.join(df1)
print (df)
call time area age Year_2017 Year_2018 Year_2019 Year_2020
0 2017-12-12 19:38:00 Rural 28 Dec NaN NaN NaN
1 2018-01-12 22:05:00 Rural 50 NaN Jan NaN NaN
2 2018-02-12 22:33:00 Rural 76 NaN Feb NaN NaN
3 2019-01-12 22:37:00 Urban 45 NaN NaN Jan NaN
4 2020-02-13 00:26:00 Urban 52 NaN NaN NaN Feb
我们如何将日期时间值拆分为年份和月份,并且需要拆分年份列 (2017_year、2018_year 等等...)并且年份列下的值应该得到相应年份的月份?
示例数据:
call time area age
2017-12-12 19:38:00 Rural 28
2018-01-12 22:05:00 Rural 50
2018-02-12 22:33:00 Rural 76
2019-01-12 22:37:00 Urban 45
2020-02-13 00:26:00 Urban 52
所需输出:
call time area age Year_2017 Year_2018
2017-12-12 19:38:00 Rural 28 jan jan
2018-01-12 22:05:00 Rural 50 Feb Feb
2018-02-12 22:33:00 Rural 76 mar mar
2019-01-12 22:37:00 Urban 45 Apr Apr
2020-02-13 00:26:00 Urban 52 may may
我认为您需要从 call time
日期时间生成年份和月份,因此输出不同:
解释 - 首先由 DataFrame.assign
and Series.dt.strftime
, then convert years to index with append=True
for MultiIndex
, so possible reshape by Series.unstack
生成月份列,最后添加到原始:
df1 = (df.assign(m = df['call time'].dt.strftime('%b'))
.set_index(df['call time'].dt.year, append=True)['m']
.unstack()
.add_prefix('Year_'))
print (df1)
call time Year_2017 Year_2018 Year_2019 Year_2020
0 Dec NaN NaN NaN
1 NaN Jan NaN NaN
2 NaN Feb NaN NaN
3 NaN NaN Jan NaN
4 NaN NaN NaN Feb
df = df.join(df1)
print (df)
call time area age Year_2017 Year_2018 Year_2019 Year_2020
0 2017-12-12 19:38:00 Rural 28 Dec NaN NaN NaN
1 2018-01-12 22:05:00 Rural 50 NaN Jan NaN NaN
2 2018-02-12 22:33:00 Rural 76 NaN Feb NaN NaN
3 2019-01-12 22:37:00 Urban 45 NaN NaN Jan NaN
4 2020-02-13 00:26:00 Urban 52 NaN NaN NaN Feb