Pandas 根据日期添加行
Pandas add rows based on date
我有一个 pandas 数据框,其中有一列作为日期
2015-11-01
2015-12-01
2016-01-01
2016-03-01
2016-03-01
2016-10-01
2016-10-01
2016-12-01
2017-03-01
我想插入两行,
1. 开头的一行为第一行的上个月
2.最后一行的下个月末尾的一行
为了得到想要的输出,
2015-10-01
2015-11-01
2015-12-01
2016-01-01
2016-03-01
2016-03-01
2016-10-01
2016-10-01
2016-12-01
2017-03-01
2017-04-01
执行此操作的 pythonic 方式是什么?
试试 MonthBegin:
import pandas as pd
df=pd.DataFrame(['2015-11-01','2015-12-01','2016-01-01','2016-03-01','2016-03-01','2016-10-01','2016-10-01','2016-12-01','2017-03-01'],columns=['date'])
df['date']=pd.to_datetime(df['date'])
df=pd.DataFrame([df.loc[0,'date'] - pd.offsets.MonthBegin(1)]+list(df['date'])+[df.loc[len(df)-1,'date'] + pd.offsets.MonthBegin(1)],columns=['date'])
df
输出:
date
0 2015-10-01
1 2015-11-01
2 2015-12-01
3 2016-01-01
4 2016-03-01
5 2016-03-01
6 2016-10-01
7 2016-10-01
8 2016-12-01
9 2017-03-01
10 2017-04-01
使用:
df['date']=pd.to_datetime(df['date'])
a = df.loc[1, 'date'] - pd.offsets.MonthBegin()
b = df.loc[len(df.index) - 1, 'date'] + pd.offsets.MonthBegin()
df = pd.DataFrame([a] + df['date'].tolist() + [b], columns=['date'])
print (df)
date
0 2015-11-01
1 2015-11-01
2 2015-12-01
3 2016-01-01
4 2016-03-01
5 2016-03-01
6 2016-10-01
7 2016-10-01
8 2016-12-01
9 2017-03-01
10 2017-04-01
或者:
df.index = df.index + 1
df.loc[0, 'date'] = df.loc[1, 'date'] - pd.offsets.MonthBegin()
df.loc[len(df.index), 'date'] = df.loc[len(df.index) - 1, 'date'] + pd.offsets.MonthBegin()
df = df.sort_index()
print (df)
date
0 2015-10-01
1 2015-11-01
2 2015-12-01
3 2016-01-01
4 2016-03-01
5 2016-03-01
6 2016-10-01
7 2016-10-01
8 2016-12-01
9 2017-03-01
10 2017-04-01
我有一个 pandas 数据框,其中有一列作为日期
2015-11-01
2015-12-01
2016-01-01
2016-03-01
2016-03-01
2016-10-01
2016-10-01
2016-12-01
2017-03-01
我想插入两行, 1. 开头的一行为第一行的上个月 2.最后一行的下个月末尾的一行
为了得到想要的输出,
2015-10-01
2015-11-01
2015-12-01
2016-01-01
2016-03-01
2016-03-01
2016-10-01
2016-10-01
2016-12-01
2017-03-01
2017-04-01
执行此操作的 pythonic 方式是什么?
试试 MonthBegin:
import pandas as pd
df=pd.DataFrame(['2015-11-01','2015-12-01','2016-01-01','2016-03-01','2016-03-01','2016-10-01','2016-10-01','2016-12-01','2017-03-01'],columns=['date'])
df['date']=pd.to_datetime(df['date'])
df=pd.DataFrame([df.loc[0,'date'] - pd.offsets.MonthBegin(1)]+list(df['date'])+[df.loc[len(df)-1,'date'] + pd.offsets.MonthBegin(1)],columns=['date'])
df
输出:
date
0 2015-10-01
1 2015-11-01
2 2015-12-01
3 2016-01-01
4 2016-03-01
5 2016-03-01
6 2016-10-01
7 2016-10-01
8 2016-12-01
9 2017-03-01
10 2017-04-01
使用:
df['date']=pd.to_datetime(df['date'])
a = df.loc[1, 'date'] - pd.offsets.MonthBegin()
b = df.loc[len(df.index) - 1, 'date'] + pd.offsets.MonthBegin()
df = pd.DataFrame([a] + df['date'].tolist() + [b], columns=['date'])
print (df)
date
0 2015-11-01
1 2015-11-01
2 2015-12-01
3 2016-01-01
4 2016-03-01
5 2016-03-01
6 2016-10-01
7 2016-10-01
8 2016-12-01
9 2017-03-01
10 2017-04-01
或者:
df.index = df.index + 1
df.loc[0, 'date'] = df.loc[1, 'date'] - pd.offsets.MonthBegin()
df.loc[len(df.index), 'date'] = df.loc[len(df.index) - 1, 'date'] + pd.offsets.MonthBegin()
df = df.sort_index()
print (df)
date
0 2015-10-01
1 2015-11-01
2 2015-12-01
3 2016-01-01
4 2016-03-01
5 2016-03-01
6 2016-10-01
7 2016-10-01
8 2016-12-01
9 2017-03-01
10 2017-04-01