根据 python 中的日期差异扩展数据框

Expanding dataframe based on the date difference in python

我有如下场景

场景 1:

FromDate    ToDate
15-01-2018  15-12-2018
15-12-2018  15-10-2020
...
15-10-2020  15-11-2020
15-11-2020  15-12-2020
15-12-2020  15-01-2021

这里没有从 15-01-201815-12-201815 的延续-12-201815-10-2020 所以我想得到上面的数据框如下。

FromDate    ToDate
15-01-2018  15-02-2018
15-02-2018  15-03-2018
15-03-2018  15-04-2018
...
15-11-2018  15-12-2018
15-12-2018  15-01-2020
15-01-2020  15-02-2020
15-02-2020  15-03-2020
...
15-09-2020  15-10-2020
15-10-2020  15-11-2020
15-11-2020  15-12-2020
15-12-2020  15-01-2021

有什么方法可以实现吗?

场景 2: 在场景 2 中,最后一个 FromDate 是 15-12-2020,ToDate 是 10-01-2021 一个月的第几天在这里不一样。

输入:

FromDate    ToDate
15-01-2018  15-12-2018
15-12-2018  10-10-2020
...
15-10-2020  15-11-2020
15-11-2020  15-12-2020
15-12-2020  10-01-2021

输出:

FromDate    ToDate
15-01-2018  15-02-2018
15-02-2018  15-03-2018
15-03-2018  15-04-2018
...
15-11-2018  15-12-2018
15-12-2018  15-01-2020
15-01-2020  15-02-2020
15-02-2020  15-03-2020
...
15-09-2020  10-10-2020
15-10-2020  15-11-2020
15-11-2020  15-12-2020
15-12-2020  10-01-2021
  • FromDatedate_range()
  • 生成列表
  • 展开explode()
  • 计算截止日期
import datetime as dt
df = pd.read_csv(io.StringIO("""FromDate    ToDate
15-01-2018  15-12-2018
15-12-2018  15-10-2020
15-10-2020  15-11-2020
15-11-2020  15-12-2020
15-12-2020  10-01-2021"""), sep="\s+")

df.FromDate = pd.to_datetime(df.FromDate)
df.ToDate = pd.to_datetime(df.ToDate)

(df.assign(FromDate=df.apply(lambda r: pd.date_range(dt.date(r.FromDate.year, r.FromDate.month, 1), 
                                                    dt.date(r.ToDate.year, r.ToDate.month, 1) - pd.Timedelta(days=1), 
                                                    freq="MS") + pd.Timedelta(days=r.FromDate.day-1), axis=1))
 .explode("FromDate")
 .assign(
     ToDate=lambda dfa: np.where((dfa.ToDate-dfa.FromDate).gt(dt.timedelta(days=35)),
                                                              dfa.FromDate + pd.DateOffset(months=1),
                                                             dfa.ToDate))
)