`to_datetime` limit or misuse? ValueError : Doesn't match format specified

`to_datetime` limit or misuse? ValueError : Doesn't match format specified

我似乎无法将包含日期字符串的系列转换为 datetime64dtype。 以下代码重现错误:

import pandas as pd

gud_date_s = pd.Series(["2019/12/31 00:00:00.0"]*100)
gud_date_s2 = pd.Series(["2261/12/31 00:00:00.0"]*100)
bad_date_s = pd.Series(["9999/12/31 00:00:00.0"]*100)
bad_date_s2 = pd.Series(["2262/12/31 00:00:00.0"]*100)


gd1 = pd.to_datetime(gud_date_s, format="%Y/%m/%d", yearfirst=True).dt.date # Correct
gd2 = pd.to_datetime(gud_date_s2 , format="%Y/%m/%d", yearfirst=True).dt.date # Correct
bd1 = pd.to_datetime(bad_date_s, format="%Y/%m/%d", yearfirst=True).dt.date 
#Returns {ValueError}time data 9999/12/31 00:00:00.0 doesn't match format specified.
bd2 = pd.to_datetime(bad_date_s2 , format="%Y/%m/%d", yearfirst=True).dt.date
#Returns {ValueError}time data 2262/12/31 00:00:00.0 doesn't match format specified.

所以接受年份的门槛似乎是2261。为什么?我该如何解决这个问题?

N.B: 9999/12/31 等日期是相关的,因此我想保持原样。

干杯

此处的年份值无效 9999,因此必须 errors='coerce' 才能转换为 NaT:

bd1 = pd.to_datetime(bad_date_s, format="%Y/%m/%d", yearfirst=True, errors='coerce').dt.date

这里出现错误,因为limit,年份是正确的,但是最大的月和日只有11th April:

不幸的是这里的错误应该更清楚。

bd2 = pd.to_datetime(bad_date_s2 , format="%Y/%m/%d", yearfirst=True, errors='coerce').dt.date

print (pd.Timestamp.max)
2262-04-11 23:47:16.8547758

使用日期时间会引发错误:

from datetime import datetime

d = datetime(year=9999, month=12, day=31)
bd1 = pd.to_datetime(bad_date_s, format="%Y/%m/%d", yearfirst=True, errors='coerce').dt.date.fillna(d)
print (bd1)

OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 9999-12-31 00:00:00