使用 python 计算两个日期之间的持续时间
Calculate Duration between two dates using python
我有数据集
DaySchedule DayAppointment
2016-04-29 18:38:08 2016-04-29
2016-04-29 16:08:27 2016-04-29
2016-04-26 15:04:17 2016-04-29
我想计算计划日期和约会日期之间的持续时间,如果它们在同一天,则持续时间将为 0,否则我将从计划日期中减去约会日期。
def duration_time(x,y):
x= x.dt.date
y= y.dt.date
if x==y:
return 0
else:
return x-y
Patient["duration"] = Patient.apply(lambda Patient:duration_time(Patient["DayAppointment"], Patient["DaySchedule"]), axis=1)
在我 运行 这个鳕鱼之后我遇到了这个错误:
AttributeError: ("'Timestamp' object has no attribute 'dt'", u'occurred at index 0')
知道为什么会出现此错误吗?
使用 numpy where
+ dt.date
+ sub
代替:
Patient.DaySchedule=pd.to_datetime(Patient.DaySchedule)
Patient.DayAppointment=pd.to_datetime(Patient.DayAppointment)
Patient['duration']=np.where(Patient.DaySchedule.dt.date==Patient.DayAppointment.dt.date, 0, Patient.DaySchedule.sub(Patient.DayAppointment))
DaySchedule DayAppointment Duration
2016-04-29 18:38:08 2016-04-29 0 days 00:00:00
2016-04-29 16:08:27 2016-04-29 0 days 00:00:00
2016-04-26 15:04:17 2016-04-29 -3 days +15:04:17
你也可以只得到天数:
Patient['Duration']=Patient.DaySchedule.sub(Patient.DayAppointment).astype('timedelta64[D]')
DaySchedule DayAppointment Duration
2016-04-29 18:38:08 2016-04-29 0.0
2016-04-29 16:08:27 2016-04-29 0.0
2016-04-26 15:04:17 2016-04-29 -3.0
使用sub,需要:
1.59 ms ± 63.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
仅仅使用简单的减法就多花费将近一毫秒:
Patient['Duration']=np.where(Patient.DaySchedule.dt.date==Patient.DayAppointment.dt.date, 0, Patient.DaySchedule-Patient.DayAppointment)
2.51 ms ± 172 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
我有数据集
DaySchedule DayAppointment
2016-04-29 18:38:08 2016-04-29
2016-04-29 16:08:27 2016-04-29
2016-04-26 15:04:17 2016-04-29
我想计算计划日期和约会日期之间的持续时间,如果它们在同一天,则持续时间将为 0,否则我将从计划日期中减去约会日期。
def duration_time(x,y):
x= x.dt.date
y= y.dt.date
if x==y:
return 0
else:
return x-y
Patient["duration"] = Patient.apply(lambda Patient:duration_time(Patient["DayAppointment"], Patient["DaySchedule"]), axis=1)
在我 运行 这个鳕鱼之后我遇到了这个错误: AttributeError: ("'Timestamp' object has no attribute 'dt'", u'occurred at index 0')
知道为什么会出现此错误吗?
使用 numpy where
+ dt.date
+ sub
代替:
Patient.DaySchedule=pd.to_datetime(Patient.DaySchedule)
Patient.DayAppointment=pd.to_datetime(Patient.DayAppointment)
Patient['duration']=np.where(Patient.DaySchedule.dt.date==Patient.DayAppointment.dt.date, 0, Patient.DaySchedule.sub(Patient.DayAppointment))
DaySchedule DayAppointment Duration
2016-04-29 18:38:08 2016-04-29 0 days 00:00:00
2016-04-29 16:08:27 2016-04-29 0 days 00:00:00
2016-04-26 15:04:17 2016-04-29 -3 days +15:04:17
你也可以只得到天数:
Patient['Duration']=Patient.DaySchedule.sub(Patient.DayAppointment).astype('timedelta64[D]')
DaySchedule DayAppointment Duration
2016-04-29 18:38:08 2016-04-29 0.0
2016-04-29 16:08:27 2016-04-29 0.0
2016-04-26 15:04:17 2016-04-29 -3.0
使用sub,需要:
1.59 ms ± 63.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
仅仅使用简单的减法就多花费将近一毫秒:
Patient['Duration']=np.where(Patient.DaySchedule.dt.date==Patient.DayAppointment.dt.date, 0, Patient.DaySchedule-Patient.DayAppointment)
2.51 ms ± 172 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)