使用 timedelta 和布尔值计算时间范围

Calculating time range with timedelta & Boolean

我需要帮助来执行 timedelta 函数以确定 actn_dt 是否大于或等于 1 年前,如果是,return 有经验。

dataframe f2 看起来像这样:

           nm_emp_lst    actn_dt
14483   MACKENZIE         2015-03-22
132902  CAMPBELL          2015-04-19
124182  SJOSTROM          2015-03-22
103482  LAPLANTE          2014-11-30
45722   LEMAY             2014-11-30
169088  TAYLOR            2015-06-14
105355  HENDERSON         2015-11-01
105359  HENDERSON         2014-10-19
45394   PELLERIN          2015-07-12
119317  BOISSEAU          2015-07-12

应该是这样的:

           nm_emp_lst    actn_dt        Experienced
14483   MACKENZIE         2015-03-22   
132902  CAMPBELL          2015-04-19    
124182  SJOSTROM          2015-03-22
103482  LAPLANTE          2014-11-30    Experienced
45722   LEMAY             2014-11-30    Experienced
169088  TAYLOR            2015-06-14    
105355  HENDERSON         2015-11-01    
105359  HENDERSON         2014-10-19    Experienced
45394   PELLERIN          2015-07-12    
119317  BOISSEAU          2015-07-12

因此,等于或大于一年前的任何值。

创建了一个函数:

year = timedelta(days=365)
today2 = datetime.datetime.strftime(datetime.datetime.now(),'%A_%B_%d_%Y_%H%M')

def year(row):
    if row['actn_dt'] >= today2 - year:
        return "Experienced"

然后是lamdba函数:

f2['Experienced'] = f2.apply (lambda row: year (row),axis=1)    

由此,我收到错误:

TypeError: ("unsupported operand type(s) for -: 'str' and 'function'", u'occurred at index 14483')

我的数据类型是:

nm_emp_lst            object
actn_dt       datetime64[ns]

感谢任何帮助!

===更新===
在 jezrael 的帮助下,我想出了一个解决方案。这可能是很长的路要走,但它确实有效。首先,我必须创建一个新列来提供今天日期之前一年的数据。

f2['year1'] = datetime.datetime.now().date() - datetime.timedelta(days=365)

然后我不得不将 'year1' 从 timedelta 更改为 datetime:

f2['year1'] = pd.to_datetime(f2['year1'], coerce=True)

从这里我使用了 jezrael 提供的编码。

f2.loc[f2['actn_dt'] <= f2['year1'], 'Experienced'] = "Experienced"

新结果是:

               nm_emp_lst    actn_dt      year1  Experienced
14483   MACKENZIE         2015-03-22 2015-02-12          NaN
132902  CAMPBELL          2015-04-19 2015-02-12          NaN
124182  SJOSTROM          2015-03-22 2015-02-12          NaN
103482  LAPLANTE          2014-11-30 2015-02-12  Experienced
45722   LEMAY             2014-11-30 2015-02-12  Experienced
169088  TAYLOR            2015-06-14 2015-02-12          NaN
105355  HENDERSON         2015-11-01 2015-02-12          NaN
105359  HENDERSON         2014-10-19 2015-02-12  Experienced
45394   PELLERIN          2015-07-12 2015-02-12          NaN
119317  BOISSEAU          2015-07-12 2015-02-12          NaN

这真是太棒了!谢谢杰斯瑞尔!

您可以使用 loc - df 中的第二行已更改为测试:

print df
       nm_emp_lst    actn_dt
14483   MACKENZIE 2015-03-22
132902   CAMPBELL 2018-04-19
124182   SJOSTROM 2015-03-22
103482   LAPLANTE 2014-11-30
45722       LEMAY 2014-11-30
169088     TAYLOR 2015-06-14
105355  HENDERSON 2015-11-01
105359  HENDERSON 2014-10-19
45394    PELLERIN 2015-07-12

print datetime.timedelta(days=365)
365 days, 0:00:00

print datetime.datetime.now().date()
2016-02-12

print datetime.datetime.now().date() - datetime.timedelta(days=365)
2015-02-12
print df['actn_dt'] <= datetime.datetime.now().date() - datetime.timedelta(days=365)
14483     False
132902    False
124182    False
103482     True
45722      True
169088    False
105355    False
105359     True
45394     False
119317    False
Name: actn_dt, dtype: bool

df.loc[df['actn_dt'] <= datetime.datetime.now().date() - datetime.timedelta(days=365) , 'Experienced'] = "Experienced"
print df
       nm_emp_lst    actn_dt  Experienced
14483   MACKENZIE 2015-03-22          NaN
132902   CAMPBELL 2015-04-19          NaN
124182   SJOSTROM 2015-03-22          NaN
103482   LAPLANTE 2014-11-30  Experienced
45722       LEMAY 2014-11-30  Experienced
169088     TAYLOR 2015-06-14          NaN
105355  HENDERSON 2015-11-01          NaN
105359  HENDERSON 2014-10-19  Experienced
45394    PELLERIN 2015-07-12          NaN
119317   BOISSEAU 2015-07-12          NaN