Python Dateutil:无法根据两个日期计算年龄(relativedelta)
Python Dateutil: Unable to calculate age from two dates (relativedelta)
我正在尝试在数据框中创建一个新列,使用 Dateutil 的 relativedelta 函数计算一个人的年龄,使用以下代码;
df['Age'] = relativedelta(df['Today'], df['DOB']).years
但是,我收到以下错误;
ValueError Traceback (most recent call last)
<ipython-input-99-f87ca88a2e3c> in <module>()
1
----> 2 df['Years of Age2'] = relativedelta(df['Today'], df['DOB']).years
C:\anaconda3\lib\site-packages\dateutil\relativedelta.py in __init__(self, dt1, dt2, years, months, days, leapdays, weeks, hours, minutes, seconds, microseconds, year, month, day, weekday, yearday, nlyearday, hour, minute, second, microsecond)
101 "ambiguous and not currently supported.")
102
--> 103 if dt1 and dt2:
104 # datetime is a subclass of date. So both must be date
105 if not (isinstance(dt1, datetime.date) and
C:\anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
953 raise ValueError("The truth value of a {0} is ambiguous. "
954 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 955 .format(self.__class__.__name__))
956
957 __bool__ = __nonzero__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
在dataframe外成功如下;
DOB = datetime.date(1990,8,25)
Today = datetime.date.today()
relativedelta(Today, DOB).years
Out[2]: 29
============================================= ========================
所以我假设我在将数据类型从 Dataframe 传递给函数时做错了什么?
我可以用下面的代码以不同的方式计算年龄,我只是不明白为什么第一种方法不起作用。
df['Years of Age'] = np.round((df['Today'] - df['DOB'])/np.timedelta64(1,'Y'),decimals = 0)
这是起始代码;
import pandas as pd
import numpy as np
import datetime
from dateutil.relativedelta import relativedelta
ind = 'Andy Brandy Cindy'
MyDict = {"DOB" : [ (datetime.date(1954,7,5)),
(datetime.date(1998,1,27)),
(datetime.date(2001,3,15)) ]}
df = pd.DataFrame(data=MyDict,index=ind.split())
df['Today'] = datetime.date.today()
df
DOB Today
Andy 1954-07-05 2019-08-30
Brandy 1998-01-27 2019-08-30
Cindy 2001-03-15 2019-08-30
这里是计算;
df['Age'] = relativedelta(df['Today'], df['DOB']).years
我认为 relativedelta
不能接受 pandas 系列作为参数。回溯显示问题是当您 relativedelta
后面的代码试图检查传递给 relativedelta
的第一个参数 dt1
的实例时,在您的代码中是系列 df['Today']
.然后从 pandas 中提出值错误,表示检查一个系列是否属于实例 datetime.datetime
和 isinstance
是不明确的。正如您自己所做的那样,在数据框之外,它之所以有效,是因为您直接传递日期时间对象而不是系列。所以你可以使用 apply
来逐行计算 2 个日期时间对象
之间的差异
df['Age'] = df.apply(lambda x: relativedelta(x['Today'], x['DOB']).years, axis=1)
但我认为您找到的解决方法更快,但可能不如使用 relativedelta
精确
我正在尝试在数据框中创建一个新列,使用 Dateutil 的 relativedelta 函数计算一个人的年龄,使用以下代码;
df['Age'] = relativedelta(df['Today'], df['DOB']).years
但是,我收到以下错误;
ValueError Traceback (most recent call last)
<ipython-input-99-f87ca88a2e3c> in <module>()
1
----> 2 df['Years of Age2'] = relativedelta(df['Today'], df['DOB']).years
C:\anaconda3\lib\site-packages\dateutil\relativedelta.py in __init__(self, dt1, dt2, years, months, days, leapdays, weeks, hours, minutes, seconds, microseconds, year, month, day, weekday, yearday, nlyearday, hour, minute, second, microsecond)
101 "ambiguous and not currently supported.")
102
--> 103 if dt1 and dt2:
104 # datetime is a subclass of date. So both must be date
105 if not (isinstance(dt1, datetime.date) and
C:\anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
953 raise ValueError("The truth value of a {0} is ambiguous. "
954 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 955 .format(self.__class__.__name__))
956
957 __bool__ = __nonzero__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
在dataframe外成功如下;
DOB = datetime.date(1990,8,25)
Today = datetime.date.today()
relativedelta(Today, DOB).years
Out[2]: 29
============================================= ========================
所以我假设我在将数据类型从 Dataframe 传递给函数时做错了什么?
我可以用下面的代码以不同的方式计算年龄,我只是不明白为什么第一种方法不起作用。
df['Years of Age'] = np.round((df['Today'] - df['DOB'])/np.timedelta64(1,'Y'),decimals = 0)
这是起始代码;
import pandas as pd
import numpy as np
import datetime
from dateutil.relativedelta import relativedelta
ind = 'Andy Brandy Cindy'
MyDict = {"DOB" : [ (datetime.date(1954,7,5)),
(datetime.date(1998,1,27)),
(datetime.date(2001,3,15)) ]}
df = pd.DataFrame(data=MyDict,index=ind.split())
df['Today'] = datetime.date.today()
df
DOB Today
Andy 1954-07-05 2019-08-30
Brandy 1998-01-27 2019-08-30
Cindy 2001-03-15 2019-08-30
这里是计算;
df['Age'] = relativedelta(df['Today'], df['DOB']).years
我认为 relativedelta
不能接受 pandas 系列作为参数。回溯显示问题是当您 relativedelta
后面的代码试图检查传递给 relativedelta
的第一个参数 dt1
的实例时,在您的代码中是系列 df['Today']
.然后从 pandas 中提出值错误,表示检查一个系列是否属于实例 datetime.datetime
和 isinstance
是不明确的。正如您自己所做的那样,在数据框之外,它之所以有效,是因为您直接传递日期时间对象而不是系列。所以你可以使用 apply
来逐行计算 2 个日期时间对象
df['Age'] = df.apply(lambda x: relativedelta(x['Today'], x['DOB']).years, axis=1)
但我认为您找到的解决方法更快,但可能不如使用 relativedelta