将值替换为数据集列上的整数

Replacement Values into the integer on dataset columns

House Number Street First Name Surname Age Relationship to Head of House Marital Status Gender Occupation Infirmity Religion
0 1 Smith Radial Grace Patel 46 Head Widowed Female Petroleum engineer None Catholic
1 1 Smith Radial Ian Nixon 24 Lodger Single Male Publishing rights manager None Christian
2 2 Smith Radial Frederick Read 87 Head Divorced Male Retired TEFL teacher None Catholic
3 3 Smith Radial Daniel Adams 58 Head Divorced Male Therapist, music None Catholic
4 3 Smith Radial Matthew Hall 13 Grandson NaN Male Student None NaN
5 3 Smith Radial Steven Fletcher 9 Grandson NaN Male Student None NaN
6 4 Smith Radial Alison Jenkins 38 Head Single Female Physiotherapist None Catholic
7 4 Smith Radial Kelly Jenkins 12 Daughter NaN Female Student None NaN
8 5 Smith Radial Kim Browne 69 Head Married Female Retired Estate manager/land agent None Christian
9 5 Smith Radial Oliver Browne 69 Husband Married Male Retired Merchandiser, retail None None

您好,

我有一个数据集,您可以在下面看到。当我试图将 Age 转换为 int 时。我得到了那个错误:ValueError: invalid literal for int() with base 10: '43.54302670766108'

这意味着该数据中有浮点数据。我试图替换“。”到'0'然后尝试转换但我失败了。你能帮我做吗?

df['Age'] = df['Age'].replace('.','0')
df['Age'] = df['Age'].astype('int')

我仍然遇到同样的错误。我认为替换行不起作用。你知道为什么吗?

谢谢

尝试:

df['Age'] = df['Age'].replace('\..*$', '', regex=True).astype(int)

或者,更激烈:

df['Age'] = df['Age'].replace('^(?:.*\D.*)?$', '0', regex=True).astype(int)

你不需要操作字符串;您可能首先将值转换为 float,然后再转换为 int,如:

df["Age"] = df["Age"].astype('float').astype('int')