将值替换为数据集列上的整数
Replacement Values into the integer on dataset columns
House Number
Street
First Name
Surname
Age
Relationship to Head of House
Marital Status
Gender
Occupation
Infirmity
Religion
0
1
Smith Radial
Grace
Patel
46
Head
Widowed
Female
Petroleum engineer
None
Catholic
1
1
Smith Radial
Ian
Nixon
24
Lodger
Single
Male
Publishing rights manager
None
Christian
2
2
Smith Radial
Frederick
Read
87
Head
Divorced
Male
Retired TEFL teacher
None
Catholic
3
3
Smith Radial
Daniel
Adams
58
Head
Divorced
Male
Therapist, music
None
Catholic
4
3
Smith Radial
Matthew
Hall
13
Grandson
NaN
Male
Student
None
NaN
5
3
Smith Radial
Steven
Fletcher
9
Grandson
NaN
Male
Student
None
NaN
6
4
Smith Radial
Alison
Jenkins
38
Head
Single
Female
Physiotherapist
None
Catholic
7
4
Smith Radial
Kelly
Jenkins
12
Daughter
NaN
Female
Student
None
NaN
8
5
Smith Radial
Kim
Browne
69
Head
Married
Female
Retired Estate manager/land agent
None
Christian
9
5
Smith Radial
Oliver
Browne
69
Husband
Married
Male
Retired Merchandiser, retail
None
None
您好,
我有一个数据集,您可以在下面看到。当我试图将 Age 转换为 int 时。我得到了那个错误:ValueError: invalid literal for int() with base 10: '43.54302670766108'
这意味着该数据中有浮点数据。我试图替换“。”到'0'然后尝试转换但我失败了。你能帮我做吗?
df['Age'] = df['Age'].replace('.','0')
df['Age'] = df['Age'].astype('int')
我仍然遇到同样的错误。我认为替换行不起作用。你知道为什么吗?
谢谢
尝试:
df['Age'] = df['Age'].replace('\..*$', '', regex=True).astype(int)
或者,更激烈:
df['Age'] = df['Age'].replace('^(?:.*\D.*)?$', '0', regex=True).astype(int)
你不需要操作字符串;您可能首先将值转换为 float,然后再转换为 int,如:
df["Age"] = df["Age"].astype('float').astype('int')
House Number | Street | First Name | Surname | Age | Relationship to Head of House | Marital Status | Gender | Occupation | Infirmity | Religion | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Smith Radial | Grace | Patel | 46 | Head | Widowed | Female | Petroleum engineer | None | Catholic | |
1 | 1 | Smith Radial | Ian | Nixon | 24 | Lodger | Single | Male | Publishing rights manager | None | Christian | |
2 | 2 | Smith Radial | Frederick | Read | 87 | Head | Divorced | Male | Retired TEFL teacher | None | Catholic | |
3 | 3 | Smith Radial | Daniel | Adams | 58 | Head | Divorced | Male | Therapist, music | None | Catholic | |
4 | 3 | Smith Radial | Matthew | Hall | 13 | Grandson | NaN | Male | Student | None | NaN | |
5 | 3 | Smith Radial | Steven | Fletcher | 9 | Grandson | NaN | Male | Student | None | NaN | |
6 | 4 | Smith Radial | Alison | Jenkins | 38 | Head | Single | Female | Physiotherapist | None | Catholic | |
7 | 4 | Smith Radial | Kelly | Jenkins | 12 | Daughter | NaN | Female | Student | None | NaN | |
8 | 5 | Smith Radial | Kim | Browne | 69 | Head | Married | Female | Retired Estate manager/land agent | None | Christian | |
9 | 5 | Smith Radial | Oliver | Browne | 69 | Husband | Married | Male | Retired Merchandiser, retail | None | None | |
您好,
我有一个数据集,您可以在下面看到。当我试图将 Age 转换为 int 时。我得到了那个错误:ValueError: invalid literal for int() with base 10: '43.54302670766108'
这意味着该数据中有浮点数据。我试图替换“。”到'0'然后尝试转换但我失败了。你能帮我做吗?
df['Age'] = df['Age'].replace('.','0')
df['Age'] = df['Age'].astype('int')
我仍然遇到同样的错误。我认为替换行不起作用。你知道为什么吗?
谢谢
尝试:
df['Age'] = df['Age'].replace('\..*$', '', regex=True).astype(int)
或者,更激烈:
df['Age'] = df['Age'].replace('^(?:.*\D.*)?$', '0', regex=True).astype(int)
你不需要操作字符串;您可能首先将值转换为 float,然后再转换为 int,如:
df["Age"] = df["Age"].astype('float').astype('int')