得到一个我不应该得到的错误
Getting an error that i should not be getting
我试图通过将一列中的数字除以另一列中的数字来获得百分比,但我总是遇到同样的错误。
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-34-60166e8a919c> in <module>()
6 dataLake = dataLake[['day','Agent','Resolved','Meta','Week','Year']]
7 #Creating new data (atingimento)
----> 8 dataLake["atingimento"] = ((dataLake.Resolved.astype(int) / dataLake.Meta.astype(int)) * 100)
9 dataLake['Resolved'] = dataLake.Resolved.astype(int)
10 dataLake['Meta'] = dataLake.Meta.astype(str)
4 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/dtypes/cast.py in astype_nansafe(arr, dtype, copy, skipna)
972 # work around NumPy brokenness, #1987
973 if np.issubdtype(dtype.type, np.integer):
--> 974 return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
975
976 # if we have a datetime/timedelta array of objects
pandas/_libs/lib.pyx in pandas._libs.lib.astype_intsafe()
ValueError: invalid literal for int() with base 10: ''
我尝试使用 .astype(int) 将两个数据集都转换为 int 但它不起作用,正如您从下面的数据集中看到的 google colab 如何读取列 'Meta' 作为字符串,即使它的格式与已解决的列相同。
day | Agent | Resolved | Meta |Week | Year
-------------------------------------------------------------------------
103 2021-01-26 | Ana Carolina B. | 107 |2525252525 4 2021
104 2021-01-25 | Bárbara D. | 275 |3831252128 4 2021
105 2021-01-25 | Danielly | 192 |3831252128 4 2021
106 2021-01-26 | Felipe Pereira | 102 |3125212822 4 2021
107 2021-01-26 |Fernanda Favalessa |207 |3125212822 4 2021
108 2021-01-25 | Guto R. |215 |3831252114 4 2021
109 2021-01-25 | Helaine S. | 253 | 3831252114 4 2021
110 2021-01-25 | João M. | 145 | 38252128 4 2021
111 2021-01-25 | João P. | 173 | 3535353535 4 2021
112 2021-01-26 | Livia Azeredo | 89 |3125212822 4 2021
113 2021-01-26 | Lucas Alves | 70 |1815101320 4 2021
114 2021-01-25 Paula P. | 137 |3831252114 4 2021
您可能希望使用 pandas.to_numeric
将无效数据转换为 NaN(如果需要,然后 fillna
使用默认值):
代替:
dataLake.Resolved.astype(int)
使用:
pd.to_numeric(dataLak['Resolved'], errors='coerce')
# or
pd.to_numeric(dataLak['Resolved'], errors='coerce').fillna(-1) # -1 if invalid
等对于所有其他事件
示例:
pd.to_numeric(pd.Series(['1', ' 12 ', '']), errors='coerce')
输出:
0 1.0
1 12.0
2 NaN
dtype: float64
我试图通过将一列中的数字除以另一列中的数字来获得百分比,但我总是遇到同样的错误。
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-34-60166e8a919c> in <module>()
6 dataLake = dataLake[['day','Agent','Resolved','Meta','Week','Year']]
7 #Creating new data (atingimento)
----> 8 dataLake["atingimento"] = ((dataLake.Resolved.astype(int) / dataLake.Meta.astype(int)) * 100)
9 dataLake['Resolved'] = dataLake.Resolved.astype(int)
10 dataLake['Meta'] = dataLake.Meta.astype(str)
4 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/dtypes/cast.py in astype_nansafe(arr, dtype, copy, skipna)
972 # work around NumPy brokenness, #1987
973 if np.issubdtype(dtype.type, np.integer):
--> 974 return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
975
976 # if we have a datetime/timedelta array of objects
pandas/_libs/lib.pyx in pandas._libs.lib.astype_intsafe()
ValueError: invalid literal for int() with base 10: ''
我尝试使用 .astype(int) 将两个数据集都转换为 int 但它不起作用,正如您从下面的数据集中看到的 google colab 如何读取列 'Meta' 作为字符串,即使它的格式与已解决的列相同。
day | Agent | Resolved | Meta |Week | Year
-------------------------------------------------------------------------
103 2021-01-26 | Ana Carolina B. | 107 |2525252525 4 2021
104 2021-01-25 | Bárbara D. | 275 |3831252128 4 2021
105 2021-01-25 | Danielly | 192 |3831252128 4 2021
106 2021-01-26 | Felipe Pereira | 102 |3125212822 4 2021
107 2021-01-26 |Fernanda Favalessa |207 |3125212822 4 2021
108 2021-01-25 | Guto R. |215 |3831252114 4 2021
109 2021-01-25 | Helaine S. | 253 | 3831252114 4 2021
110 2021-01-25 | João M. | 145 | 38252128 4 2021
111 2021-01-25 | João P. | 173 | 3535353535 4 2021
112 2021-01-26 | Livia Azeredo | 89 |3125212822 4 2021
113 2021-01-26 | Lucas Alves | 70 |1815101320 4 2021
114 2021-01-25 Paula P. | 137 |3831252114 4 2021
您可能希望使用 pandas.to_numeric
将无效数据转换为 NaN(如果需要,然后 fillna
使用默认值):
代替:
dataLake.Resolved.astype(int)
使用:
pd.to_numeric(dataLak['Resolved'], errors='coerce')
# or
pd.to_numeric(dataLak['Resolved'], errors='coerce').fillna(-1) # -1 if invalid
等对于所有其他事件
示例:
pd.to_numeric(pd.Series(['1', ' 12 ', '']), errors='coerce')
输出:
0 1.0
1 12.0
2 NaN
dtype: float64