得到一个我不应该得到的错误

Question

我试图通过将一列中的数字除以另一列中的数字来获得百分比，但我总是遇到同样的错误。

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-34-60166e8a919c> in <module>()
      6 dataLake = dataLake[['day','Agent','Resolved','Meta','Week','Year']]
      7 #Creating new data (atingimento)
----> 8 dataLake["atingimento"] = ((dataLake.Resolved.astype(int) / dataLake.Meta.astype(int)) * 100)
      9 dataLake['Resolved'] = dataLake.Resolved.astype(int)
     10 dataLake['Meta'] = dataLake.Meta.astype(str)

4 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/dtypes/cast.py in astype_nansafe(arr, dtype, copy, skipna)
    972         # work around NumPy brokenness, #1987
    973         if np.issubdtype(dtype.type, np.integer):
--> 974             return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
    975 
    976         # if we have a datetime/timedelta array of objects

pandas/_libs/lib.pyx in pandas._libs.lib.astype_intsafe()

ValueError: invalid literal for int() with base 10: ''

我尝试使用 .astype(int) 将两个数据集都转换为 int 但它不起作用，正如您从下面的数据集中看到的 google colab 如何读取列 'Meta' 作为字符串，即使它的格式与已解决的列相同。

           day  |             Agent | Resolved |   Meta |Week | Year
-------------------------------------------------------------------------
103 2021-01-26  |   Ana Carolina B. |     107  |2525252525    4  2021
104 2021-01-25  |       Bárbara D.  |   275    |3831252128    4  2021
105 2021-01-25  |          Danielly |   192    |3831252128    4  2021
106 2021-01-26  |   Felipe Pereira  | 102      |3125212822    4  2021
107 2021-01-26  |Fernanda Favalessa |207       |3125212822    4  2021
108 2021-01-25  |           Guto R. |215       |3831252114    4  2021
109 2021-01-25  |        Helaine S. |   253    |  3831252114    4  2021
110 2021-01-25  |           João M. |   145    |   38252128    4  2021
111 2021-01-25  |           João P. |    173   | 3535353535    4  2021
112 2021-01-26  |     Livia Azeredo |     89   |3125212822    4  2021
113 2021-01-26  |       Lucas Alves |     70   |1815101320    4  2021
114 2021-01-25            Paula P.  |    137   |3831252114    4  2021

Answer 1

您可能希望使用 pandas.to_numeric 将无效数据转换为 NaN（如果需要，然后 fillna 使用默认值）：

代替：

dataLake.Resolved.astype(int)

使用：

pd.to_numeric(dataLak['Resolved'], errors='coerce')
# or
pd.to_numeric(dataLak['Resolved'], errors='coerce').fillna(-1) # -1 if invalid

等对于所有其他事件

示例：

pd.to_numeric(pd.Series(['1', '   12  ', '']), errors='coerce')

输出：

0     1.0
1    12.0
2     NaN
dtype: float64

得到一个我不应该得到的错误

Getting an error that i should not be getting

python

types