计算 z-score 时出现奇怪的错误

getting strange error while calculating z-score

我想计算整个数据集的 z 分数。我尝试了两种类型的代码,但不幸的是它们都给了我同样的错误。 我的第一个代码在这里:

zee=stats.zscore(df)
print(zee)

我的 2 代码是:

from scipy import stats
import numpy as np
z = np.abs(stats.zscore(df))
print(z)

我正在使用 jupyter

我遇到的错误:

-----
TypeError                                 Traceback (most recent call last)
<ipython-input-23-ef429aebacfd> in <module>
      1 from scipy import stats
      2 import numpy as np
----> 3 z = np.abs(stats.zscore(df))
      4 print(z)

~/.local/lib/python3.8/site-packages/scipy/stats/stats.py in zscore(a, axis, ddof, nan_policy)
   2495         sstd = np.nanstd(a=a, axis=axis, ddof=ddof, keepdims=True)
   2496     else:
-> 2497         mns = a.mean(axis=axis, keepdims=True)
   2498         sstd = a.std(axis=axis, ddof=ddof, keepdims=True)
   2499 

~/.local/lib/python3.8/site-packages/numpy/core/_methods.py in _mean(a, axis, dtype, out, keepdims)
    160     ret = umr_sum(arr, axis, dtype, out, keepdims)
    161     if isinstance(ret, mu.ndarray):
--> 162         ret = um.true_divide(
    163                 ret, rcount, out=ret, casting='unsafe', subok=False)
    164         if is_float16_result and out is None:

TypeError: unsupported operand type(s) for /: 'str' and 'int' 

这里是我的数据框的信息,如果我的数据农场有问题的话。

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 14 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Region          100 non-null    object 
 1   Country         100 non-null    object 
 2   Item Type       100 non-null    object 
 3   Sales Channel   100 non-null    object 
 4   Order Priority  100 non-null    object 
 5   Order Date      100 non-null    object 
 6   Order ID        100 non-null    int64  
 7   Ship Date       100 non-null    object 
 8   Units Sold      100 non-null    int64  
 9   Unit Price      100 non-null    float64
 10  Unit Cost       100 non-null    float64
 11  Total Revenue   100 non-null    float64
 12  Total Cost      100 non-null    float64
 13  Total Profit    100 non-null    float64
dtypes: float64(5), int64(2), object(7)
memory usage: 11.1+ KB

提前致谢。

您的 df 包含非 float/int 值,请尝试仅向您的 zscore 函数发送 int/float 列。

stats.zscore(df[['Unit Cost', 'Total Revenue', 'Total Cost', 'Total Profit']])