计算 z-score 时出现奇怪的错误
getting strange error while calculating z-score
我想计算整个数据集的 z 分数。我尝试了两种类型的代码,但不幸的是它们都给了我同样的错误。
我的第一个代码在这里:
zee=stats.zscore(df)
print(zee)
我的 2 代码是:
from scipy import stats
import numpy as np
z = np.abs(stats.zscore(df))
print(z)
我正在使用 jupyter
我遇到的错误:
-----
TypeError Traceback (most recent call last)
<ipython-input-23-ef429aebacfd> in <module>
1 from scipy import stats
2 import numpy as np
----> 3 z = np.abs(stats.zscore(df))
4 print(z)
~/.local/lib/python3.8/site-packages/scipy/stats/stats.py in zscore(a, axis, ddof, nan_policy)
2495 sstd = np.nanstd(a=a, axis=axis, ddof=ddof, keepdims=True)
2496 else:
-> 2497 mns = a.mean(axis=axis, keepdims=True)
2498 sstd = a.std(axis=axis, ddof=ddof, keepdims=True)
2499
~/.local/lib/python3.8/site-packages/numpy/core/_methods.py in _mean(a, axis, dtype, out, keepdims)
160 ret = umr_sum(arr, axis, dtype, out, keepdims)
161 if isinstance(ret, mu.ndarray):
--> 162 ret = um.true_divide(
163 ret, rcount, out=ret, casting='unsafe', subok=False)
164 if is_float16_result and out is None:
TypeError: unsupported operand type(s) for /: 'str' and 'int'
这里是我的数据框的信息,如果我的数据农场有问题的话。
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 14 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Region 100 non-null object
1 Country 100 non-null object
2 Item Type 100 non-null object
3 Sales Channel 100 non-null object
4 Order Priority 100 non-null object
5 Order Date 100 non-null object
6 Order ID 100 non-null int64
7 Ship Date 100 non-null object
8 Units Sold 100 non-null int64
9 Unit Price 100 non-null float64
10 Unit Cost 100 non-null float64
11 Total Revenue 100 non-null float64
12 Total Cost 100 non-null float64
13 Total Profit 100 non-null float64
dtypes: float64(5), int64(2), object(7)
memory usage: 11.1+ KB
提前致谢。
您的 df 包含非 float/int
值,请尝试仅向您的 zscore 函数发送 int/float 列。
stats.zscore(df[['Unit Cost', 'Total Revenue', 'Total Cost', 'Total Profit']])
我想计算整个数据集的 z 分数。我尝试了两种类型的代码,但不幸的是它们都给了我同样的错误。 我的第一个代码在这里:
zee=stats.zscore(df)
print(zee)
我的 2 代码是:
from scipy import stats
import numpy as np
z = np.abs(stats.zscore(df))
print(z)
我正在使用 jupyter
我遇到的错误:
-----
TypeError Traceback (most recent call last)
<ipython-input-23-ef429aebacfd> in <module>
1 from scipy import stats
2 import numpy as np
----> 3 z = np.abs(stats.zscore(df))
4 print(z)
~/.local/lib/python3.8/site-packages/scipy/stats/stats.py in zscore(a, axis, ddof, nan_policy)
2495 sstd = np.nanstd(a=a, axis=axis, ddof=ddof, keepdims=True)
2496 else:
-> 2497 mns = a.mean(axis=axis, keepdims=True)
2498 sstd = a.std(axis=axis, ddof=ddof, keepdims=True)
2499
~/.local/lib/python3.8/site-packages/numpy/core/_methods.py in _mean(a, axis, dtype, out, keepdims)
160 ret = umr_sum(arr, axis, dtype, out, keepdims)
161 if isinstance(ret, mu.ndarray):
--> 162 ret = um.true_divide(
163 ret, rcount, out=ret, casting='unsafe', subok=False)
164 if is_float16_result and out is None:
TypeError: unsupported operand type(s) for /: 'str' and 'int'
这里是我的数据框的信息,如果我的数据农场有问题的话。
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 14 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Region 100 non-null object
1 Country 100 non-null object
2 Item Type 100 non-null object
3 Sales Channel 100 non-null object
4 Order Priority 100 non-null object
5 Order Date 100 non-null object
6 Order ID 100 non-null int64
7 Ship Date 100 non-null object
8 Units Sold 100 non-null int64
9 Unit Price 100 non-null float64
10 Unit Cost 100 non-null float64
11 Total Revenue 100 non-null float64
12 Total Cost 100 non-null float64
13 Total Profit 100 non-null float64
dtypes: float64(5), int64(2), object(7)
memory usage: 11.1+ KB
提前致谢。
您的 df 包含非 float/int
值,请尝试仅向您的 zscore 函数发送 int/float 列。
stats.zscore(df[['Unit Cost', 'Total Revenue', 'Total Cost', 'Total Profit']])