scipy t test_ind 中的值错误
ValueError in scipy t test_ind
我有以下 csv 文件:
SRA ID ERR169499 ERR169498 ERR169497
Label 1 0 1
TaxID PRJEB3251_ERR169499 PRJEB3251_ERR169499 PRJEB3251_ERR169499
333046 0.05 0.99 99.61
1049 0.03 2.34 34.33
337090 0.01 9.78 23.22
99007 22.33 2.90 0.00
我有 92 列用于标签为 0
的案例和 95 列用于标签为 1
的控件。我必须执行两个样本独立 T 检验和秩和检验到目前为止我有:
df = pd.read_csv('final_out_transposed.csv', header=[1,2], index_col=[0])
case = df.xs('0', axis=1, level=0).dropna()
ctrl = df.xs('1', axis=1, level=0).dropna()
(tt_val, p_ttest) = ttest_ind(case, ctrl, equal_var=False)
为此我收到错误:ValueError: operands could not be broadcast together with shapes (92,) (95,)
。
回溯是:
File "<ipython-input-152-d58634e75106>", line 1, in <module>
runfile('C:/IBD Bioproject/New folder/temp_3251.py', wdir='C:/IBD
Bioproject/New folder')
File "C:\Users\ksingh1\AppData\Local\Continuum\Anaconda3\lib\site-
packages\spyder\utils\site\sitecustomize.py", line 880, in runfile
execfile(filename, namespace)
File "C:\Users\ksingh1\AppData\Local\Continuum\Anaconda3\lib\site-
packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/IBD Bioproject/New folder/temp_3251.py", line 106, in <module>
tt_val, p_ttest = ttest_ind(case, ctrl, equal_var=False)
File "C:\Users\ksingh1\AppData\Local\Continuum\Anaconda3\lib\site-
packages\scipy\stats\stats.py", line 4068, in ttest_ind
df, denom = _unequal_var_ttest_denom(v1, n1, v2, n2)
File "C:\Users\ksingh1\AppData\Local\Continuum\Anaconda3\lib\site-
packages\scipy\stats\stats.py", line 3872, in _unequal_var_ttest_denom
df = (vn1 + vn2)**2 / (vn1**2 / (n1 - 1) + vn2**2 / (n2 - 1))
ValueError: operands could not be broadcast together with shapes (92,) (95,)
我读的很少posts but its still unclear also I went through numpy broadcast。
提前致谢
显然 Pandas DataFrame 的 xs
方法创建的对象看起来像二维数组。当传递给 ttest_ind
.
时,这些必须被展平以看起来像一维数组
试试这个:
ttest_ind(case.values.ravel(), ctrl.values.ravel(), equal_var=False)
Pandas对象的values
属性给出了一个numpy数组,ravel()
方法将数组扁平化为一维
我有以下 csv 文件:
SRA ID ERR169499 ERR169498 ERR169497
Label 1 0 1
TaxID PRJEB3251_ERR169499 PRJEB3251_ERR169499 PRJEB3251_ERR169499
333046 0.05 0.99 99.61
1049 0.03 2.34 34.33
337090 0.01 9.78 23.22
99007 22.33 2.90 0.00
我有 92 列用于标签为 0
的案例和 95 列用于标签为 1
的控件。我必须执行两个样本独立 T 检验和秩和检验到目前为止我有:
df = pd.read_csv('final_out_transposed.csv', header=[1,2], index_col=[0])
case = df.xs('0', axis=1, level=0).dropna()
ctrl = df.xs('1', axis=1, level=0).dropna()
(tt_val, p_ttest) = ttest_ind(case, ctrl, equal_var=False)
为此我收到错误:ValueError: operands could not be broadcast together with shapes (92,) (95,)
。
回溯是:
File "<ipython-input-152-d58634e75106>", line 1, in <module>
runfile('C:/IBD Bioproject/New folder/temp_3251.py', wdir='C:/IBD
Bioproject/New folder')
File "C:\Users\ksingh1\AppData\Local\Continuum\Anaconda3\lib\site-
packages\spyder\utils\site\sitecustomize.py", line 880, in runfile
execfile(filename, namespace)
File "C:\Users\ksingh1\AppData\Local\Continuum\Anaconda3\lib\site-
packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/IBD Bioproject/New folder/temp_3251.py", line 106, in <module>
tt_val, p_ttest = ttest_ind(case, ctrl, equal_var=False)
File "C:\Users\ksingh1\AppData\Local\Continuum\Anaconda3\lib\site-
packages\scipy\stats\stats.py", line 4068, in ttest_ind
df, denom = _unequal_var_ttest_denom(v1, n1, v2, n2)
File "C:\Users\ksingh1\AppData\Local\Continuum\Anaconda3\lib\site-
packages\scipy\stats\stats.py", line 3872, in _unequal_var_ttest_denom
df = (vn1 + vn2)**2 / (vn1**2 / (n1 - 1) + vn2**2 / (n2 - 1))
ValueError: operands could not be broadcast together with shapes (92,) (95,)
我读的很少posts but its still unclear also I went through numpy broadcast。
提前致谢
显然 Pandas DataFrame 的 xs
方法创建的对象看起来像二维数组。当传递给 ttest_ind
.
试试这个:
ttest_ind(case.values.ravel(), ctrl.values.ravel(), equal_var=False)
Pandas对象的values
属性给出了一个numpy数组,ravel()
方法将数组扁平化为一维