为什么 SciPy return `nan` 用于样本方差为 0 的 t 检验？

Question

我在 Python 中使用 SciPy 以及以下 return 一个 nan 值，无论出于何种原因：

>>>stats.ttest_ind([1, 1], [1, 1])
Ttest_indResult(statistic=nan, pvalue=nan)

>>>stats.ttest_ind([1, 1], [1, 1, 1])
Ttest_indResult(statistic=nan, pvalue=nan).

但每当我使用具有不同汇总统计数据的样本时，我实际上得到了一个合理的值：

stats.ttest_ind([1, 1], [1, 1, 1, 2])
Ttest_indResult(statistic=-0.66666666666666663, pvalue=0.54146973927558495).

将 nan 的 p 值解释为 0 是否合理？是否有统计数据表明运行对具有相同汇总统计数据的样本进行 2 样本 t 检验没有意义？

Answer 1

除以零将引发 NaN（= 不是数字）异常，或 return 按照惯例匹配 NaN 的浮点表示。要特别注意除以 N 与除以 N 减一标准差公式。

为什么 SciPy return `nan` 用于样本方差为 0 的 t 检验？

Why does SciPy return `nan` for a t-test with samples with 0 variance?

python

nan

scipy

hypothesis-test