Python 中的 Dataframe Boxplot 显示不正确的胡须

Dataframe Boxplot in Python displays incorrect whiskers

在这个简单的例子中,它给出了错误的最小值和最大值。

df = pd.DataFrame(np.array([1,2,3, 4, 5]),
                  columns=['a'])
df.boxplot() 

结果:

按照常规公式 (Q3 + 1.5 * IQR),它应该是 7 和 -1,但如图所示,它是 5 和 1。看起来公式使用 0.5 而不是 1.5。我怎样才能改回标准?

Q1 = df['a'].quantile(0.25)
Q2 = df['a'].quantile(0.50)
Q3 = df['a'].quantile(0.75)

print(Q1,Q2, Q3)
IQR = Q3 - Q1
MaxO = (Q3 + 1.5 * IQR)
MinO = (Q1 - 1.5 * IQR)
print("IQR:", IQR, "Max:", MaxO, "Min:" ,MinO)

结果:

2.0 3.0 4.0

IQR: 2.0 最大值:%: 7.0 最小值:% -1.0

(Q1、Q2、Q3 和 IQR 正确,但 Min 或 Max 不正确)

Source

From above the upper quartile, a distance of 1.5 times the IQR is measured out and a whisker is drawn up to the largest observed point from the dataset that falls within this distance. Similarly, a distance of 1.5 times the IQR is measured out below the lower quartile and a whisker is drawn up to the lower observed point from the dataset that falls within this distance. All other observed points are plotted as outliers.