Python 中的 Dataframe Boxplot 显示不正确的胡须
Dataframe Boxplot in Python displays incorrect whiskers
在这个简单的例子中,它给出了错误的最小值和最大值。
df = pd.DataFrame(np.array([1,2,3, 4, 5]),
columns=['a'])
df.boxplot()
结果:
按照常规公式 (Q3 + 1.5 * IQR),它应该是 7 和 -1,但如图所示,它是 5 和 1。看起来公式使用 0.5 而不是 1.5。我怎样才能改回标准?
Q1 = df['a'].quantile(0.25)
Q2 = df['a'].quantile(0.50)
Q3 = df['a'].quantile(0.75)
print(Q1,Q2, Q3)
IQR = Q3 - Q1
MaxO = (Q3 + 1.5 * IQR)
MinO = (Q1 - 1.5 * IQR)
print("IQR:", IQR, "Max:", MaxO, "Min:" ,MinO)
结果:
2.0 3.0 4.0
IQR: 2.0 最大值:%: 7.0 最小值:% -1.0
(Q1、Q2、Q3 和 IQR 正确,但 Min 或 Max 不正确)
From above the upper quartile, a distance of 1.5 times the IQR is measured out and a whisker is drawn up to the largest observed point from the dataset that falls within this distance. Similarly, a distance of 1.5 times the IQR is measured out below the lower quartile and a whisker is drawn up to the lower observed point from the dataset that falls within this distance. All other observed points are plotted as outliers.
在这个简单的例子中,它给出了错误的最小值和最大值。
df = pd.DataFrame(np.array([1,2,3, 4, 5]),
columns=['a'])
df.boxplot()
结果:
按照常规公式 (Q3 + 1.5 * IQR),它应该是 7 和 -1,但如图所示,它是 5 和 1。看起来公式使用 0.5 而不是 1.5。我怎样才能改回标准?
Q1 = df['a'].quantile(0.25)
Q2 = df['a'].quantile(0.50)
Q3 = df['a'].quantile(0.75)
print(Q1,Q2, Q3)
IQR = Q3 - Q1
MaxO = (Q3 + 1.5 * IQR)
MinO = (Q1 - 1.5 * IQR)
print("IQR:", IQR, "Max:", MaxO, "Min:" ,MinO)
结果:
2.0 3.0 4.0
IQR: 2.0 最大值:%: 7.0 最小值:% -1.0
(Q1、Q2、Q3 和 IQR 正确,但 Min 或 Max 不正确)
From above the upper quartile, a distance of 1.5 times the IQR is measured out and a whisker is drawn up to the largest observed point from the dataset that falls within this distance. Similarly, a distance of 1.5 times the IQR is measured out below the lower quartile and a whisker is drawn up to the lower observed point from the dataset that falls within this distance. All other observed points are plotted as outliers.