Python 绘制一维数组

Question

我想绘制每个点之间的差异。

我有一个系列 y_test，它是一维的并且包含连续值。索引有点古怪 (7618, 276, 7045, 6095, 2296, 7191, 1213, 2408...)。

我还有另一个 numpy 数组 ypred，它是一维的并且包含 y_test 的预测。我想看看使用图表预测的每个值的差异。

我试过这个：

fig, ax1 = plt.subplots(figsize = (20,5))
ax1.bar(y_test, y_test.index color = 'tab:orange')
ax1.set_ylabel('Actual',color = 'tab:orange')
ax2 = ax1.twinx()
ax2.bar(y_pred, y_test.index, color = 'tab:blue')
ax2.set_ylabel('Predicted',color = 'tab:blue')
plt.title('XGBoost Regression Performance')
fig.tight_layout()
plt.show()

但是 returns 错误：

ValueError: shape mismatch: objects cannot be broadcast to a single shape

bar/scatter/anything 很好我只是想一起看一下所有值。

这样我就可以对最佳预测值进行分组，以了解我的原始数据中的哪些特征值最容易预测。

顺便说一句，如果有人可以推荐获取该信息的最佳 XGBoost 方法，也请告诉我。

这是一些数据：

ypred: 
[10.410029 ,   4.4897604,  29.77089  ,  23.548471 ,  27.415161 ,
        56.28772  ,  13.083108 ,  38.086662 ,  19.128792 ,  42.49037  ,
        65.15919  ,  47.172436 ,  39.517883 ,  13.782948 , 121.52351  ,
         8.388838 ,  49.625607 ,  24.28464  ,  49.55232  ,  34.797436] 

y_test:
7618      9.88
276       2.69
7045     26.93
6095     23.49
2296     24.79
7191     57.09
1213     15.90
2408     46.26
5961     18.60
275      41.03
1707     66.25
2333     53.50
5717     40.60
1497     12.34
4937    121.93
2654      7.97
7442     53.65
7157     25.93
2141     54.28
4339     36.93

谢谢

Answer 1

plt.scatter(y_test, y_pred)?

许多点靠近等式线（对角线）意味着预测好，远离意味着不太好。

Answer 2

我假设 y_test 有一个 'val' 列，其中存储了您要绘制的值。
也许这会有所帮助？
您在 x 轴上有索引，在 y 轴上有预测值和真实值。

fig, ax1 = plt.subplots(figsize = (20,5))

ax1.plot(y_test.index, y_test['val'], color = 'tab:orange')
ax1.set_ylabel('Actual',color = 'tab:orange')
ax2 = ax1.twinx()
ax2.plot(y_test.index, y_pred, color = 'tab:blue')
ax2.set_ylabel('Predicted',color = 'tab:blue')
plt.title('XGBoost Regression Performance')
fig.tight_layout()

plt.show()

Python 绘制一维数组

Python plot 1D array

python

plot

numpy

data-visualization

matplotlib