python - matplotlib 子图网格:在哪里插入 row/column 个参数
python - matplot lib sub-plot grid: where to insert row/column arguments
我正在尝试以 matplotlib 子图的形式显示跨多个数据集的 LDA 文本分析的主题提取结果。
这是我所在的位置:
我认为我的问题是我对 matplotlib 不熟悉。我已经提前完成了所有的数字运算,这样我就可以专注于如何绘制数据:
top_words_master = []
top_weights_master = []
for i in range(len(tf_list)):
tf = tf_vectorizer.fit_transform(tf_list[i])
lda.fit(tf)
n_top_words = 20
tf_feature_names = tf_vectorizer.get_feature_names_out()
top_features_ind = lda.components_[0].argsort()[: -n_top_words - 1 : -1]
top_features = [tf_feature_names[i] for i in top_features_ind]
weights = lda.components_[0][top_features_ind]
top_words_master.append(top_features)
top_weights_master.append(weights)
这给了我我的文字和权重(x 轴值)来制作我的 row/bar 图表的子图矩阵。
我尝试通过 matplot lib 构建它:
fig, axes = plt.subplots(2, 5, figsize=(30, 15), sharex=True)
plt.subplots_adjust(hspace=0.5)
fig.suptitle("Topics in LDA Model", fontsize=18, y=0.95)
axes = axes.flatten()
for i in range(len(tf_list)):
ax = axes[i]
ax.barh(top_words_master[i], top_weights_master[i], height=0.7)
ax.set_title(topic_map[f"Topic {i +1}"], fontdict={"fontsize": 30})
ax.invert_yaxis()
ax.tick_params(axis="both", which="major", labelsize=20)
for j in "top right left".split():
ax.spines[j].set_visible(False)
fig.suptitle("Topics in LDA Model", fontsize=40)
plt.subplots_adjust(top=0.90, bottom=0.05, wspace=0.90, hspace=0.3)
plt.show()
然而,它只显示了一个,第一个。对于剩下的 6 个数据集,它刚刚打印:
<Figure size 432x288 with 0 Axes> <Figure size 432x288 with 0 Axes> <Figure size 432x288 with 0 Axes> <Figure size 432x288 with 0 Axes> <Figure size 432x288 with 0 Axes>
问题
我已经在这里工作好几天了。我觉得我已经接近了,但是这样的结果真的让我很困惑,谁有解决方案或者能给我指明正确的方向?
您应该先创建图形:
def top_word_comparison(axes, model, feature_names, n_top_words):
for topic_idx, topic in enumerate(model.components_):
top_features_ind = topic.argsort()[: -n_top_words - 1 : -1]
top_features = [feature_names[i] for i in top_features_ind]
weights = topic[top_features_ind]
ax = axes[topic_idx]
ax.barh(top_features, weights, height=0.7)
ax.set_title(topic_map[f"Topic {topic_idx +1}"], fontdict={"fontsize": 30})
ax.invert_yaxis()
ax.tick_params(axis="both", which="major", labelsize=20)
for i in "top right left".split():
ax.spines[i].set_visible(False)
tf_list = [cm_array, xb_array]
fig, axes = plt.subplots(len(tf_list), 5, figsize=(30, 15), sharex=True)
fig.suptitle("Topics in LDA model", fontsize=40)
for i in range(enumerate(tf_list)):
tf = tf_vectorizer.fit_transform(tf_list[i])
n_components = 1
lda.fit(tf)
n_top_words = 20
tf_feature_names = tf_vectorizer.get_feature_names_out()
top_word_comparison(axes[i], lda, tf_feature_names, n_top_words)
plt.subplots_adjust(top=0.90, bottom=0.05, wspace=0.90, hspace=0.3)
plt.show()
据我从你的问题中了解到,你的问题是为你的子图获取正确的索引。
在你的例子中,你有一个数组 range(len(tf_list))
来索引你的数据,一些数据(例如 top_words_master[i]
)来绘制,以及一个包含 10 个子图的图形(行数=2,列数=5 ).例如,如果您想绘制数据的第 7 项 (i=6),则 ax
的索引将为 axes[1,1]
.
为了获得子图轴的正确索引,您可以使用 numpy.unravel_index。当然,你不应该 flatten
你的 axes
.
import matplotlib.pyplot as plt
import numpy as np
# dummy function
my_func = lambda x: np.random.random(x)
x_max = 100
# fig properties
rows = 2
cols = 5
fig, axes = plt.subplots(rows,cols,figsize=(30, 15), sharex=True)
for i in range(rows*cols):
ax_i = np.unravel_index(i,(rows,cols))
axes[ax_i[0],ax_i[1]].barh(np.arange(x_max),my_func(x_max), height=0.7)
plt.show()
我正在尝试以 matplotlib 子图的形式显示跨多个数据集的 LDA 文本分析的主题提取结果。
这是我所在的位置:
我认为我的问题是我对 matplotlib 不熟悉。我已经提前完成了所有的数字运算,这样我就可以专注于如何绘制数据:
top_words_master = []
top_weights_master = []
for i in range(len(tf_list)):
tf = tf_vectorizer.fit_transform(tf_list[i])
lda.fit(tf)
n_top_words = 20
tf_feature_names = tf_vectorizer.get_feature_names_out()
top_features_ind = lda.components_[0].argsort()[: -n_top_words - 1 : -1]
top_features = [tf_feature_names[i] for i in top_features_ind]
weights = lda.components_[0][top_features_ind]
top_words_master.append(top_features)
top_weights_master.append(weights)
这给了我我的文字和权重(x 轴值)来制作我的 row/bar 图表的子图矩阵。
我尝试通过 matplot lib 构建它:
fig, axes = plt.subplots(2, 5, figsize=(30, 15), sharex=True)
plt.subplots_adjust(hspace=0.5)
fig.suptitle("Topics in LDA Model", fontsize=18, y=0.95)
axes = axes.flatten()
for i in range(len(tf_list)):
ax = axes[i]
ax.barh(top_words_master[i], top_weights_master[i], height=0.7)
ax.set_title(topic_map[f"Topic {i +1}"], fontdict={"fontsize": 30})
ax.invert_yaxis()
ax.tick_params(axis="both", which="major", labelsize=20)
for j in "top right left".split():
ax.spines[j].set_visible(False)
fig.suptitle("Topics in LDA Model", fontsize=40)
plt.subplots_adjust(top=0.90, bottom=0.05, wspace=0.90, hspace=0.3)
plt.show()
然而,它只显示了一个,第一个。对于剩下的 6 个数据集,它刚刚打印:
<Figure size 432x288 with 0 Axes> <Figure size 432x288 with 0 Axes> <Figure size 432x288 with 0 Axes> <Figure size 432x288 with 0 Axes> <Figure size 432x288 with 0 Axes>
问题
我已经在这里工作好几天了。我觉得我已经接近了,但是这样的结果真的让我很困惑,谁有解决方案或者能给我指明正确的方向?
您应该先创建图形:
def top_word_comparison(axes, model, feature_names, n_top_words):
for topic_idx, topic in enumerate(model.components_):
top_features_ind = topic.argsort()[: -n_top_words - 1 : -1]
top_features = [feature_names[i] for i in top_features_ind]
weights = topic[top_features_ind]
ax = axes[topic_idx]
ax.barh(top_features, weights, height=0.7)
ax.set_title(topic_map[f"Topic {topic_idx +1}"], fontdict={"fontsize": 30})
ax.invert_yaxis()
ax.tick_params(axis="both", which="major", labelsize=20)
for i in "top right left".split():
ax.spines[i].set_visible(False)
tf_list = [cm_array, xb_array]
fig, axes = plt.subplots(len(tf_list), 5, figsize=(30, 15), sharex=True)
fig.suptitle("Topics in LDA model", fontsize=40)
for i in range(enumerate(tf_list)):
tf = tf_vectorizer.fit_transform(tf_list[i])
n_components = 1
lda.fit(tf)
n_top_words = 20
tf_feature_names = tf_vectorizer.get_feature_names_out()
top_word_comparison(axes[i], lda, tf_feature_names, n_top_words)
plt.subplots_adjust(top=0.90, bottom=0.05, wspace=0.90, hspace=0.3)
plt.show()
据我从你的问题中了解到,你的问题是为你的子图获取正确的索引。
在你的例子中,你有一个数组 range(len(tf_list))
来索引你的数据,一些数据(例如 top_words_master[i]
)来绘制,以及一个包含 10 个子图的图形(行数=2,列数=5 ).例如,如果您想绘制数据的第 7 项 (i=6),则 ax
的索引将为 axes[1,1]
.
为了获得子图轴的正确索引,您可以使用 numpy.unravel_index。当然,你不应该 flatten
你的 axes
.
import matplotlib.pyplot as plt
import numpy as np
# dummy function
my_func = lambda x: np.random.random(x)
x_max = 100
# fig properties
rows = 2
cols = 5
fig, axes = plt.subplots(rows,cols,figsize=(30, 15), sharex=True)
for i in range(rows*cols):
ax_i = np.unravel_index(i,(rows,cols))
axes[ax_i[0],ax_i[1]].barh(np.arange(x_max),my_func(x_max), height=0.7)
plt.show()