如何绘制只有 2 列(文本和整数)的数据框?
How do I plot the dataframe with only 2 columns(text and int)?
index reviews label
0 0 i admit the great majority of... 1
1 1 take a low budget inexperienced ... 0
2 2 everybody has seen back to th... 1
3 3 doris day was an icon of b... 0
4 4 after a series of silly fun ... 0
我有一个电影评论数据框,我使用 kmeans.labels_ 预测了标签列(1-正面,0-负面评论)。我如何可视化/绘制以上内容?
所需输出:1 和 0 的散点图
尝试过的代码:
colors = ['red', 'blue']
pred_colors = [colors[label] for label in km.labels_]
import matplotlib.pyplot as plt
%matplotlib inline
plt.scatter(x='index',y='label',c=pred_colors)
输出:绘图,中间有一个红点
本图来自:
http://www3.ntu.edu.sg/home/ehchua/programming/webprogramming/Python4_DataAnalysis.html
您没有要在 x 轴上绘制的值,因此我们可以简单地使用索引。
评论可以作为另一列添加到数据中。
import pandas as pd
from matplotlib import pyplot as plt
data = [1,0,1,0,0]
df = pd.DataFrame(data, index=range(5), columns=['label'])
#
# line plot
#df.reset_index().plot(x='index', y='label') # turn index into column for plotting on x-axis
#
# scatter plot
ax1 = df.reset_index().plot.scatter(x='index', y='label', c='DarkBlue')
#
plt.tight_layout() # helps prevent labels from being cropped
plt.show()
index reviews label
0 0 i admit the great majority of... 1
1 1 take a low budget inexperienced ... 0
2 2 everybody has seen back to th... 1
3 3 doris day was an icon of b... 0
4 4 after a series of silly fun ... 0
我有一个电影评论数据框,我使用 kmeans.labels_ 预测了标签列(1-正面,0-负面评论)。我如何可视化/绘制以上内容?
所需输出:1 和 0 的散点图
尝试过的代码:
colors = ['red', 'blue']
pred_colors = [colors[label] for label in km.labels_]
import matplotlib.pyplot as plt
%matplotlib inline
plt.scatter(x='index',y='label',c=pred_colors)
输出:绘图,中间有一个红点
本图来自: http://www3.ntu.edu.sg/home/ehchua/programming/webprogramming/Python4_DataAnalysis.html
您没有要在 x 轴上绘制的值,因此我们可以简单地使用索引。 评论可以作为另一列添加到数据中。
import pandas as pd
from matplotlib import pyplot as plt
data = [1,0,1,0,0]
df = pd.DataFrame(data, index=range(5), columns=['label'])
#
# line plot
#df.reset_index().plot(x='index', y='label') # turn index into column for plotting on x-axis
#
# scatter plot
ax1 = df.reset_index().plot.scatter(x='index', y='label', c='DarkBlue')
#
plt.tight_layout() # helps prevent labels from being cropped
plt.show()