基于 frequent/common 个词的直方图

Question

我正在尝试根据 frequent/common 个单词创建直方图，但我只在运行代码时出错。我设法找到了 10 个最常用的词，但我无法在直方图中将其可视化。

description_list = df['description'].values.tolist()

from collections import Counter
Counter(" ".join(description_list).split()).most_common(10)

#histogram 
plt.bar(x, y)
plt.title("10 most frequent tokens in description")
plt.ylabel("Frequency")
plt.xlabel("Words")
plt.show

Answer 1

看来这漏掉了几件事：

Counter(...).most_common(10) 的结果未分配给 x 或 y
x、y 似乎未绑定
plt.show 未被调用，因此它要么不执行任何操作，要么打印类似 <function show at 0x...>

这是修复这些问题的可重现示例：

from collections import Counter
import matplotlib.pyplot as plt
import pandas as pd

data = {
    "description": [
        "This is the first example",
        "This is the second example",
        "This is similar to the first two",
        "This exists add more words"
    ]
}
df = pd.DataFrame(data)


description_list = df['description'].values.tolist()

# Assign the Counter instance `most_common` call to a variable:
word_frequency = Counter(" ".join(description_list).split()).most_common(10)

# `most_common` returns a list of (word, count) tuples
words = [word for word, _ in word_frequency]
counts = [counts for _, counts in word_frequency]

plt.bar(words, counts)
plt.title("10 most frequent tokens in description")
plt.ylabel("Frequency")
plt.xlabel("Words")
plt.show()

预期输出：

基于 frequent/common 个词的直方图

Histogram based on frequent/common words

python

histogram