基于 frequent/common 个词的直方图
Histogram based on frequent/common words
我正在尝试根据 frequent/common 个单词创建直方图,但我只在 运行 代码时出错。我设法找到了 10 个最常用的词,但我无法在直方图中将其可视化。
description_list = df['description'].values.tolist()
from collections import Counter
Counter(" ".join(description_list).split()).most_common(10)
#histogram
plt.bar(x, y)
plt.title("10 most frequent tokens in description")
plt.ylabel("Frequency")
plt.xlabel("Words")
plt.show
看来这漏掉了几件事:
Counter(...).most_common(10)
的结果未分配给 x
或 y
x
、y
似乎未绑定
plt.show
未被调用,因此它要么不执行任何操作,要么打印类似 <function show at 0x...>
的内容
这是修复这些问题的可重现示例:
from collections import Counter
import matplotlib.pyplot as plt
import pandas as pd
data = {
"description": [
"This is the first example",
"This is the second example",
"This is similar to the first two",
"This exists add more words"
]
}
df = pd.DataFrame(data)
description_list = df['description'].values.tolist()
# Assign the Counter instance `most_common` call to a variable:
word_frequency = Counter(" ".join(description_list).split()).most_common(10)
# `most_common` returns a list of (word, count) tuples
words = [word for word, _ in word_frequency]
counts = [counts for _, counts in word_frequency]
plt.bar(words, counts)
plt.title("10 most frequent tokens in description")
plt.ylabel("Frequency")
plt.xlabel("Words")
plt.show()
预期输出:
我正在尝试根据 frequent/common 个单词创建直方图,但我只在 运行 代码时出错。我设法找到了 10 个最常用的词,但我无法在直方图中将其可视化。
description_list = df['description'].values.tolist()
from collections import Counter
Counter(" ".join(description_list).split()).most_common(10)
#histogram
plt.bar(x, y)
plt.title("10 most frequent tokens in description")
plt.ylabel("Frequency")
plt.xlabel("Words")
plt.show
看来这漏掉了几件事:
Counter(...).most_common(10)
的结果未分配给x
或y
x
、y
似乎未绑定plt.show
未被调用,因此它要么不执行任何操作,要么打印类似<function show at 0x...>
的内容
这是修复这些问题的可重现示例:
from collections import Counter
import matplotlib.pyplot as plt
import pandas as pd
data = {
"description": [
"This is the first example",
"This is the second example",
"This is similar to the first two",
"This exists add more words"
]
}
df = pd.DataFrame(data)
description_list = df['description'].values.tolist()
# Assign the Counter instance `most_common` call to a variable:
word_frequency = Counter(" ".join(description_list).split()).most_common(10)
# `most_common` returns a list of (word, count) tuples
words = [word for word, _ in word_frequency]
counts = [counts for _, counts in word_frequency]
plt.bar(words, counts)
plt.title("10 most frequent tokens in description")
plt.ylabel("Frequency")
plt.xlabel("Words")
plt.show()
预期输出: